release 2014.12.13

Merge remote-tracking branch 'fstirlitz/master'
Merge branch 'master' of github.com:rg3/youtube-dl
2014-12-13 23:13:48 +01:00 · 2014-12-13 23:05:41 +01:00 · 2014-12-13 23:05:28 +01:00 · 2014-12-13 23:05:22 +01:00 · 2014-12-14 03:42:42 +06:00 · 2014-12-14 03:41:17 +06:00
233 changed files with 4029 additions and 1713 deletions
--- a/7
+++ b/7
@@ -86,3 +86,10 @@ Mauroy Sébastien
 William Sewell
 Dao Hoang Son
 Oskar Jauch
+Matthew Rayfield
+t0mm0
+Tithen-Firion
+Zack Fernandes
+cryptonaut
+Adrian Kretz
+Mathias Rav
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,136 @@
+Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
+
+Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
+
+### Is the description of the issue itself sufficient?
+
+We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
+
+So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
+
+- What the problem is
+- How it could be fixed
+- How your proposed solution would look like
+
+If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
+
+For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
+
+Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
+
+###  Are you using the latest version?
+
+Before reporting any issue, type youtube-dl -U. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
+
+###  Is the issue already documented?
+
+Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
+
+###  Why are existing options not enough?
+
+Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
+
+###  Is there enough context in your bug report?
+
+People want to solve problems, and often think they do us a favor by breaking down their larger problems (e.g. wanting to skip already downloaded files) to a specific request (e.g. requesting us to look whether the file exists before downloading the info page). However, what often happens is that they break down the problem into two steps: One simple, and one impossible (or extremely complicated one).
+
+We are then presented with a very complicated request when the original problem could be solved far easier, e.g. by recording the downloaded video IDs in a separate file. To avoid this, you must include the greater context where it is non-obvious. In particular, every feature request that does not consist of adding support for a new site should contain a use case scenario that explains in what situation the missing feature would be useful.
+
+###  Does the issue involve one problem, and one problem only?
+
+Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
+
+In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
+
+###  Is anyone going to need the feature?
+
+Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
+
+###  Is your question about youtube-dl?
+
+It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different or even the reporter's own application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
+
+# DEVELOPER INSTRUCTIONS
+
+Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
+
+To run youtube-dl as a developer, you don't need to build anything either. Simply execute
+
+    python -m youtube_dl
+
+To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
+
+    python -m unittest discover
+    python test/test_download.py
+    nosetests
+
+If you want to create a build of youtube-dl yourself, you'll need
+
+* python
+* make
+* pandoc
+* zip
+* nosetests
+
+### Adding support for a new site
+
+If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
+
+1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
+2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
+3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
+4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
+    ```python
+    # coding: utf-8
+    from __future__ import unicode_literals
+
+    from .common import InfoExtractor
+
+
+    class YourExtractorIE(InfoExtractor):
+        _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
+        _TEST = {
+            'url': 'http://yourextractor.com/watch/42',
+            'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
+            'info_dict': {
+                'id': '42',
+                'ext': 'mp4',
+                'title': 'Video title goes here',
+                'thumbnail': 're:^https?://.*\.jpg$',
+                # TODO more properties, either as:
+                # * A value
+                # * MD5 checksum; start the string with md5:
+                # * A regular expression; start the string with re:
+                # * Any Python type (for example int or float)
+            }
+        }
+
+        def _real_extract(self, url):
+            video_id = self._match_id(url)
+            webpage = self._download_webpage(url, video_id)
+
+            # TODO more code goes here, for example ...
+            title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')
+
+            return {
+                'id': video_id,
+                'title': title,
+                'description': self._og_search_description(webpage),
+                # TODO more properties (see youtube_dl/extractor/common.py)
+            }
+    ```
+5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
+6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
+7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
+8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
+9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
+
+        $ git add youtube_dl/extractor/__init__.py
+        $ git add youtube_dl/extractor/yourextractor.py
+        $ git commit -m '[yourextractor] Add new extractor'
+        $ git push origin yourextractor
+
+10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
+
+In any case, thank you very much for your contributions!
+
--- a/7
+++ b/7
@@ -1,7 +1,7 @@
-all: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
+all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish

 clean:
-	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json CONTRIBUTING.md.tmp

 cleanall: clean
 	rm -f youtube-dl youtube-dl.exe
@@ -56,6 +56,9 @@ youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
 README.md: youtube_dl/*.py youtube_dl/*/*.py
 	COLUMNS=80 python -m youtube_dl --help | python devscripts/make_readme.py

+CONTRIBUTING.md: README.md
+	python devscripts/make_contributing.py README.md CONTRIBUTING.md
+
 README.txt: README.md
 	pandoc -f markdown -t plain README.md -o README.txt

--- a/README.md
+++ b/README.md
@@ -30,7 +30,7 @@ Alternatively, refer to the developer instructions below for how to check out an
 # DESCRIPTION
 **youtube-dl** is a small command-line program to download videos from
 YouTube.com and a few more sites. It requires the Python interpreter, version
-2.6, 2.7, or 3.3+, and it is not platform specific. It should work on
+2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
 your Unix box, on Windows or on Mac OS X. It is released to the public domain,
 which means you can modify it, redistribute it or use it however you like.

@@ -65,10 +65,10 @@ which means you can modify it, redistribute it or use it however you like.
                                     this is not possible instead of searching.
    --ignore-config                  Do not read configuration files. When given
                                     in the global configuration file /etc
-                                     /youtube-dl.conf: do not read the user
-                                     configuration in ~/.config/youtube-dl.conf
-                                     (%APPDATA%/youtube-dl/config.txt on
-                                     Windows)
+                                     /youtube-dl.conf: Do not read the user
+                                     configuration in ~/.config/youtube-
+                                     dl/config (%APPDATA%/youtube-dl/config.txt
+                                     on Windows)
    --flat-playlist                  Do not extract the videos of a playlist,
                                     only list them.

@@ -93,7 +93,8 @@ which means you can modify it, redistribute it or use it however you like.
                                     COUNT views
    --max-views COUNT                Do not download any videos with more than
                                     COUNT views
-    --no-playlist                    download only the currently playing video
+    --no-playlist                    If the URL refers to a video and a
+                                     playlist, download only the video.
    --age-limit YEARS                download only videos suitable for the given
                                     age
    --download-archive FILE          Download only videos not listed in the
@@ -112,12 +113,12 @@ which means you can modify it, redistribute it or use it however you like.
                                     size. By default, the buffer size is
                                     automatically resized from an initial value
                                     of SIZE.
+    --playlist-reverse               Download playlist videos in reverse order

 ## Filesystem Options:
    -a, --batch-file FILE            file containing URLs to download ('-' for
                                     stdin)
    --id                             use only video ID in file name
-    -A, --auto-number                number downloaded files starting from 00000
    -o, --output TEMPLATE            output filename template. Use %(title)s to
                                     get the title, %(uploader)s for the
                                     uploader name, %(uploader_id)s for the
@@ -151,6 +152,9 @@ which means you can modify it, redistribute it or use it however you like.
    --restrict-filenames             Restrict filenames to only ASCII
                                     characters, and avoid "&" and spaces in
                                     filenames
+    -A, --auto-number                [deprecated; use  -o
+                                     "%(autonumber)s-%(title)s.%(ext)s" ] number
+                                     downloaded files starting from 00000
    -t, --title                      [deprecated] use title in file name
                                     (default)
    -l, --literal                    [deprecated] alias of --title
@@ -492,14 +496,15 @@ If you want to add support for a new site, you can follow this quick list (assum

        def _real_extract(self, url):
            video_id = self._match_id(url)
+            webpage = self._download_webpage(url, video_id)

            # TODO more code goes here, for example ...
-            webpage = self._download_webpage(url, video_id)
            title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')

            return {
                'id': video_id,
                'title': title,
+                'description': self._og_search_description(webpage),
                # TODO more properties (see youtube_dl/extractor/common.py)
            }
    ```
@@ -534,13 +539,11 @@ Most likely, you'll want to use various options. For a list of what can be done,

 # BUGS

-Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email.
+Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the irc channel #youtube-dl on freenode.

 Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.

-For discussions, join us in the irc channel #youtube-dl on freenode.
-
-When you submit a request, please re-read it once to avoid a couple of mistakes (you can and should use this as a checklist):
+Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):

 ### Is the description of the issue itself sufficient?

--- a/devscripts/bash-completion.py
+++ b/devscripts/bash-completion.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python
+from __future__ import unicode_literals
+
 import os
 from os.path import dirname as dirn
 import sys
--- a/devscripts/check-porn.py
+++ b/devscripts/check-porn.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+from __future__ import unicode_literals

 """
 This script employs a VERY basic heuristic ('porn' in webpage.lower()) to check
--- a/devscripts/gh-pages/add-version.py
+++ b/devscripts/gh-pages/add-version.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python3
+from __future__ import unicode_literals

 import json
 import sys
--- a/devscripts/gh-pages/generate-download.py
+++ b/devscripts/gh-pages/generate-download.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+from __future__ import unicode_literals
+
 import hashlib
 import urllib.request
 import json
--- a/devscripts/gh-pages/sign-versions.py
+++ b/devscripts/gh-pages/sign-versions.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python3
+from __future__ import unicode_literals, with_statement

 import rsa
 import json
@@ -29,4 +30,5 @@ signature = hexlify(rsa.pkcs1.sign(json.dumps(versions_info, sort_keys=True).enc
 print('signature: ' + signature)

 versions_info['signature'] = signature
-json.dump(versions_info, open('update/versions.json', 'w'), indent=4, sort_keys=True)
+with open('update/versions.json', 'w') as versionsf:
+    json.dump(versions_info, versionsf, indent=4, sort_keys=True)
--- a/devscripts/gh-pages/update-copyright.py
+++ b/devscripts/gh-pages/update-copyright.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python
 # coding: utf-8

-from __future__ import with_statement
+from __future__ import with_statement, unicode_literals

 import datetime
 import glob
@@ -13,7 +13,7 @@ year = str(datetime.datetime.now().year)
 for fn in glob.glob('*.html*'):
    with io.open(fn, encoding='utf-8') as f:
        content = f.read()
-    newc = re.sub(u'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', u'Copyright © 2006-' + year, content)
+    newc = re.sub(r'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', 'Copyright © 2006-' + year, content)
    if content != newc:
        tmpFn = fn + '.part'
        with io.open(tmpFn, 'wt', encoding='utf-8') as outf:
--- a/devscripts/gh-pages/update-feed.py
+++ b/devscripts/gh-pages/update-feed.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python3
+from __future__ import unicode_literals

 import datetime
 import io
--- a/devscripts/gh-pages/update-sites.py
+++ b/devscripts/gh-pages/update-sites.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python3
+from __future__ import unicode_literals

 import sys
 import os
--- a/devscripts/make_contributing.py
+++ b/devscripts/make_contributing.py
@@ -0,0 +1,32 @@
+#!/usr/bin/env python
+from __future__ import unicode_literals
+
+import argparse
+import io
+import re
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        'INFILE', help='README.md file name to read from')
+    parser.add_argument(
+        'OUTFILE', help='CONTRIBUTING.md file name to write to')
+    args = parser.parse_args()
+
+    with io.open(args.INFILE, encoding='utf-8') as inf:
+        readme = inf.read()
+
+    bug_text = re.search(
+        r'(?s)#\s*BUGS\s*[^\n]*\s*(.*?)#\s*COPYRIGHT', readme).group(1)
+    dev_text = re.search(
+        r'(?s)(#\s*DEVELOPER INSTRUCTIONS.*?)#\s*EMBEDDING YOUTUBE-DL',
+        readme).group(1)
+
+    out = bug_text + dev_text
+
+    with io.open(args.OUTFILE, 'w', encoding='utf-8') as outf:
+        outf.write(out)
+
+if __name__ == '__main__':
+    main()
--- a/devscripts/make_readme.py
+++ b/devscripts/make_readme.py
@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+
 import io
 import sys
 import re
--- a/devscripts/prepare_manpage.py
+++ b/devscripts/prepare_manpage.py
@@ -1,3 +1,4 @@
+from __future__ import unicode_literals

 import io
 import os.path
--- a/devscripts/zsh-completion.py
+++ b/devscripts/zsh-completion.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python
+from __future__ import unicode_literals
+
 import os
 from os.path import dirname as dirn
 import sys
--- a/setup.py
+++ b/setup.py
@@ -102,7 +102,9 @@ setup(
        "Programming Language :: Python :: 2.6",
        "Programming Language :: Python :: 2.7",
        "Programming Language :: Python :: 3",
-        "Programming Language :: Python :: 3.3"
+        "Programming Language :: Python :: 3.2",
+        "Programming Language :: Python :: 3.3",
+        "Programming Language :: Python :: 3.4",
    ],

    **params
--- a/test/helper.py
+++ b/test/helper.py
@@ -141,7 +141,7 @@ def expect_info_dict(self, expected_dict, got_dict):
    if missing_keys:
        def _repr(v):
            if isinstance(v, compat_str):
-                return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'")
+                return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'").replace('\n', '\\n')
            else:
                return repr(v)
        info_dict_str = ''.join(
@@ -161,7 +161,9 @@ def assertRegexpMatches(self, text, regexp, msg=None):
    else:
        m = re.match(regexp, text)
        if not m:
-            note = 'Regexp didn\'t match: %r not found in %r' % (regexp, text)
+            note = 'Regexp didn\'t match: %r not found' % (regexp)
+            if len(text) < 1000:
+                note += ' in %r' % text
            if msg is None:
                msg = note
            else:
--- a/test/test_download.py
+++ b/test/test_download.py
@@ -97,7 +97,7 @@ def generator(test_case):
            return
        for other_ie in other_ies:
            if not other_ie.working():
-                print_skipping(u'test depends on %sIE, marked as not WORKING' % other_ie.ie_key())
+                print_skipping('test depends on %sIE, marked as not WORKING' % other_ie.ie_key())
                return

        params = get_params(test_case.get('params', {}))
@@ -143,7 +143,7 @@ def generator(test_case):
                        raise

                    if try_num == RETRIES:
-                        report_warning(u'Failed due to network errors, skipping...')
+                        report_warning('Failed due to network errors, skipping...')
                        return

                    print('Retrying: {0} failed tries\n\n##########\n\n'.format(try_num))
--- a/test/test_subtitles.py
+++ b/test/test_subtitles.py
@@ -238,7 +238,7 @@ class TestVimeoSubtitles(BaseTestSubtitles):
    def test_subtitles(self):
        self.DL.params['writesubtitles'] = True
        subtitles = self.getSubtitles()
-        self.assertEqual(md5(subtitles['en']), '8062383cf4dec168fc40a088aa6d5888')
+        self.assertEqual(md5(subtitles['en']), '26399116d23ae3cf2c087cea94bc43b4')

    def test_subtitles_lang(self):
        self.DL.params['writesubtitles'] = True
--- a/test/test_unicode_literals.py
+++ b/test/test_unicode_literals.py
@@ -1,5 +1,11 @@
 from __future__ import unicode_literals

+# Allow direct execution
+import os
+import sys
+import unittest
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
 import io
 import os
 import re
@@ -9,14 +15,16 @@ rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

 IGNORED_FILES = [
    'setup.py',  # http://bugs.python.org/issue13943
+    'conf.py',
+    'buildserver.py',
 ]


+from test.helper import assertRegexpMatches
+
+
 class TestUnicodeLiterals(unittest.TestCase):
    def test_all_files(self):
-        print('Skipping this test (not yet fully implemented)')
-        return
-
        for dirpath, _, filenames in os.walk(rootDir):
            for basename in filenames:
                if not basename.endswith('.py'):
@@ -30,10 +38,11 @@ class TestUnicodeLiterals(unittest.TestCase):

                if "'" not in code and '"' not in code:
                    continue
-                imps = 'from __future__ import unicode_literals'
-                self.assertTrue(
-                    imps in code,
-                    ' %s  missing in %s' % (imps, fn))
+                assertRegexpMatches(
+                    self,
+                    code,
+                    r'(?:(?:#.*?|\s*)\n)*from __future__ import (?:[a-z_]+,\s*)*unicode_literals',
+                    'unicode_literals import  missing in %s' % fn)

                m = re.search(r'(?<=\s)u[\'"](?!\)|,|$)', code)
                if m is not None:
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -47,6 +47,8 @@ from youtube_dl.utils import (
    js_to_json,
    intlist_to_bytes,
    args_to_str,
+    parse_filesize,
+    version_tuple,
 )


@@ -142,6 +144,9 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(unified_strdate('2012/10/11 01:56:38 +0000'), '20121011')
        self.assertEqual(unified_strdate('1968-12-10'), '19681210')
        self.assertEqual(unified_strdate('28/01/2014 21:00:00 +0100'), '20140128')
+        self.assertEqual(
+            unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
+            '20141126')

    def test_find_xpath_attr(self):
        testxml = '''<root>
@@ -170,7 +175,7 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(find('media:song/url').text, 'http://server.com/download.mp3')

    def test_smuggle_url(self):
-        data = {u"ö": u"ö", u"abc": [3]}
+        data = {"ö": "ö", "abc": [3]}
        url = 'https://foo.bar/baz?x=y#a'
        smug_url = smuggle_url(url, data)
        unsmug_url, unsmug_data = unsmuggle_url(smug_url)
@@ -219,6 +224,9 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(parse_duration('0s'), 0)
        self.assertEqual(parse_duration('01:02:03.05'), 3723.05)
        self.assertEqual(parse_duration('T30M38S'), 1838)
+        self.assertEqual(parse_duration('5 s'), 5)
+        self.assertEqual(parse_duration('3 min'), 180)
+        self.assertEqual(parse_duration('2.5 hours'), 9000)

    def test_fix_xml_ampersands(self):
        self.assertEqual(
@@ -367,5 +375,20 @@ class TestUtil(unittest.TestCase):
            'foo ba/r -baz \'2 be\' \'\''
        )

+    def test_parse_filesize(self):
+        self.assertEqual(parse_filesize(None), None)
+        self.assertEqual(parse_filesize(''), None)
+        self.assertEqual(parse_filesize('91 B'), 91)
+        self.assertEqual(parse_filesize('foobar'), None)
+        self.assertEqual(parse_filesize('2 MiB'), 2097152)
+        self.assertEqual(parse_filesize('5 GB'), 5000000000)
+        self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
+        self.assertEqual(parse_filesize('1,24 KB'), 1240)
+
+    def test_version_tuple(self):
+        self.assertEqual(version_tuple('1'), (1,))
+        self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
+        self.assertEqual(version_tuple('10.1-6'), (10, 1, 6))  # avconv style
+
 if __name__ == '__main__':
    unittest.main()
--- a/test/test_write_annotations.py
+++ b/test/test_write_annotations.py
@@ -1,5 +1,6 @@
 #!/usr/bin/env python
 # coding: utf-8
+from __future__ import unicode_literals

 # Allow direct execution
 import os
--- a/test/test_write_info_json.py
+++ b/test/test_write_info_json.py
@@ -1,5 +1,6 @@
 #!/usr/bin/env python
 # coding: utf-8
+from __future__ import unicode_literals

 # Allow direct execution
 import os
@@ -32,7 +33,7 @@ params = get_params({
 TEST_ID = 'BaW_jenozKc'
 INFO_JSON_FILE = TEST_ID + '.info.json'
 DESCRIPTION_FILE = TEST_ID + '.mp4.description'
-EXPECTED_DESCRIPTION = u'''test chars:  "'/\ä↭𝕐
+EXPECTED_DESCRIPTION = '''test chars:  "'/\ä↭𝕐
 test URL: https://github.com/rg3/youtube-dl/issues/1892

 This is a test video for youtube-dl.
@@ -53,11 +54,11 @@ class TestInfoJSON(unittest.TestCase):
        self.assertTrue(os.path.exists(INFO_JSON_FILE))
        with io.open(INFO_JSON_FILE, 'r', encoding='utf-8') as jsonf:
            jd = json.load(jsonf)
-        self.assertEqual(jd['upload_date'], u'20121002')
+        self.assertEqual(jd['upload_date'], '20121002')
        self.assertEqual(jd['description'], EXPECTED_DESCRIPTION)
        self.assertEqual(jd['id'], TEST_ID)
        self.assertEqual(jd['extractor'], 'youtube')
-        self.assertEqual(jd['title'], u'''youtube-dl test video "'/\ä↭𝕐''')
+        self.assertEqual(jd['title'], '''youtube-dl test video "'/\ä↭𝕐''')
        self.assertEqual(jd['uploader'], 'Philipp Hagemeister')

        self.assertTrue(os.path.exists(DESCRIPTION_FILE))
--- a/test/test_youtube_lists.py
+++ b/test/test_youtube_lists.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+from __future__ import unicode_literals

 # Allow direct execution
 import os
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -7,6 +7,7 @@ import collections
 import datetime
 import errno
 import io
+import itertools
 import json
 import locale
 import os
@@ -123,6 +124,7 @@ class YoutubeDL(object):
    nooverwrites:      Prevent overwriting files.
    playliststart:     Playlist item to start at.
    playlistend:       Playlist item to end at.
+    playlistreverse:   Download playlist items in reverse order.
    matchtitle:        Download only matching titles.
    rejecttitle:       Reject downloads for matching titles.
    logger:            Log messages to a logging.Logger instance.
@@ -621,23 +623,15 @@ class YoutubeDL(object):
                ie_result['url'], ie_key=ie_result.get('ie_key'),
                extra_info=extra_info, download=False, process=False)

-            def make_result(embedded_info):
-                new_result = ie_result.copy()
-                for f in ('_type', 'url', 'ext', 'player_url', 'formats',
-                          'entries', 'ie_key', 'duration',
-                          'subtitles', 'annotations', 'format',
-                          'thumbnail', 'thumbnails'):
-                    if f in new_result:
-                        del new_result[f]
-                    if f in embedded_info:
-                        new_result[f] = embedded_info[f]
-                return new_result
-            new_result = make_result(info)
+            force_properties = dict(
+                (k, v) for k, v in ie_result.items() if v is not None)
+            for f in ('_type', 'url'):
+                if f in force_properties:
+                    del force_properties[f]
+            new_result = info.copy()
+            new_result.update(force_properties)

            assert new_result.get('_type') != 'url_transparent'
-            if new_result.get('_type') == 'compat_list':
-                new_result['entries'] = [
-                    make_result(e) for e in new_result['entries']]

            return self.process_ie_result(
                new_result, download=download, extra_info=extra_info)
@@ -654,21 +648,31 @@ class YoutubeDL(object):
            if playlistend == -1:
                playlistend = None

-            if isinstance(ie_result['entries'], list):
-                n_all_entries = len(ie_result['entries'])
-                entries = ie_result['entries'][playliststart:playlistend]
+            ie_entries = ie_result['entries']
+            if isinstance(ie_entries, list):
+                n_all_entries = len(ie_entries)
+                entries = ie_entries[playliststart:playlistend]
                n_entries = len(entries)
                self.to_screen(
                    "[%s] playlist %s: Collected %d video ids (downloading %d of them)" %
                    (ie_result['extractor'], playlist, n_all_entries, n_entries))
-            else:
-                assert isinstance(ie_result['entries'], PagedList)
-                entries = ie_result['entries'].getslice(
+            elif isinstance(ie_entries, PagedList):
+                entries = ie_entries.getslice(
                    playliststart, playlistend)
                n_entries = len(entries)
                self.to_screen(
                    "[%s] playlist %s: Downloading %d videos" %
                    (ie_result['extractor'], playlist, n_entries))
+            else:  # iterable
+                entries = list(itertools.islice(
+                    ie_entries, playliststart, playlistend))
+                n_entries = len(entries)
+                self.to_screen(
+                    "[%s] playlist %s: Downloading %d videos" %
+                    (ie_result['extractor'], playlist, n_entries))
+
+            if self.params.get('playlistreverse', False):
+                entries = entries[::-1]

            for i, entry in enumerate(entries, 1):
                self.to_screen('[download] Downloading video #%s of %s' % (i, n_entries))
@@ -787,6 +791,10 @@ class YoutubeDL(object):
            info_dict['display_id'] = info_dict['id']

        if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
+            # Working around negative timestamps in Windows
+            # (see http://bugs.python.org/issue1646728)
+            if info_dict['timestamp'] < 0 and os.name == 'nt':
+                info_dict['timestamp'] = 0
            upload_date = datetime.datetime.utcfromtimestamp(
                info_dict['timestamp'])
            info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
@@ -930,8 +938,12 @@ class YoutubeDL(object):
        if self.params.get('forceid', False):
            self.to_stdout(info_dict['id'])
        if self.params.get('forceurl', False):
-            # For RTMP URLs, also include the playpath
-            self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
+            if info_dict.get('requested_formats') is not None:
+                for f in info_dict['requested_formats']:
+                    self.to_stdout(f['url'] + f.get('play_path', ''))
+            else:
+                # For RTMP URLs, also include the playpath
+                self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
        if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
            self.to_stdout(info_dict['thumbnail'])
        if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -249,6 +249,7 @@ def _real_main(argv=None):
        'progress_with_newline': opts.progress_with_newline,
        'playliststart': opts.playliststart,
        'playlistend': opts.playlistend,
+        'playlistreverse': opts.playlist_reverse,
        'noplaylist': opts.noplaylist,
        'logtostderr': opts.outtmpl == '-',
        'consoletitle': opts.consoletitle,
--- a/youtube_dl/main.py
+++ b/youtube_dl/main.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+from __future__ import unicode_literals

 # Execute with
 # $ python youtube_dl/__main__.py (2.6+)
--- a/youtube_dl/aes.py
+++ b/youtube_dl/aes.py
@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+
 __all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']

 import base64
--- a/youtube_dl/compat.py
+++ b/youtube_dl/compat.py
@@ -247,7 +247,7 @@ else:
                userhome = compat_getenv('HOME')
            elif 'USERPROFILE' in os.environ:
                userhome = compat_getenv('USERPROFILE')
-            elif not 'HOMEPATH' in os.environ:
+            elif 'HOMEPATH' not in os.environ:
                return path
            else:
                try:
@@ -270,7 +270,7 @@ if sys.version_info < (3, 0):
        print(s.encode(preferredencoding(), 'xmlcharrefreplace'))
 else:
    def compat_print(s):
-        assert type(s) == type(u'')
+        assert isinstance(s, compat_str)
        print(s)


@@ -297,7 +297,9 @@ else:

 # Old 2.6 and 2.7 releases require kwargs to be bytes
 try:
-    (lambda x: x)(**{'x': 0})
+    def _testfunc(x):
+        pass
+    _testfunc(**{'x': 0})
 except TypeError:
    def compat_kwargs(kwargs):
        return dict((bytes(k), v) for k, v in kwargs.items())
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dl/downloader/common.py
@@ -5,8 +5,8 @@ import re
 import sys
 import time

+from ..compat import compat_str
 from ..utils import (
-    compat_str,
    encodeFilename,
    format_bytes,
    timeconvert,
@@ -80,6 +80,8 @@ class FileDownloader(object):
    def calc_eta(start, now, total, current):
        if total is None:
            return None
+        if now is None:
+            now = time.time()
        dif = now - start
        if current == 0 or dif < 0.001:  # One millisecond
            return None
@@ -146,18 +148,19 @@ class FileDownloader(object):
    def report_error(self, *args, **kargs):
        self.ydl.report_error(*args, **kargs)

-    def slow_down(self, start_time, byte_counter):
+    def slow_down(self, start_time, now, byte_counter):
        """Sleep if the download speed is over the rate limit."""
        rate_limit = self.params.get('ratelimit', None)
        if rate_limit is None or byte_counter == 0:
            return
-        now = time.time()
+        if now is None:
+            now = time.time()
        elapsed = now - start_time
        if elapsed <= 0.0:
            return
        speed = float(byte_counter) / elapsed
        if speed > rate_limit:
-            time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
+            time.sleep(max((byte_counter // rate_limit) - elapsed, 0))

    def temp_name(self, filename):
        """Returns a temporary filename for the given filename."""
--- a/youtube_dl/downloader/f4m.py
+++ b/youtube_dl/downloader/f4m.py
@@ -9,10 +9,12 @@ import xml.etree.ElementTree as etree

 from .common import FileDownloader
 from .http import HttpFD
+from ..compat import (
+    compat_urlparse,
+)
 from ..utils import (
    struct_pack,
    struct_unpack,
-    compat_urlparse,
    format_bytes,
    encodeFilename,
    sanitize_open,
@@ -231,6 +233,7 @@ class F4mFD(FileDownloader):
                'continuedl': True,
                'quiet': True,
                'noprogress': True,
+                'ratelimit': self.params.get('ratelimit', None),
                'test': self.params.get('test', False),
            }
        )
--- a/youtube_dl/downloader/hls.py
+++ b/youtube_dl/downloader/hls.py
@@ -4,10 +4,13 @@ import os
 import re
 import subprocess

+from ..postprocessor.ffmpeg import FFmpegPostProcessor
 from .common import FileDownloader
-from ..utils import (
+from ..compat import (
    compat_urlparse,
    compat_urllib_request,
+)
+from ..utils import (
    check_executable,
    encodeFilename,
 )
@@ -28,14 +31,17 @@ class HlsFD(FileDownloader):
            if check_executable(program, ['-version']):
                break
        else:
-            self.report_error(u'm3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
+            self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
            return False
        cmd = [program] + args

+        ffpp = FFmpegPostProcessor(downloader=self)
+        ffpp.check_version()
+
        retval = subprocess.call(cmd)
        if retval == 0:
            fsize = os.path.getsize(encodeFilename(tmpfilename))
-            self.to_screen(u'\r[%s] %s bytes' % (cmd[0], fsize))
+            self.to_screen('\r[%s] %s bytes' % (cmd[0], fsize))
            self.try_rename(tmpfilename, filename)
            self._hook_progress({
                'downloaded_bytes': fsize,
@@ -45,8 +51,8 @@ class HlsFD(FileDownloader):
            })
            return True
        else:
-            self.to_stderr(u"\n")
-            self.report_error(u'%s exited with code %d' % (program, retval))
+            self.to_stderr('\n')
+            self.report_error('%s exited with code %d' % (program, retval))
            return False


--- a/youtube_dl/downloader/http.py
+++ b/youtube_dl/downloader/http.py
@@ -1,12 +1,15 @@
+from __future__ import unicode_literals
+
 import os
 import time

 from .common import FileDownloader
-from ..utils import (
+from ..compat import (
    compat_urllib_request,
    compat_urllib_error,
+)
+from ..utils import (
    ContentTooShortError,
-
    encodeFilename,
    sanitize_open,
    format_bytes,
@@ -106,7 +109,7 @@ class HttpFD(FileDownloader):
                self.report_retry(count, retries)

        if count > retries:
-            self.report_error(u'giving up after %s retries' % retries)
+            self.report_error('giving up after %s retries' % retries)
            return False

        data_len = data.info().get('Content-length', None)
@@ -124,26 +127,31 @@ class HttpFD(FileDownloader):
            min_data_len = self.params.get("min_filesize", None)
            max_data_len = self.params.get("max_filesize", None)
            if min_data_len is not None and data_len < min_data_len:
-                self.to_screen(u'\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
+                self.to_screen('\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
                return False
            if max_data_len is not None and data_len > max_data_len:
-                self.to_screen(u'\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
+                self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
                return False

        data_len_str = format_bytes(data_len)
        byte_counter = 0 + resume_len
        block_size = self.params.get('buffersize', 1024)
        start = time.time()
+
+        # measure time over whole while-loop, so slow_down() and best_block_size() work together properly
+        now = None  # needed for slow_down() in the first loop run
+        before = start  # start measuring
        while True:
+
            # Download and write
-            before = time.time()
            data_block = data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
-            after = time.time()
-            if len(data_block) == 0:
-                break
            byte_counter += len(data_block)

-            # Open file just in time
+            # exit loop when download is finished
+            if len(data_block) == 0:
+                break
+
+            # Open destination file just in time
            if stream is None:
                try:
                    (stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
@@ -151,19 +159,30 @@ class HttpFD(FileDownloader):
                    filename = self.undo_temp_name(tmpfilename)
                    self.report_destination(filename)
                except (OSError, IOError) as err:
-                    self.report_error(u'unable to open for writing: %s' % str(err))
+                    self.report_error('unable to open for writing: %s' % str(err))
                    return False
            try:
                stream.write(data_block)
            except (IOError, OSError) as err:
-                self.to_stderr(u"\n")
-                self.report_error(u'unable to write data: %s' % str(err))
+                self.to_stderr('\n')
+                self.report_error('unable to write data: %s' % str(err))
                return False
+
+            # Apply rate limit
+            self.slow_down(start, now, byte_counter - resume_len)
+
+            # end measuring of one loop run
+            now = time.time()
+            after = now
+
+            # Adjust block size
            if not self.params.get('noresizebuffer', False):
                block_size = self.best_block_size(after - before, len(data_block))

+            before = after
+
            # Progress message
-            speed = self.calc_speed(start, time.time(), byte_counter - resume_len)
+            speed = self.calc_speed(start, now, byte_counter - resume_len)
            if data_len is None:
                eta = percent = None
            else:
@@ -184,14 +203,11 @@ class HttpFD(FileDownloader):
            if is_test and byte_counter == data_len:
                break

-            # Apply rate limit
-            self.slow_down(start, byte_counter - resume_len)
-
        if stream is None:
-            self.to_stderr(u"\n")
-            self.report_error(u'Did not get any data blocks')
+            self.to_stderr('\n')
+            self.report_error('Did not get any data blocks')
            return False
-        if tmpfilename != u'-':
+        if tmpfilename != '-':
            stream.close()
        self.report_finish(data_len_str, (time.time() - start))
        if data_len is not None and byte_counter != data_len:
--- a/youtube_dl/downloader/mplayer.py
+++ b/youtube_dl/downloader/mplayer.py
@@ -1,7 +1,10 @@
+from __future__ import unicode_literals
+
 import os
 import subprocess

 from .common import FileDownloader
+from ..compat import compat_subprocess_get_DEVNULL
 from ..utils import (
    encodeFilename,
 )
@@ -13,19 +16,23 @@ class MplayerFD(FileDownloader):
        self.report_destination(filename)
        tmpfilename = self.temp_name(filename)

-        args = ['mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy', '-dumpstream', '-dumpfile', tmpfilename, url]
+        args = [
+            'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
+            '-dumpstream', '-dumpfile', tmpfilename, url]
        # Check for mplayer first
        try:
-            subprocess.call(['mplayer', '-h'], stdout=(open(os.path.devnull, 'w')), stderr=subprocess.STDOUT)
+            subprocess.call(
+                ['mplayer', '-h'],
+                stdout=compat_subprocess_get_DEVNULL(), stderr=subprocess.STDOUT)
        except (OSError, IOError):
-            self.report_error(u'MMS or RTSP download detected but "%s" could not be run' % args[0])
+            self.report_error('MMS or RTSP download detected but "%s" could not be run' % args[0])
            return False

        # Download using mplayer.
        retval = subprocess.call(args)
        if retval == 0:
            fsize = os.path.getsize(encodeFilename(tmpfilename))
-            self.to_screen(u'\r[%s] %s bytes' % (args[0], fsize))
+            self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
            self.try_rename(tmpfilename, filename)
            self._hook_progress({
                'downloaded_bytes': fsize,
@@ -35,6 +42,6 @@ class MplayerFD(FileDownloader):
            })
            return True
        else:
-            self.to_stderr(u"\n")
-            self.report_error(u'mplayer exited with code %d' % retval)
+            self.to_stderr('\n')
+            self.report_error('mplayer exited with code %d' % retval)
            return False
--- a/youtube_dl/downloader/rtmp.py
+++ b/youtube_dl/downloader/rtmp.py
@@ -7,9 +7,9 @@ import sys
 import time

 from .common import FileDownloader
+from ..compat import compat_str
 from ..utils import (
    check_executable,
-    compat_str,
    encodeFilename,
    format_bytes,
    get_exe_version,
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+
 from .abc import ABCIE
 from .academicearth import AcademicEarthCourseIE
 from .addanime import AddAnimeIE
@@ -22,11 +24,13 @@ from .arte import (
 )
 from .audiomack import AudiomackIE
 from .auengine import AUEngineIE
+from .azubu import AzubuIE
 from .bambuser import BambuserIE, BambuserChannelIE
 from .bandcamp import BandcampIE, BandcampAlbumIE
 from .bbccouk import BBCCoUkIE
 from .beeg import BeegIE
 from .behindkink import BehindKinkIE
+from .bet import BetIE
 from .bild import BildIE
 from .bilibili import BiliBiliIE
 from .blinkx import BlinkxIE
@@ -36,6 +40,7 @@ from .bpb import BpbIE
 from .br import BRIE
 from .breakcom import BreakIE
 from .brightcove import BrightcoveIE
+from .buzzfeed import BuzzFeedIE
 from .byutv import BYUtvIE
 from .c56 import C56IE
 from .canal13cl import Canal13clIE
@@ -46,7 +51,7 @@ from .cbsnews import CBSNewsIE
 from .ceskatelevize import CeskaTelevizeIE
 from .channel9 import Channel9IE
 from .chilloutzone import ChilloutzoneIE
-from .cinemassacre import CinemassacreIE
+from .cinchcast import CinchcastIE
 from .clipfish import ClipfishIE
 from .cliphunter import CliphunterIE
 from .clipsyndicate import ClipsyndicateIE
@@ -60,6 +65,7 @@ from .cnn import (
 )
 from .collegehumor import CollegeHumorIE
 from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
+from .comcarcoff import ComCarCoffIE
 from .condenast import CondeNastIE
 from .cracked import CrackedIE
 from .criterion import CriterionIE
@@ -118,6 +124,8 @@ from .fktv import (
 from .flickr import FlickrIE
 from .folketinget import FolketingetIE
 from .fourtube import FourTubeIE
+from .foxgay import FoxgayIE
+from .foxnews import FoxNewsIE
 from .franceculture import FranceCultureIE
 from .franceinter import FranceInterIE
 from .francetv import (
@@ -141,6 +149,7 @@ from .gamestar import GameStarIE
 from .gametrailers import GametrailersIE
 from .gdcvault import GDCVaultIE
 from .generic import GenericIE
+from .giantbomb import GiantBombIE
 from .glide import GlideIE
 from .globo import GloboIE
 from .godtube import GodTubeIE
@@ -151,6 +160,7 @@ from .googlesearch import GoogleSearchIE
 from .gorillavid import GorillaVidIE
 from .goshgay import GoshgayIE
 from .grooveshark import GroovesharkIE
+from .groupon import GrouponIE
 from .hark import HarkIE
 from .heise import HeiseIE
 from .helsinki import HelsinkiIE
@@ -213,6 +223,7 @@ from .mdr import MDRIE
 from .metacafe import MetacafeIE
 from .metacritic import MetacriticIE
 from .mgoon import MgoonIE
+from .minhateca import MinhatecaIE
 from .ministrygrid import MinistryGridIE
 from .mit import TechTVMITIE, MITIE, OCWMITIE
 from .mitele import MiTeleIE
@@ -239,9 +250,10 @@ from .muenchentv import MuenchenTVIE
 from .musicplayon import MusicPlayOnIE
 from .musicvault import MusicVaultIE
 from .muzu import MuzuTVIE
-from .myspace import MySpaceIE
+from .myspace import MySpaceIE, MySpaceAlbumIE
 from .myspass import MySpassIE
 from .myvideo import MyVideoIE
+from .myvidster import MyVidsterIE
 from .naver import NaverIE
 from .nba import NBAIE
 from .nbc import (
@@ -299,10 +311,12 @@ from .promptfile import PromptFileIE
 from .prosiebensat1 import ProSiebenSat1IE
 from .pyvideo import PyvideoIE
 from .quickvid import QuickVidIE
+from .radiode import RadioDeIE
 from .radiofrance import RadioFranceIE
 from .rai import RaiIE
 from .rbmaradio import RBMARadioIE
 from .redtube import RedTubeIE
+from .restudy import RestudyIE
 from .reverbnation import ReverbNationIE
 from .ringtv import RingTVIE
 from .ro220 import Ro220IE
@@ -311,6 +325,7 @@ from .roxwel import RoxwelIE
 from .rtbf import RTBFIE
 from .rtlnl import RtlXlIE
 from .rtlnow import RTLnowIE
+from .rtp import RTPIE
 from .rts import RTSIE
 from .rtve import RTVEALaCartaIE, RTVELiveIE
 from .ruhd import RUHDIE
@@ -326,6 +341,7 @@ from .savefrom import SaveFromIE
 from .sbs import SBSIE
 from .scivee import SciVeeIE
 from .screencast import ScreencastIE
+from .screenwavemedia import CinemassacreIE, ScreenwaveMediaIE, TeamFourIE
 from .servingsys import ServingSysIE
 from .sexu import SexuIE
 from .sexykarma import SexyKarmaIE
@@ -373,6 +389,7 @@ from .syfy import SyfyIE
 from .sztvhu import SztvHuIE
 from .tagesschau import TagesschauIE
 from .tapely import TapelyIE
+from .tass import TassIE
 from .teachertube import (
    TeacherTubeIE,
    TeacherTubeUserIE,
@@ -393,6 +410,7 @@ from .thesixtyone import TheSixtyOneIE
 from .thisav import ThisAVIE
 from .tinypic import TinyPicIE
 from .tlc import TlcIE, TlcDeIE
+from .tmz import TMZIE
 from .tnaflix import TNAFlixIE
 from .thvideo import (
    THVideoIE,
@@ -412,6 +430,7 @@ from .tutv import TutvIE
 from .tvigle import TvigleIE
 from .tvp import TvpIE
 from .tvplay import TVPlayIE
+from .twentyfourvideo import TwentyFourVideoIE
 from .twitch import TwitchIE
 from .ubu import UbuIE
 from .udemy import (
@@ -483,6 +502,7 @@ from .wrzuta import WrzutaIE
 from .xbef import XBefIE
 from .xboxclips import XboxClipsIE
 from .xhamster import XHamsterIE
+from .xminus import XMinusIE
 from .xnxx import XNXXIE
 from .xvideos import XVideosIE
 from .xtube import XTubeUserIE, XTubeIE
@@ -512,7 +532,7 @@ from .youtube import (
    YoutubeUserIE,
    YoutubeWatchLaterIE,
 )
-from .zdf import ZDFIE
+from .zdf import ZDFIE, ZDFChannelIE
 from .zingmp3 import (
    ZingMp3SongIE,
    ZingMp3AlbumIE,
--- a/youtube_dl/extractor/academicearth.py
+++ b/youtube_dl/extractor/academicearth.py
@@ -1,4 +1,5 @@
 from __future__ import unicode_literals
+
 import re

 from .common import InfoExtractor
@@ -18,15 +19,14 @@ class AcademicEarthCourseIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        m = re.match(self._VALID_URL, url)
-        playlist_id = m.group('id')
+        playlist_id = self._match_id(url)

        webpage = self._download_webpage(url, playlist_id)
        title = self._html_search_regex(
-            r'<h1 class="playlist-name"[^>]*?>(.*?)</h1>', webpage, u'title')
+            r'<h1 class="playlist-name"[^>]*?>(.*?)</h1>', webpage, 'title')
        description = self._html_search_regex(
            r'<p class="excerpt"[^>]*?>(.*?)</p>',
-            webpage, u'description', fatal=False)
+            webpage, 'description', fatal=False)
        urls = re.findall(
            r'<li class="lecture-preview">\s*?<a target="_blank" href="([^"]+)">',
            webpage)
--- a/youtube_dl/extractor/addanime.py
+++ b/youtube_dl/extractor/addanime.py
@@ -15,8 +15,7 @@ from ..utils import (


 class AddAnimeIE(InfoExtractor):
-
-    _VALID_URL = r'^http://(?:\w+\.)?add-anime\.net/watch_video\.php\?(?:.*?)v=(?P<video_id>[\w_]+)(?:.*)'
+    _VALID_URL = r'^http://(?:\w+\.)?add-anime\.net/watch_video\.php\?(?:.*?)v=(?P<id>[\w_]+)(?:.*)'
    _TEST = {
        'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
        'md5': '72954ea10bc979ab5e2eb288b21425a0',
@@ -29,9 +28,9 @@ class AddAnimeIE(InfoExtractor):
    }

    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
        try:
-            mobj = re.match(self._VALID_URL, url)
-            video_id = mobj.group('video_id')
            webpage = self._download_webpage(url, video_id)
        except ExtractorError as ee:
            if not isinstance(ee.cause, compat_HTTPError) or \
@@ -49,7 +48,7 @@ class AddAnimeIE(InfoExtractor):
                r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
                redir_webpage)
            if av is None:
-                raise ExtractorError(u'Cannot find redirect math task')
+                raise ExtractorError('Cannot find redirect math task')
            av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))

            parsed_url = compat_urllib_parse_urlparse(url)
--- a/youtube_dl/extractor/adultswim.py
+++ b/youtube_dl/extractor/adultswim.py
@@ -2,123 +2,150 @@
 from __future__ import unicode_literals

 import re
+import json

 from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    xpath_text,
+    float_or_none,
+)


 class AdultSwimIE(InfoExtractor):
-    _VALID_URL = r'https?://video\.adultswim\.com/(?P<path>.+?)(?:\.html)?(?:\?.*)?(?:#.*)?$'
-    _TEST = {
-        'url': 'http://video.adultswim.com/rick-and-morty/close-rick-counters-of-the-rick-kind.html?x=y#title',
+    _VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<is_playlist>playlists/)?(?P<show_path>[^/]+)/(?P<episode_path>[^/?#]+)/?'
+
+    _TESTS = [{
+        'url': 'http://adultswim.com/videos/rick-and-morty/pilot',
        'playlist': [
            {
-                'md5': '4da359ec73b58df4575cd01a610ba5dc',
+                'md5': '247572debc75c7652f253c8daa51a14d',
                'info_dict': {
-                    'id': '8a250ba1450996e901453d7f02ca02f5',
+                    'id': 'rQxZvXQ4ROaSOqq-or2Mow-0',
                    'ext': 'flv',
-                    'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 1',
-                    'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
-                    'uploader': 'Rick and Morty',
-                    'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
-                }
+                    'title': 'Rick and Morty - Pilot Part 1',
+                    'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
+                },
            },
            {
-                'md5': 'ffbdf55af9331c509d95350bd0cc1819',
+                'md5': '77b0e037a4b20ec6b98671c4c379f48d',
                'info_dict': {
-                    'id': '8a250ba1450996e901453d7f4bd102f6',
+                    'id': 'rQxZvXQ4ROaSOqq-or2Mow-3',
                    'ext': 'flv',
-                    'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 2',
-                    'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
-                    'uploader': 'Rick and Morty',
-                    'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
-                }
+                    'title': 'Rick and Morty - Pilot Part 4',
+                    'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
+                },
            },
+        ],
+        'info_dict': {
+            'title': 'Rick and Morty - Pilot',
+            'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
+        }
+    }, {
+        'url': 'http://www.adultswim.com/videos/playlists/american-parenting/putting-francine-out-of-business/',
+        'playlist': [
            {
-                'md5': 'b92409635540304280b4b6c36bd14a0a',
+                'md5': '2eb5c06d0f9a1539da3718d897f13ec5',
                'info_dict': {
-                    'id': '8a250ba1450996e901453d7fa73c02f7',
+                    'id': '-t8CamQlQ2aYZ49ItZCFog-0',
                    'ext': 'flv',
-                    'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 3',
-                    'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
-                    'uploader': 'Rick and Morty',
-                    'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
-                }
-            },
-            {
-                'md5': 'e8818891d60e47b29cd89d7b0278156d',
-                'info_dict': {
-                    'id': '8a250ba1450996e901453d7fc8ba02f8',
-                    'ext': 'flv',
-                    'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 4',
-                    'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
-                    'uploader': 'Rick and Morty',
-                    'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
-                }
+                    'title': 'American Dad - Putting Francine Out of Business',
+                    'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
+                },
            }
-        ]
-    }
+        ],
+        'info_dict': {
+            'title': 'American Dad - Putting Francine Out of Business',
+            'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
+        },
+    }]

-    _video_extensions = {
-        '3500': 'flv',
-        '640': 'mp4',
-        '150': 'mp4',
-        'ipad': 'm3u8',
-        'iphone': 'm3u8'
-    }
-    _video_dimensions = {
-        '3500': (1280, 720),
-        '640': (480, 270),
-        '150': (320, 180)
-    }
+    @staticmethod
+    def find_video_info(collection, slug):
+        for video in collection.get('videos'):
+            if video.get('slug') == slug:
+                return video
+
+    @staticmethod
+    def find_collection_by_linkURL(collections, linkURL):
+        for collection in collections:
+            if collection.get('linkURL') == linkURL:
+                return collection
+
+    @staticmethod
+    def find_collection_containing_video(collections, slug):
+        for collection in collections:
+            for video in collection.get('videos'):
+                if video.get('slug') == slug:
+                    return collection, video

    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-        video_path = mobj.group('path')
+        show_path = mobj.group('show_path')
+        episode_path = mobj.group('episode_path')
+        is_playlist = True if mobj.group('is_playlist') else False

-        webpage = self._download_webpage(url, video_path)
-        episode_id = self._html_search_regex(
-            r'<link rel="video_src" href="http://i\.adultswim\.com/adultswim/adultswimtv/tools/swf/viralplayer.swf\?id=([0-9a-f]+?)"\s*/?\s*>',
-            webpage, 'episode_id')
-        title = self._og_search_title(webpage)
+        webpage = self._download_webpage(url, episode_path)

-        index_url = 'http://asfix.adultswim.com/asfix-svc/episodeSearch/getEpisodesByIDs?networkName=AS&ids=%s' % episode_id
-        idoc = self._download_xml(index_url, title, 'Downloading episode index', 'Unable to download episode index')
+        # Extract the value of `bootstrappedData` from the Javascript in the page.
+        bootstrappedDataJS = self._search_regex(r'var bootstrappedData = ({.*});', webpage, episode_path)

-        episode_el = idoc.find('.//episode')
-        show_title = episode_el.attrib.get('collectionTitle')
-        episode_title = episode_el.attrib.get('title')
-        thumbnail = episode_el.attrib.get('thumbnailUrl')
-        description = episode_el.find('./description').text.strip()
+        try:
+            bootstrappedData = json.loads(bootstrappedDataJS)
+        except ValueError as ve:
+            errmsg = '%s: Failed to parse JSON ' % episode_path
+            raise ExtractorError(errmsg, cause=ve)
+
+        # Downloading videos from a /videos/playlist/ URL needs to be handled differently.
+        # NOTE: We are only downloading one video (the current one) not the playlist
+        if is_playlist:
+            collections = bootstrappedData['playlists']['collections']
+            collection = self.find_collection_by_linkURL(collections, show_path)
+            video_info = self.find_video_info(collection, episode_path)
+
+            show_title = video_info['showTitle']
+            segment_ids = [video_info['videoPlaybackID']]
+        else:
+            collections = bootstrappedData['show']['collections']
+            collection, video_info = self.find_collection_containing_video(collections, episode_path)
+
+            show = bootstrappedData['show']
+            show_title = show['title']
+            segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
+
+        episode_id = video_info['id']
+        episode_title = video_info['title']
+        episode_description = video_info['description']
+        episode_duration = video_info.get('duration')

        entries = []
-        segment_els = episode_el.findall('./segments/segment')
+        for part_num, segment_id in enumerate(segment_ids):
+            segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=mobile' % segment_id

-        for part_num, segment_el in enumerate(segment_els):
-            segment_id = segment_el.attrib.get('id')
-            segment_title = '%s %s part %d' % (show_title, episode_title, part_num + 1)
-            thumbnail = segment_el.attrib.get('thumbnailUrl')
-            duration = segment_el.attrib.get('duration')
+            segment_title = '%s - %s' % (show_title, episode_title)
+            if len(segment_ids) > 1:
+                segment_title += ' Part %d' % (part_num + 1)

-            segment_url = 'http://asfix.adultswim.com/asfix-svc/episodeservices/getCvpPlaylist?networkName=AS&id=%s' % segment_id
            idoc = self._download_xml(
                segment_url, segment_title,
                'Downloading segment information', 'Unable to download segment information')

+            segment_duration = float_or_none(
+                xpath_text(idoc, './/trt', 'segment duration').strip())
+
            formats = []
            file_els = idoc.findall('.//files/file')

            for file_el in file_els:
                bitrate = file_el.attrib.get('bitrate')
-                type = file_el.attrib.get('type')
-                width, height = self._video_dimensions.get(bitrate, (None, None))
+                ftype = file_el.attrib.get('type')
+
                formats.append({
-                    'format_id': '%s-%s' % (bitrate, type),
-                    'url': file_el.text,
-                    'ext': self._video_extensions.get(bitrate, 'mp4'),
+                    'format_id': '%s_%s' % (bitrate, ftype),
+                    'url': file_el.text.strip(),
                    # The bitrate may not be a number (for example: 'iphone')
                    'tbr': int(bitrate) if bitrate.isdigit() else None,
-                    'height': height,
-                    'width': width
+                    'quality': 1 if ftype == 'hd' else -1
                })

            self._sort_formats(formats)
@@ -127,18 +154,16 @@ class AdultSwimIE(InfoExtractor):
                'id': segment_id,
                'title': segment_title,
                'formats': formats,
-                'uploader': show_title,
-                'thumbnail': thumbnail,
-                'duration': duration,
-                'description': description
+                'duration': segment_duration,
+                'description': episode_description
            })

        return {
            '_type': 'playlist',
            'id': episode_id,
-            'display_id': video_path,
+            'display_id': episode_path,
            'entries': entries,
-            'title': '%s %s' % (show_title, episode_title),
-            'description': description,
-            'thumbnail': thumbnail
+            'title': '%s - %s' % (show_title, episode_title),
+            'description': episode_description,
+            'duration': episode_duration
        }
--- a/youtube_dl/extractor/allocine.py
+++ b/youtube_dl/extractor/allocine.py
@@ -5,10 +5,9 @@ import re
 import json

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
-    compat_str,
    qualities,
-    determine_ext,
 )


@@ -75,9 +74,7 @@ class AllocineIE(InfoExtractor):
                    'format_id': format_id,
                    'quality': quality(format_id),
                    'url': v,
-                    'ext': determine_ext(v),
                })
-
        self._sort_formats(formats)

        return {
--- a/youtube_dl/extractor/aol.py
+++ b/youtube_dl/extractor/aol.py
@@ -3,7 +3,6 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from .fivemin import FiveMinIE


 class AolIE(InfoExtractor):
@@ -42,31 +41,30 @@ class AolIE(InfoExtractor):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
-
        playlist_id = mobj.group('playlist_id')
-        if playlist_id and not self._downloader.params.get('noplaylist'):
-            self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
+        if not playlist_id or self._downloader.params.get('noplaylist'):
+            return self.url_result('5min:%s' % video_id)

-            webpage = self._download_webpage(url, playlist_id)
-            title = self._html_search_regex(
-                r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
-            playlist_html = self._search_regex(
-                r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
-                'playlist HTML')
-            entries = [{
-                '_type': 'url',
-                'url': 'aol-video:%s' % m.group('id'),
-                'ie_key': 'Aol',
-            } for m in re.finditer(
-                r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
-                playlist_html)]
+        self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))

-            return {
-                '_type': 'playlist',
-                'id': playlist_id,
-                'display_id': mobj.group('playlist_display_id'),
-                'title': title,
-                'entries': entries,
-            }
+        webpage = self._download_webpage(url, playlist_id)
+        title = self._html_search_regex(
+            r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
+        playlist_html = self._search_regex(
+            r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
+            'playlist HTML')
+        entries = [{
+            '_type': 'url',
+            'url': 'aol-video:%s' % m.group('id'),
+            'ie_key': 'Aol',
+        } for m in re.finditer(
+            r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
+            playlist_html)]

-        return FiveMinIE._build_result(video_id)
+        return {
+            '_type': 'playlist',
+            'id': playlist_id,
+            'display_id': mobj.group('playlist_display_id'),
+            'title': title,
+            'entries': entries,
+        }
--- a/youtube_dl/extractor/aparat.py
+++ b/youtube_dl/extractor/aparat.py
@@ -1,5 +1,4 @@
 # coding: utf-8
-
 from __future__ import unicode_literals

 import re
@@ -26,8 +25,7 @@ class AparatIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        m = re.match(self._VALID_URL, url)
-        video_id = m.group('id')
+        video_id = self._match_id(url)

        # Note: There is an easier-to-parse configuration at
        # http://www.aparat.com/video/video/config/videohash/%video_id
@@ -40,15 +38,15 @@ class AparatIE(InfoExtractor):
        for i, video_url in enumerate(video_urls):
            req = HEADRequest(video_url)
            res = self._request_webpage(
-                req, video_id, note=u'Testing video URL %d' % i, errnote=False)
+                req, video_id, note='Testing video URL %d' % i, errnote=False)
            if res:
                break
        else:
-            raise ExtractorError(u'No working video URLs found')
+            raise ExtractorError('No working video URLs found')

-        title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, u'title')
+        title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
        thumbnail = self._search_regex(
-            r'\s+image:\s*"([^"]+)"', webpage, u'thumbnail', fatal=False)
+            r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)

        return {
            'id': video_id,
--- a/youtube_dl/extractor/appletrailers.py
+++ b/youtube_dl/extractor/appletrailers.py
@@ -4,8 +4,8 @@ import re
 import json

 from .common import InfoExtractor
+from ..compat import compat_urlparse
 from ..utils import (
-    compat_urlparse,
    int_or_none,
 )

@@ -80,7 +80,7 @@ class AppleTrailersIE(InfoExtractor):
            def _clean_json(m):
                return 'iTunes.playURL(%s);' % m.group(1).replace('\'', '&#39;')
            s = re.sub(self._JSON_RE, _clean_json, s)
-            s = '<html>' + s + u'</html>'
+            s = '<html>%s</html>' % s
            return s
        doc = self._download_xml(playlist_url, movie, transform_source=fix_html)

--- a/youtube_dl/extractor/audiomack.py
+++ b/youtube_dl/extractor/audiomack.py
@@ -24,17 +24,17 @@ class AudiomackIE(InfoExtractor):
        },
        # hosted on soundcloud via audiomack
        {
+            'add_ie': ['Soundcloud'],
            'url': 'http://www.audiomack.com/song/xclusiveszone/take-kare',
-            'file': '172419696.mp3',
-            'info_dict':
-            {
+            'info_dict': {
+                'id': '172419696',
                'ext': 'mp3',
+                'description': 'md5:1fc3272ed7a635cce5be1568c2822997',
                'title': 'Young Thug ft Lil Wayne - Take Kare',
-                "upload_date": "20141016",
-                "description": "New track produced by London On Da Track called “Take Kare\"\n\nhttp://instagram.com/theyoungthugworld\nhttps://www.facebook.com/ThuggerThuggerCashMoney\n",
-                "uploader": "Young Thug World"
+                'uploader': 'Young Thug World',
+                'upload_date': '20141016',
            }
-        }
+        },
    ]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/auengine.py
+++ b/youtube_dl/extractor/auengine.py
@@ -3,8 +3,8 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_urllib_parse
 from ..utils import (
-    compat_urllib_parse,
    determine_ext,
    ExtractorError,
 )
--- a/youtube_dl/extractor/azubu.py
+++ b/youtube_dl/extractor/azubu.py
@@ -0,0 +1,93 @@
+from __future__ import unicode_literals
+
+import json
+
+from .common import InfoExtractor
+from ..utils import float_or_none
+
+
+class AzubuIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)'
+    _TESTS = [
+        {
+            'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
+            'md5': 'a88b42fcf844f29ad6035054bd9ecaf4',
+            'info_dict': {
+                'id': '15575',
+                'ext': 'mp4',
+                'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
+                'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
+                'thumbnail': 're:^https?://.*\.jpe?g',
+                'timestamp': 1417523507.334,
+                'upload_date': '20141202',
+                'duration': 9988.7,
+                'uploader': 'GSL',
+                'uploader_id': 414310,
+                'view_count': int,
+            },
+        },
+        {
+            'url': 'http://www.azubu.tv/FnaticTV#!/play/9344/-fnatic-at-worlds-2014:-toyz---%22i-love-rekkles,-he-has-amazing-mechanics%22-',
+            'md5': 'b72a871fe1d9f70bd7673769cdb3b925',
+            'info_dict': {
+                'id': '9344',
+                'ext': 'mp4',
+                'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
+                'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
+                'thumbnail': 're:^https?://.*\.jpe?g',
+                'timestamp': 1410530893.320,
+                'upload_date': '20140912',
+                'duration': 172.385,
+                'uploader': 'FnaticTV',
+                'uploader_id': 272749,
+                'view_count': int,
+            },
+        },
+    ]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        data = self._download_json(
+            'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
+
+        title = data['title'].strip()
+        description = data['description']
+        thumbnail = data['thumbnail']
+        view_count = data['view_count']
+        uploader = data['user']['username']
+        uploader_id = data['user']['id']
+
+        stream_params = json.loads(data['stream_params'])
+
+        timestamp = float_or_none(stream_params['creationDate'], 1000)
+        duration = float_or_none(stream_params['length'], 1000)
+
+        renditions = stream_params.get('renditions') or []
+        video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
+        if video:
+            renditions.append(video)
+
+        formats = [{
+            'url': fmt['url'],
+            'width': fmt['frameWidth'],
+            'height': fmt['frameHeight'],
+            'vbr': float_or_none(fmt['encodingRate'], 1000),
+            'filesize': fmt['size'],
+            'vcodec': fmt['videoCodec'],
+            'container': fmt['videoContainer'],
+        } for fmt in renditions if fmt['url']]
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+            'duration': duration,
+            'uploader': uploader,
+            'uploader_id': uploader_id,
+            'view_count': view_count,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/bambuser.py
+++ b/youtube_dl/extractor/bambuser.py
@@ -5,7 +5,7 @@ import json
 import itertools

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_request,
 )

@@ -18,7 +18,7 @@ class BambuserIE(InfoExtractor):
    _TEST = {
        'url': 'http://bambuser.com/v/4050584',
        # MD5 seems to be flaky, see https://travis-ci.org/rg3/youtube-dl/jobs/14051016#L388
-        # u'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
+        # 'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
        'info_dict': {
            'id': '4050584',
            'ext': 'flv',
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@@ -4,9 +4,11 @@ import json
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_str,
    compat_urlparse,
+)
+from ..utils import (
    ExtractorError,
 )

@@ -104,7 +106,7 @@ class BandcampIE(InfoExtractor):

 class BandcampAlbumIE(InfoExtractor):
    IE_NAME = 'Bandcamp:album'
-    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))'
+    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))?'

    _TESTS = [{
        'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -139,6 +141,12 @@ class BandcampAlbumIE(InfoExtractor):
            'title': 'Hierophany of the Open Grave',
        },
        'playlist_mincount': 9,
+    }, {
+        'url': 'http://dotscale.bandcamp.com',
+        'info_dict': {
+            'title': 'Loom',
+        },
+        'playlist_mincount': 7,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/bbccouk.py
+++ b/youtube_dl/extractor/bbccouk.py
@@ -1,9 +1,10 @@
 from __future__ import unicode_literals

-import re
+import xml.etree.ElementTree

 from .subtitles import SubtitlesInfoExtractor
 from ..utils import ExtractorError
+from ..compat import compat_HTTPError


 class BBCCoUkIE(SubtitlesInfoExtractor):
@@ -55,7 +56,22 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
                'skip_download': True,
            },
            'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
-        }
+        },
+        {
+            'url': 'http://www.bbc.co.uk/iplayer/episode/p026c7jt/tomorrows-worlds-the-unearthly-history-of-science-fiction-2-invasion',
+            'info_dict': {
+                'id': 'b03k3pb7',
+                'ext': 'flv',
+                'title': "Tomorrow's Worlds: The Unearthly History of Science Fiction",
+                'description': '2. Invasion',
+                'duration': 3600,
+            },
+            'params': {
+                # rtmp download
+                'skip_download': True,
+            },
+            'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
+        },
    ]

    def _extract_asx_playlist(self, connection, programme_id):
@@ -102,6 +118,10 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
        return playlist.findall('./{http://bbc.co.uk/2008/emp/playlist}item')

    def _extract_medias(self, media_selection):
+        error = media_selection.find('./{http://bbc.co.uk/2008/mp/mediaselection}error')
+        if error is not None:
+            raise ExtractorError(
+                '%s returned error: %s' % (self.IE_NAME, error.get('id')), expected=True)
        return media_selection.findall('./{http://bbc.co.uk/2008/mp/mediaselection}media')

    def _extract_connections(self, media):
@@ -158,54 +178,73 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
            subtitles[lang] = srt
        return subtitles

-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        group_id = mobj.group('id')
-
-        webpage = self._download_webpage(url, group_id, 'Downloading video page')
-        if re.search(r'id="emp-error" class="notinuk">', webpage):
-            raise ExtractorError('Currently BBC iPlayer TV programmes are available to play in the UK only',
-                                 expected=True)
-
-        playlist = self._download_xml('http://www.bbc.co.uk/iplayer/playlist/%s' % group_id, group_id,
-                                      'Downloading playlist XML')
-
-        no_items = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}noItems')
-        if no_items is not None:
-            reason = no_items.get('reason')
-            if reason == 'preAvailability':
-                msg = 'Episode %s is not yet available' % group_id
-            elif reason == 'postAvailability':
-                msg = 'Episode %s is no longer available' % group_id
+    def _download_media_selector(self, programme_id):
+        try:
+            media_selection = self._download_xml(
+                'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
+                programme_id, 'Downloading media selection XML')
+        except ExtractorError as ee:
+            if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
+                media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().encode('utf-8'))
            else:
-                msg = 'Episode %s is not available: %s' % (group_id, reason)
-            raise ExtractorError(msg, expected=True)
+                raise

        formats = []
        subtitles = None

-        for item in self._extract_items(playlist):
-            kind = item.get('kind')
-            if kind != 'programme' and kind != 'radioProgramme':
-                continue
-            title = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}title').text
-            description = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}summary').text
+        for media in self._extract_medias(media_selection):
+            kind = media.get('kind')
+            if kind == 'audio':
+                formats.extend(self._extract_audio(media, programme_id))
+            elif kind == 'video':
+                formats.extend(self._extract_video(media, programme_id))
+            elif kind == 'captions':
+                subtitles = self._extract_captions(media, programme_id)

-            programme_id = item.get('identifier')
-            duration = int(item.get('duration'))
+        return formats, subtitles

-            media_selection = self._download_xml(
-                'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
-                programme_id, 'Downloading media selection XML')
+    def _real_extract(self, url):
+        group_id = self._match_id(url)

-            for media in self._extract_medias(media_selection):
-                kind = media.get('kind')
-                if kind == 'audio':
-                    formats.extend(self._extract_audio(media, programme_id))
-                elif kind == 'video':
-                    formats.extend(self._extract_video(media, programme_id))
-                elif kind == 'captions':
-                    subtitles = self._extract_captions(media, programme_id)
+        webpage = self._download_webpage(url, group_id, 'Downloading video page')
+
+        programme_id = self._search_regex(
+            r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
+        if programme_id:
+            player = self._download_json(
+                'http://www.bbc.co.uk/iplayer/episode/%s.json' % group_id,
+                group_id)['jsConf']['player']
+            title = player['title']
+            description = player['subtitle']
+            duration = player['duration']
+            formats, subtitles = self._download_media_selector(programme_id)
+        else:
+            playlist = self._download_xml(
+                'http://www.bbc.co.uk/iplayer/playlist/%s' % group_id,
+                group_id, 'Downloading playlist XML')
+
+            no_items = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}noItems')
+            if no_items is not None:
+                reason = no_items.get('reason')
+                if reason == 'preAvailability':
+                    msg = 'Episode %s is not yet available' % group_id
+                elif reason == 'postAvailability':
+                    msg = 'Episode %s is no longer available' % group_id
+                elif reason == 'noMedia':
+                    msg = 'Episode %s is not currently available' % group_id
+                else:
+                    msg = 'Episode %s is not available: %s' % (group_id, reason)
+                raise ExtractorError(msg, expected=True)
+
+            for item in self._extract_items(playlist):
+                kind = item.get('kind')
+                if kind != 'programme' and kind != 'radioProgramme':
+                    continue
+                title = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}title').text
+                description = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}summary').text
+                programme_id = item.get('identifier')
+                duration = int(item.get('duration'))
+                formats, subtitles = self._download_media_selector(programme_id)

        if self._downloader.params.get('listsubtitles', False):
            self._list_available_subtitles(programme_id, subtitles)
--- a/youtube_dl/extractor/behindkink.py
+++ b/youtube_dl/extractor/behindkink.py
@@ -10,15 +10,15 @@ from ..utils import url_basename
 class BehindKinkIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
    _TEST = {
-        'url': 'http://www.behindkink.com/2014/08/14/ab1576-performers-voice-finally-heard-the-bill-is-killed/',
-        'md5': '41ad01222b8442089a55528fec43ec01',
+        'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
+        'md5': '507b57d8fdcd75a41a9a7bdb7989c762',
        'info_dict': {
-            'id': '36370',
+            'id': '37127',
            'ext': 'mp4',
-            'title': 'AB1576 - PERFORMERS VOICE FINALLY HEARD - THE BILL IS KILLED!',
-            'description': 'The adult industry voice was finally heard as Assembly Bill 1576 remained\xa0 in suspense today at the Senate Appropriations Hearing. AB1576 was, among other industry damaging issues, a condom mandate...',
-            'upload_date': '20140814',
-            'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/08/36370_AB1576_Win.jpg',
+            'title': 'What are you passionate about – Marley Blaze',
+            'description': 'md5:aee8e9611b4ff70186f752975d9b94b4',
+            'upload_date': '20141205',
+            'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/12/blaze-1.jpg',
            'age_limit': 18,
        }
    }
@@ -26,26 +26,19 @@ class BehindKinkIE(InfoExtractor):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        display_id = mobj.group('id')
-        year = mobj.group('year')
-        month = mobj.group('month')
-        day = mobj.group('day')
-        upload_date = year + month + day

        webpage = self._download_webpage(url, display_id)

        video_url = self._search_regex(
-            r"'file':\s*'([^']+)'",
-            webpage, 'URL base')
-
-        video_id = url_basename(video_url)
-        video_id = video_id.split('_')[0]
+            r'<source src="([^"]+)"', webpage, 'video URL')
+        video_id = url_basename(video_url).split('_')[0]
+        upload_date = mobj.group('year') + mobj.group('month') + mobj.group('day')

        return {
            'id': video_id,
-            'url': video_url,
-            'ext': 'mp4',
-            'title': self._og_search_title(webpage),
            'display_id': display_id,
+            'url': video_url,
+            'title': self._og_search_title(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
            'description': self._og_search_description(webpage),
            'upload_date': upload_date,
--- a/youtube_dl/extractor/bet.py
+++ b/youtube_dl/extractor/bet.py
@@ -0,0 +1,108 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_urllib_parse
+from ..utils import (
+    xpath_text,
+    xpath_with_ns,
+    int_or_none,
+    parse_iso8601,
+)
+
+
+class BetIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
+    _TESTS = [
+        {
+            'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
+            'info_dict': {
+                'id': '417cd61c-c793-4e8e-b006-e445ecc45add',
+                'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
+                'ext': 'flv',
+                'title': 'BET News Presents: A Conversation With President Obama',
+                'description': 'md5:5a88d8ae912c1b33e090290af7ec33c6',
+                'duration': 1534,
+                'timestamp': 1418075340,
+                'upload_date': '20141208',
+                'uploader': 'admin',
+                'thumbnail': 're:(?i)^https?://.*\.jpg$',
+            },
+            'params': {
+                # rtmp download
+                'skip_download': True,
+            },
+        },
+        {
+            'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
+            'info_dict': {
+                'id': '4160e53b-ad41-43b1-980f-8d85f63121f4',
+                'display_id': 'justice-for-ferguson-a-community-reacts',
+                'ext': 'flv',
+                'title': 'Justice for Ferguson: A Community Reacts',
+                'description': 'A BET News special.',
+                'duration': 1696,
+                'timestamp': 1416942360,
+                'upload_date': '20141125',
+                'uploader': 'admin',
+                'thumbnail': 're:(?i)^https?://.*\.jpg$',
+            },
+            'params': {
+                # rtmp download
+                'skip_download': True,
+            },
+        }
+    ]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        media_url = compat_urllib_parse.unquote(self._search_regex(
+            [r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
+            webpage, 'media URL'))
+
+        mrss = self._download_xml(media_url, display_id)
+
+        item = mrss.find('./channel/item')
+
+        NS_MAP = {
+            'dc': 'http://purl.org/dc/elements/1.1/',
+            'media': 'http://search.yahoo.com/mrss/',
+            'ka': 'http://kickapps.com/karss',
+        }
+
+        title = xpath_text(item, './title', 'title')
+        description = xpath_text(
+            item, './description', 'description', fatal=False)
+
+        video_id = xpath_text(item, './guid', 'video id', fatal=False)
+
+        timestamp = parse_iso8601(xpath_text(
+            item, xpath_with_ns('./dc:date', NS_MAP),
+            'upload date', fatal=False))
+        uploader = xpath_text(
+            item, xpath_with_ns('./dc:creator', NS_MAP),
+            'uploader', fatal=False)
+
+        media_content = item.find(
+            xpath_with_ns('./media:content', NS_MAP))
+        duration = int_or_none(media_content.get('duration'))
+        smil_url = media_content.get('url')
+
+        thumbnail = media_content.find(
+            xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
+
+        formats = self._extract_smil_formats(smil_url, display_id)
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+            'uploader': uploader,
+            'duration': duration,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/bilibili.py
+++ b/youtube_dl/extractor/bilibili.py
@@ -4,8 +4,8 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_parse_qs
 from ..utils import (
-    compat_parse_qs,
    ExtractorError,
    int_or_none,
    unified_strdate,
@@ -29,10 +29,9 @@ class BiliBiliIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
+
        video_code = self._search_regex(
            r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')

--- a/youtube_dl/extractor/bliptv.py
+++ b/youtube_dl/extractor/bliptv.py
@@ -4,13 +4,17 @@ import re

 from .common import InfoExtractor
 from .subtitles import SubtitlesInfoExtractor
-from ..utils import (
-    compat_urllib_request,
-    unescapeHTML,
-    parse_iso8601,
-    compat_urlparse,
-    clean_html,
+
+from ..compat import (
    compat_str,
+    compat_urllib_request,
+    compat_urlparse,
+)
+from ..utils import (
+    clean_html,
+    int_or_none,
+    parse_iso8601,
+    unescapeHTML,
 )


@@ -64,7 +68,39 @@ class BlipTVIE(SubtitlesInfoExtractor):
                'uploader': 'redvsblue',
                'uploader_id': '792887',
            }
-        }
+        },
+        {
+            'url': 'http://blip.tv/play/gbk766dkj4Yn',
+            'md5': 'fe0a33f022d49399a241e84a8ea8b8e3',
+            'info_dict': {
+                'id': '1749452',
+                'ext': 'mp4',
+                'upload_date': '20090208',
+                'description': 'Witness the first appearance of the Nostalgia Critic character, as Doug reviews the movie Transformers.',
+                'title': 'Nostalgia Critic: Transformers',
+                'timestamp': 1234068723,
+                'uploader': 'NostalgiaCritic',
+                'uploader_id': '246467',
+            }
+        },
+        {
+            # https://github.com/rg3/youtube-dl/pull/4404
+            'note': 'Audio only',
+            'url': 'http://blip.tv/hilarios-productions/weekly-manga-recap-kingdom-7119982',
+            'md5': '76c0a56f24e769ceaab21fbb6416a351',
+            'info_dict': {
+                'id': '7103299',
+                'ext': 'flv',
+                'title': 'Weekly Manga Recap: Kingdom',
+                'description': 'And then Shin breaks the enemy line, and he&apos;s all like HWAH! And then he slices a guy and it&apos;s all like FWASHING! And... it&apos;s really hard to describe the best parts of this series without breaking down into sound effects, okay?',
+                'timestamp': 1417660321,
+                'upload_date': '20141204',
+                'uploader': 'The Rollo T',
+                'uploader_id': '407429',
+                'duration': 7251,
+                'vcodec': 'none',
+            }
+        },
    ]

    def _real_extract(self, url):
@@ -74,11 +110,13 @@ class BlipTVIE(SubtitlesInfoExtractor):
        # See https://github.com/rg3/youtube-dl/issues/857 and
        # https://github.com/rg3/youtube-dl/issues/4197
        if lookup_id:
-            info_page = self._download_webpage(
-                'http://blip.tv/play/%s.x?p=1' % lookup_id, lookup_id, 'Resolving lookup id')
-            video_id = self._search_regex(r'config\.id\s*=\s*"([0-9]+)', info_page, 'video_id')
-        else:
-            video_id = mobj.group('id')
+            urlh = self._request_webpage(
+                'http://blip.tv/play/%s' % lookup_id, lookup_id, 'Resolving lookup id')
+            url = compat_urlparse.urlparse(urlh.geturl())
+            qs = compat_urlparse.parse_qs(url.query)
+            mobj = re.match(self._VALID_URL, qs['file'][0])
+
+        video_id = mobj.group('id')

        rss = self._download_xml('http://blip.tv/rss/flash/%s' % video_id, video_id, 'Downloading video RSS')

@@ -114,7 +152,7 @@ class BlipTVIE(SubtitlesInfoExtractor):
            msg = self._download_webpage(
                url + '?showplayer=20140425131715&referrer=http://blip.tv&mask=7&skin=flashvars&view=url',
                video_id, 'Resolving URL for %s' % role)
-            real_url = compat_urlparse.parse_qs(msg)['message'][0]
+            real_url = compat_urlparse.parse_qs(msg.strip())['message'][0]

            media_type = media_content.get('type')
            if media_type == 'text/srt' or url.endswith('.srt'):
@@ -129,11 +167,11 @@ class BlipTVIE(SubtitlesInfoExtractor):
                    'url': real_url,
                    'format_id': role,
                    'format_note': media_type,
-                    'vcodec': media_content.get(blip('vcodec')),
+                    'vcodec': media_content.get(blip('vcodec')) or 'none',
                    'acodec': media_content.get(blip('acodec')),
                    'filesize': media_content.get('filesize'),
-                    'width': int(media_content.get('width')),
-                    'height': int(media_content.get('height')),
+                    'width': int_or_none(media_content.get('width')),
+                    'height': int_or_none(media_content.get('height')),
                })
        self._sort_formats(formats)

--- a/youtube_dl/extractor/breakcom.py
+++ b/youtube_dl/extractor/breakcom.py
@@ -14,7 +14,6 @@ class BreakIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
    _TESTS = [{
        'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
-        'md5': '33aa4ff477ecd124d18d7b5d23b87ce5',
        'info_dict': {
            'id': '2468056',
            'ext': 'mp4',
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -6,20 +6,21 @@ import json
 import xml.etree.ElementTree

 from .common import InfoExtractor
-from ..utils import (
-    compat_urllib_parse,
-    find_xpath_attr,
-    fix_xml_ampersands,
-    compat_urlparse,
-    compat_str,
-    compat_urllib_request,
+from ..compat import (
    compat_parse_qs,
+    compat_str,
+    compat_urllib_parse,
    compat_urllib_parse_urlparse,
-
+    compat_urllib_request,
+    compat_urlparse,
+)
+from ..utils import (
    determine_ext,
    ExtractorError,
-    unsmuggle_url,
+    find_xpath_attr,
+    fix_xml_ampersands,
    unescapeHTML,
+    unsmuggle_url,
 )


@@ -265,6 +266,7 @@ class BrightcoveIE(InfoExtractor):
                url = rend['defaultURL']
                if not url:
                    continue
+                ext = None
                if rend['remote']:
                    url_comp = compat_urllib_parse_urlparse(url)
                    if url_comp.path.endswith('.m3u8'):
@@ -276,7 +278,7 @@ class BrightcoveIE(InfoExtractor):
                        # akamaihd.net, but they don't use f4m manifests
                        url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
                        ext = 'flv'
-                else:
+                if ext is None:
                    ext = determine_ext(url)
                size = rend.get('size')
                formats.append({
--- a/youtube_dl/extractor/buzzfeed.py
+++ b/youtube_dl/extractor/buzzfeed.py
@@ -0,0 +1,74 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import json
+import re
+
+from .common import InfoExtractor
+
+
+class BuzzFeedIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?buzzfeed\.com/[^?#]*?/(?P<id>[^?#]+)'
+    _TESTS = [{
+        'url': 'http://www.buzzfeed.com/abagg/this-angry-ram-destroys-a-punching-bag-like-a-boss?utm_term=4ldqpia',
+        'info_dict': {
+            'id': 'this-angry-ram-destroys-a-punching-bag-like-a-boss',
+            'title': 'This Angry Ram Destroys A Punching Bag Like A Boss',
+            'description': 'Rambro!',
+        },
+        'playlist': [{
+            'info_dict': {
+                'id': 'aVCR29aE_OQ',
+                'ext': 'mp4',
+                'upload_date': '20141024',
+                'uploader_id': 'Buddhanz1',
+                'description': 'He likes to stay in shape with his heavy bag, he wont stop until its on the ground\n\nFollow Angry Ram on Facebook for regular updates -\nhttps://www.facebook.com/pages/Angry-Ram/1436897249899558?ref=hl',
+                'uploader': 'Buddhanz',
+                'title': 'Angry Ram destroys a punching bag',
+            }
+        }]
+    }, {
+        'url': 'http://www.buzzfeed.com/sheridanwatson/look-at-this-cute-dog-omg?utm_term=4ldqpia',
+        'params': {
+            'skip_download': True,  # Got enough YouTube download tests
+        },
+        'info_dict': {
+            'description': 'Munchkin the Teddy Bear is back !',
+            'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
+        },
+        'playlist': [{
+            'info_dict': {
+                'id': 'mVmBL8B-In0',
+                'ext': 'mp4',
+                'upload_date': '20141124',
+                'uploader_id': 'CindysMunchkin',
+                'description': '© 2014 Munchkin the Shih Tzu\nAll rights reserved\nFacebook: http://facebook.com/MunchkintheShihTzu',
+                'uploader': 'Munchkin the Shih Tzu',
+                'title': 'Munchkin the Teddy Bear gets her exercise',
+            },
+        }]
+    }]
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+        webpage = self._download_webpage(url, playlist_id)
+
+        all_buckets = re.findall(
+            r'(?s)<div class="video-embed[^"]*"..*?rel:bf_bucket_data=\'([^\']+)\'',
+            webpage)
+
+        entries = []
+        for bd_json in all_buckets:
+            bd = json.loads(bd_json)
+            video = bd.get('video') or bd.get('progload_video')
+            if not video:
+                continue
+            entries.append(self.url_result(video['url']))
+
+        return {
+            '_type': 'playlist',
+            'id': playlist_id,
+            'title': self._og_search_title(webpage),
+            'description': self._og_search_description(webpage),
+            'entries': entries,
+        }
--- a/youtube_dl/extractor/cbs.py
+++ b/youtube_dl/extractor/cbs.py
@@ -45,4 +45,4 @@ class CBSIE(InfoExtractor):
        real_id = self._search_regex(
            r"video\.settings\.pid\s*=\s*'([^']+)';",
            webpage, 'real video ID')
-        return self.url_result(u'theplatform:%s' % real_id)
+        return self.url_result('theplatform:%s' % real_id)
--- a/youtube_dl/extractor/ceskatelevize.py
+++ b/youtube_dl/extractor/ceskatelevize.py
@@ -4,10 +4,12 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_request,
    compat_urllib_parse,
    compat_urllib_parse_urlparse,
+)
+from ..utils import (
    ExtractorError,
 )

--- a/youtube_dl/extractor/channel9.py
+++ b/youtube_dl/extractor/channel9.py
@@ -236,16 +236,17 @@ class Channel9IE(InfoExtractor):
        if contents is None:
            return contents

-        session_meta = {'session_code': self._extract_session_code(html),
-                        'session_day': self._extract_session_day(html),
-                        'session_room': self._extract_session_room(html),
-                        'session_speakers': self._extract_session_speakers(html),
-                        }
+        session_meta = {
+            'session_code': self._extract_session_code(html),
+            'session_day': self._extract_session_day(html),
+            'session_room': self._extract_session_room(html),
+            'session_speakers': self._extract_session_speakers(html),
+        }

        for content in contents:
            content.update(session_meta)

-        return contents
+        return self.playlist_result(contents)

    def _extract_list(self, content_path):
        rss = self._download_xml(self._RSS_URL % content_path, content_path, 'Downloading RSS')
--- a/youtube_dl/extractor/cinchcast.py
+++ b/youtube_dl/extractor/cinchcast.py
@@ -0,0 +1,52 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    unified_strdate,
+    xpath_text,
+)
+
+
+class CinchcastIE(InfoExtractor):
+    _VALID_URL = r'https?://player\.cinchcast\.com/.*?assetId=(?P<id>[0-9]+)'
+    _TEST = {
+        # Actual test is run in generic, look for undergroundwellness
+        'url': 'http://player.cinchcast.com/?platformId=1&#038;assetType=single&#038;assetId=7141703',
+        'only_matching': True,
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        doc = self._download_xml(
+            'http://www.blogtalkradio.com/playerasset/mrss?assetType=single&assetId=%s' % video_id,
+            video_id)
+
+        item = doc.find('.//item')
+        title = xpath_text(item, './title', fatal=True)
+        date_str = xpath_text(
+            item, './{http://developer.longtailvideo.com/trac/}date')
+        upload_date = unified_strdate(date_str, day_first=False)
+        # duration is present but wrong
+        formats = []
+        formats.append({
+            'format_id': 'main',
+            'url': item.find(
+                './{http://search.yahoo.com/mrss/}content').attrib['url'],
+        })
+        backup_url = xpath_text(
+            item, './{http://developer.longtailvideo.com/trac/}backupContent')
+        if backup_url:
+            formats.append({
+                'preference': 2,  # seems to be more reliable
+                'format_id': 'backup',
+                'url': backup_url,
+            })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'upload_date': upload_date,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/clipfish.py
+++ b/youtube_dl/extractor/clipfish.py
@@ -24,7 +24,7 @@ class ClipfishIE(InfoExtractor):
            'title': 'FIFA 14 - E3 2013 Trailer',
            'duration': 82,
        },
-        u'skip': 'Blocked in the US'
+        'skip': 'Blocked in the US'
    }

    def _real_extract(self, url):
@@ -34,7 +34,7 @@ class ClipfishIE(InfoExtractor):
        info_url = ('http://www.clipfish.de/devxml/videoinfo/%s?ts=%d' %
                    (video_id, int(time.time())))
        doc = self._download_xml(
-            info_url, video_id, note=u'Downloading info page')
+            info_url, video_id, note='Downloading info page')
        title = doc.find('title').text
        video_url = doc.find('filename').text
        if video_url is None:
--- a/youtube_dl/extractor/cnet.py
+++ b/youtube_dl/extractor/cnet.py
@@ -2,12 +2,10 @@
 from __future__ import unicode_literals

 import json
-import re

 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
-    int_or_none,
 )


@@ -15,23 +13,24 @@ class CNETIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
    _TEST = {
        'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
-        'md5': '041233212a0d06b179c87cbcca1577b8',
        'info_dict': {
            'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
-            'ext': 'mp4',
+            'ext': 'flv',
            'title': 'Hands-on with Microsoft Windows 8.1 Update',
            'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
            'thumbnail': 're:^http://.*/flmswindows8.jpg$',
-            'uploader_id': 'sarah.mitroff@cbsinteractive.com',
+            'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861',
            'uploader': 'Sarah Mitroff',
+        },
+        'params': {
+            'skip_download': 'requires rtmpdump',
        }
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        display_id = mobj.group('id')
-
+        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
+
        data_json = self._html_search_regex(
            r"<div class=\"cnetVideoPlayer\"\s+.*?data-cnet-video-options='([^']+)'",
            webpage, 'data json')
@@ -42,37 +41,31 @@ class CNETIE(InfoExtractor):
        if not vdata:
            raise ExtractorError('Cannot find video data')

+        mpx_account = data['config']['players']['default']['mpx_account']
+        vid = vdata['files']['rtmp']
+        tp_link = 'http://link.theplatform.com/s/%s/%s' % (mpx_account, vid)
+
        video_id = vdata['id']
        title = vdata.get('headline')
        if title is None:
            title = vdata.get('title')
        if title is None:
            raise ExtractorError('Cannot find title!')
-        description = vdata.get('dek')
        thumbnail = vdata.get('image', {}).get('path')
        author = vdata.get('author')
        if author:
            uploader = '%s %s' % (author['firstName'], author['lastName'])
-            uploader_id = author.get('email')
+            uploader_id = author.get('id')
        else:
            uploader = None
            uploader_id = None

-        formats = [{
-            'format_id': '%s-%s-%s' % (
-                f['type'], f['format'],
-                int_or_none(f.get('bitrate'), 1000, default='')),
-            'url': f['uri'],
-            'tbr': int_or_none(f.get('bitrate'), 1000),
-        } for f in vdata['files']['data']]
-        self._sort_formats(formats)
-
        return {
+            '_type': 'url_transparent',
+            'url': tp_link,
            'id': video_id,
            'display_id': display_id,
            'title': title,
-            'formats': formats,
-            'description': description,
            'uploader': uploader,
            'uploader_id': uploader_id,
            'thumbnail': thumbnail,
--- a/youtube_dl/extractor/comcarcoff.py
+++ b/youtube_dl/extractor/comcarcoff.py
@@ -0,0 +1,57 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+import json
+
+from .common import InfoExtractor
+from ..utils import parse_iso8601
+
+
+class ComCarCoffIE(InfoExtractor):
+    _VALID_URL = r'http://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
+    _TESTS = [{
+        'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
+        'info_dict': {
+            'id': 'miranda-sings-happy-thanksgiving-miranda',
+            'ext': 'mp4',
+            'upload_date': '20141127',
+            'timestamp': 1417107600,
+            'title': 'Happy Thanksgiving Miranda',
+            'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
+            'thumbnail': 'http://ccc.crackle.com/images/s5e4_thumb.jpg',
+        },
+        'params': {
+            'skip_download': 'requires ffmpeg',
+        }
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        if not display_id:
+            display_id = 'comediansincarsgettingcoffee.com'
+        webpage = self._download_webpage(url, display_id)
+
+        full_data = json.loads(self._search_regex(
+            r'<script type="application/json" id="videoData">(?P<json>.+?)</script>',
+            webpage, 'full data json'))
+
+        video_id = full_data['activeVideo']['video']
+        video_data = full_data['videos'][video_id]
+        thumbnails = [{
+            'url': video_data['images']['thumb'],
+        }, {
+            'url': video_data['images']['poster'],
+        }]
+        formats = self._extract_m3u8_formats(
+            video_data['mediaUrl'], video_id, ext='mp4')
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': video_data['title'],
+            'description': video_data.get('description'),
+            'timestamp': parse_iso8601(video_data.get('pubDate')),
+            'thumbnails': thumbnails,
+            'formats': formats,
+            'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
+        }
--- a/youtube_dl/extractor/comedycentral.py
+++ b/youtube_dl/extractor/comedycentral.py
@@ -3,9 +3,11 @@ from __future__ import unicode_literals
 import re

 from .mtv import MTVServicesInfoExtractor
-from ..utils import (
+from ..compat import (
    compat_str,
    compat_urllib_parse,
+)
+from ..utils import (
    ExtractorError,
    float_or_none,
    unified_strdate,
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -13,6 +13,7 @@ import time
 import xml.etree.ElementTree

 from ..compat import (
+    compat_cookiejar,
    compat_http_client,
    compat_urllib_error,
    compat_urllib_parse_urlparse,
@@ -117,6 +118,7 @@ class InfoExtractor(object):

    The following fields are optional:

+    alt_title:      A secondary title of the video.
    display_id      An alternative identifier for the video, not necessarily
                    unique, but available before title. Typically, id is
                    something like "4234987", title "Dancing naked mole rats",
@@ -128,7 +130,7 @@ class InfoExtractor(object):
                        * "resolution" (optional, string "{width}x{height"},
                                        deprecated)
    thumbnail:      Full URL to a video thumbnail image.
-    description:    One-line video description.
+    description:    Full video description.
    uploader:       Full name of the video uploader.
    timestamp:      UNIX timestamp of the moment the video became available.
    upload_date:    Video upload date (YYYYMMDD).
@@ -157,8 +159,8 @@ class InfoExtractor(object):


    _type "playlist" indicates multiple videos.
-    There must be a key "entries", which is a list or a PagedList object, each
-    element of which is a valid dictionary under this specfication.
+    There must be a key "entries", which is a list, an iterable, or a PagedList
+    object, each element of which is a valid dictionary by this specification.

    Additionally, playlists can have "title" and "id" attributes with the same
    semantics as videos (see above).
@@ -173,9 +175,10 @@ class InfoExtractor(object):
    _type "url" indicates that the video must be extracted from another
    location, possibly by a different extractor. Its only required key is:
    "url" - the next URL to extract.
-
-    Additionally, it may have properties believed to be identical to the
-    resolved entity, for example "title" if the title of the referred video is
+    The key "ie_key" can be set to the class name (minus the trailing "IE",
+    e.g. "Youtube") if the extractor class is known in advance.
+    Additionally, the dictionary may have any properties of the resolved entity
+    known in advance, for example "title" if the title of the referred video is
    known ahead of time.


@@ -296,9 +299,11 @@ class InfoExtractor(object):
        content = self._webpage_read_content(urlh, url_or_request, video_id, note, errnote, fatal)
        return (content, urlh)

-    def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True):
+    def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True, prefix=None):
        content_type = urlh.headers.get('Content-Type', '')
        webpage_bytes = urlh.read()
+        if prefix is not None:
+            webpage_bytes = prefix + webpage_bytes
        m = re.match(r'[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+\s*;\s*charset=(.+)', content_type)
        if m:
            encoding = m.group(1)
@@ -387,6 +392,10 @@ class InfoExtractor(object):
            url_or_request, video_id, note, errnote, fatal=fatal)
        if (not fatal) and json_string is False:
            return None
+        return self._parse_json(
+            json_string, video_id, transform_source=transform_source, fatal=fatal)
+
+    def _parse_json(self, json_string, video_id, transform_source=None, fatal=True):
        if transform_source:
            json_string = transform_source(json_string)
        try:
@@ -436,7 +445,7 @@ class InfoExtractor(object):
        return video_info

    @staticmethod
-    def playlist_result(entries, playlist_id=None, playlist_title=None):
+    def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
        """Returns a playlist"""
        video_info = {'_type': 'playlist',
                      'entries': entries}
@@ -444,6 +453,8 @@ class InfoExtractor(object):
            video_info['id'] = playlist_id
        if playlist_title:
            video_info['title'] = playlist_title
+        if playlist_description:
+            video_info['description'] = playlist_description
        return video_info

    def _search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
@@ -787,6 +798,49 @@ class InfoExtractor(object):
        self._sort_formats(formats)
        return formats

+    # TODO: improve extraction
+    def _extract_smil_formats(self, smil_url, video_id):
+        smil = self._download_xml(
+            smil_url, video_id, 'Downloading SMIL file',
+            'Unable to download SMIL file')
+
+        base = smil.find('./head/meta').get('base')
+
+        formats = []
+        rtmp_count = 0
+        for video in smil.findall('./body/switch/video'):
+            src = video.get('src')
+            if not src:
+                continue
+            bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
+            width = int_or_none(video.get('width'))
+            height = int_or_none(video.get('height'))
+            proto = video.get('proto')
+            if not proto:
+                if base:
+                    if base.startswith('rtmp'):
+                        proto = 'rtmp'
+                    elif base.startswith('http'):
+                        proto = 'http'
+            ext = video.get('ext')
+            if proto == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(src, video_id, ext))
+            elif proto == 'rtmp':
+                rtmp_count += 1
+                streamer = video.get('streamer') or base
+                formats.append({
+                    'url': streamer,
+                    'play_path': src,
+                    'ext': 'flv',
+                    'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
+                    'tbr': bitrate,
+                    'width': width,
+                    'height': height,
+                })
+        self._sort_formats(formats)
+
+        return formats
+
    def _live_title(self, name):
        """ Generate the title for a live video """
        now = datetime.datetime.now()
@@ -815,6 +869,12 @@ class InfoExtractor(object):
                self._downloader.report_warning(msg)
        return res

+    def _set_cookie(self, domain, name, value, expire_time=None):
+        cookie = compat_cookiejar.Cookie(
+            0, name, value, None, None, domain, None,
+            None, '/', True, False, expire_time, '', None, None, None)
+        self._downloader.cookiejar.set_cookie(cookie)
+

 class SearchInfoExtractor(InfoExtractor):
    """
--- a/youtube_dl/extractor/condenast.py
+++ b/youtube_dl/extractor/condenast.py
@@ -5,12 +5,14 @@ import re
 import json

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
-    orderedSet,
    compat_urllib_parse_urlparse,
    compat_urlparse,
 )
+from ..utils import (
+    orderedSet,
+)


 class CondeNastIE(InfoExtractor):
--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dl/extractor/crunchyroll.py
@@ -10,10 +10,12 @@ import xml.etree.ElementTree
 from hashlib import sha1
 from math import pow, sqrt, floor
 from .subtitles import SubtitlesInfoExtractor
-from ..utils import (
-    ExtractorError,
+from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
+)
+from ..utils import (
+    ExtractorError,
    bytes_to_intlist,
    intlist_to_bytes,
    unified_strdate,
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -8,13 +8,15 @@ import itertools
 from .common import InfoExtractor
 from .subtitles import SubtitlesInfoExtractor

-from ..utils import (
-    compat_urllib_request,
+from ..compat import (
    compat_str,
+    compat_urllib_request,
+)
+from ..utils import (
+    ExtractorError,
+    int_or_none,
    orderedSet,
    str_to_int,
-    int_or_none,
-    ExtractorError,
    unescapeHTML,
 )

--- a/youtube_dl/extractor/daum.py
+++ b/youtube_dl/extractor/daum.py
@@ -5,7 +5,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
 )

--- a/youtube_dl/extractor/ebaumsworld.py
+++ b/youtube_dl/extractor/ebaumsworld.py
@@ -1,7 +1,5 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor


@@ -20,8 +18,7 @@ class EbaumsWorldIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        config = self._download_xml(
            'http://www.ebaumsworld.com/video/player/%s' % video_id, video_id)
        video_url = config.find('file').text
--- a/youtube_dl/extractor/ehow.py
+++ b/youtube_dl/extractor/ehow.py
@@ -1,8 +1,6 @@
 from __future__ import unicode_literals

-import re
-
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
 )
 from .common import InfoExtractor
@@ -24,11 +22,10 @@ class EHowIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        video_url = self._search_regex(r'(?:file|source)=(http[^\'"&]*)',
-                                       webpage, 'video URL')
+        video_url = self._search_regex(
+            r'(?:file|source)=(http[^\'"&]*)', webpage, 'video URL')
        final_url = compat_urllib_parse.unquote(video_url)
        uploader = self._html_search_meta('uploader', webpage)
        title = self._og_search_title(webpage).replace(' | eHow', '')
--- a/youtube_dl/extractor/eighttracks.py
+++ b/youtube_dl/extractor/eighttracks.py
@@ -6,7 +6,7 @@ import random
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_str,
 )

@@ -125,7 +125,7 @@ class EightTracksIE(InfoExtractor):
            info = {
                'id': compat_str(track_data['id']),
                'url': track_data['track_file_stream_url'],
-                'title': track_data['performer'] + u' - ' + track_data['name'],
+                'title': track_data['performer'] + ' - ' + track_data['name'],
                'raw_title': track_data['name'],
                'uploader_id': data['user']['login'],
                'ext': 'm4a',
--- a/youtube_dl/extractor/engadget.py
+++ b/youtube_dl/extractor/engadget.py
@@ -3,7 +3,6 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from .fivemin import FiveMinIE
 from ..utils import (
    url_basename,
 )
@@ -27,11 +26,10 @@ class EngadgetIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)

        if video_id is not None:
-            return FiveMinIE._build_result(video_id)
+            return self.url_result('5min:%s' % video_id)
        else:
            title = url_basename(url)
            webpage = self._download_webpage(url, title)
@@ -39,5 +37,5 @@ class EngadgetIE(InfoExtractor):
            return {
                '_type': 'playlist',
                'title': title,
-                'entries': [FiveMinIE._build_result(id) for id in ids]
+                'entries': [self.url_result('5min:%s' % vid) for vid in ids]
            }
--- a/youtube_dl/extractor/escapist.py
+++ b/youtube_dl/extractor/escapist.py
@@ -3,9 +3,10 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
-
+)
+from ..utils import (
    ExtractorError,
 )

--- a/youtube_dl/extractor/everyonesmixtape.py
+++ b/youtube_dl/extractor/everyonesmixtape.py
@@ -3,8 +3,10 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_request,
+)
+from ..utils import (
    ExtractorError,
 )

--- a/youtube_dl/extractor/extremetube.py
+++ b/youtube_dl/extractor/extremetube.py
@@ -3,16 +3,18 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse_urlparse,
    compat_urllib_request,
    compat_urllib_parse,
+)
+from ..utils import (
    str_to_int,
 )


 class ExtremeTubeIE(InfoExtractor):
-    _VALID_URL = r'^(?:https?://)?(?:www\.)?(?P<url>extremetube\.com/.*?video/.+?(?P<videoid>[0-9]+))(?:[/?&]|$)'
+    _VALID_URL = r'https?://(?:www\.)?(?P<url>extremetube\.com/.*?video/.+?(?P<id>[0-9]+))(?:[/?&]|$)'
    _TESTS = [{
        'url': 'http://www.extremetube.com/video/music-video-14-british-euro-brit-european-cumshots-swallow-652431',
        'md5': '1fb9228f5e3332ec8c057d6ac36f33e0',
@@ -31,7 +33,7 @@ class ExtremeTubeIE(InfoExtractor):

    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('videoid')
+        video_id = mobj.group('id')
        url = 'http://www.' + mobj.group('url')

        req = compat_urllib_request.Request(url)
--- a/youtube_dl/extractor/facebook.py
+++ b/youtube_dl/extractor/facebook.py
@@ -13,9 +13,10 @@ from ..compat import (
    compat_urllib_request,
 )
 from ..utils import (
-    urlencode_postdata,
    ExtractorError,
+    int_or_none,
    limit_length,
+    urlencode_postdata,
 )


@@ -36,7 +37,6 @@ class FacebookIE(InfoExtractor):
        'info_dict': {
            'id': '637842556329505',
            'ext': 'mp4',
-            'duration': 38,
            'title': 're:Did you know Kei Nishikori is the first Asian man to ever reach a Grand Slam',
        }
    }, {
@@ -107,9 +107,7 @@ class FacebookIE(InfoExtractor):
        self._login()

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        url = 'https://www.facebook.com/video/video.php?v=%s' % video_id
        webpage = self._download_webpage(url, video_id)

@@ -149,6 +147,6 @@ class FacebookIE(InfoExtractor):
            'id': video_id,
            'title': video_title,
            'url': video_url,
-            'duration': int(video_data['video_duration']),
-            'thumbnail': video_data['thumbnail_src'],
+            'duration': int_or_none(video_data.get('video_duration')),
+            'thumbnail': video_data.get('thumbnail_src'),
        }
--- a/youtube_dl/extractor/fc2.py
+++ b/youtube_dl/extractor/fc2.py
@@ -1,19 +1,20 @@
 #! -*- coding: utf-8 -*-
 from __future__ import unicode_literals

-import re
 import hashlib

 from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
+from ..compat import (
    compat_urllib_request,
    compat_urlparse,
 )
+from ..utils import (
+    ExtractorError,
+)


 class FC2IE(InfoExtractor):
-    _VALID_URL = r'^http://video\.fc2\.com/((?P<lang>[^/]+)/)?content/(?P<id>[^/]+)'
+    _VALID_URL = r'^http://video\.fc2\.com/(?:[^/]+/)?content/(?P<id>[^/]+)'
    IE_NAME = 'fc2'
    _TEST = {
        'url': 'http://video.fc2.com/en/content/20121103kUan1KHs',
@@ -26,9 +27,7 @@ class FC2IE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        self._downloader.cookiejar.clear_session_cookies()  # must clear

--- a/youtube_dl/extractor/firedrive.py
+++ b/youtube_dl/extractor/firedrive.py
@@ -4,11 +4,13 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
+from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
 )
+from ..utils import (
+    ExtractorError,
+)


 class FiredriveIE(InfoExtractor):
@@ -28,11 +30,8 @@ class FiredriveIE(InfoExtractor):
    }]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        url = 'http://firedrive.com/file/%s' % video_id
-
        webpage = self._download_webpage(url, video_id)

        if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
--- a/youtube_dl/extractor/fivemin.py
+++ b/youtube_dl/extractor/fivemin.py
@@ -1,11 +1,11 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_str,
    compat_urllib_parse,
+)
+from ..utils import (
    ExtractorError,
 )

@@ -13,7 +13,7 @@ from ..utils import (
 class FiveMinIE(InfoExtractor):
    IE_NAME = '5min'
    _VALID_URL = r'''(?x)
-        (?:https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js\?(.*?&)?playList=|
+        (?:https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js\?(?:.*?&)?playList=|
            5min:)
        (?P<id>\d+)
        '''
@@ -41,13 +41,8 @@ class FiveMinIE(InfoExtractor):
        },
    ]

-    @classmethod
-    def _build_result(cls, video_id):
-        return cls.url_result('5min:%s' % video_id, cls.ie_key())
-
    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        embed_url = 'https://embed.5min.com/playerseed/?playList=%s' % video_id
        embed_page = self._download_webpage(embed_url, video_id,
                                            'Downloading embed page')
--- a/youtube_dl/extractor/fourtube.py
+++ b/youtube_dl/extractor/fourtube.py
@@ -3,12 +3,14 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_request,
-    unified_strdate,
-    str_to_int,
-    parse_duration,
+)
+from ..utils import (
    clean_html,
+    parse_duration,
+    str_to_int,
+    unified_strdate,
 )


@@ -31,9 +33,7 @@ class FourTubeIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        webpage_url = 'http://www.4tube.com/videos/' + video_id
        webpage = self._download_webpage(webpage_url, video_id)

--- a/youtube_dl/extractor/foxgay.py
+++ b/youtube_dl/extractor/foxgay.py
@@ -0,0 +1,48 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class FoxgayIE(InfoExtractor):
+    _VALID_URL = r'http://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml'
+    _TEST = {
+        'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml',
+        'md5': '80d72beab5d04e1655a56ad37afe6841',
+        'info_dict': {
+            'id': '2582',
+            'ext': 'mp4',
+            'title': 'md5:6122f7ae0fc6b21ebdf59c5e083ce25a',
+            'description': 'md5:5e51dc4405f1fd315f7927daed2ce5cf',
+            'age_limit': 18,
+            'thumbnail': 're:https?://.*\.jpg$',
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_regex(
+            r'<title>(?P<title>.*?)</title>',
+            webpage, 'title', fatal=False)
+        description = self._html_search_regex(
+            r'<div class="ico_desc"><h2>(?P<description>.*?)</h2>',
+            webpage, 'description', fatal=False)
+
+        # Find the URL for the iFrame which contains the actual video.
+        iframe = self._download_webpage(
+            self._html_search_regex(r'iframe src="(?P<frame>.*?)"', webpage, 'video frame'),
+            video_id)
+        video_url = self._html_search_regex(
+            r"v_path = '(?P<vid>http://.*?)'", iframe, 'url')
+        thumb_url = self._html_search_regex(
+            r"t_path = '(?P<thumb>http://.*?)'", iframe, 'thumbnail', fatal=False)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'url': video_url,
+            'description': description,
+            'thumbnail': thumb_url,
+            'age_limit': 18,
+        }
--- a/youtube_dl/extractor/foxnews.py
+++ b/youtube_dl/extractor/foxnews.py
@@ -0,0 +1,94 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    parse_iso8601,
+    int_or_none,
+)
+
+
+class FoxNewsIE(InfoExtractor):
+    _VALID_URL = r'https?://video\.foxnews\.com/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
+    _TESTS = [
+        {
+            'url': 'http://video.foxnews.com/v/3937480/frozen-in-time/#sp=show-clips',
+            'md5': '32aaded6ba3ef0d1c04e238d01031e5e',
+            'info_dict': {
+                'id': '3937480',
+                'ext': 'flv',
+                'title': 'Frozen in Time',
+                'description': 'Doctors baffled by 16-year-old girl that is the size of a toddler',
+                'duration': 265,
+                'timestamp': 1304411491,
+                'upload_date': '20110503',
+                'thumbnail': 're:^https?://.*\.jpg$',
+            },
+        },
+        {
+            'url': 'http://video.foxnews.com/v/3922535568001/rep-luis-gutierrez-on-if-obamas-immigration-plan-is-legal/#sp=show-clips',
+            'md5': '5846c64a1ea05ec78175421b8323e2df',
+            'info_dict': {
+                'id': '3922535568001',
+                'ext': 'mp4',
+                'title': "Rep. Luis Gutierrez on if Obama's immigration plan is legal",
+                'description': "Congressman discusses the president's executive action",
+                'duration': 292,
+                'timestamp': 1417662047,
+                'upload_date': '20141204',
+                'thumbnail': 're:^https?://.*\.jpg$',
+            },
+        },
+        {
+            'url': 'http://video.foxnews.com/v/video-embed.html?video_id=3937480&d=video.foxnews.com',
+            'only_matching': True,
+        },
+    ]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        video = self._download_json(
+            'http://video.foxnews.com/v/feed/video/%s.js?template=fox' % video_id, video_id)
+
+        item = video['channel']['item']
+        title = item['title']
+        description = item['description']
+        timestamp = parse_iso8601(item['dc-date'])
+
+        media_group = item['media-group']
+        duration = None
+        formats = []
+        for media in media_group['media-content']:
+            attributes = media['@attributes']
+            video_url = attributes['url']
+            if video_url.endswith('.f4m'):
+                formats.extend(self._extract_f4m_formats(video_url + '?hdcore=3.4.0&plugin=aasp-3.4.0.132.124', video_id))
+            elif video_url.endswith('.m3u8'):
+                formats.extend(self._extract_m3u8_formats(video_url, video_id, 'flv'))
+            elif not video_url.endswith('.smil'):
+                duration = int_or_none(attributes.get('duration'))
+                formats.append({
+                    'url': video_url,
+                    'format_id': media['media-category']['@attributes']['label'],
+                    'preference': 1,
+                    'vbr': int_or_none(attributes.get('bitrate')),
+                    'filesize': int_or_none(attributes.get('fileSize'))
+                })
+        self._sort_formats(formats)
+
+        media_thumbnail = media_group['media-thumbnail']['@attributes']
+        thumbnails = [{
+            'url': media_thumbnail['url'],
+            'width': int_or_none(media_thumbnail.get('width')),
+            'height': int_or_none(media_thumbnail.get('height')),
+        }] if media_thumbnail else []
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'duration': duration,
+            'timestamp': timestamp,
+            'formats': formats,
+            'thumbnails': thumbnails,
+        }
--- a/youtube_dl/extractor/franceculture.py
+++ b/youtube_dl/extractor/franceculture.py
@@ -5,7 +5,7 @@ import json
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_parse_qs,
    compat_urlparse,
 )
--- a/youtube_dl/extractor/francetv.py
+++ b/youtube_dl/extractor/francetv.py
@@ -6,13 +6,15 @@ import re
 import json

 from .common import InfoExtractor
-from ..utils import (
-    compat_urlparse,
-    ExtractorError,
-    clean_html,
-    parse_duration,
+from ..compat import (
    compat_urllib_parse_urlparse,
+    compat_urlparse,
+)
+from ..utils import (
+    clean_html,
+    ExtractorError,
    int_or_none,
+    parse_duration,
 )


@@ -40,8 +42,6 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
        else:
            georestricted = False

-
-
        formats = []
        for video in info['videos']:
            if video['statut'] != 'ONLINE':
--- a/youtube_dl/extractor/gamekings.py
+++ b/youtube_dl/extractor/gamekings.py
@@ -11,7 +11,7 @@ class GamekingsIE(InfoExtractor):
        'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/',
        # MD5 is flaky, seems to change regularly
        # 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3',
-        u'info_dict': {
+        'info_dict': {
            'id': '20130811',
            'ext': 'mp4',
            'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
--- a/youtube_dl/extractor/gamespot.py
+++ b/youtube_dl/extractor/gamespot.py
@@ -4,9 +4,11 @@ import re
 import json

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
    compat_urlparse,
+)
+from ..utils import (
    unescapeHTML,
 )

--- a/youtube_dl/extractor/gdcvault.py
+++ b/youtube_dl/extractor/gdcvault.py
@@ -3,7 +3,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
 )
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -445,6 +445,39 @@ class GenericIE(InfoExtractor):
                'title': 'Rosetta #CometLanding webcast HL 10',
            }
        },
+        # LazyYT
+        {
+            'url': 'http://discourse.ubuntu.com/t/unity-8-desktop-mode-windows-on-mir/1986',
+            'info_dict': {
+                'title': 'Unity 8 desktop-mode windows on Mir! - Ubuntu Discourse',
+            },
+            'playlist_mincount': 2,
+        },
+        # Direct link with incorrect MIME type
+        {
+            'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
+            'md5': '4ccbebe5f36706d85221f204d7eb5913',
+            'info_dict': {
+                'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
+                'id': '5_Lennart_Poettering_-_Systemd',
+                'ext': 'webm',
+                'title': '5_Lennart_Poettering_-_Systemd',
+                'upload_date': '20141120',
+            },
+            'expected_warnings': [
+                'URL could be a direct video link, returning it as such.'
+            ]
+        },
+        # Cinchcast embed
+        {
+            'url': 'http://undergroundwellness.com/podcasts/306-5-steps-to-permanent-gut-healing/',
+            'info_dict': {
+                'id': '7141703',
+                'ext': 'mp3',
+                'upload_date': '20141126',
+                'title': 'Jack Tips: 5 Steps to Permanent Gut Healing',
+            }
+        },
    ]

    def report_following_redirect(self, new_url):
@@ -598,10 +631,28 @@ class GenericIE(InfoExtractor):
        if not self._downloader.params.get('test', False) and not is_intentional:
            self._downloader.report_warning('Falling back on generic information extractor.')

-        if full_response:
-            webpage = self._webpage_read_content(full_response, url, video_id)
-        else:
-            webpage = self._download_webpage(url, video_id)
+        if not full_response:
+            full_response = self._request_webpage(url, video_id)
+
+        # Maybe it's a direct link to a video?
+        # Be careful not to download the whole thing!
+        first_bytes = full_response.read(512)
+        if not re.match(r'^\s*<', first_bytes.decode('utf-8', 'replace')):
+            self._downloader.report_warning(
+                'URL could be a direct video link, returning it as such.')
+            upload_date = unified_strdate(
+                head_response.headers.get('Last-Modified'))
+            return {
+                'id': video_id,
+                'title': os.path.splitext(url_basename(url))[0],
+                'direct': True,
+                'url': url,
+                'upload_date': upload_date,
+            }
+
+        webpage = self._webpage_read_content(
+            full_response, url, video_id, prefix=first_bytes)
+
        self.report_extraction(video_id)

        # Is it an RSS feed?
@@ -702,6 +753,12 @@ class GenericIE(InfoExtractor):
            return _playlist_from_matches(
                matches, lambda m: unescapeHTML(m[1]))

+        # Look for lazyYT YouTube embed
+        matches = re.findall(
+            r'class="lazyYT" data-youtube-id="([^"]+)"', webpage)
+        if matches:
+            return _playlist_from_matches(matches, lambda m: unescapeHTML(m))
+
        # Look for embedded Dailymotion player
        matches = re.findall(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/embed/video/.+?)\1', webpage)
@@ -914,6 +971,13 @@ class GenericIE(InfoExtractor):
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'SBS')

+        # Look for embedded Cinchcast player
+        mobj = re.search(
+            r'<iframe[^>]+?src=(["\'])(?P<url>https?://player\.cinchcast\.com/.+?)\1',
+            webpage)
+        if mobj is not None:
+            return self.url_result(mobj.group('url'), 'Cinchcast')
+
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>https?://m(?:lb)?\.mlb\.com/shared/video/embed/embed\.html\?.+?)\1',
            webpage)
--- a/youtube_dl/extractor/giantbomb.py
+++ b/youtube_dl/extractor/giantbomb.py
@@ -0,0 +1,81 @@
+from __future__ import unicode_literals
+
+import re
+import json
+
+from .common import InfoExtractor
+from ..utils import (
+    unescapeHTML,
+    qualities,
+    int_or_none,
+)
+
+
+class GiantBombIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?giantbomb\.com/videos/(?P<display_id>[^/]+)/(?P<id>\d+-\d+)'
+    _TEST = {
+        'url': 'http://www.giantbomb.com/videos/quick-look-destiny-the-dark-below/2300-9782/',
+        'md5': '57badeface303ecf6b98b812de1b9018',
+        'info_dict': {
+            'id': '2300-9782',
+            'display_id': 'quick-look-destiny-the-dark-below',
+            'ext': 'mp4',
+            'title': 'Quick Look: Destiny: The Dark Below',
+            'description': 'md5:0aa3aaf2772a41b91d44c63f30dfad24',
+            'duration': 2399,
+            'thumbnail': 're:^https?://.*\.jpg$',
+        }
+    }
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id')
+        display_id = mobj.group('display_id')
+
+        webpage = self._download_webpage(url, display_id)
+
+        title = self._og_search_title(webpage)
+        description = self._og_search_description(webpage)
+        thumbnail = self._og_search_thumbnail(webpage)
+
+        video = json.loads(unescapeHTML(self._search_regex(
+            r'data-video="([^"]+)"', webpage, 'data-video')))
+
+        duration = int_or_none(video.get('lengthSeconds'))
+
+        quality = qualities([
+            'f4m_low', 'progressive_low', 'f4m_high',
+            'progressive_high', 'f4m_hd', 'progressive_hd'])
+
+        formats = []
+        for format_id, video_url in video['videoStreams'].items():
+            if format_id == 'f4m_stream':
+                continue
+            if video_url.endswith('.f4m'):
+                f4m_formats = self._extract_f4m_formats(video_url + '?hdcore=3.3.1', display_id)
+                if f4m_formats:
+                    f4m_formats[0]['quality'] = quality(format_id)
+                    formats.extend(f4m_formats)
+            else:
+                formats.append({
+                    'url': video_url,
+                    'format_id': format_id,
+                    'quality': quality(format_id),
+                })
+
+        if not formats:
+            youtube_id = video.get('youtubeID')
+            if youtube_id:
+                return self.url_result(youtube_id, 'Youtube')
+
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/goldenmoustache.py
+++ b/youtube_dl/extractor/goldenmoustache.py
@@ -1,9 +1,6 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..utils import (
-    int_or_none,
-)


 class GoldenMoustacheIE(InfoExtractor):
@@ -17,7 +14,6 @@ class GoldenMoustacheIE(InfoExtractor):
            'title': 'Suricate - Le Poker',
            'description': 'md5:3d1f242f44f8c8cb0a106f1fd08e5dc9',
            'thumbnail': 're:^https?://.*\.jpg$',
-            'view_count': int,
        }
    }, {
        'url': 'http://www.goldenmoustache.com/le-lab-tout-effacer-mc-fly-et-carlito-55249/',
@@ -28,7 +24,6 @@ class GoldenMoustacheIE(InfoExtractor):
            'title': 'Le LAB - Tout Effacer (Mc Fly et Carlito)',
            'description': 'md5:9b7fbf11023fb2250bd4b185e3de3b2a',
            'thumbnail': 're:^https?://.*\.(?:png|jpg)$',
-            'view_count': int,
        }
    }]

@@ -42,9 +37,6 @@ class GoldenMoustacheIE(InfoExtractor):
            r'<title>(.*?)(?: - Golden Moustache)?</title>', webpage, 'title')
        thumbnail = self._og_search_thumbnail(webpage)
        description = self._og_search_description(webpage)
-        view_count = int_or_none(self._html_search_regex(
-            r'<strong>([0-9]+)</strong>\s*VUES</span>',
-            webpage, 'view count', fatal=False))

        return {
            'id': video_id,
@@ -53,5 +45,4 @@ class GoldenMoustacheIE(InfoExtractor):
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
-            'view_count': view_count,
        }
--- a/youtube_dl/extractor/golem.py
+++ b/youtube_dl/extractor/golem.py
@@ -2,8 +2,10 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urlparse,
+)
+from ..utils import (
    determine_ext,
 )

--- a/youtube_dl/extractor/googlesearch.py
+++ b/youtube_dl/extractor/googlesearch.py
@@ -4,7 +4,7 @@ import itertools
 import re

 from .common import SearchInfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
 )

--- a/youtube_dl/extractor/gorillavid.py
+++ b/youtube_dl/extractor/gorillavid.py
@@ -4,19 +4,21 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
-    determine_ext,
+from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
 )
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+)


 class GorillaVidIE(InfoExtractor):
-    IE_DESC = 'GorillaVid.in, daclips.in and movpod.in'
+    IE_DESC = 'GorillaVid.in, daclips.in, movpod.in and fastvideo.in'
    _VALID_URL = r'''(?x)
        https?://(?P<host>(?:www\.)?
-            (?:daclips\.in|gorillavid\.in|movpod\.in))/
+            (?:daclips\.in|gorillavid\.in|movpod\.in|fastvideo\.in))/
        (?:embed-)?(?P<id>[0-9a-zA-Z]+)(?:-[0-9]+x[0-9]+\.html)?
    '''

@@ -49,6 +51,16 @@ class GorillaVidIE(InfoExtractor):
            'title': 'Micro Pig piglets ready on 16th July 2009-bG0PdrCdxUc',
            'thumbnail': 're:http://.*\.jpg',
        }
+    }, {
+        # video with countdown timeout
+        'url': 'http://fastvideo.in/1qmdn1lmsmbw',
+        'md5': '8b87ec3f6564a3108a0e8e66594842ba',
+        'info_dict': {
+            'id': '1qmdn1lmsmbw',
+            'ext': 'mp4',
+            'title': 'Man of Steel - Trailer',
+            'thumbnail': 're:http://.*\.jpg',
+        },
    }, {
        'url': 'http://movpod.in/0wguyyxi1yca',
        'only_matching': True,
@@ -71,6 +83,12 @@ class GorillaVidIE(InfoExtractor):
            ''', webpage))

        if fields['op'] == 'download1':
+            countdown = int_or_none(self._search_regex(
+                r'<span id="countdown_str">(?:[Ww]ait)?\s*<span id="cxc">(\d+)</span>\s*(?:seconds?)?</span>',
+                webpage, 'countdown', default=None))
+            if countdown:
+                self._sleep(countdown, video_id)
+
            post = compat_urllib_parse.urlencode(fields)

            req = compat_urllib_request.Request(url, post)
@@ -78,14 +96,17 @@ class GorillaVidIE(InfoExtractor):

            webpage = self._download_webpage(req, video_id, 'Downloading video page')

-        title = self._search_regex(r'style="z-index: [0-9]+;">([^<]+)</span>', webpage, 'title')
-        video_url = self._search_regex(r'file\s*:\s*\'(http[^\']+)\',', webpage, 'file url')
-        thumbnail = self._search_regex(r'image\s*:\s*\'(http[^\']+)\',', webpage, 'thumbnail', fatal=False)
+        title = self._search_regex(
+            r'style="z-index: [0-9]+;">([^<]+)</span>',
+            webpage, 'title', default=None) or self._og_search_title(webpage)
+        video_url = self._search_regex(
+            r'file\s*:\s*["\'](http[^"\']+)["\'],', webpage, 'file url')
+        thumbnail = self._search_regex(
+            r'image\s*:\s*["\'](http[^"\']+)["\'],', webpage, 'thumbnail', fatal=False)

        formats = [{
            'format_id': 'sd',
            'url': video_url,
-            'ext': determine_ext(video_url),
            'quality': 1,
        }]

--- a/youtube_dl/extractor/goshgay.py
+++ b/youtube_dl/extractor/goshgay.py
@@ -2,57 +2,52 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
+from ..compat import (
+    compat_parse_qs,
+)
 from ..utils import (
-    compat_urlparse,
-    ExtractorError,
+    parse_duration,
 )


 class GoshgayIE(InfoExtractor):
-    _VALID_URL = r'^(?:https?://)www.goshgay.com/video(?P<id>\d+?)($|/)'
+    _VALID_URL = r'https?://www\.goshgay\.com/video(?P<id>\d+?)($|/)'
    _TEST = {
-        'url': 'http://www.goshgay.com/video4116282',
-        'md5': '268b9f3c3229105c57859e166dd72b03',
+        'url': 'http://www.goshgay.com/video299069/diesel_sfw_xxx_video',
+        'md5': '027fcc54459dff0feb0bc06a7aeda680',
        'info_dict': {
-            'id': '4116282',
+            'id': '299069',
            'ext': 'flv',
-            'title': 'md5:089833a4790b5e103285a07337f245bf',
-            'thumbnail': 're:http://.*\.jpg',
+            'title': 'DIESEL SFW XXX Video',
+            'thumbnail': 're:^http://.*\.jpg$',
+            'duration': 79,
            'age_limit': 18,
        }
    }

    def _real_extract(self, url):
        video_id = self._match_id(url)
-
        webpage = self._download_webpage(url, video_id)
-        title = self._og_search_title(webpage)
-        thumbnail = self._og_search_thumbnail(webpage)
+
+        title = self._html_search_regex(
+            r'<h2>(.*?)<', webpage, 'title')
+        duration = parse_duration(self._html_search_regex(
+            r'<span class="duration">\s*-?\s*(.*?)</span>',
+            webpage, 'duration', fatal=False))
        family_friendly = self._html_search_meta(
            'isFamilyFriendly', webpage, default='false')
-        config_url = self._search_regex(
-            r"'config'\s*:\s*'([^']+)'", webpage, 'config URL')

-        config = self._download_xml(
-            config_url, video_id, 'Downloading player config XML')
-
-        if config is None:
-            raise ExtractorError('Missing config XML')
-        if config.tag != 'config':
-            raise ExtractorError('Missing config attribute')
-        fns = config.findall('file')
-        if len(fns) < 1:
-            raise ExtractorError('Missing media URI')
-        video_url = fns[0].text
-
-        url_comp = compat_urlparse.urlparse(url)
-        ref = "%s://%s%s" % (url_comp[0], url_comp[1], url_comp[2])
+        flashvars = compat_parse_qs(self._html_search_regex(
+            r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"',
+            webpage, 'flashvars'))
+        thumbnail = flashvars.get('url_bigthumb', [None])[0]
+        video_url = flashvars['flv_url'][0]

        return {
            'id': video_id,
            'url': video_url,
            'title': title,
            'thumbnail': thumbnail,
-            'http_referer': ref,
+            'duration': duration,
            'age_limit': 0 if family_friendly == 'true' else 18,
        }
--- a/youtube_dl/extractor/groupon.py
+++ b/youtube_dl/extractor/groupon.py
@@ -0,0 +1,50 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class GrouponIE(InfoExtractor):
+    _VALID_URL = r'https?://www\.groupon\.com/deals/(?P<id>[^?#]+)'
+
+    _TEST = {
+        'url': 'https://www.groupon.com/deals/bikram-yoga-huntington-beach-2#ooid=tubGNycTo_9Uxg82uESj4i61EYX8nyuf',
+        'info_dict': {
+            'id': 'bikram-yoga-huntington-beach-2',
+            'title': '$49 for 10 Yoga Classes or One Month of Unlimited Classes at Bikram Yoga Huntington Beach ($180 Value)',
+            'description': 'Studio kept at 105 degrees and 40% humidity with anti-microbial and anti-slip Flotex flooring; certified instructors',
+        },
+        'playlist': [{
+            'info_dict': {
+                'id': 'tubGNycTo_9Uxg82uESj4i61EYX8nyuf',
+                'ext': 'mp4',
+                'title': 'Bikram Yoga Huntington Beach | Orange County',
+            },
+        }],
+        'params': {
+            'skip_download': 'HLS',
+        }
+    }
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+        webpage = self._download_webpage(url, playlist_id)
+
+        payload = self._parse_json(self._search_regex(
+            r'var\s+payload\s*=\s*(.*?);\n', webpage, 'payload'), playlist_id)
+        videos = payload['carousel'].get('dealVideos', [])
+        entries = []
+        for v in videos:
+            if v.get('provider') != 'OOYALA':
+                self.report_warning(
+                    '%s: Unsupported video provider %s, skipping video' %
+                    (playlist_id, v.get('provider')))
+                continue
+            entries.append(self.url_result('ooyala:%s' % v['media']))
+
+        return {
+            '_type': 'playlist',
+            'id': playlist_id,
+            'entries': entries,
+            'title': self._og_search_title(webpage),
+            'description': self._og_search_description(webpage),
+        }
--- a/youtube_dl/extractor/helsinki.py
+++ b/youtube_dl/extractor/helsinki.py
@@ -2,9 +2,8 @@

 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
+from ..utils import js_to_json


 class HelsinkiIE(InfoExtractor):
@@ -24,39 +23,21 @@ class HelsinkiIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        formats = []
-
-        mobj = re.search(r'file=((\w+):[^&]+)', webpage)
-        if mobj:
-            formats.append({
-                'ext': mobj.group(2),
-                'play_path': mobj.group(1),
-                'url': 'rtmp://flashvideo.it.helsinki.fi/vod/',
-                'player_url': 'http://video.helsinki.fi/player.swf',
-                'format_note': 'sd',
-                'quality': 0,
-            })
-
-        mobj = re.search(r'hd\.file=((\w+):[^&]+)', webpage)
-        if mobj:
-            formats.append({
-                'ext': mobj.group(2),
-                'play_path': mobj.group(1),
-                'url': 'rtmp://flashvideo.it.helsinki.fi/vod/',
-                'player_url': 'http://video.helsinki.fi/player.swf',
-                'format_note': 'hd',
-                'quality': 1,
-            })

+        params = self._parse_json(self._html_search_regex(
+            r'(?s)jwplayer\("player"\).setup\((\{.*?\})\);',
+            webpage, 'player code'), video_id, transform_source=js_to_json)
+        formats = [{
+            'url': s['file'],
+            'ext': 'mp4',
+        } for s in params['sources']]
        self._sort_formats(formats)

        return {
            'id': video_id,
            'title': self._og_search_title(webpage).replace('Video: ', ''),
            'description': self._og_search_description(webpage),
-            'thumbnail': self._og_search_thumbnail(webpage),
            'formats': formats,
        }
--- a/youtube_dl/extractor/hostingbulk.py
+++ b/youtube_dl/extractor/hostingbulk.py
@@ -4,9 +4,11 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import (
+    compat_urllib_request,
+)
 from ..utils import (
    ExtractorError,
-    compat_urllib_request,
    int_or_none,
    urlencode_postdata,
 )
@@ -30,9 +32,7 @@ class HostingBulkIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        url = 'http://hostingbulk.com/{0:}.html'.format(video_id)

        # Custom request with cookie to set language to English, so our file
--- a/youtube_dl/extractor/hotnewhiphop.py
+++ b/youtube_dl/extractor/hotnewhiphop.py
@@ -1,12 +1,13 @@
 from __future__ import unicode_literals

-import re
 import base64

 from .common import InfoExtractor
-from ..utils import (
+from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
+)
+from ..utils import (
    ExtractorError,
    HEADRequest,
 )
@@ -16,25 +17,24 @@ class HotNewHipHopIE(InfoExtractor):
    _VALID_URL = r'http://www\.hotnewhiphop\.com/.*\.(?P<id>.*)\.html'
    _TEST = {
        'url': 'http://www.hotnewhiphop.com/freddie-gibbs-lay-it-down-song.1435540.html',
-        'file': '1435540.mp3',
        'md5': '2c2cd2f76ef11a9b3b581e8b232f3d96',
        'info_dict': {
+            'id': '1435540',
+            'ext': 'mp3',
            'title': 'Freddie Gibbs - Lay It Down'
        }
    }

    def _real_extract(self, url):
-        m = re.match(self._VALID_URL, url)
-        video_id = m.group('id')
-
-        webpage_src = self._download_webpage(url, video_id)
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)

        video_url_base64 = self._search_regex(
-            r'data-path="(.*?)"', webpage_src, u'video URL', fatal=False)
+            r'data-path="(.*?)"', webpage, 'video URL', default=None)

        if video_url_base64 is None:
            video_url = self._search_regex(
-                r'"contentUrl" content="(.*?)"', webpage_src, u'video URL')
+                r'"contentUrl" content="(.*?)"', webpage, 'content URL')
            return self.url_result(video_url, ie='Youtube')

        reqdata = compat_urllib_parse.urlencode([
@@ -59,11 +59,11 @@ class HotNewHipHopIE(InfoExtractor):
        if video_url.endswith('.html'):
            raise ExtractorError('Redirect failed')

-        video_title = self._og_search_title(webpage_src).strip()
+        video_title = self._og_search_title(webpage).strip()

        return {
            'id': video_id,
            'url': video_url,
            'title': video_title,
-            'thumbnail': self._og_search_thumbnail(webpage_src),
+            'thumbnail': self._og_search_thumbnail(webpage),
        }
--- a/Show More
+++ b/Show More