release 2014.04.07.1

[teamcoco] Simplify ID management (Closes #2715 )
release 2014.04.07
2014-04-07 15:28:55 +02:00 · 2014-04-07 15:25:35 +02:00 · 2014-04-07 13:11:37 +02:00 · 2014-04-07 13:11:30 +02:00 · 2014-04-07 00:34:23 +07:00 · 2014-04-06 06:03:58 +07:00
46 changed files with 1152 additions and 297 deletions
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -3,5 +3,4 @@ include test/*.py
 include test/*.json
 include youtube-dl.bash-completion
 include youtube-dl.1
-recursive-include docs *
+recursive-include docs Makefile conf.py *.rst
 prune docs/_build
--- a/README.md
+++ b/README.md
@@ -371,7 +371,67 @@ If you want to create a build of youtube-dl yourself, you'll need
 ### Adding support for a new site
-If you want to add support for a new site, copy *any* [recently modified](https://github.com/rg3/youtube-dl/commits/master/youtube_dl/extractor) file in `youtube_dl/extractor`, add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py). Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Don't forget to run the tests with `python test/test_download.py TestDownload.test_YourExtractor`! For a detailed tutorial, refer to [this blog post](http://filippo.io/add-support-for-a-new-video-site-to-youtube-dl/).
+If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
 1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
 2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
 3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
 4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
        # coding: utf-8
        from __future__ import unicode_literals
        import re
        from .common import InfoExtractor
        class YourExtractorIE(InfoExtractor):
            _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
            _TEST = {
                'url': 'http://yourextractor.com/watch/42',
                'md5': 'TODO: md5 sum of the first 10KiB of the video file',
                'info_dict': {
                    'id': '42',
                    'ext': 'mp4',
                    'title': 'Video title goes here',
                    # TODO more properties, either as:
                    # * A value
                    # * MD5 checksum; start the string with md5:
                    # * A regular expression; start the string with re:
                    # * Any Python type (for example int or float)
                }
            }
            def _real_extract(self, url):
                mobj = re.match(self._VALID_URL, url)
                video_id = mobj.group('id')
                # TODO more code goes here, for example ...
                webpage = self._download_webpage(url, video_id)
                title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')
                return {
                    'id': video_id,
                    'title': title,
                    # TODO more properties (see youtube_dl/extractor/common.py)
                }
 5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done.
 7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
 8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
 9. When the tests pass, [add](https://www.kernel.org/pub/software/scm/git/docs/git-add.html) the new files and [commit](https://www.kernel.org/pub/software/scm/git/docs/git-commit.html) them and [push](https://www.kernel.org/pub/software/scm/git/docs/git-push.html) the result, like this:
        $ git add youtube_dl/extractor/__init__.py
        $ git add youtube_dl/extractor/yourextractor.py
        $ git commit -m '[yourextractor] Add new extractor'
        $ git push origin yourextractor
 10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
 In any case, thank you very much for your contributions!
 # BUGS
--- a/test/test_YoutubeDL.py
+++ b/test/test_YoutubeDL.py
@@ -26,16 +26,27 @@ class YDL(FakeYDL):
        self.msgs.append(msg)
 def _make_result(formats, **kwargs):
    res = {
        'formats': formats,
        'id': 'testid',
        'title': 'testttitle',
        'extractor': 'testex',
    }
    res.update(**kwargs)
    return res
 class TestFormatSelection(unittest.TestCase):
    def test_prefer_free_formats(self):
        # Same resolution => download webm
        ydl = YDL()
        ydl.params['prefer_free_formats'] = True
        formats = [
-            {'ext': 'webm', 'height': 460},
+            {'ext': 'webm', 'height': 460, 'url': 'x'},
-            {'ext': 'mp4',  'height': 460},
+            {'ext': 'mp4', 'height': 460, 'url': 'y'},
        ]
-        info_dict = {'formats': formats, 'extractor': 'test'}
+        info_dict = _make_result(formats)
        yie = YoutubeIE(ydl)
        yie._sort_formats(info_dict['formats'])
        ydl.process_ie_result(info_dict)
@@ -46,8 +57,8 @@ class TestFormatSelection(unittest.TestCase):
        ydl = YDL()
        ydl.params['prefer_free_formats'] = True
        formats = [
-            {'ext': 'webm', 'height': 720},
+            {'ext': 'webm', 'height': 720, 'url': 'a'},
-            {'ext': 'mp4', 'height': 1080},
+            {'ext': 'mp4', 'height': 1080, 'url': 'b'},
        ]
        info_dict['formats'] = formats
        yie = YoutubeIE(ydl)
@@ -60,9 +71,9 @@ class TestFormatSelection(unittest.TestCase):
        ydl = YDL()
        ydl.params['prefer_free_formats'] = False
        formats = [
-            {'ext': 'webm', 'height': 720},
+            {'ext': 'webm', 'height': 720, 'url': '_'},
-            {'ext': 'mp4', 'height': 720},
+            {'ext': 'mp4', 'height': 720, 'url': '_'},
-            {'ext': 'flv', 'height': 720},
+            {'ext': 'flv', 'height': 720, 'url': '_'},
        ]
        info_dict['formats'] = formats
        yie = YoutubeIE(ydl)
@@ -74,8 +85,8 @@ class TestFormatSelection(unittest.TestCase):
        ydl = YDL()
        ydl.params['prefer_free_formats'] = False
        formats = [
-            {'ext': 'flv', 'height': 720},
+            {'ext': 'flv', 'height': 720, 'url': '_'},
-            {'ext': 'webm', 'height': 720},
+            {'ext': 'webm', 'height': 720, 'url': '_'},
        ]
        info_dict['formats'] = formats
        yie = YoutubeIE(ydl)
@@ -91,8 +102,7 @@ class TestFormatSelection(unittest.TestCase):
            {'format_id': 'great', 'url': 'http://example.com/great', 'preference': 3},
            {'format_id': 'excellent', 'url': 'http://example.com/exc', 'preference': 4},
        ]
-        info_dict = {
+        info_dict = _make_result(formats)
            'formats': formats, 'extractor': 'test', 'id': 'testvid'}
        ydl = YDL()
        ydl.process_ie_result(info_dict)
@@ -120,12 +130,12 @@ class TestFormatSelection(unittest.TestCase):
    def test_format_selection(self):
        formats = [
-            {'format_id': '35', 'ext': 'mp4', 'preference': 1},
+            {'format_id': '35', 'ext': 'mp4', 'preference': 1, 'url': '_'},
-            {'format_id': '45', 'ext': 'webm', 'preference': 2},
+            {'format_id': '45', 'ext': 'webm', 'preference': 2, 'url': '_'},
-            {'format_id': '47', 'ext': 'webm', 'preference': 3},
+            {'format_id': '47', 'ext': 'webm', 'preference': 3, 'url': '_'},
-            {'format_id': '2', 'ext': 'flv', 'preference': 4},
+            {'format_id': '2', 'ext': 'flv', 'preference': 4, 'url': '_'},
        ]
-        info_dict = {'formats': formats, 'extractor': 'test'}
+        info_dict = _make_result(formats)
        ydl = YDL({'format': '20/47'})
        ydl.process_ie_result(info_dict.copy())
@@ -154,12 +164,12 @@ class TestFormatSelection(unittest.TestCase):
    def test_format_selection_audio(self):
        formats = [
-            {'format_id': 'audio-low', 'ext': 'webm', 'preference': 1, 'vcodec': 'none'},
+            {'format_id': 'audio-low', 'ext': 'webm', 'preference': 1, 'vcodec': 'none', 'url': '_'},
-            {'format_id': 'audio-mid', 'ext': 'webm', 'preference': 2, 'vcodec': 'none'},
+            {'format_id': 'audio-mid', 'ext': 'webm', 'preference': 2, 'vcodec': 'none', 'url': '_'},
-            {'format_id': 'audio-high', 'ext': 'flv', 'preference': 3, 'vcodec': 'none'},
+            {'format_id': 'audio-high', 'ext': 'flv', 'preference': 3, 'vcodec': 'none', 'url': '_'},
-            {'format_id': 'vid', 'ext': 'mp4', 'preference': 4},
+            {'format_id': 'vid', 'ext': 'mp4', 'preference': 4, 'url': '_'},
        ]
-        info_dict = {'formats': formats, 'extractor': 'test'}
+        info_dict = _make_result(formats)
        ydl = YDL({'format': 'bestaudio'})
        ydl.process_ie_result(info_dict.copy())
@@ -172,10 +182,10 @@ class TestFormatSelection(unittest.TestCase):
        self.assertEqual(downloaded['format_id'], 'audio-low')
        formats = [
-            {'format_id': 'vid-low', 'ext': 'mp4', 'preference': 1},
+            {'format_id': 'vid-low', 'ext': 'mp4', 'preference': 1, 'url': '_'},
-            {'format_id': 'vid-high', 'ext': 'mp4', 'preference': 2},
+            {'format_id': 'vid-high', 'ext': 'mp4', 'preference': 2, 'url': '_'},
        ]
-        info_dict = {'formats': formats, 'extractor': 'test'}
+        info_dict = _make_result(formats)
        ydl = YDL({'format': 'bestaudio/worstaudio/best'})
        ydl.process_ie_result(info_dict.copy())
@@ -184,11 +194,11 @@ class TestFormatSelection(unittest.TestCase):
    def test_format_selection_video(self):
        formats = [
-            {'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none'},
+            {'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none', 'url': '_'},
-            {'format_id': 'dash-video-high', 'ext': 'mp4', 'preference': 2, 'acodec': 'none'},
+            {'format_id': 'dash-video-high', 'ext': 'mp4', 'preference': 2, 'acodec': 'none', 'url': '_'},
-            {'format_id': 'vid', 'ext': 'mp4', 'preference': 3},
+            {'format_id': 'vid', 'ext': 'mp4', 'preference': 3, 'url': '_'},
        ]
-        info_dict = {'formats': formats, 'extractor': 'test'}
+        info_dict = _make_result(formats)
        ydl = YDL({'format': 'bestvideo'})
        ydl.process_ie_result(info_dict.copy())
@@ -217,10 +227,12 @@ class TestFormatSelection(unittest.TestCase):
        for f1id, f2id in zip(order, order[1:]):
            f1 = YoutubeIE._formats[f1id].copy()
            f1['format_id'] = f1id
            f1['url'] = 'url:' + f1id
            f2 = YoutubeIE._formats[f2id].copy()
            f2['format_id'] = f2id
            f2['url'] = 'url:' + f2id
-            info_dict = {'formats': [f1, f2], 'extractor': 'youtube'}
+            info_dict = _make_result([f1, f2], extractor='youtube')
            ydl = YDL()
            yie = YoutubeIE(ydl)
            yie._sort_formats(info_dict['formats'])
@@ -228,7 +240,7 @@ class TestFormatSelection(unittest.TestCase):
            downloaded = ydl.downloaded_info_dicts[0]
            self.assertEqual(downloaded['format_id'], f1id)
-            info_dict = {'formats': [f2, f1], 'extractor': 'youtube'}
+            info_dict = _make_result([f2, f1], extractor='youtube')
            ydl = YDL()
            yie = YoutubeIE(ydl)
            yie._sort_formats(info_dict['formats'])
--- a/test/test_all_urls.py
+++ b/test/test_all_urls.py
@@ -144,7 +144,24 @@ class TestAllURLsMatching(unittest.TestCase):
        self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['PBS'])
    def test_ComedyCentralShows(self):
-        self.assertMatch('http://thedailyshow.cc.com/extended-interviews/xm3fnq/andrew-napolitano-extended-interview', ['ComedyCentralShows'])
+        self.assertMatch(
            'http://thedailyshow.cc.com/extended-interviews/xm3fnq/andrew-napolitano-extended-interview',
            ['ComedyCentralShows'])
        self.assertMatch(
            'http://thecolbertreport.cc.com/videos/29w6fx/-realhumanpraise-for-fox-news',
            ['ComedyCentralShows'])
        self.assertMatch(
            'http://thecolbertreport.cc.com/videos/gh6urb/neil-degrasse-tyson-pt--1?xrs=eml_col_031114',
            ['ComedyCentralShows'])
        self.assertMatch(
            'http://thedailyshow.cc.com/guests/michael-lewis/3efna8/exclusive---michael-lewis-extended-interview-pt--3',
            ['ComedyCentralShows'])
    def test_yahoo_https(self):
        # https://github.com/rg3/youtube-dl/issues/2701
        self.assertMatch(
            'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
            ['Yahoo'])
 if __name__ == '__main__':
    unittest.main()
--- a/test/test_playlists.py
+++ b/test/test_playlists.py
@@ -42,6 +42,7 @@ from youtube_dl.extractor import (
    ToypicsUserIE,
    XTubeUserIE,
    InstagramUserIE,
    CSpanIE,
 )
@@ -314,6 +315,18 @@ class TestPlaylists(unittest.TestCase):
        }
        expect_info_dict(self, EXPECTED, test_video)
    def test_CSpan_playlist(self):
        dl = FakeYDL()
        ie = CSpanIE(dl)
        result = ie.extract(
            'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall')
        self.assertIsPlaylist(result)
        self.assertEqual(result['id'], '342759')
        self.assertEqual(
            result['title'], 'General Motors Ignition Switch Recall')
        whole_duration = sum(e['duration'] for e in result['entries'])
        self.assertEqual(whole_duration, 14855)
 if __name__ == '__main__':
    unittest.main()
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -38,6 +38,7 @@ from youtube_dl.utils import (
    xpath_with_ns,
    parse_iso8601,
    strip_jsonp,
    uppercase_escape,
 )
 if sys.version_info < (3, 0):
@@ -279,6 +280,9 @@ class TestUtil(unittest.TestCase):
        d = json.loads(stripped)
        self.assertEqual(d, [{"id": "532cb", "x": 3}])
    def test_uppercase_escpae(self):
        self.assertEqual(uppercase_escape(u'aä'), u'aä')
        self.assertEqual(uppercase_escape(u'\\U0001d550'), u'𝕐')
 if __name__ == '__main__':
    unittest.main()
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -702,6 +702,11 @@ class YoutubeDL(object):
    def process_video_result(self, info_dict, download=True):
        assert info_dict.get('_type', 'video') == 'video'
        if 'id' not in info_dict:
            raise ExtractorError('Missing "id" field in extractor result')
        if 'title' not in info_dict:
            raise ExtractorError('Missing "title" field in extractor result')
        if 'playlist' not in info_dict:
            # It isn't part of a playlist
            info_dict['playlist'] = None
@@ -733,6 +738,9 @@ class YoutubeDL(object):
        # We check that all the formats have the format and format_id fields
        for i, format in enumerate(formats):
            if 'url' not in format:
                raise ExtractorError('Missing "url" key in result (index %d)' % i)
            if format.get('format_id') is None:
                format['format_id'] = compat_str(i)
            if format.get('format') is None:
@@ -743,7 +751,7 @@ class YoutubeDL(object):
                )
            # Automatically determine file extension if missing
            if 'ext' not in format:
-                format['ext'] = determine_ext(format['url'])
+                format['ext'] = determine_ext(format['url']).lower()
        format_limit = self.params.get('format_limit', None)
        if format_limit:
@@ -868,7 +876,7 @@ class YoutubeDL(object):
        try:
            dn = os.path.dirname(encodeFilename(filename))
-            if dn != '' and not os.path.exists(dn):
+            if dn and not os.path.exists(dn):
                os.makedirs(dn)
        except (OSError, IOError) as err:
            self.report_error('unable to create directory ' + compat_str(err))
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -52,6 +52,7 @@ __authors__  = (
    'Juan C. Olivares',
    'Mattias Harrysson',
    'phaer',
    'Sainyam Kapoor',
 )
 __license__ = 'Public Domain'
@@ -242,7 +243,7 @@ def parseOpts(overrideArguments=None):
        help='Use the specified HTTP/HTTPS proxy. Pass in an empty string (--proxy "") for direct connection')
    general.add_option('--no-check-certificate', action='store_true', dest='no_check_certificate', default=False, help='Suppress HTTPS certificate validation.')
    general.add_option(
-        '--prefer-insecure', action='store_true', dest='prefer_insecure',
+        '--prefer-insecure', '--prefer-unsecure', action='store_true', dest='prefer_insecure',
        help='Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)')
    general.add_option(
        '--cache-dir', dest='cachedir', default=get_cachedir(), metavar='DIR',
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dl/downloader/common.py
@@ -4,9 +4,10 @@ import sys
 import time
 from ..utils import (
    compat_str,
    encodeFilename,
    timeconvert,
    format_bytes,
    timeconvert,
 )
@@ -173,7 +174,7 @@ class FileDownloader(object):
                return
            os.rename(encodeFilename(old_filename), encodeFilename(new_filename))
        except (IOError, OSError) as err:
-            self.report_error(u'unable to rename file: %s' % str(err))
+            self.report_error(u'unable to rename file: %s' % compat_str(err))
    def try_utime(self, filename, last_modified_hdr):
        """Try to set the last-modified time of the given file."""
--- a/youtube_dl/downloader/f4m.py
+++ b/youtube_dl/downloader/f4m.py
@@ -297,6 +297,7 @@ class F4mFD(FileDownloader):
                        break
            frags_filenames.append(frag_filename)
        dest_stream.close()
        self.report_finish(format_bytes(state['downloaded_bytes']), time.time() - start)
        self.try_rename(tmpfilename, filename)
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@@ -32,6 +32,7 @@ from .canal13cl import Canal13clIE
 from .canalplus import CanalplusIE
 from .canalc2 import Canalc2IE
 from .cbs import CBSIE
 from .cbsnews import CBSNewsIE
 from .ceskatelevize import CeskaTelevizeIE
 from .channel9 import Channel9IE
 from .chilloutzone import ChilloutzoneIE
@@ -40,6 +41,7 @@ from .clipfish import ClipfishIE
 from .cliphunter import CliphunterIE
 from .clipsyndicate import ClipsyndicateIE
 from .cmt import CMTIE
 from .cnet import CNETIE
 from .cnn import (
    CNNIE,
    CNNBlogsIE,
@@ -61,6 +63,7 @@ from .dotsub import DotsubIE
 from .dreisat import DreiSatIE
 from .defense import DefenseGouvFrIE
 from .discovery import DiscoveryIE
 from .divxstage import DivxStageIE
 from .dropbox import DropboxIE
 from .ebaumsworld import EbaumsWorldIE
 from .ehow import EHowIE
@@ -83,6 +86,7 @@ from .fktv import (
 )
 from .flickr import FlickrIE
 from .fourtube import FourTubeIE
 from .franceculture import FranceCultureIE
 from .franceinter import FranceInterIE
 from .francetv import (
    PluzzIE,
@@ -152,10 +156,14 @@ from .mixcloud import MixcloudIE
 from .mpora import MporaIE
 from .mofosex import MofosexIE
 from .mooshare import MooshareIE
 from .morningstar import MorningstarIE
 from .motorsport import MotorsportIE
 from .movshare import MovShareIE
 from .mtv import (
    MTVIE,
    MTVIggyIE,
 )
 from .musicplayon import MusicPlayOnIE
 from .muzu import MuzuTVIE
 from .myspace import MySpaceIE
 from .myspass import MySpassIE
@@ -271,6 +279,7 @@ from .videodetective import VideoDetectiveIE
 from .videolecturesnet import VideoLecturesNetIE
 from .videofyme import VideofyMeIE
 from .videopremium import VideoPremiumIE
 from .videoweed import VideoWeedIE
 from .vimeo import (
    VimeoIE,
    VimeoChannelIE,
--- a/youtube_dl/extractor/breakcom.py
+++ b/youtube_dl/extractor/breakcom.py
@@ -27,9 +27,10 @@ class BreakIE(InfoExtractor):
            webpage, 'info json', flags=re.DOTALL)
        info = json.loads(info_json)
        video_url = info['videoUri']
-        m_youtube = re.search(r'(https?://www\.youtube\.com/watch\?v=.*)', video_url)
+        youtube_id = info.get('youtubeId')
-        if m_youtube is not None:
+        if youtube_id:
-            return self.url_result(m_youtube.group(1), 'Youtube')
+            return self.url_result(youtube_id, 'Youtube')
        final_url = video_url + '?' + info['AuthToken']
        return {
            'id': video_id,
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -87,7 +87,7 @@ class BrightcoveIE(InfoExtractor):
        object_str = object_str.replace('<--', '<!--')
        object_str = fix_xml_ampersands(object_str)
-        object_doc = xml.etree.ElementTree.fromstring(object_str)
+        object_doc = xml.etree.ElementTree.fromstring(object_str.encode('utf-8'))
        fv_el = find_xpath_attr(object_doc, './param', 'name', 'flashVars')
        if fv_el is not None:
--- a/youtube_dl/extractor/c56.py
+++ b/youtube_dl/extractor/c56.py
@@ -2,39 +2,46 @@
 from __future__ import unicode_literals
 import re
 import json
 from .common import InfoExtractor
 class C56IE(InfoExtractor):
-    _VALID_URL = r'https?://((www|player)\.)?56\.com/(.+?/)?(v_|(play_album.+-))(?P<textid>.+?)\.(html|swf)'
+    _VALID_URL = r'https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)'
    IE_NAME = '56.com'
    _TEST = {
        'url': 'http://www.56.com/u39/v_OTM0NDA3MTY.html',
        'file': '93440716.flv',
        'md5': 'e59995ac63d0457783ea05f93f12a866',
        'info_dict': {
            'id': '93440716',
            'ext': 'flv',
            'title': '网事知多少 第32期：车怒',
            'duration': 283.813,
        },
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE)
        text_id = mobj.group('textid')
-        info_page = self._download_webpage('http://vxml.56.com/json/%s/' % text_id,
+
-                                           text_id, 'Downloading video info')
+        page = self._download_json(
-        info = json.loads(info_page)['info']
+            'http://vxml.56.com/json/%s/' % text_id, text_id, 'Downloading video info')
-        formats = [{
+
-            'format_id': f['type'],
+        info = page['info']
-            'filesize': int(f['filesize']),
+
-            'url': f['url']
+        formats = [
-        } for f in info['rfiles']]
+            {
                'format_id': f['type'],
                'filesize': int(f['filesize']),
                'url': f['url']
            } for f in info['rfiles']
        ]
        self._sort_formats(formats)
        return {
            'id': info['vid'],
            'title': info['Subject'],
            'duration': int(info['duration']) / 1000.0,
            'formats': formats,
            'thumbnail': info.get('bimg') or info.get('img'),
        }
--- a/youtube_dl/extractor/cbsnews.py
+++ b/youtube_dl/extractor/cbsnews.py
@@ -0,0 +1,87 @@
 # encoding: utf-8
 from __future__ import unicode_literals
 import re
 import json
 from .common import InfoExtractor
 class CBSNewsIE(InfoExtractor):
    IE_DESC = 'CBS News'
    _VALID_URL = r'http://(?:www\.)?cbsnews\.com/(?:[^/]+/)+(?P<id>[\da-z_-]+)'
    _TESTS = [
        {
            'url': 'http://www.cbsnews.com/news/tesla-and-spacex-elon-musks-industrial-empire/',
            'info_dict': {
                'id': 'tesla-and-spacex-elon-musks-industrial-empire',
                'ext': 'flv',
                'title': 'Tesla and SpaceX: Elon Musk\'s industrial empire',
                'thumbnail': 'http://beta.img.cbsnews.com/i/2014/03/30/60147937-2f53-4565-ad64-1bdd6eb64679/60-0330-pelley-640x360.jpg',
                'duration': 791,
            },
            'params': {
                # rtmp download
                'skip_download': True,
            },
        },
        {
            'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
            'info_dict': {
                'id': 'fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack',
                'ext': 'flv',
                'title': 'Fort Hood shooting: Army downplays mental illness as cause of attack',
                'thumbnail': 'http://cbsnews2.cbsistatic.com/hub/i/r/2014/04/04/0c9fbc66-576b-41ca-8069-02d122060dd2/thumbnail/140x90/6dad7a502f88875ceac38202984b6d58/en-0404-werner-replace-640x360.jpg',
                'duration': 205,
            },
            'params': {
                # rtmp download
                'skip_download': True,
            },
        },
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        webpage = self._download_webpage(url, video_id)
        video_info = json.loads(self._html_search_regex(
            r'(?:<ul class="media-list items" id="media-related-items"><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
            webpage, 'video JSON info'))
        item = video_info['item'] if 'item' in video_info else video_info
        title = item.get('articleTitle') or item.get('hed')
        duration = item.get('duration')
        thumbnail = item.get('mediaImage') or item.get('thumbnail')
        formats = []
        for format_id in ['RtmpMobileLow', 'RtmpMobileHigh', 'Hls', 'RtmpDesktop']:
            uri = item.get('media' + format_id + 'URI')
            if not uri:
                continue
            fmt = {
                'url': uri,
                'format_id': format_id,
            }
            if uri.startswith('rtmp'):
                fmt.update({
                    'app': 'ondemand?auth=cbs',
                    'play_path': 'mp4:' + uri.split('<break>')[-1],
                    'player_url': 'http://www.cbsnews.com/[[IMPORT]]/vidtech.cbsinteractive.com/player/3_3_0/CBSI_PLAYER_HD.swf',
                    'page_url': 'http://www.cbsnews.com',
                    'ext': 'flv',
                })
            elif uri.endswith('.m3u8'):
                fmt['ext'] = 'mp4'
            formats.append(fmt)
        return {
            'id': video_id,
            'title': title,
            'thumbnail': thumbnail,
            'duration': duration,
            'formats': formats,
        }
--- a/youtube_dl/extractor/cnet.py
+++ b/youtube_dl/extractor/cnet.py
@@ -0,0 +1,75 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import json
 import re
 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
    int_or_none,
 )
 class CNETIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
    _TEST = {
        'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
        'md5': '041233212a0d06b179c87cbcca1577b8',
        'info_dict': {
            'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
            'ext': 'mp4',
            'title': 'Hands-on with Microsoft Windows 8.1 Update',
            'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
            'thumbnail': 're:^http://.*/flmswindows8.jpg$',
            'uploader_id': 'sarah.mitroff@cbsinteractive.com',
            'uploader': 'Sarah Mitroff',
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        display_id = mobj.group('id')
        webpage = self._download_webpage(url, display_id)
        data_json = self._html_search_regex(
            r"<div class=\"cnetVideoPlayer\" data-cnet-video-options='([^']+)'",
            webpage, 'data json')
        data = json.loads(data_json)
        vdata = data['video']
        if not vdata:
            vdata = data['videos'][0]
        if not vdata:
            raise ExtractorError('Cannot find video data')
        video_id = vdata['id']
        title = vdata['headline']
        description = vdata.get('dek')
        thumbnail = vdata.get('image', {}).get('path')
        author = vdata.get('author')
        if author:
            uploader = '%s %s' % (author['firstName'], author['lastName'])
            uploader_id = author.get('email')
        else:
            uploader = None
            uploader_id = None
        formats = [{
            'format_id': '%s-%s-%s' % (
                f['type'], f['format'],
                int_or_none(f.get('bitrate'), 1000, default='')),
            'url': f['uri'],
            'tbr': int_or_none(f.get('bitrate'), 1000),
        } for f in vdata['files']['data']]
        self._sort_formats(formats)
        return {
            'id': video_id,
            'display_id': display_id,
            'title': title,
            'formats': formats,
            'description': description,
            'uploader': uploader,
            'uploader_id': uploader_id,
            'thumbnail': thumbnail,
        }
--- a/youtube_dl/extractor/comedycentral.py
+++ b/youtube_dl/extractor/comedycentral.py
@@ -41,13 +41,15 @@ class ComedyCentralShowsIE(InfoExtractor):
    _VALID_URL = r'''(?x)^(:(?P<shortname>tds|thedailyshow|cr|colbert|colbertnation|colbertreport)
                      |https?://(:www\.)?
                          (?P<showname>thedailyshow|thecolbertreport)\.(?:cc\.)?com/
-                         (full-episodes/(?P<episode>.*)|
+                         (full-episodes/(?:[0-9a-z]{6}/)?(?P<episode>.*)|
                          (?P<clip>
-                              (the-colbert-report-(videos|collections)/(?P<clipID>[0-9]+)/[^/]*/(?P<cntitle>.*?))
+                              (?:(?:guests/[^/]+|videos)/[^/]+/(?P<videotitle>[^/?#]+))
-                              |(watch/(?P<date>[^/]*)/(?P<tdstitle>.*)))|
+                              |(the-colbert-report-(videos|collections)/(?P<clipID>[0-9]+)/[^/]*/(?P<cntitle>.*?))
                              |(watch/(?P<date>[^/]*)/(?P<tdstitle>.*))
                          )|
                          (?P<interview>
                              extended-interviews/(?P<interID>[0-9a-z]+)/(?:playlist_tds_extended_)?(?P<interview_title>.*?)(/.*?)?)))
-                     $'''
+                     (?:[?#].*|$)'''
    _TEST = {
        'url': 'http://thedailyshow.cc.com/watch/thu-december-13-2012/kristen-stewart',
        'md5': '4e2f5cb088a83cd8cdb7756132f9739d',
@@ -57,7 +59,7 @@ class ComedyCentralShowsIE(InfoExtractor):
            'upload_date': '20121213',
            'description': 'Kristen Stewart learns to let loose in "On the Road."',
            'uploader': 'thedailyshow',
-            'title': 'thedailyshow-kristen-stewart part 1',
+            'title': 'thedailyshow kristen-stewart part 1',
        }
    }
@@ -102,7 +104,9 @@ class ComedyCentralShowsIE(InfoExtractor):
            assert mobj is not None
        if mobj.group('clip'):
-            if mobj.group('showname') == 'thedailyshow':
+            if mobj.group('videotitle'):
                epTitle = mobj.group('videotitle')
            elif mobj.group('showname') == 'thedailyshow':
                epTitle = mobj.group('tdstitle')
            else:
                epTitle = mobj.group('cntitle')
@@ -161,7 +165,7 @@ class ComedyCentralShowsIE(InfoExtractor):
            content = itemEl.find('.//{http://search.yahoo.com/mrss/}content')
            duration = float_or_none(content.attrib.get('duration'))
            mediagen_url = content.attrib['url']
-            guid = itemEl.find('.//guid').text.rpartition(':')[-1]
+            guid = itemEl.find('./guid').text.rpartition(':')[-1]
            cdoc = self._download_xml(
                mediagen_url, epTitle,
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -252,6 +252,17 @@ class InfoExtractor(object):
                outf.write(webpage_bytes)
        content = webpage_bytes.decode(encoding, 'replace')
        if (u'<title>Access to this site is blocked</title>' in content and
                u'Websense' in content[:512]):
            msg = u'Access to this webpage has been blocked by Websense filtering software in your network.'
            blocked_iframe = self._html_search_regex(
                r'<iframe src="([^"]+)"', content,
                u'Websense information URL', default=None)
            if blocked_iframe:
                msg += u' Visit %s for more details' % blocked_iframe
            raise ExtractorError(msg, expected=True)
        return (content, urlh)
    def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True):
--- a/youtube_dl/extractor/cspan.py
+++ b/youtube_dl/extractor/cspan.py
@@ -4,6 +4,7 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
    unescapeHTML,
    find_xpath_attr,
 )
@@ -54,18 +55,29 @@ class CSpanIE(InfoExtractor):
        info_url = 'http://c-spanvideo.org/videoLibrary/assets/player/ajax-player.php?os=android&html5=program&id=' + video_id
        data = self._download_json(info_url, video_id)
-        url = unescapeHTML(data['video']['files'][0]['path']['#text'])
+        doc = self._download_xml(
-
+            'http://www.c-span.org/common/services/flashXml.php?programid=' + video_id,
        doc = self._download_xml('http://www.c-span.org/common/services/flashXml.php?programid=' + video_id,
            video_id)
-        def find_string(s):
+        title = find_xpath_attr(doc, './/string', 'name', 'title').text
-            return find_xpath_attr(doc, './/string', 'name', s).text
+        thumbnail = find_xpath_attr(doc, './/string', 'name', 'poster').text
        files = data['video']['files']
        entries = [{
            'id': '%s_%d' % (video_id, partnum + 1),
            'title': (
                title if len(files) == 1 else
                '%s part %d' % (title, partnum + 1)),
            'url': unescapeHTML(f['path']['#text']),
            'description': description,
            'thumbnail': thumbnail,
            'duration': int_or_none(f.get('length', {}).get('#text')),
        } for partnum, f in enumerate(files)]
        return {
            '_type': 'playlist',
            'entries': entries,
            'title': title,
            'id': video_id,
            'title': find_string('title'),
            'url': url,
            'description': description,
            'thumbnail': find_string('poster'),
        }
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -8,7 +8,6 @@ from .subtitles import SubtitlesInfoExtractor
 from ..utils import (
    compat_urllib_request,
    compat_str,
    get_element_by_attribute,
    get_element_by_id,
    orderedSet,
    str_to_int,
@@ -180,7 +179,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
 class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
    IE_NAME = u'dailymotion:playlist'
    _VALID_URL = r'(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/playlist/(?P<id>.+?)/'
-    _MORE_PAGES_INDICATOR = r'<div class="next">.*?<a.*?href="/playlist/.+?".*?>.*?</a>.*?</div>'
+    _MORE_PAGES_INDICATOR = r'(?s)<div class="pages[^"]*">.*?<a\s+class="[^"]*?icon-arrow_right[^"]*?"'
    _PAGE_TEMPLATE = 'https://www.dailymotion.com/playlist/%s/%s'
    def _extract_entries(self, id):
@@ -190,10 +189,9 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
            webpage = self._download_webpage(request,
                                             id, u'Downloading page %s' % pagenum)
-            playlist_el = get_element_by_attribute(u'class', u'row video_list', webpage)
+            video_ids.extend(re.findall(r'data-id="(.+?)"', webpage))
            video_ids.extend(re.findall(r'data-id="(.+?)"', playlist_el))
-            if re.search(self._MORE_PAGES_INDICATOR, webpage, re.DOTALL) is None:
+            if re.search(self._MORE_PAGES_INDICATOR, webpage) is None:
                break
        return [self.url_result('http://www.dailymotion.com/video/%s' % video_id, 'Dailymotion')
                   for video_id in orderedSet(video_ids)]
@@ -212,8 +210,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
 class DailymotionUserIE(DailymotionPlaylistIE):
    IE_NAME = u'dailymotion:user'
-    _VALID_URL = r'(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/user/(?P<user>[^/]+)'
+    _VALID_URL = r'https?://(?:www\.)?dailymotion\.[a-z]{2,3}/user/(?P<user>[^/]+)'
    _MORE_PAGES_INDICATOR = r'<div class="next">.*?<a.*?href="/user/.+?".*?>.*?</a>.*?</div>'
    _PAGE_TEMPLATE = 'http://www.dailymotion.com/user/%s/%s'
    def _real_extract(self, url):
--- a/youtube_dl/extractor/divxstage.py
+++ b/youtube_dl/extractor/divxstage.py
@@ -0,0 +1,27 @@
 from __future__ import unicode_literals
 from .novamov import NovaMovIE
 class DivxStageIE(NovaMovIE):
    IE_NAME = 'divxstage'
    IE_DESC = 'DivxStage'
    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'divxstage\.(?:eu|net|ch|co|at|ag)'}
    _HOST = 'www.divxstage.eu'
    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    _TITLE_REGEX = r'<div class="video_det">\s*<strong>([^<]+)</strong>'
    _DESCRIPTION_REGEX = r'<div class="video_det">\s*<strong>[^<]+</strong>\s*<p>([^<]+)</p>'
    _TEST = {
        'url': 'http://www.divxstage.eu/video/57f238e2e5e01',
        'md5': '63969f6eb26533a1968c4d325be63e72',
        'info_dict': {
            'id': '57f238e2e5e01',
            'ext': 'flv',
            'title': 'youtubedl test video',
            'description': 'This is a test video for youtubedl.',
        }
    }
--- a/youtube_dl/extractor/franceculture.py
+++ b/youtube_dl/extractor/franceculture.py
@@ -0,0 +1,77 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import json
 import re
 from .common import InfoExtractor
 from ..utils import (
    compat_parse_qs,
    compat_urlparse,
 )
 class FranceCultureIE(InfoExtractor):
    _VALID_URL = r'(?P<baseurl>http://(?:www\.)?franceculture\.fr/)player/reecouter\?play=(?P<id>[0-9]+)'
    _TEST = {
        'url': 'http://www.franceculture.fr/player/reecouter?play=4795174',
        'info_dict': {
            'id': '4795174',
            'ext': 'mp3',
            'title': 'Rendez-vous au pays des geeks',
            'vcodec': 'none',
            'uploader': 'Colette Fellous',
            'upload_date': '20140301',
            'duration': 3601,
            'thumbnail': r're:^http://www\.franceculture\.fr/.*/images/player/Carnet-nomade\.jpg$',
            'description': 'Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats-Unis dans la S ...',
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        baseurl = mobj.group('baseurl')
        webpage = self._download_webpage(url, video_id)
        params_code = self._search_regex(
            r"<param name='movie' value='/sites/all/modules/rf/rf_player/swf/loader.swf\?([^']+)' />",
            webpage, 'parameter code')
        params = compat_parse_qs(params_code)
        video_url = compat_urlparse.urljoin(baseurl, params['urlAOD'][0])
        title = self._html_search_regex(
            r'<h1 class="title[^"]+">(.+?)</h1>', webpage, 'title')
        uploader = self._html_search_regex(
            r'(?s)<div id="emission".*?<span class="author">(.*?)</span>',
            webpage, 'uploader', fatal=False)
        thumbnail_part = self._html_search_regex(
            r'(?s)<div id="emission".*?<img src="([^"]+)"', webpage,
            'thumbnail', fatal=False)
        if thumbnail_part is None:
            thumbnail = None
        else:
            thumbnail = compat_urlparse.urljoin(baseurl, thumbnail_part)
        description = self._html_search_regex(
            r'(?s)<p class="desc">(.*?)</p>', webpage, 'description')
        info = json.loads(params['infoData'][0])[0]
        duration = info.get('media_length')
        upload_date_candidate = info.get('media_section5')
        upload_date = (
            upload_date_candidate
            if (upload_date_candidate is not None and
                re.match(r'[0-9]{8}$', upload_date_candidate))
            else None)
        return {
            'id': video_id,
            'url': video_url,
            'vcodec': 'none' if video_url.lower().endswith('.mp3') else None,
            'duration': duration,
            'uploader': uploader,
            'upload_date': upload_date,
            'title': title,
            'thumbnail': thumbnail,
            'description': description,
        }
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -82,6 +82,17 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Brightcove'],
        },
        {
            'url': 'http://www.championat.com/video/football/v/87/87499.html',
            'md5': 'fb973ecf6e4a78a67453647444222983',
            'info_dict': {
                'id': '3414141473001',
                'ext': 'mp4',
                'title': 'Видео. Удаление Дзагоева (ЦСКА)',
                'description': 'Онлайн-трансляция матча ЦСКА - "Волга"',
                'uploader': 'Championat',
            },
        },
        # Direct link to a video
        {
            'url': 'http://media.w3.org/2010/05/sintel/trailer.mp4',
@@ -103,20 +114,6 @@ class GenericIE(InfoExtractor):
                'title': '2cc213299525360.mov',  # that's what we get
            },
        },
        # second style of embedded ooyala videos
        {
            'url': 'http://www.smh.com.au/tv/business/show/financial-review-sunday/behind-the-scenes-financial-review-sunday--4350201.html',
            'info_dict': {
                'id': '13djJjYjptA1XpPx8r9kuzPyj3UZH0Uk',
                'ext': 'mp4',
                'title': 'Behind-the-scenes: Financial Review Sunday ',
                'description': 'Step inside Channel Nine studios for an exclusive tour of its upcoming financial business show.',
            },
            'params': {
                # m3u8 download
                'skip_download': True,
            },
        },
        # google redirect
        {
            'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
@@ -187,6 +184,17 @@ class GenericIE(InfoExtractor):
                'description': 'md5:ddb2a40ecd6b6a147e400e535874947b',
            }
        },
        # Embeded Ustream video
        {
            'url': 'http://www.american.edu/spa/pti/nsa-privacy-janus-2014.cfm',
            'md5': '27b99cdb639c9b12a79bca876a073417',
            'info_dict': {
                'id': '45734260',
                'ext': 'flv',
                'uploader': 'AU SPA:  The NSA and Privacy',
                'title': 'NSA and Privacy Forum Debate featuring General Hayden and Barton Gellman'
            }
        },
        # nowvideo embed hidden behind percent encoding
        {
            'url': 'http://www.waoanime.tv/the-super-dimension-fortress-macross-episode-1/',
@@ -503,17 +511,18 @@ class GenericIE(InfoExtractor):
        if mobj is not None:
            return self.url_result(mobj.group(1), 'Mpora')
-        # Look for embedded NovaMov player
+        # Look for embedded NovaMov-based player
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>http://(?:(?:embed|www)\.)?novamov\.com/embed\.php.+?)\1', webpage)
+            r'''(?x)<iframe[^>]+?src=(["\'])
                    (?P<url>http://(?:(?:embed|www)\.)?
                        (?:novamov\.com|
                           nowvideo\.(?:ch|sx|eu|at|ag|co)|
                           videoweed\.(?:es|com)|
                           movshare\.(?:net|sx|ag)|
                           divxstage\.(?:eu|net|ch|co|at|ag))
                        /embed\.php.+?)\1''', webpage)
        if mobj is not None:
-            return self.url_result(mobj.group('url'), 'NovaMov')
+            return self.url_result(mobj.group('url'))
        # Look for embedded NowVideo player
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>http://(?:(?:embed|www)\.)?nowvideo\.(?:ch|sx|eu)/embed\.php.+?)\1', webpage)
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'NowVideo')
        # Look for embedded Facebook player
        mobj = re.search(
@@ -559,6 +568,12 @@ class GenericIE(InfoExtractor):
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'TED')
        # Look for embedded Ustream videos
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'Ustream')
        # Look for embedded arte.tv player
        mobj = re.search(
            r'<script [^>]*?src="(?P<url>http://www\.arte\.tv/playerv2/embed[^"]+)"',
--- a/youtube_dl/extractor/justintv.py
+++ b/youtube_dl/extractor/justintv.py
@@ -1,9 +1,12 @@
 from __future__ import unicode_literals
 import json
 import os
 import re
 from .common import InfoExtractor
 from ..utils import (
    compat_str,
    ExtractorError,
    formatSeconds,
 )
@@ -24,34 +27,31 @@ class JustinTVIE(InfoExtractor):
        /?(?:\#.*)?$
        """
    _JUSTIN_PAGE_LIMIT = 100
-    IE_NAME = u'justin.tv'
+    IE_NAME = 'justin.tv'
    IE_DESC = 'justin.tv and twitch.tv'
    _TEST = {
-        u'url': u'http://www.twitch.tv/thegamedevhub/b/296128360',
+        'url': 'http://www.twitch.tv/thegamedevhub/b/296128360',
-        u'file': u'296128360.flv',
+        'md5': 'ecaa8a790c22a40770901460af191c9a',
-        u'md5': u'ecaa8a790c22a40770901460af191c9a',
+        'info_dict': {
-        u'info_dict': {
+            'id': '296128360',
-            u"upload_date": u"20110927", 
+            'ext': 'flv',
-            u"uploader_id": 25114803, 
+            'upload_date': '20110927',
-            u"uploader": u"thegamedevhub", 
+            'uploader_id': 25114803,
-            u"title": u"Beginner Series - Scripting With Python Pt.1"
+            'uploader': 'thegamedevhub',
            'title': 'Beginner Series - Scripting With Python Pt.1'
        }
    }
    def report_download_page(self, channel, offset):
        """Report attempt to download a single page of videos."""
        self.to_screen(u'%s: Downloading video information from %d to %d' %
                (channel, offset, offset + self._JUSTIN_PAGE_LIMIT))
    # Return count of items, list of *valid* items
    def _parse_page(self, url, video_id):
        info_json = self._download_webpage(url, video_id,
-                                           u'Downloading video info JSON',
+                                           'Downloading video info JSON',
-                                           u'unable to download video info JSON')
+                                           'unable to download video info JSON')
        response = json.loads(info_json)
        if type(response) != list:
            error_text = response.get('error', 'unknown error')
-            raise ExtractorError(u'Justin.tv API: %s' % error_text)
+            raise ExtractorError('Justin.tv API: %s' % error_text)
        info = []
        for clip in response:
            video_url = clip['video_file_url']
@@ -62,7 +62,7 @@ class JustinTVIE(InfoExtractor):
                video_id = clip['id']
                video_title = clip.get('title', video_id)
                info.append({
-                    'id': video_id,
+                    'id': compat_str(video_id),
                    'url': video_url,
                    'title': video_title,
                    'uploader': clip.get('channel_name', video_uploader_id),
@@ -74,8 +74,6 @@ class JustinTVIE(InfoExtractor):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        if mobj is None:
            raise ExtractorError(u'invalid URL: %s' % url)
        api_base = 'http://api.justin.tv'
        paged = False
@@ -89,40 +87,41 @@ class JustinTVIE(InfoExtractor):
            webpage = self._download_webpage(url, chapter_id)
            m = re.search(r'PP\.archive_id = "([0-9]+)";', webpage)
            if not m:
-                raise ExtractorError(u'Cannot find archive of a chapter')
+                raise ExtractorError('Cannot find archive of a chapter')
            archive_id = m.group(1)
            api = api_base + '/broadcast/by_chapter/%s.xml' % chapter_id
-            doc = self._download_xml(api, chapter_id,
+            doc = self._download_xml(
-                                             note=u'Downloading chapter information',
+                api, chapter_id,
-                                             errnote=u'Chapter information download failed')
+                note='Downloading chapter information',
                errnote='Chapter information download failed')
            for a in doc.findall('.//archive'):
                if archive_id == a.find('./id').text:
                    break
            else:
-                raise ExtractorError(u'Could not find chapter in chapter information')
+                raise ExtractorError('Could not find chapter in chapter information')
            video_url = a.find('./video_file_url').text
-            video_ext = video_url.rpartition('.')[2] or u'flv'
+            video_ext = video_url.rpartition('.')[2] or 'flv'
-            chapter_api_url = u'https://api.twitch.tv/kraken/videos/c' + chapter_id
+            chapter_api_url = 'https://api.twitch.tv/kraken/videos/c' + chapter_id
-            chapter_info_json = self._download_webpage(chapter_api_url, u'c' + chapter_id,
+            chapter_info = self._download_json(
-                                   note='Downloading chapter metadata',
+                chapter_api_url, 'c' + chapter_id,
-                                   errnote='Download of chapter metadata failed')
+                note='Downloading chapter metadata',
-            chapter_info = json.loads(chapter_info_json)
+                errnote='Download of chapter metadata failed')
            bracket_start = int(doc.find('.//bracket_start').text)
            bracket_end = int(doc.find('.//bracket_end').text)
            # TODO determine start (and probably fix up file)
            #  youtube-dl -v http://www.twitch.tv/firmbelief/c/1757457
-            #video_url += u'?start=' + TODO:start_timestamp
+            #video_url += '?start=' + TODO:start_timestamp
            # bracket_start is 13290, but we want 51670615
-            self._downloader.report_warning(u'Chapter detected, but we can just download the whole file. '
+            self._downloader.report_warning('Chapter detected, but we can just download the whole file. '
-                                            u'Chapter starts at %s and ends at %s' % (formatSeconds(bracket_start), formatSeconds(bracket_end)))
+                                            'Chapter starts at %s and ends at %s' % (formatSeconds(bracket_start), formatSeconds(bracket_end)))
            info = {
-                'id': u'c' + chapter_id,
+                'id': 'c' + chapter_id,
                'url': video_url,
                'ext': video_ext,
                'title': chapter_info['title'],
@@ -131,14 +130,12 @@ class JustinTVIE(InfoExtractor):
                'uploader': chapter_info['channel']['display_name'],
                'uploader_id': chapter_info['channel']['name'],
            }
-            return [info]
+            return info
        else:
            video_id = mobj.group('videoid')
            api = api_base + '/broadcast/by_archive/%s.json' % video_id
-        self.report_extraction(video_id)
+        entries = []
        info = []
        offset = 0
        limit = self._JUSTIN_PAGE_LIMIT
        while True:
@@ -146,8 +143,12 @@ class JustinTVIE(InfoExtractor):
                self.report_download_page(video_id, offset)
            page_url = api + ('?offset=%d&limit=%d' % (offset, limit))
            page_count, page_info = self._parse_page(page_url, video_id)
-            info.extend(page_info)
+            entries.extend(page_info)
            if not paged or page_count != limit:
                break
            offset += limit
-        return info
+        return {
            '_type': 'playlist',
            'id': video_id,
            'entries': entries,
        }
--- a/youtube_dl/extractor/keezmovies.py
+++ b/youtube_dl/extractor/keezmovies.py
@@ -1,3 +1,5 @@
 from __future__ import unicode_literals
 import os
 import re
@@ -11,22 +13,22 @@ from ..aes import (
    aes_decrypt_text
 )
 class KeezMoviesIE(InfoExtractor):
-    _VALID_URL = r'^(?:https?://)?(?:www\.)?(?P<url>keezmovies\.com/video/.+?(?P<videoid>[0-9]+))(?:[/?&]|$)'
+    _VALID_URL = r'^https?://(?:www\.)?keezmovies\.com/video/.+?(?P<videoid>[0-9]+)(?:[/?&]|$)'
    _TEST = {
-        u'url': u'http://www.keezmovies.com/video/petite-asian-lady-mai-playing-in-bathtub-1214711',
+        'url': 'http://www.keezmovies.com/video/petite-asian-lady-mai-playing-in-bathtub-1214711',
-        u'file': u'1214711.mp4',
+        'file': '1214711.mp4',
-        u'md5': u'6e297b7e789329923fcf83abb67c9289',
+        'md5': '6e297b7e789329923fcf83abb67c9289',
-        u'info_dict': {
+        'info_dict': {
-            u"title": u"Petite Asian Lady Mai Playing In Bathtub",
+            'title': 'Petite Asian Lady Mai Playing In Bathtub',
-            u"age_limit": 18,
+            'age_limit': 18,
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('videoid')
        url = 'http://www.' + mobj.group('url')
        req = compat_urllib_request.Request(url)
        req.add_header('Cookie', 'age_verified=1')
@@ -38,10 +40,10 @@ class KeezMoviesIE(InfoExtractor):
            embedded_url = mobj.group(1)
            return self.url_result(embedded_url)
-        video_title = self._html_search_regex(r'<h1 [^>]*>([^<]+)', webpage, u'title')
+        video_title = self._html_search_regex(r'<h1 [^>]*>([^<]+)', webpage, 'title')
-        video_url = compat_urllib_parse.unquote(self._html_search_regex(r'video_url=(.+?)&amp;', webpage, u'video_url'))
+        video_url = compat_urllib_parse.unquote(self._html_search_regex(r'video_url=(.+?)&amp;', webpage, 'video_url'))
-        if webpage.find('encrypted=true')!=-1:
+        if 'encrypted=true' in webpage:
-            password = self._html_search_regex(r'video_title=(.+?)&amp;', webpage, u'password')
+            password = self._html_search_regex(r'video_title=(.+?)&amp;', webpage, 'password')
            video_url = aes_decrypt_text(video_url, password, 32).decode('utf-8')
        path = compat_urllib_parse_urlparse(video_url).path
        extension = os.path.splitext(path)[1][1:]
--- a/youtube_dl/extractor/morningstar.py
+++ b/youtube_dl/extractor/morningstar.py
@@ -0,0 +1,47 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 class MorningstarIE(InfoExtractor):
    IE_DESC = 'morningstar.com'
    _VALID_URL = r'https?://(?:www\.)?morningstar\.com/cover/videocenter\.aspx\?id=(?P<id>[0-9]+)'
    _TEST = {
        'url': 'http://www.morningstar.com/cover/videocenter.aspx?id=615869',
        'md5': '6c0acface7a787aadc8391e4bbf7b0f5',
        'info_dict': {
            'id': '615869',
            'ext': 'mp4',
            'title': 'Get Ahead of the Curve on 2013 Taxes',
            'description': "Vanguard's Joel Dickson on managing higher tax rates for high-income earners and fund capital-gain distributions in 2013.",
            'thumbnail': r're:^https?://.*m(?:orning)?star\.com/.+thumb\.jpg$'
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        webpage = self._download_webpage(url, video_id)
        title = self._html_search_regex(
            r'<h1 id="titleLink">(.*?)</h1>', webpage, 'title')
        video_url = self._html_search_regex(
            r'<input type="hidden" id="hidVideoUrl" value="([^"]+)"',
            webpage, 'video URL')
        thumbnail = self._html_search_regex(
            r'<input type="hidden" id="hidSnapshot" value="([^"]+)"',
            webpage, 'thumbnail', fatal=False)
        description = self._html_search_regex(
            r'<div id="mstarDeck".*?>(.*?)</div>',
            webpage, 'description', fatal=False)
        return {
            'id': video_id,
            'title': title,
            'url': video_url,
            'thumbnail': thumbnail,
            'description': description,
        }
--- a/youtube_dl/extractor/motorsport.py
+++ b/youtube_dl/extractor/motorsport.py
@@ -0,0 +1,63 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import hashlib
 import json
 import re
 import time
 from .common import InfoExtractor
 from ..utils import (
    compat_parse_qs,
    compat_str,
    int_or_none,
 )
 class MotorsportIE(InfoExtractor):
    IE_DESC = 'motorsport.com'
    _VALID_URL = r'http://www\.motorsport\.com/[^/?#]+/video/(?:[^/?#]+/)(?P<id>[^/]+)/(?:$|[?#])'
    _TEST = {
        'url': 'http://www.motorsport.com/f1/video/main-gallery/red-bull-racing-2014-rules-explained/',
        'md5': '5592cb7c5005d9b2c163df5ac3dc04e4',
        'info_dict': {
            'id': '7063',
            'ext': 'mp4',
            'title': 'Red Bull Racing: 2014 Rules Explained',
            'duration': 207,
            'description': 'A new clip from Red Bull sees Daniel Ricciardo and Sebastian Vettel explain the 2014 Formula One regulations – which are arguably the most complex the sport has ever seen.',
            'uploader': 'rainiere',
            'thumbnail': r're:^http://.*motorsport\.com/.+\.jpg$'
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        display_id = mobj.group('id')
        webpage = self._download_webpage(url, display_id)
        flashvars_code = self._html_search_regex(
            r'<embed id="player".*?flashvars="([^"]+)"', webpage, 'flashvars')
        flashvars = compat_parse_qs(flashvars_code)
        params = json.loads(flashvars['parameters'][0])
        e = compat_str(int(time.time()) + 24 * 60 * 60)
        base_video_url = params['location'] + '?e=' + e
        s = 'h3hg713fh32'
        h = hashlib.md5((s + base_video_url).encode('utf-8')).hexdigest()
        video_url = base_video_url + '&h=' + h
        uploader = self._html_search_regex(
            r'(?s)<span class="label">Video by: </span>(.*?)</a>', webpage,
            'uploader', fatal=False)
        return {
            'id': params['video_id'],
            'display_id': display_id,
            'title': params['title'],
            'url': video_url,
            'description': params.get('description'),
            'thumbnail': params.get('main_thumb'),
            'duration': int_or_none(params.get('duration')),
            'uploader': uploader,
        }
--- a/youtube_dl/extractor/movshare.py
+++ b/youtube_dl/extractor/movshare.py
@@ -0,0 +1,27 @@
 from __future__ import unicode_literals
 from .novamov import NovaMovIE
 class MovShareIE(NovaMovIE):
    IE_NAME = 'movshare'
    IE_DESC = 'MovShare'
    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'movshare\.(?:net|sx|ag)'}
    _HOST = 'www.movshare.net'
    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    _TITLE_REGEX = r'<strong>Title:</strong> ([^<]+)</p>'
    _DESCRIPTION_REGEX = r'<strong>Description:</strong> ([^<]+)</p>'
    _TEST = {
        'url': 'http://www.movshare.net/video/559e28be54d96',
        'md5': 'abd31a2132947262c50429e1d16c1bfd',
        'info_dict': {
            'id': '559e28be54d96',
            'ext': 'flv',
            'title': 'dissapeared image',
            'description': 'optical illusion  dissapeared image  magic illusion',
        }
    }
--- a/youtube_dl/extractor/musicplayon.py
+++ b/youtube_dl/extractor/musicplayon.py
@@ -0,0 +1,75 @@
 # encoding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import int_or_none
 class MusicPlayOnIE(InfoExtractor):
    _VALID_URL = r'https?://(?:.+?\.)?musicplayon\.com/play(?:-touch)?\?(?:v|pl=100&play)=(?P<id>\d+)'
    _TEST = {
        'url': 'http://en.musicplayon.com/play?v=433377',
        'info_dict': {
            'id': '433377',
            'ext': 'mp4',
            'title': 'Rick Ross - Interview On Chelsea Lately (2014)',
            'description': 'Rick Ross Interview On Chelsea Lately',
            'duration': 342,
            'uploader': 'ultrafish',
        },
        'params': {
            # m3u8 download
            'skip_download': True,
        },
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        page = self._download_webpage(url, video_id)
        title = self._og_search_title(page)
        description = self._og_search_description(page)
        thumbnail = self._og_search_thumbnail(page)
        duration = self._html_search_meta('video:duration', page, 'duration', fatal=False)
        view_count = self._og_search_property('count', page, fatal=False)
        uploader = self._html_search_regex(
            r'<div>by&nbsp;<a href="[^"]+" class="purple">([^<]+)</a></div>', page, 'uploader', fatal=False)
        formats = [
            {
                'url': 'http://media0-eu-nl.musicplayon.com/stream-mobile?id=%s&type=.mp4' % video_id,
                'ext': 'mp4',
            }
        ]
        manifest = self._download_webpage(
            'http://en.musicplayon.com/manifest.m3u8?v=%s' % video_id, video_id, 'Downloading manifest')
        for entry in manifest.split('#')[1:]:
            if entry.startswith('EXT-X-STREAM-INF:'):
                meta, url, _ = entry.split('\n')
                params = dict(param.split('=') for param in meta.split(',')[1:])
                formats.append({
                    'url': url,
                    'ext': 'mp4',
                    'tbr': int(params['BANDWIDTH']),
                    'width': int(params['RESOLUTION'].split('x')[1]),
                    'height': int(params['RESOLUTION'].split('x')[-1]),
                    'format_note': params['NAME'].replace('"', '').strip(),
                })
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'uploader': uploader,
            'duration': int_or_none(duration),
            'view_count': int_or_none(view_count),
            'formats': formats,
        }
--- a/youtube_dl/extractor/novamov.py
+++ b/youtube_dl/extractor/novamov.py
@@ -13,7 +13,8 @@ class NovaMovIE(InfoExtractor):
    IE_NAME = 'novamov'
    IE_DESC = 'NovaMov'
-    _VALID_URL = r'http://(?:(?:www\.)?%(host)s/video/|(?:(?:embed|www)\.)%(host)s/embed\.php\?(?:.*?&)?v=)(?P<videoid>[a-z\d]{13})' % {'host': 'novamov\.com'}
+    _VALID_URL_TEMPLATE = r'http://(?:(?:www\.)?%(host)s/(?:file|video)/|(?:(?:embed|www)\.)%(host)s/embed\.php\?(?:.*?&)?v=)(?P<id>[a-z\d]{13})'
    _VALID_URL = _VALID_URL_TEMPLATE % {'host': 'novamov\.com'}
    _HOST = 'www.novamov.com'
@@ -36,18 +37,17 @@ class NovaMovIE(InfoExtractor):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('videoid')
+        video_id = mobj.group('id')
        page = self._download_webpage(
            'http://%s/video/%s' % (self._HOST, video_id), video_id, 'Downloading video page')
        if re.search(self._FILE_DELETED_REGEX, page) is not None:
-            raise ExtractorError(u'Video %s does not exist' % video_id, expected=True)
+            raise ExtractorError('Video %s does not exist' % video_id, expected=True)
        filekey = self._search_regex(self._FILEKEY_REGEX, page, 'filekey')
        title = self._html_search_regex(self._TITLE_REGEX, page, 'title', fatal=False)
        description = self._html_search_regex(self._DESCRIPTION_REGEX, page, 'description', default='', fatal=False)
        api_response = self._download_webpage(
--- a/youtube_dl/extractor/nowvideo.py
+++ b/youtube_dl/extractor/nowvideo.py
@@ -7,7 +7,7 @@ class NowVideoIE(NovaMovIE):
    IE_NAME = 'nowvideo'
    IE_DESC = 'NowVideo'
-    _VALID_URL = r'http://(?:(?:www\.)?%(host)s/video/|(?:(?:embed|www)\.)%(host)s/embed\.php\?(?:.*?&)?v=)(?P<videoid>[a-z\d]{13})' % {'host': 'nowvideo\.(?:ch|sx|eu)'}
+    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'nowvideo\.(?:ch|sx|eu|at|ag|co)'}
    _HOST = 'www.nowvideo.ch'
--- a/youtube_dl/extractor/pornhd.py
+++ b/youtube_dl/extractor/pornhd.py
@@ -1,44 +1,81 @@
 from __future__ import unicode_literals
 import re
 import json
 from .common import InfoExtractor
-from ..utils import compat_urllib_parse
+from ..utils import int_or_none
 class PornHdIE(InfoExtractor):
-    _VALID_URL = r'(?:http://)?(?:www\.)?pornhd\.com/(?:[a-z]{2,4}/)?videos/(?P<video_id>[0-9]+)/(?P<video_title>.+)'
+    _VALID_URL = r'http://(?:www\.)?pornhd\.com/(?:[a-z]{2,4}/)?videos/(?P<id>\d+)'
    _TEST = {
        'url': 'http://www.pornhd.com/videos/1962/sierra-day-gets-his-cum-all-over-herself-hd-porn-video',
-        'file': '1962.flv',
+        'md5': '956b8ca569f7f4d8ec563e2c41598441',
        'md5': '35272469887dca97abd30abecc6cdf75',
        'info_dict': {
-            "title": "sierra-day-gets-his-cum-all-over-herself-hd-porn-video",
+            'id': '1962',
-            "age_limit": 18,
+            'ext': 'mp4',
            'title': 'Sierra loves doing laundry',
            'description': 'md5:8ff0523848ac2b8f9b065ba781ccf294',
            'age_limit': 18,
        }
    }
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-
+        video_id = mobj.group('id')
        video_id = mobj.group('video_id')
        video_title = mobj.group('video_title')
        webpage = self._download_webpage(url, video_id)
-        next_url = self._html_search_regex(
+        title = self._og_search_title(webpage)
-            r'&hd=(http.+?)&', webpage, 'video URL')
+        TITLE_SUFFIX = ' porn HD Video | PornHD.com '
-        next_url = compat_urllib_parse.unquote(next_url)
+        if title.endswith(TITLE_SUFFIX):
            title = title[:-len(TITLE_SUFFIX)]
-        video_url = self._download_webpage(
+        description = self._html_search_regex(
-            next_url, video_id, note='Retrieving video URL',
+            r'<div class="description">([^<]+)</div>', webpage, 'description', fatal=False)
-            errnote='Could not retrieve video URL')
+        view_count = int_or_none(self._html_search_regex(
-        age_limit = 18
+            r'(\d+) views 	</span>', webpage, 'view count', fatal=False))
        formats = [
            {
                'url': format_url,
                'ext': format.lower(),
                'format_id': '%s-%s' % (format.lower(), quality.lower()),
                'quality': 1 if quality.lower() == 'high' else 0,
            } for format, quality, format_url in re.findall(
                r'var __video([\da-zA-Z]+?)(Low|High)StreamUrl = \'(http://.+?)\?noProxy=1\'', webpage)
        ]
        mobj = re.search(r'flashVars = (?P<flashvars>{.+?});', webpage)
        if mobj:
            flashvars = json.loads(mobj.group('flashvars'))
            formats.extend([
                {
                    'url': flashvars['hashlink'].replace('?noProxy=1', ''),
                    'ext': 'flv',
                    'format_id': 'flv-low',
                    'quality': 0,
                },
                {
                    'url': flashvars['hd'].replace('?noProxy=1', ''),
                    'ext': 'flv',
                    'format_id': 'flv-high',
                    'quality': 1,
                }
            ])
            thumbnail = flashvars['urlWallpaper']
        else:
            thumbnail = self._og_search_thumbnail(webpage)
        self._sort_formats(formats)
        return {
            'id': video_id,
-            'url': video_url,
+            'title': title,
-            'ext': 'flv',
+            'description': description,
-            'title': video_title,
+            'thumbnail': thumbnail,
-            'age_limit': age_limit,
+            'view_count': view_count,
            'formats': formats,
            'age_limit': 18,
        }
--- a/youtube_dl/extractor/pyvideo.py
+++ b/youtube_dl/extractor/pyvideo.py
@@ -1,3 +1,5 @@
 from __future__ import unicode_literals
 import re
 import os
@@ -5,45 +7,50 @@ from .common import InfoExtractor
 class PyvideoIE(InfoExtractor):
-    _VALID_URL = r'(?:http://)?(?:www\.)?pyvideo\.org/video/(?P<id>\d+)/(.*)'
+    _VALID_URL = r'http://(?:www\.)?pyvideo\.org/video/(?P<id>\d+)/(.*)'
-    _TESTS = [{
+
-        u'url': u'http://pyvideo.org/video/1737/become-a-logging-expert-in-30-minutes',
+    _TESTS = [
-        u'file': u'24_4WWkSmNo.mp4',
+        {
-        u'md5': u'de317418c8bc76b1fd8633e4f32acbc6',
+            'url': 'http://pyvideo.org/video/1737/become-a-logging-expert-in-30-minutes',
-        u'info_dict': {
+            'md5': 'de317418c8bc76b1fd8633e4f32acbc6',
-            u"title": u"Become a logging expert in 30 minutes",
+            'info_dict': {
-            u"description": u"md5:9665350d466c67fb5b1598de379021f7",
+                'id': '24_4WWkSmNo',
-            u"upload_date": u"20130320",
+                'ext': 'mp4',
-            u"uploader": u"NextDayVideo",
+                'title': 'Become a logging expert in 30 minutes',
-            u"uploader_id": u"NextDayVideo",
+                'description': 'md5:9665350d466c67fb5b1598de379021f7',
                'upload_date': '20130320',
                'uploader': 'NextDayVideo',
                'uploader_id': 'NextDayVideo',
            },
            'add_ie': ['Youtube'],
        },
-        u'add_ie': ['Youtube'],
+        {
-    },
+            'url': 'http://pyvideo.org/video/2542/gloriajw-spotifywitherikbernhardsson182m4v',
-    {
+            'md5': '5fe1c7e0a8aa5570330784c847ff6d12',
-        u'url': u'http://pyvideo.org/video/2542/gloriajw-spotifywitherikbernhardsson182m4v',
+            'info_dict': {
-        u'md5': u'5fe1c7e0a8aa5570330784c847ff6d12',
+                'id': '2542',
-        u'info_dict': {
+                'ext': 'm4v',
-            u'id': u'2542',
+                'title': 'Gloriajw-SpotifyWithErikBernhardsson182',
-            u'ext': u'm4v',
+            },
            u'title': u'Gloriajw-SpotifyWithErikBernhardsson182',
        },
    },
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        webpage = self._download_webpage(url, video_id)
        m_youtube = re.search(r'(https?://www\.youtube\.com/watch\?v=.*)', webpage)
        webpage = self._download_webpage(url, video_id)
        m_youtube = re.search(r'(https?://www\.youtube\.com/watch\?v=.*)', webpage)
        if m_youtube is not None:
            return self.url_result(m_youtube.group(1), 'Youtube')
-        title = self._html_search_regex(r'<div class="section">.*?<h3>([^>]+?)</h3>',
+        title = self._html_search_regex(
-            webpage, u'title', flags=re.DOTALL)
+            r'<div class="section">.*?<h3>([^>]+?)</h3>', webpage, 'title', flags=re.DOTALL)
-        video_url = self._search_regex([r'<source src="(.*?)"',
+        video_url = self._search_regex(
-            r'<dt>Download</dt>.*?<a href="(.+?)"'],
+            [r'<source src="(.*?)"', r'<dt>Download</dt>.*?<a href="(.+?)"'],
-            webpage, u'video url', flags=re.DOTALL)
+            webpage, 'video url', flags=re.DOTALL)
        return {
            'id': video_id,
            'title': os.path.splitext(title)[0],
--- a/youtube_dl/extractor/ro220.py
+++ b/youtube_dl/extractor/ro220.py
@@ -18,7 +18,7 @@ class Ro220IE(InfoExtractor):
        'md5': '03af18b73a07b4088753930db7a34add',
        'info_dict': {
            "title": "Luati-le Banii sez 4 ep 1",
-            "description": "Iata-ne reveniti dupa o binemeritata vacanta. Va astept si pe Facebook cu pareri si comentarii.",
+            "description": "re:^Iata-ne reveniti dupa o binemeritata vacanta\. +Va astept si pe Facebook cu pareri si comentarii.$",
        }
    }
--- a/youtube_dl/extractor/rts.py
+++ b/youtube_dl/extractor/rts.py
@@ -9,46 +9,136 @@ from ..utils import (
    parse_duration,
    parse_iso8601,
    unescapeHTML,
    compat_str,
 )
 class RTSIE(InfoExtractor):
    IE_DESC = 'RTS.ch'
-    _VALID_URL = r'^https?://(?:www\.)?rts\.ch/archives/tv/[^/]+/(?P<id>[0-9]+)-.*?\.html'
+    _VALID_URL = r'^https?://(?:www\.)?rts\.ch/(?:[^/]+/){2,}(?P<id>[0-9]+)-.*?\.html'
-    _TEST = {
+    _TESTS = [
-        'url': 'http://www.rts.ch/archives/tv/divers/3449373-les-enfants-terribles.html',
+        {
-        'md5': '753b877968ad8afaeddccc374d4256a5',
+            'url': 'http://www.rts.ch/archives/tv/divers/3449373-les-enfants-terribles.html',
-        'info_dict': {
+            'md5': '753b877968ad8afaeddccc374d4256a5',
-            'id': '3449373',
+            'info_dict': {
-            'ext': 'mp4',
+                'id': '3449373',
-            'duration': 1488,
+                'ext': 'mp4',
-            'title': 'Les Enfants Terribles',
+                'duration': 1488,
-            'description': 'France Pommier et sa soeur Luce Feral, les deux filles de ce groupe de 5.',
+                'title': 'Les Enfants Terribles',
-            'uploader': 'Divers',
+                'description': 'France Pommier et sa soeur Luce Feral, les deux filles de ce groupe de 5.',
-            'upload_date': '19680921',
+                'uploader': 'Divers',
-            'timestamp': -40280400,
+                'upload_date': '19680921',
-            'thumbnail': 're:^https?://.*\.image'
+                'timestamp': -40280400,
                'thumbnail': 're:^https?://.*\.image'
            },
        },
-    }
+        {
            'url': 'http://www.rts.ch/emissions/passe-moi-les-jumelles/5624067-entre-ciel-et-mer.html',
            'md5': 'c148457a27bdc9e5b1ffe081a7a8337b',
            'info_dict': {
                'id': '5624067',
                'ext': 'mp4',
                'duration': 3720,
                'title': 'Les yeux dans les cieux - Mon homard au Canada',
                'description': 'md5:d22ee46f5cc5bac0912e5a0c6d44a9f7',
                'uploader': 'Passe-moi les jumelles',
                'upload_date': '20140404',
                'timestamp': 1396635300,
                'thumbnail': 're:^https?://.*\.image'
            },
        },
        {
            'url': 'http://www.rts.ch/video/sport/hockey/5745975-1-2-kloten-fribourg-5-2-second-but-pour-gotteron-par-kwiatowski.html',
            'md5': 'b4326fecd3eb64a458ba73c73e91299d',
            'info_dict': {
                'id': '5745975',
                'ext': 'mp4',
                'duration': 48,
                'title': '1/2, Kloten - Fribourg (5-2): second but pour Gottéron par Kwiatowski',
                'description': 'Hockey - Playoff',
                'uploader': 'Hockey',
                'upload_date': '20140403',
                'timestamp': 1396556882,
                'thumbnail': 're:^https?://.*\.image'
            },
            'skip': 'Blocked outside Switzerland',
        },
        {
            'url': 'http://www.rts.ch/video/info/journal-continu/5745356-londres-cachee-par-un-epais-smog.html',
            'md5': '9bb06503773c07ce83d3cbd793cebb91',
            'info_dict': {
                'id': '5745356',
                'ext': 'mp4',
                'duration': 33,
                'title': 'Londres cachée par un épais smog',
                'description': 'Un important voile de smog recouvre Londres depuis mercredi, provoqué par la pollution et du sable du Sahara.',
                'uploader': 'Le Journal en continu',
                'upload_date': '20140403',
                'timestamp': 1396537322,
                'thumbnail': 're:^https?://.*\.image'
            },
        },
        {
            'url': 'http://www.rts.ch/audio/couleur3/programmes/la-belle-video-de-stephane-laurenceau/5706148-urban-hippie-de-damien-krisl-03-04-2014.html',
            'md5': 'dd8ef6a22dff163d063e2a52bc8adcae',
            'info_dict': {
                'id': '5706148',
                'ext': 'mp3',
                'duration': 123,
                'title': '"Urban Hippie", de Damien Krisl',
                'description': 'Des Hippies super glam.',
                'upload_date': '20140403',
                'timestamp': 1396551600,
            },
        },
    ]
    def _real_extract(self, url):
        m = re.match(self._VALID_URL, url)
        video_id = m.group('id')
-        all_info = self._download_json(
+        def download_json(internal_id):
-            'http://www.rts.ch/a/%s.html?f=json/article' % video_id, video_id)
+            return self._download_json(
-        info = all_info['video']['JSONinfo']
+                'http://www.rts.ch/a/%s.html?f=json/article' % internal_id,
                video_id)
        all_info = download_json(video_id)
        # video_id extracted out of URL is not always a real id
        if 'video' not in all_info and 'audio' not in all_info:
            page = self._download_webpage(url, video_id)
            internal_id = self._html_search_regex(
                r'<(?:video|audio) data-id="([0-9]+)"', page,
                'internal video id')
            all_info = download_json(internal_id)
        info = all_info['video']['JSONinfo'] if 'video' in all_info else all_info['audio']
        upload_timestamp = parse_iso8601(info.get('broadcast_date'))
-        duration = parse_duration(info.get('duration'))
+        duration = info.get('duration') or info.get('cutout') or info.get('cutduration')
        if isinstance(duration, compat_str):
            duration = parse_duration(duration)
        view_count = info.get('plays')
        thumbnail = unescapeHTML(info.get('preview_image_url'))
        def extract_bitrate(url):
            return int_or_none(self._search_regex(
                r'-([0-9]+)k\.', url, 'bitrate', default=None))
        formats = [{
            'format_id': fid,
            'url': furl,
-            'tbr': int_or_none(self._search_regex(
+            'tbr': extract_bitrate(furl),
                r'-([0-9]+)k\.', furl, 'bitrate', default=None)),
        } for fid, furl in info['streams'].items()]
        if 'media' in info:
            formats.extend([{
                'format_id': '%s-%sk' % (media['ext'], media['rate']),
                'url': 'http://download-video.rts.ch/%s' % media['url'],
                'tbr': media['rate'] or extract_bitrate(media['url']),
            } for media in info['media'] if media.get('rate')])
        self._sort_formats(formats)
        return {
@@ -57,6 +147,7 @@ class RTSIE(InfoExtractor):
            'title': info['title'],
            'description': info.get('intro'),
            'duration': duration,
            'view_count': view_count,
            'uploader': info.get('programName'),
            'timestamp': upload_timestamp,
            'thumbnail': thumbnail,
--- a/youtube_dl/extractor/rutube.py
+++ b/youtube_dl/extractor/rutube.py
@@ -2,7 +2,6 @@
 from __future__ import unicode_literals
 import re
 import json
 import itertools
 from .common import InfoExtractor
@@ -39,17 +38,15 @@ class RutubeIE(InfoExtractor):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
-        
+
-        api_response = self._download_webpage(
+        video = self._download_json(
            'http://rutube.ru/api/video/%s/?format=json' % video_id,
            video_id, 'Downloading video JSON')
-        video = json.loads(api_response)
+
-        
+        trackinfo = self._download_json(
        api_response = self._download_webpage(
            'http://rutube.ru/api/play/trackinfo/%s/?format=json' % video_id,
            video_id, 'Downloading trackinfo JSON')
-        trackinfo = json.loads(api_response)
+
        # Some videos don't have the author field
        author = trackinfo.get('author') or {}
        m3u8_url = trackinfo['video_balancer'].get('m3u8')
@@ -82,10 +79,9 @@ class RutubeChannelIE(InfoExtractor):
    def _extract_videos(self, channel_id, channel_title=None):
        entries = []
        for pagenum in itertools.count(1):
-            api_response = self._download_webpage(
+            page = self._download_json(
                self._PAGE_TEMPLATE % (channel_id, pagenum),
                channel_id, 'Downloading page %s' % pagenum)
            page = json.loads(api_response)
            results = page['results']
            if not results:
                break
@@ -111,10 +107,9 @@ class RutubeMovieIE(RutubeChannelIE):
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        movie_id = mobj.group('id')
-        api_response = self._download_webpage(
+        movie = self._download_json(
            self._MOVIE_TEMPLATE % movie_id, movie_id,
            'Downloading movie JSON')
        movie = json.loads(api_response)
        movie_name = movie['name']
        return self._extract_videos(movie_id, movie_name)
--- a/youtube_dl/extractor/teamcoco.py
+++ b/youtube_dl/extractor/teamcoco.py
@@ -9,8 +9,18 @@ from ..utils import (
 class TeamcocoIE(InfoExtractor):
-    _VALID_URL = r'http://teamcoco\.com/video/(?P<url_title>.*)'
+    _VALID_URL = r'http://teamcoco\.com/video/(?P<video_id>[0-9]+)?/?(?P<display_id>.*)'
-    _TEST = {
+    _TESTS = [
    {
        'url': 'http://teamcoco.com/video/80187/conan-becomes-a-mary-kay-beauty-consultant',
        'file': '80187.mp4',
        'md5': '3f7746aa0dc86de18df7539903d399ea',
        'info_dict': {
            'title': 'Conan Becomes A Mary Kay Beauty Consultant',
            'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.'
        }
    },
    {
        'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush',
        'file': '19705.mp4',
        'md5': 'cde9ba0fa3506f5f017ce11ead928f9a',
@@ -19,22 +29,23 @@ class TeamcocoIE(InfoExtractor):
            "title": "Louis C.K. Interview Pt. 1 11/3/11"
        }
    }
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        if mobj is None:
            raise ExtractorError('Invalid URL: %s' % url)
        url_title = mobj.group('url_title')
        webpage = self._download_webpage(url, url_title)
-        video_id = self._html_search_regex(
+        display_id = mobj.group('display_id')
-            r'<article class="video" data-id="(\d+?)"',
+        webpage = self._download_webpage(url, display_id)
-            webpage, 'video id')
+        
-
+        video_id = mobj.group("video_id")
-        self.report_extraction(video_id)
+        if not video_id:
            video_id = self._html_search_regex(
                r'<article class="video" data-id="(\d+?)"',
                webpage, 'video id')
        data_url = 'http://teamcoco.com/cvp/2.0/%s.xml' % video_id
-        data = self._download_xml(data_url, video_id, 'Downloading data webpage')
+        data = self._download_xml(
            data_url, display_id, 'Downloading data webpage')
        qualities = ['500k', '480p', '1000k', '720p', '1080p']
        formats = []
@@ -69,6 +80,7 @@ class TeamcocoIE(InfoExtractor):
        return {
            'id': video_id,
            'display_id': display_id,
            'formats': formats,
            'title': self._og_search_title(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
--- a/youtube_dl/extractor/ted.py
+++ b/youtube_dl/extractor/ted.py
@@ -37,6 +37,7 @@ class TEDIE(SubtitlesInfoExtractor):
                'consciousness, but that half the time our brains are '
                'actively fooling us.'),
            'uploader': 'Dan Dennett',
            'width': 854,
        }
    }, {
        'url': 'http://www.ted.com/watch/ted-institute/ted-bcg/vishal-sikka-the-beauty-and-power-of-algorithms',
@@ -50,10 +51,10 @@ class TEDIE(SubtitlesInfoExtractor):
        }
    }]
-    _FORMATS_PREFERENCE = {
+    _NATIVE_FORMATS = {
-        'low': 1,
+        'low': {'preference': 1, 'width': 320, 'height': 180},
-        'medium': 2,
+        'medium': {'preference': 2, 'width': 512, 'height': 288},
-        'high': 3,
+        'high': {'preference': 3, 'width': 854, 'height': 480},
    }
    def _extract_info(self, webpage):
@@ -98,12 +99,14 @@ class TEDIE(SubtitlesInfoExtractor):
        talk_info = self._extract_info(webpage)['talks'][0]
        formats = [{
            'ext': 'mp4',
            'url': format_url,
            'format_id': format_id,
            'format': format_id,
            'preference': self._FORMATS_PREFERENCE.get(format_id, -1),
        } for (format_id, format_url) in talk_info['nativeDownloads'].items()]
        for f in formats:
            finfo = self._NATIVE_FORMATS.get(f['format_id'])
            if finfo:
                f.update(finfo)
        self._sort_formats(formats)
        video_id = compat_str(talk_info['id'])
--- a/youtube_dl/extractor/ustream.py
+++ b/youtube_dl/extractor/ustream.py
@@ -11,7 +11,7 @@ from ..utils import (
 class UstreamIE(InfoExtractor):
-    _VALID_URL = r'https?://www\.ustream\.tv/recorded/(?P<videoID>\d+)'
+    _VALID_URL = r'https?://www\.ustream\.tv/(?P<type>recorded|embed)/(?P<videoID>\d+)'
    IE_NAME = 'ustream'
    _TEST = {
        'url': 'http://www.ustream.tv/recorded/20274954',
@@ -25,6 +25,13 @@ class UstreamIE(InfoExtractor):
    def _real_extract(self, url):
        m = re.match(self._VALID_URL, url)
        if m.group('type') == 'embed':
            video_id = m.group('videoID')
            webpage = self._download_webpage(url, video_id)
            desktop_video_id = self._html_search_regex(r'ContentVideoIds=\["([^"]*?)"\]', webpage, 'desktop_video_id')
            desktop_url = 'http://www.ustream.tv/recorded/' + desktop_video_id
            return self.url_result(desktop_url, 'Ustream')
        video_id = m.group('videoID')
        video_url = 'http://tcdn.ustream.tv/video/%s' % video_id
--- a/youtube_dl/extractor/videoweed.py
+++ b/youtube_dl/extractor/videoweed.py
@@ -0,0 +1,26 @@
 from __future__ import unicode_literals
 from .novamov import NovaMovIE
 class VideoWeedIE(NovaMovIE):
    IE_NAME = 'videoweed'
    IE_DESC = 'VideoWeed'
    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'videoweed\.(?:es|com)'}
    _HOST = 'www.videoweed.es'
    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    _TITLE_REGEX = r'<h1 class="text_shadow">([^<]+)</h1>'
    _TEST = {
        'url': 'http://www.videoweed.es/file/b42178afbea14',
        'md5': 'abd31a2132947262c50429e1d16c1bfd',
        'info_dict': {
            'id': 'b42178afbea14',
            'ext': 'flv',
            'title': 'optical illusion  dissapeared image magic illusion',
            'description': ''
        },
    }
--- a/youtube_dl/extractor/vk.py
+++ b/youtube_dl/extractor/vk.py
@@ -16,7 +16,7 @@ from ..utils import (
 class VKIE(InfoExtractor):
    IE_NAME = 'vk.com'
-    _VALID_URL = r'https?://vk\.com/(?:video_ext\.php\?.*?\boid=(?P<oid>\d+).*?\bid=(?P<id>\d+)|(?:videos.*?\?.*?z=)?video(?P<videoid>.*?)(?:\?|%2F|$))'
+    _VALID_URL = r'https?://vk\.com/(?:video_ext\.php\?.*?\boid=(?P<oid>-?\d+).*?\bid=(?P<id>\d+)|(?:videos.*?\?.*?z=)?video(?P<videoid>.*?)(?:\?|%2F|$))'
    _NETRC_MACHINE = 'vk'
    _TESTS = [
--- a/youtube_dl/extractor/wimp.py
+++ b/youtube_dl/extractor/wimp.py
@@ -3,11 +3,12 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from .youtube import YoutubeIE
 class WimpIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?wimp\.com/([^/]+)/'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.wimp.com/maruexhausted/',
        'md5': 'f1acced123ecb28d9bb79f2479f2b6a1',
        'info_dict': {
@@ -16,7 +17,20 @@ class WimpIE(InfoExtractor):
            'title': 'Maru is exhausted.',
            'description': 'md5:57e099e857c0a4ea312542b684a869b8',
        }
-    }
+    }, {
        # youtube video
        'url': 'http://www.wimp.com/clowncar/',
        'info_dict': {
            'id': 'cG4CEr2aiSg',
            'ext': 'mp4',
            'title': 'Basset hound clown car...incredible!',
            'description': 'md5:8d228485e0719898c017203f900b3a35',
            'uploader': 'Gretchen Hoey',
            'uploader_id': 'gretchenandjeff1',
            'upload_date': '20140303',
        },
        'add_ie': ['Youtube'],
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
@@ -24,6 +38,13 @@ class WimpIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)
        video_url = self._search_regex(
            r's1\.addVariable\("file",\s*"([^"]+)"\);', webpage, 'video URL')
        if YoutubeIE.suitable(video_url):
            self.to_screen('Found YouTube video')
            return {
                '_type': 'url',
                'url': video_url,
                'ie_key': YoutubeIE.ie_key(),
            }
        return {
            'id': video_id,
@@ -31,4 +52,4 @@ class WimpIE(InfoExtractor):
            'title': self._og_search_title(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
            'description': self._og_search_description(webpage),
-        }
+        }
--- a/youtube_dl/extractor/yahoo.py
+++ b/youtube_dl/extractor/yahoo.py
@@ -15,22 +15,24 @@ from ..utils import (
 class YahooIE(InfoExtractor):
    IE_DESC = 'Yahoo screen'
-    _VALID_URL = r'http://screen\.yahoo\.com/.*?-(?P<id>\d*?)\.html'
+    _VALID_URL = r'https?://screen\.yahoo\.com/.*?-(?P<id>[0-9]+)(?:-[a-z]+)?\.html'
    _TESTS = [
        {
            'url': 'http://screen.yahoo.com/julian-smith-travis-legg-watch-214727115.html',
            'file': '214727115.mp4',
            'md5': '4962b075c08be8690a922ee026d05e69',
            'info_dict': {
                'id': '214727115',
                'ext': 'mp4',
                'title': 'Julian Smith & Travis Legg Watch Julian Smith',
                'description': 'Julian and Travis watch Julian Smith',
            },
        },
        {
            'url': 'http://screen.yahoo.com/wired/codefellas-s1-ep12-cougar-lies-103000935.html',
            'file': '103000935.mp4',
            'md5': 'd6e6fc6e1313c608f316ddad7b82b306',
            'info_dict': {
                'id': '103000935',
                'ext': 'mp4',
                'title': 'Codefellas - The Cougar Lies with Spanish Moss',
                'description': 'Agent Topple\'s mustache does its dirty work, and Nicole brokers a deal for peace. But why is the NSA collecting millions of Instagram brunch photos? And if your waffles have nothing to hide, what are they so worried about?',
            },
@@ -60,10 +62,9 @@ class YahooIE(InfoExtractor):
            'env': 'prod',
            'format': 'json',
        })
-        query_result_json = self._download_webpage(
+        query_result = self._download_json(
            'http://video.query.yahoo.com/v1/public/yql?' + data,
            video_id, 'Downloading video info')
        query_result = json.loads(query_result_json)
        info = query_result['query']['results']['mediaObj'][0]
        meta = info['meta']
@@ -86,7 +87,6 @@ class YahooIE(InfoExtractor):
            else:
                format_url = compat_urlparse.urljoin(host, path)
                format_info['url'] = format_url
            formats.append(format_info)
        self._sort_formats(formats)
@@ -134,27 +134,25 @@ class YahooSearchIE(SearchInfoExtractor):
    def _get_n_results(self, query, n):
        """Get a specified number of results for a query"""
-
+        entries = []
-        res = {
+        for pagenum in itertools.count(0):
            '_type': 'playlist',
            'id': query,
            'entries': []
        }
        for pagenum in itertools.count(0): 
            result_url = 'http://video.search.yahoo.com/search/?p=%s&fr=screen&o=js&gs=0&b=%d' % (compat_urllib_parse.quote_plus(query), pagenum * 30)
-            webpage = self._download_webpage(result_url, query,
+            info = self._download_json(result_url, query,
-                                             note='Downloading results page '+str(pagenum+1))
+                note='Downloading results page '+str(pagenum+1))
            info = json.loads(webpage)
            m = info['m']
            results = info['results']
            for (i, r) in enumerate(results):
-                if (pagenum * 30) +i >= n:
+                if (pagenum * 30) + i >= n:
                    break
                mobj = re.search(r'(?P<url>screen\.yahoo\.com/.*?-\d*?\.html)"', r)
                e = self.url_result('http://' + mobj.group('url'), 'Yahoo')
-                res['entries'].append(e)
+                entries.append(e)
-            if (pagenum * 30 +i >= n) or (m['last'] >= (m['total'] -1)):
+            if (pagenum * 30 + i >= n) or (m['last'] >= (m['total'] - 1)):
                break
-        return res
+        return {
            '_type': 'playlist',
            'id': query,
            'entries': entries,
        }
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -1446,12 +1446,15 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
                break
            more = self._download_json(
-                'https://youtube.com/%s' % mobj.group('more'), playlist_id, 'Downloading page #%s' % page_num)
+                'https://youtube.com/%s' % mobj.group('more'), playlist_id,
                'Downloading page #%s' % page_num,
                transform_source=uppercase_escape)
            content_html = more['content_html']
            more_widget_html = more['load_more_widget_html']
        playlist_title = self._html_search_regex(
-                r'<h1 class="pl-header-title">\s*(.*?)\s*</h1>', page, u'title')
+            r'(?s)<h1 class="pl-header-title[^"]*">\s*(.*?)\s*</h1>',
            page, u'title')
        url_results = self._ids_to_results(ids)
        return self.playlist_result(url_results, playlist_id, playlist_title)
@@ -1736,11 +1739,10 @@ class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
        feed_entries = []
        paging = 0
        for i in itertools.count(1):
-            info = self._download_webpage(self._FEED_TEMPLATE % paging,
+            info = self._download_json(self._FEED_TEMPLATE % paging,
                                          u'%s feed' % self._FEED_NAME,
                                          u'Downloading page %s' % i)
-            info = json.loads(info)
+            feed_html = info.get('feed_html') or info.get('content_html')
            feed_html = info['feed_html']
            m_ids = re.finditer(r'"/watch\?v=(.*?)["&]', feed_html)
            ids = orderedSet(m.group(1) for m in m_ids)
            feed_entries.extend(
@@ -1752,7 +1754,7 @@ class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
        return self.playlist_result(feed_entries, playlist_title=self._PLAYLIST_TITLE)
 class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
-    IE_DESC = u'YouTube.com subscriptions feed, "ytsubs" keyword(requires authentication)'
+    IE_DESC = u'YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)'
    _VALID_URL = r'https?://www\.youtube\.com/feed/subscriptions|:ytsubs(?:criptions)?'
    _FEED_NAME = 'subscriptions'
    _PLAYLIST_TITLE = u'Youtube Subscriptions'
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -2,6 +2,7 @@
 # -*- coding: utf-8 -*-
 import calendar
 import codecs
 import contextlib
 import ctypes
 import datetime
@@ -1176,12 +1177,12 @@ class HEADRequest(compat_urllib_request.Request):
        return "HEAD"
-def int_or_none(v, scale=1):
+def int_or_none(v, scale=1, default=None):
-    return v if v is None else (int(v) // scale)
+    return default if v is None else (int(v) // scale)
-def float_or_none(v, scale=1):
+def float_or_none(v, scale=1, default=None):
-    return v if v is None else (float(v) / scale)
+    return default if v is None else (float(v) / scale)
 def parse_duration(s):
@@ -1263,9 +1264,11 @@ class PagedList(object):
 def uppercase_escape(s):
    unicode_escape = codecs.getdecoder('unicode_escape')
    return re.sub(
-        r'\\U([0-9a-fA-F]{8})',
+        r'\\U[0-9a-fA-F]{8}',
-        lambda m: compat_chr(int(m.group(1), base=16)), s)
+        lambda m: unicode_escape(m.group(0))[0],
        s)
 try:
    struct.pack(u'!I', 0)
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,2 +1,2 @@
-__version__ = '2014.03.30.1'
+__version__ = '2014.04.07.1'
Author	SHA1	Message	Date
Philipp Hagemeister	9afb76c5ad	release 2014.04.07.1	2014-04-07 15:28:55 +02:00
Philipp Hagemeister	dfb2cb5cfd	[teamcoco] Simplify ID management (Closes #2715 )	2014-04-07 15:25:35 +02:00
Philipp Hagemeister	650d688d10	release 2014.04.07	2014-04-07 13:11:37 +02:00
Philipp Hagemeister	0ba77818f3	[ted] Add width and height (Fixes #2716 )	2014-04-07 13:11:30 +02:00
Sergey M․	09baa7da7e	[rts] Update test	2014-04-07 00:34:23 +07:00
Sergey M․	85e787f51d	[cbsnews] Add support for cbsnews.com (Closes #2691 )	2014-04-06 06:03:58 +07:00
Philipp Hagemeister	2a9e1e453a	Merge branch 'master' of github.com:rg3/youtube-dl	2014-04-05 20:05:47 +02:00
Philipp Hagemeister	ee1e199685	[justin.tv] Modernize (Fixes #2705 )	2014-04-05 17:56:36 +02:00
Sergey M․	17c5a00774	[novamov] Simplify	2014-04-05 19:36:22 +07:00
Sergey M․	15c0e8e7b2	[generic] Generalize novamov based embeds	2014-04-05 17:20:05 +07:00
Sergey M․	cca37fba48	[divxstage] Fix typo in IE_NAME	2014-04-05 17:15:43 +07:00
Sergey M․	9d0993ec4a	[movshare] Support more domains	2014-04-05 17:00:18 +07:00
Sergey M․	342f33bf9e	[divxstage] Support more domains	2014-04-05 16:50:05 +07:00
Sergey M․	7cd3bc5f99	[nowvideo] Support more domains	2014-04-05 16:38:57 +07:00
Sergey M․	931055e6cb	[videoweed] Revert _FILE_DELETED_REGEX	2014-04-05 16:32:14 +07:00
Sergey M․	d0e4cf82f1	[movshare] Add _FILE_DELETED_REGEX	2014-04-05 16:31:38 +07:00
Sergey M․	6f88df2c57	[divxstage] Add support for divxstage.eu	2014-04-05 16:29:44 +07:00
Sergey M․	4479bf2762	[videoweed] Simplify	2014-04-05 16:09:28 +07:00
Sergey M․	1ff7c0f7d8	[movshare] Add support for movshare.net	2014-04-05 16:09:03 +07:00
Sergey M․	610e47c87e	Credit @sainyamkapoor for videoweed extractor	2014-04-05 15:53:50 +07:00
Sergey M․	50f566076f	[generic] Add support for videoweed embeds	2014-04-05 15:49:45 +07:00
Sergey M․	92810ff497	[nowvideo] Improve _VALID_URL	2014-04-05 15:35:21 +07:00
Sergey M․	60ccc59a1c	[novamov] Improve _VALID_URL	2014-04-05 15:34:54 +07:00
Sergey M․	91745595d3	[videoweed] Simplify	2014-04-05 15:32:55 +07:00
Sainyam Kapoor	d6e40507d0	[videoweed]Cleanup	2014-04-05 10:53:22 +05:30
Sainyam Kapoor	deed48b472	[Videoweed] Added support for videoweed.	2014-04-05 10:40:03 +05:30
Philipp Hagemeister	e4d41bfca5	Merge pull request #2696 from anovicecodemonkey/support-ustream-embeds [UstreamIE] [generic] Added support for Ustream embed URLs (Fixes #2694)	2014-04-04 23:33:08 +02:00
Philipp Hagemeister	a355b70f27	[cspan] Do not test number of playlist entries Apparently, CSpan switches between single-file and multiple-file results. Either one is fine as long as we get the full four hours.	2014-04-04 23:16:22 +02:00
Philipp Hagemeister	f8514f6186	[rts] Use visible id in file names Maybe the internal ID is more precise, but it's totally confusing, and the obvious ID still allows a google search.	2014-04-04 23:13:55 +02:00
Philipp Hagemeister	e09b8fcd9d	[ro220] Make test case more flexible Either one or two spaces is fine here.	2014-04-04 23:08:33 +02:00
Philipp Hagemeister	7d1b527ff9	[motorsport] Fix on Python 3	2014-04-04 23:06:27 +02:00
Philipp Hagemeister	f943c7b622	release 2014.04.04.7	2014-04-04 23:01:45 +02:00
Philipp Hagemeister	676eb3f2dd	Fix unicode_escape (Fixes #2695 )	2014-04-04 23:00:51 +02:00
Philipp Hagemeister	98b7cf1ace	release 2014.04.04.6	2014-04-04 22:48:35 +02:00
Philipp Hagemeister	c465afd736	[teamcoco] Fix regex in 2.6 (#2700 ) The re engine does not want to repeat an empty string, for fear that something like (.) could be matching the tokens ... "" "" "" "" "" "" Of course, that's harmless with a question mark, although still somewhat strange.	2014-04-04 22:46:47 +02:00
Philipp Hagemeister	b84d6e7fc4	Merge remote-tracking branch 'AGSPhoenix/teamcoco-fix'	2014-04-04 22:44:49 +02:00
Philipp Hagemeister	2efd5d78c1	release 2014.04.04.5	2014-04-04 22:24:45 +02:00
Philipp Hagemeister	c8edf47b3a	[yahoo] Support https and -uploader URLs (Fixes #2701 )	2014-04-04 22:23:59 +02:00
Philipp Hagemeister	3b4c26a428	[pornhd] Avoid shadowing variable url	2014-04-04 22:22:30 +02:00
Philipp Hagemeister	1525148114	Remove unused imports	2014-04-04 22:22:11 +02:00
Philipp Hagemeister	9e0c5791c1	release 2014.04.04.4	2014-04-04 22:15:32 +02:00
Philipp Hagemeister	29a1ab2afc	Add alternative --prefer-unsecure spelling (Closes #2697 )	2014-04-04 22:15:21 +02:00
AGSPhoenix	fa387d2d99	Revert "Workaround for regex engine limitation" This reverts commit `6d0d573eca`.	2014-04-04 15:37:49 -04:00
AGSPhoenix	6d0d573eca	Workaround for regex engine limitation	2014-04-04 15:25:28 -04:00
AGSPhoenix	bb799e811b	Add a test for the new URL pages Add a test for the pages with the video_id in the URL.	2014-04-04 13:52:35 -04:00
AGSPhoenix	04ee53eca1	Support TeamCoco URLs with video_id in the title If the URL has the video_id in it, use that since the current method of finding the id breaks on those pages. Fixes 2698.	2014-04-04 13:42:34 -04:00
Jaime Marquínez Ferrándiz	659eb98a53	[breakcom] Fix YouTube videos extraction (fixes #2699 )	2014-04-04 19:01:18 +02:00
anovicecodemonkey	ca6aada48e	Fix _TEST for Ustream embed URLs	2014-04-05 03:26:29 +10:30
Jaime Marquínez Ferrándiz	43df5a7e71	[keezmovies] Modernize	2014-04-04 18:52:43 +02:00
Jaime Marquínez Ferrándiz	88f1c6de7b	[yahoo] Modernize	2014-04-04 18:52:43 +02:00
Sergey M․	65a40ab82b	[pornhd] Update test checksum	2014-04-04 22:47:38 +07:00
Sergey M․	4b9cced103	[pornhd] Fix extraction (Closes #2693 )	2014-04-04 22:45:39 +07:00
anovicecodemonkey	5c38625259	[UstreamIE] [generic] Added support for Ustream embed URLs (Fixes #2694 )	2014-04-05 00:53:09 +10:30
Sergey M․	6344fa04bb	[rts] Add more formats and audio support (Closes #2689 )	2014-04-04 20:42:06 +07:00
Jaime Marquínez Ferrándiz	e3ced9ed61	[downloader/common] Use `compat_str` with the error in `try_rename` (appeared in #2389 ) Otherwise on python 2.x we get `UnicodeDecodeError` because it may contain non ascii characters.	2014-04-04 14:59:11 +02:00
Philipp Hagemeister	5075d598bc	release 2014.04.04.2	2014-04-04 02:24:21 +02:00
Philipp Hagemeister	68eb8e90e6	[youtube:playlist] Fix playlists for logged-in users (Fixes #2690 )	2014-04-04 02:23:36 +02:00
Philipp Hagemeister	d3a96346c4	release 2014.04.04.3	2014-04-04 02:09:16 +02:00
Philipp Hagemeister	0e518e2fea	[cnet] Fall back to "videos" key	2014-04-04 02:09:04 +02:00
Philipp Hagemeister	1e0a235f39	[dailymotion] Fix playlist+user	2014-04-04 02:04:16 +02:00
Philipp Hagemeister	9ad400f75e	[generic] Remove test case that has become a 404	2014-04-04 01:47:17 +02:00
Philipp Hagemeister	3537b93d8a	[tests] Fix YoutubeDL tests Since `bec1fad`, the id, title, and url (also in formats) keys are mandatory. Change the tests to reflect that.	2014-04-04 01:45:49 +02:00
Philipp Hagemeister	56eca2e956	release 2014.04.04.1	2014-04-04 00:25:43 +02:00
Philipp Hagemeister	2ad4d1ba07	[morningstar] Add new extractor (Fixes #2687 )	2014-04-04 00:25:35 +02:00
Philipp Hagemeister	4853de808b	release 2014.04.04	2014-04-04 00:06:06 +02:00
Philipp Hagemeister	6ff5f12218	[motorsport] Add extractor (Fixes #2688 )	2014-04-04 00:05:43 +02:00
Philipp Hagemeister	52a180684f	[README] Fix VALID_URL in extractor example	2014-04-03 23:25:23 +02:00
Philipp Hagemeister	b21e25702f	Merge pull request #2681 from phihag/readme-dev-instructions [README] Improve developer instructions	2014-04-03 23:06:15 +02:00
Jaime Marquínez Ferrándiz	983af2600f	[wimp] Detect youtube videos (fixes #2686 )	2014-04-03 20:44:51 +02:00
Philipp Hagemeister	f34e6a2cd6	[comedycentral:shows] Do no include 6-digit identifier in display ID	2014-04-03 18:39:00 +02:00
Philipp Hagemeister	a9f304031b	release 2014.04.03.3	2014-04-03 16:21:54 +02:00
Philipp Hagemeister	9271bc8355	[cnet] Add new extractor (Fixes #2679 )	2014-04-03 16:21:21 +02:00
Philipp Hagemeister	d1b3e3dd75	[README] Add md5 to code example	2014-04-03 15:59:04 +02:00
Philipp Hagemeister	968ed2a777	[comedycentral] Add test for #2677	2014-04-03 15:31:04 +02:00
Philipp Hagemeister	24de5d2556	release 2014.04.03.2	2014-04-03 15:28:56 +02:00
Philipp Hagemeister	d26e981df4	Correct check for empty dirname (Fixes #2683 )	2014-04-03 15:28:41 +02:00
Jaime Marquínez Ferrándiz	e45d40b171	[youtube:subscriptions] Add space to the description	2014-04-03 15:13:52 +02:00
Sergey M․	4a419b8851	[c56] Modernize and add duration extraction	2014-04-03 19:53:11 +07:00
Philipp Hagemeister	5fbd672c38	[README] Improve developer instructions Add a longer tutorial that should cover everything needed to start developing IEs. Fixes #2676	2014-04-03 14:46:24 +02:00
Philipp Hagemeister	bec1fad223	[YouTubeDL] Throw an early error if the info_dict result is invalid	2014-04-03 14:38:16 +02:00
Philipp Hagemeister	177fed41bc	[comedycentral:shows] Support guest/ URLs (Fixes #2677 )	2014-04-03 14:38:16 +02:00
Jaime Marquínez Ferrándiz	b900e7cba4	[downloader/f4m] Close the final video	2014-04-03 13:35:07 +02:00
Jaime Marquínez Ferrándiz	14cb4979f0	MANIFEST.in: Only list the files from the docs folder that will be included (closes #2623 ) Pruning the _build folder produced the message `no previously-included directories found matching 'docs/_build'` when installing from the source distribution.	2014-04-03 13:26:27 +02:00
Philipp Hagemeister	69e61e30fe	release 2014.04.03.1	2014-04-03 08:55:59 +02:00
Philipp Hagemeister	cce929eaac	[franceculture] Add extractor (Fixes #2669 )	2014-04-03 08:55:38 +02:00
Philipp Hagemeister	b6cfde99b7	Only mention websense URL once	2014-04-03 08:12:53 +02:00
Philipp Hagemeister	1be99f052d	release 2014.04.03	2014-04-03 06:09:45 +02:00
Philipp Hagemeister	2410c43d83	Detect Websense censorship (Fixes #2670 )	2014-04-03 06:09:38 +02:00
Philipp Hagemeister	aea6e7fc3c	[cspan] Support multiple segments (Fixes #2674 )	2014-04-03 06:09:38 +02:00
Sergey M․	91a76c40c0	[musicplayon] Add support for musicplayon.com	2014-04-02 22:10:20 +07:00
Philipp Hagemeister	d2b194607c	release 2014.04.02	2014-04-02 14:26:34 +02:00
Jaime Marquínez Ferrándiz	f6177462db	[youtube] feeds: Also look for the html in the 'content_html' field (fixes #2671 )	2014-04-02 14:13:08 +02:00
Jaime Marquínez Ferrándiz	9ddaf4ef8c	[comedycentral] Change XPath .//guid to ./guid (fixes #2668 ) It fails to find the element in python 2.6 and it's not required, the element is a direct child of the item node.	2014-04-01 21:38:07 +02:00
Jaime Marquínez Ferrándiz	97b5573848	[comedycentral] Update test title for `34cbc7ee8d`	2014-04-01 21:29:40 +02:00
Jaime Marquínez Ferrándiz	18c95c1ab0	[rutube] Use _download_json	2014-04-01 20:30:22 +02:00
Sergey M․	0479c625a4	[brightcove] Encode object_str with utf-8	2014-04-01 20:17:35 +07:00
Sergey M․	f659951e22	[vk] Support optional dash for oid in embedded links	2014-04-01 19:38:42 +07:00
Philipp Hagemeister	5853a7316e	release 2014.04.01.3	2014-04-01 13:17:15 +02:00
Philipp Hagemeister	a612753db9	[utils] Correct decoding of large unicode codepoints in uppercase_escape (Fixes #2664 )	2014-04-01 13:17:07 +02:00
Philipp Hagemeister	c8fc3fb524	release 2014.04.01.2	2014-04-01 05:57:15 +02:00
Philipp Hagemeister	5912c639df	[youtube] Transform google's JSON dialect (fixes #2663 )	2014-04-01 05:56:56 +02:00
Philipp Hagemeister	017e4dd58c	release 2014.04.01.1	2014-04-01 00:25:17 +02:00
Philipp Hagemeister	651486621d	[comedycentral] Allow URLs with query parts (fixes #2661 )	2014-04-01 00:25:11 +02:00
Philipp Hagemeister	28d9032c88	release 2014.04.01	2014-04-01 00:02:39 +02:00
Philipp Hagemeister	16f4eb723a	[comedycentral] Add support for /videos URLs (Fixes #2660 )	2014-04-01 00:02:32 +02:00
Sergey M․	1cbd410620	[pyvideo] Modernize	2014-03-31 19:31:48 +07:00
`@@ -1,2 +1,2 @@`

	`__version__ = '2014.03.30.1'`	`__version__ = '2014.04.07.1'`