release 2016.06.23.1

[jsinterp] Relax JS function regex (Closes #9863 )
[nbc:nbcnews] improve extraction and add msnbc to the extractor
2016-06-23 09:42:56 +07:00 · 2016-06-23 09:41:34 +07:00 · 2016-06-23 01:36:19 +01:00 · 2016-06-23 00:14:34 +01:00 · 2016-06-23 04:29:34 +07:00 · 2016-06-23 04:27:10 +07:00
40 changed files with 916 additions and 701 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -6,8 +6,8 @@
 ---
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.18.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.23.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.18.1**
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.23.1**
 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.06.18.1
+[debug] youtube-dl version 2016.06.23.1
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/README.md
+++ b/README.md
@ -44,7 +44,7 @@ Or with [MacPorts](https://www.macports.org/):
 Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
 # DESCRIPTION
-**youtube-dl** is a small command-line program to download videos from
+**youtube-dl** is a command-line program to download videos from
 YouTube.com and a few more sites. It requires the Python interpreter, version
 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
 your Unix box, on Windows or on Mac OS X. It is released to the public domain,
--- a/devscripts/make_lazy_extractors.py
+++ b/devscripts/make_lazy_extractors.py
@ -14,15 +14,17 @@ if os.path.exists(lazy_extractors_filename):
    os.remove(lazy_extractors_filename)
 from youtube_dl.extractor import _ALL_CLASSES
-from youtube_dl.extractor.common import InfoExtractor
+from youtube_dl.extractor.common import InfoExtractor, SearchInfoExtractor
 with open('devscripts/lazy_load_template.py', 'rt') as f:
    module_template = f.read()
-module_contents = [module_template + '\n' + getsource(InfoExtractor.suitable)]
+module_contents = [
    module_template + '\n' + getsource(InfoExtractor.suitable) + '\n',
    'class LazyLoadSearchExtractor(LazyLoadExtractor):\n    pass\n']
 ie_template = '''
-class {name}(LazyLoadExtractor):
+class {name}({bases}):
    _VALID_URL = {valid_url!r}
    _module = '{module}'
 '''
@ -34,10 +36,20 @@ make_valid_template = '''
 '''
 def get_base_name(base):
    if base is InfoExtractor:
        return 'LazyLoadExtractor'
    elif base is SearchInfoExtractor:
        return 'LazyLoadSearchExtractor'
    else:
        return base.__name__
 def build_lazy_ie(ie, name):
    valid_url = getattr(ie, '_VALID_URL', None)
    s = ie_template.format(
        name=name,
        bases=', '.join(map(get_base_name, ie.__bases__)),
        valid_url=valid_url,
        module=ie.__module__)
    if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
@ -47,11 +59,34 @@ def build_lazy_ie(ie, name):
        s += make_valid_template.format(valid_url=ie._make_valid_url())
    return s
 # find the correct sorting and add the required base classes so that sublcasses
 # can be correctly created
 classes = _ALL_CLASSES[:-1]
 ordered_cls = []
 while classes:
    for c in classes[:]:
        bases = set(c.__bases__) - set((object, InfoExtractor, SearchInfoExtractor))
        stop = False
        for b in bases:
            if b not in classes and b not in ordered_cls:
                if b.__name__ == 'GenericIE':
                    exit()
                classes.insert(0, b)
                stop = True
        if stop:
            break
        if all(b in ordered_cls for b in bases):
            ordered_cls.append(c)
            classes.remove(c)
            break
 ordered_cls.append(_ALL_CLASSES[-1])
 names = []
-for ie in list(sorted(_ALL_CLASSES[:-1], key=lambda cls: cls.ie_key())) + _ALL_CLASSES[-1:]:
+for ie in ordered_cls:
-    name = ie.ie_key() + 'IE'
+    name = ie.__name__
    src = build_lazy_ie(ie, name)
    module_contents.append(src)
    if ie in _ALL_CLASSES:
        names.append(name)
 module_contents.append(
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -44,8 +44,8 @@
 - **appletrailers:section**
 - **archive.org**: archive.org videos
 - **ARD**
 - **ARD:mediathek**: Saarländischer Rundfunk
 - **ARD:mediathek**
 - **ARD:mediathek**: Saarländischer Rundfunk
 - **arte.tv**
 - **arte.tv:+7**
 - **arte.tv:cinema**
@ -128,6 +128,7 @@
 - **cliphunter**
 - **ClipRs**
 - **Clipsyndicate**
 - **CloserToTruth**
 - **cloudtime**: CloudTime
 - **Cloudy**
 - **Clubic**
@ -247,7 +248,6 @@
 - **Gamersyde**
 - **GameSpot**
 - **GameStar**
 - **Gametrailers**
 - **Gazeta**
 - **GDCVault**
 - **generic**: Generic downloader that works on some sites
@ -385,7 +385,6 @@
 - **MovieFap**
 - **Moviezine**
 - **MPORA**
 - **MSNBC**
 - **MTV**
 - **mtv.de**
 - **mtviggy.com**
@ -521,6 +520,7 @@
 - **qqmusic:singer**: QQ音乐 - 歌手
 - **qqmusic:toplist**: QQ音乐 - 排行榜
 - **R7**
 - **R7Article**
 - **radio.de**
 - **radiobremen**
 - **radiocanada**
--- a/youtube_dl/downloader/hls.py
+++ b/youtube_dl/downloader/hls.py
@ -2,14 +2,24 @@ from __future__ import unicode_literals
 import os.path
 import re
 import binascii
 try:
    from Crypto.Cipher import AES
    can_decrypt_frag = True
 except ImportError:
    can_decrypt_frag = False
 from .fragment import FragmentFD
 from .external import FFmpegFD
-from ..compat import compat_urlparse
+from ..compat import (
    compat_urlparse,
    compat_struct_pack,
 )
 from ..utils import (
    encodeFilename,
    sanitize_open,
    parse_m3u8_attributes,
 )
@ -21,7 +31,7 @@ class HlsFD(FragmentFD):
    @staticmethod
    def can_download(manifest):
        UNSUPPORTED_FEATURES = (
-            r'#EXT-X-KEY:METHOD=(?!NONE)',  # encrypted streams [1]
+            r'#EXT-X-KEY:METHOD=(?!NONE|AES-128)',  # encrypted streams [1]
            r'#EXT-X-BYTERANGE',  # playlists composed of byte ranges of media files [2]
            # Live streams heuristic does not always work (e.g. geo restricted to Germany
@ -39,7 +49,9 @@ class HlsFD(FragmentFD):
            # 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.2
            # 4. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.5
        )
-        return all(not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES)
+        check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
        check_results.append(can_decrypt_frag or '#EXT-X-KEY:METHOD=AES-128' not in manifest)
        return all(check_results)
    def real_download(self, filename, info_dict):
        man_url = info_dict['url']
@ -57,36 +69,60 @@ class HlsFD(FragmentFD):
                fd.add_progress_hook(ph)
            return fd.real_download(filename, info_dict)
-        fragment_urls = []
+        total_frags = 0
        for line in s.splitlines():
            line = line.strip()
            if line and not line.startswith('#'):
-                segment_url = (
+                total_frags += 1
                    line
                    if re.match(r'^https?://', line)
                    else compat_urlparse.urljoin(man_url, line))
                fragment_urls.append(segment_url)
                # We only download the first fragment during the test
                if self.params.get('test', False):
                    break
        ctx = {
            'filename': filename,
-            'total_frags': len(fragment_urls),
+            'total_frags': total_frags,
        }
        self._prepare_and_start_frag_download(ctx)
        i = 0
        media_sequence = 0
        decrypt_info = {'METHOD': 'NONE'}
        frags_filenames = []
-        for i, frag_url in enumerate(fragment_urls):
+        for line in s.splitlines():
            line = line.strip()
            if line:
                if not line.startswith('#'):
                    frag_url = (
                        line
                        if re.match(r'^https?://', line)
                        else compat_urlparse.urljoin(man_url, line))
                    frag_filename = '%s-Frag%d' % (ctx['tmpfilename'], i)
                    success = ctx['dl'].download(frag_filename, {'url': frag_url})
                    if not success:
                        return False
                    down, frag_sanitized = sanitize_open(frag_filename, 'rb')
-            ctx['dest_stream'].write(down.read())
+                    frag_content = down.read()
                    down.close()
                    if decrypt_info['METHOD'] == 'AES-128':
                        iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
                        frag_content = AES.new(
                            decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
                    ctx['dest_stream'].write(frag_content)
                    frags_filenames.append(frag_sanitized)
                    # We only download the first fragment during the test
                    if self.params.get('test', False):
                        break
                    i += 1
                    media_sequence += 1
                elif line.startswith('#EXT-X-KEY'):
                    decrypt_info = parse_m3u8_attributes(line[11:])
                    if decrypt_info['METHOD'] == 'AES-128':
                        if 'IV' in decrypt_info:
                            decrypt_info['IV'] = binascii.unhexlify(decrypt_info['IV'][2:])
                        if not re.match(r'^https?://', decrypt_info['URI']):
                            decrypt_info['URI'] = compat_urlparse.urljoin(
                                man_url, decrypt_info['URI'])
                        decrypt_info['KEY'] = self.ydl.urlopen(decrypt_info['URI']).read()
                elif line.startswith('#EXT-X-MEDIA-SEQUENCE'):
                    media_sequence = int(line[22:])
        self._finish_frag_download(ctx)
--- a/youtube_dl/extractor/adobetv.py
+++ b/youtube_dl/extractor/adobetv.py
@ -156,7 +156,10 @@ class AdobeTVVideoIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        video_data = self._download_json(url + '?format=json', video_id)
+        webpage = self._download_webpage(url, video_id)
        video_data = self._parse_json(self._search_regex(
            r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
        formats = [{
            'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),
--- a/youtube_dl/extractor/aftonbladet.py
+++ b/youtube_dl/extractor/aftonbladet.py
@ -24,10 +24,10 @@ class AftonbladetIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)
        # find internal video meta data
-        meta_url = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
+        meta_url = 'http://aftonbladet-play-metadata.cdn.drvideo.aptoma.no/video/%s.json'
        player_config = self._parse_json(self._html_search_regex(
            r'data-player-config="([^"]+)"', webpage, 'player config'), video_id)
-        internal_meta_id = player_config['videoId']
+        internal_meta_id = player_config['aptomaVideoId']
        internal_meta_url = meta_url % internal_meta_id
        internal_meta_json = self._download_json(
            internal_meta_url, video_id, 'Downloading video meta data')
--- a/youtube_dl/extractor/ard.py
+++ b/youtube_dl/extractor/ard.py
@ -8,7 +8,6 @@ from .generic import GenericIE
 from ..utils import (
    determine_ext,
    ExtractorError,
    get_element_by_attribute,
    qualities,
    int_or_none,
    parse_duration,
@ -274,41 +273,3 @@ class ARDIE(InfoExtractor):
            'upload_date': upload_date,
            'thumbnail': thumbnail,
        }
 class SportschauIE(ARDMediathekIE):
    IE_NAME = 'Sportschau'
    _VALID_URL = r'(?P<baseurl>https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video(?P<id>[^/#?]+))\.html'
    _TESTS = [{
        'url': 'http://www.sportschau.de/tourdefrance/videoseppeltkokainhatnichtsmitklassischemdopingzutun100.html',
        'info_dict': {
            'id': 'seppeltkokainhatnichtsmitklassischemdopingzutun100',
            'ext': 'mp4',
            'title': 'Seppelt: "Kokain hat nichts mit klassischem Doping zu tun"',
            'thumbnail': 're:^https?://.*\.jpg$',
            'description': 'Der ARD-Doping Experte Hajo Seppelt gibt seine Einschätzung zum ersten Dopingfall der diesjährigen Tour de France um den Italiener Luca Paolini ab.',
        },
        'params': {
            # m3u8 download
            'skip_download': True,
        },
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        base_url = mobj.group('baseurl')
        webpage = self._download_webpage(url, video_id)
        title = get_element_by_attribute('class', 'headline', webpage)
        description = self._html_search_meta('description', webpage, 'description')
        info = self._extract_media_info(
            base_url + '-mc_defaultQuality-h.json', webpage, video_id)
        info.update({
            'title': title,
            'description': description,
        })
        return info
--- a/youtube_dl/extractor/arte.py
+++ b/youtube_dl/extractor/arte.py
@ -180,11 +180,14 @@ class ArteTVBaseIE(InfoExtractor):
 class ArteTVPlus7IE(ArteTVBaseIE):
    IE_NAME = 'arte.tv:+7'
-    _VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/(?:(?:sendungen|emissions|embed)/)?(?P<id>[^/]+)/(?P<name>[^/?#&]+)'
+    _VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/[^/]+/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
        'only_matching': True,
    }, {
        'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
        'only_matching': True,
    }]
    @classmethod
@ -240,10 +243,10 @@ class ArteTVPlus7IE(ArteTVBaseIE):
            return self._extract_from_json_url(json_url, video_id, lang, title=title)
        # Different kind of embed URL (e.g.
        # http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
-        embed_url = self._search_regex(
+        entries = [
-            r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1',
+            self.url_result(url)
-            webpage, 'embed url', group='url')
+            for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
-        return self.url_result(embed_url)
+        return self.playlist_result(entries)
 # It also uses the arte_vp_url url from the webpage to extract the information
@ -252,22 +255,17 @@ class ArteTVCreativeIE(ArteTVPlus7IE):
    _VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
    _TESTS = [{
-        'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
+        'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
        'info_dict': {
-            'id': '72176',
+            'id': '057405-001-A',
            'ext': 'mp4',
-            'title': 'Folge 2 - Corporate Design',
+            'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
-            'upload_date': '20131004',
+            'upload_date': '20150716',
        },
    }, {
        'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
-        'info_dict': {
+        'playlist_count': 11,
-            'id': '160676',
+        'add_ie': ['Youtube'],
            'ext': 'mp4',
            'title': 'Monty Python live (mostly)',
            'description': 'Événement ! Quarante-cinq ans après leurs premiers succès, les légendaires Monty Python remontent sur scène.\n',
            'upload_date': '20140805',
        }
    }, {
        'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
        'only_matching': True,
@ -349,14 +347,13 @@ class ArteTVCinemaIE(ArteTVPlus7IE):
    _VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
    _TESTS = [{
-        'url': 'http://cinema.arte.tv/de/node/38291',
+        'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
-        'md5': '6b275511a5107c60bacbeeda368c3aa1',
+        'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
        'info_dict': {
-            'id': '055876-000_PWA12025-D',
+            'id': '062494-000-A',
            'ext': 'mp4',
-            'title': 'Tod auf dem Nil',
+            'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
-            'upload_date': '20160122',
+            'upload_date': '20150807',
            'description': 'md5:7f749bbb77d800ef2be11d54529b96bc',
        },
    }]
--- a/youtube_dl/extractor/azubu.py
+++ b/youtube_dl/extractor/azubu.py
@ -46,6 +46,7 @@ class AzubuIE(InfoExtractor):
                'uploader_id': 272749,
                'view_count': int,
            },
            'skip': 'Channel offline',
        },
    ]
@ -56,22 +57,26 @@ class AzubuIE(InfoExtractor):
            'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
        title = data['title'].strip()
-        description = data['description']
+        description = data.get('description')
-        thumbnail = data['thumbnail']
+        thumbnail = data.get('thumbnail')
-        view_count = data['view_count']
+        view_count = data.get('view_count')
-        uploader = data['user']['username']
+        user = data.get('user', {})
-        uploader_id = data['user']['id']
+        uploader = user.get('username')
        uploader_id = user.get('id')
        stream_params = json.loads(data['stream_params'])
-        timestamp = float_or_none(stream_params['creationDate'], 1000)
+        timestamp = float_or_none(stream_params.get('creationDate'), 1000)
-        duration = float_or_none(stream_params['length'], 1000)
+        duration = float_or_none(stream_params.get('length'), 1000)
        renditions = stream_params.get('renditions') or []
        video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
        if video:
            renditions.append(video)
        if not renditions and not user.get('channel', {}).get('is_live', True):
            raise ExtractorError('%s said: channel is offline.' % self.IE_NAME, expected=True)
        formats = [{
            'url': fmt['url'],
            'width': fmt['frameWidth'],
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@ -192,6 +192,7 @@ class BBCCoUkIE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
            'skip': 'Now it\'s really geo-restricted',
        }, {
            # compact player (https://github.com/rg3/youtube-dl/issues/8147)
            'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player',
--- a/youtube_dl/extractor/bet.py
+++ b/youtube_dl/extractor/bet.py
@ -1,31 +1,27 @@
 from __future__ import unicode_literals
-from .common import InfoExtractor
+from .mtv import MTVServicesInfoExtractor
-from ..compat import compat_urllib_parse_unquote
+from ..utils import unified_strdate
-from ..utils import (
+from ..compat import compat_urllib_parse_urlencode
    xpath_text,
    xpath_with_ns,
    int_or_none,
    parse_iso8601,
 )
-class BetIE(InfoExtractor):
+class BetIE(MTVServicesInfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
    _TESTS = [
        {
            'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
            'info_dict': {
-                'id': 'news/national/2014/a-conversation-with-president-obama',
+                'id': '07e96bd3-8850-3051-b856-271b457f0ab8',
                'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
                'ext': 'flv',
                'title': 'A Conversation With President Obama',
-                'description': 'md5:699d0652a350cf3e491cd15cc745b5da',
+                'description': 'President Obama urges persistence in confronting racism and bias.',
                'duration': 1534,
                'timestamp': 1418075340,
                'upload_date': '20141208',
                'uploader': 'admin',
                'thumbnail': 're:(?i)^https?://.*\.jpg$',
                'subtitles': {
                    'en': 'mincount:2',
                }
            },
            'params': {
                # rtmp download
@ -35,16 +31,17 @@ class BetIE(InfoExtractor):
        {
            'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
            'info_dict': {
-                'id': 'news/national/2014/justice-for-ferguson-a-community-reacts',
+                'id': '9f516bf1-7543-39c4-8076-dd441b459ba9',
                'display_id': 'justice-for-ferguson-a-community-reacts',
                'ext': 'flv',
                'title': 'Justice for Ferguson: A Community Reacts',
                'description': 'A BET News special.',
                'duration': 1696,
                'timestamp': 1416942360,
                'upload_date': '20141125',
                'uploader': 'admin',
                'thumbnail': 're:(?i)^https?://.*\.jpg$',
                'subtitles': {
                    'en': 'mincount:2',
                }
            },
            'params': {
                # rtmp download
@ -53,57 +50,32 @@ class BetIE(InfoExtractor):
        }
    ]
    _FEED_URL = "http://feeds.mtvnservices.com/od/feed/bet-mrss-player"
    def _get_feed_query(self, uri):
        return compat_urllib_parse_urlencode({
            'uuid': uri,
        })
    def _extract_mgid(self, webpage):
        return self._search_regex(r'data-uri="([^"]+)', webpage, 'mgid')
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
        mgid = self._extract_mgid(webpage)
        videos_info = self._get_videos_info(mgid)
-        media_url = compat_urllib_parse_unquote(self._search_regex(
+        info_dict = videos_info['entries'][0]
            [r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
            webpage, 'media URL'))
-        video_id = self._search_regex(
+        upload_date = unified_strdate(self._html_search_meta('date', webpage))
-            r'/video/(.*)/_jcr_content/', media_url, 'video id')
+        description = self._html_search_meta('description', webpage)
-        mrss = self._download_xml(media_url, display_id)
+        info_dict.update({
        item = mrss.find('./channel/item')
        NS_MAP = {
            'dc': 'http://purl.org/dc/elements/1.1/',
            'media': 'http://search.yahoo.com/mrss/',
            'ka': 'http://kickapps.com/karss',
        }
        title = xpath_text(item, './title', 'title')
        description = xpath_text(
            item, './description', 'description', fatal=False)
        timestamp = parse_iso8601(xpath_text(
            item, xpath_with_ns('./dc:date', NS_MAP),
            'upload date', fatal=False))
        uploader = xpath_text(
            item, xpath_with_ns('./dc:creator', NS_MAP),
            'uploader', fatal=False)
        media_content = item.find(
            xpath_with_ns('./media:content', NS_MAP))
        duration = int_or_none(media_content.get('duration'))
        smil_url = media_content.get('url')
        thumbnail = media_content.find(
            xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
        formats = self._extract_smil_formats(smil_url, display_id)
        self._sort_formats(formats)
        return {
            'id': video_id,
            'display_id': display_id,
            'title': title,
            'description': description,
-            'thumbnail': thumbnail,
+            'upload_date': upload_date,
-            'timestamp': timestamp,
+        })
-            'uploader': uploader,
+
-            'duration': duration,
+        return info_dict
            'formats': formats,
        }
--- a/youtube_dl/extractor/br.py
+++ b/youtube_dl/extractor/br.py
@ -29,7 +29,8 @@ class BRIE(InfoExtractor):
                'duration': 180,
                'uploader': 'Reinhard Weber',
                'upload_date': '20150422',
-            }
+            },
            'skip': '404 not found',
        },
        {
            'url': 'http://www.br.de/nachrichten/oberbayern/inhalt/muenchner-polizeipraesident-schreiber-gestorben-100.html',
@ -40,7 +41,8 @@ class BRIE(InfoExtractor):
                'title': 'Manfred Schreiber ist tot',
                'description': 'md5:b454d867f2a9fc524ebe88c3f5092d97',
                'duration': 26,
-            }
+            },
            'skip': '404 not found',
        },
        {
            'url': 'https://www.br-klassik.de/audio/peeping-tom-premierenkritik-dance-festival-muenchen-100.html',
@ -51,7 +53,8 @@ class BRIE(InfoExtractor):
                'title': 'Kurzweilig und sehr bewegend',
                'description': 'md5:0351996e3283d64adeb38ede91fac54e',
                'duration': 296,
-            }
+            },
            'skip': '404 not found',
        },
        {
            'url': 'http://www.br.de/radio/bayern1/service/team/videos/team-video-erdelt100.html',
--- a/youtube_dl/extractor/cbs.py
+++ b/youtube_dl/extractor/cbs.py
@ -1,17 +1,13 @@
 from __future__ import unicode_literals
-import re
+from .theplatform import ThePlatformFeedIE
 from .theplatform import ThePlatformIE
 from ..utils import (
    xpath_text,
    xpath_element,
    int_or_none,
    find_xpath_attr,
 )
-class CBSBaseIE(ThePlatformIE):
+class CBSBaseIE(ThePlatformFeedIE):
    def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
        closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
        return {
@ -21,9 +17,22 @@ class CBSBaseIE(ThePlatformIE):
            }]
        } if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []
    def _extract_video_info(self, filter_query, video_id):
        return self._extract_feed_info(
            'dJ5BDC', 'VxxJg8Ymh8sE', filter_query, video_id, lambda entry: {
                'series': entry.get('cbs$SeriesTitle'),
                'season_number': int_or_none(entry.get('cbs$SeasonNumber')),
                'episode': entry.get('cbs$EpisodeTitle'),
                'episode_number': int_or_none(entry.get('cbs$EpisodeNumber')),
            }, {
                'StreamPack': {
                    'manifest': 'm3u',
                }
            })
 class CBSIE(CBSBaseIE):
-    _VALID_URL = r'(?:cbs:(?P<content_id>\w+)|https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/(?:video|artist)|colbertlateshow\.com/(?:video|podcasts))/[^/]+/(?P<display_id>[^/]+))'
+    _VALID_URL = r'(?:cbs:|https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/video|colbertlateshow\.com/(?:video|podcasts))/)(?P<id>[\w-]+)'
    _TESTS = [{
        'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
@ -38,25 +47,7 @@ class CBSIE(CBSBaseIE):
            'upload_date': '20131127',
            'uploader': 'CBSI-NEW',
        },
-        'params': {
+        'expected_warnings': ['Failed to download m3u8 information'],
            # rtmp download
            'skip_download': True,
        },
        '_skip': 'Blocked outside the US',
    }, {
        'url': 'http://www.cbs.com/shows/liveonletterman/artist/221752/st-vincent/',
        'info_dict': {
            'id': 'WWF_5KqY3PK1',
            'display_id': 'st-vincent',
            'ext': 'flv',
            'title': 'Live on Letterman - St. Vincent',
            'description': 'Live On Letterman: St. Vincent in concert from New York\'s Ed Sullivan Theater on Tuesday, July 16, 2014.',
            'duration': 3221,
        },
        'params': {
            # rtmp download
            'skip_download': True,
        },
        '_skip': 'Blocked outside the US',
    }, {
        'url': 'http://colbertlateshow.com/video/8GmB0oY0McANFvp2aEffk9jZZZ2YyXxy/the-colbeard/',
@ -68,44 +59,5 @@ class CBSIE(CBSBaseIE):
    TP_RELEASE_URL_TEMPLATE = 'http://link.theplatform.com/s/dJ5BDC/%s?mbr=true'
    def _real_extract(self, url):
-        content_id, display_id = re.match(self._VALID_URL, url).groups()
+        content_id = self._match_id(url)
-        if not content_id:
+        return self._extract_video_info('byGuid=%s' % content_id, content_id)
            webpage = self._download_webpage(url, display_id)
            content_id = self._search_regex(
                [r"video\.settings\.content_id\s*=\s*'([^']+)';", r"cbsplayer\.contentId\s*=\s*'([^']+)';"],
                webpage, 'content id')
        items_data = self._download_xml(
            'http://can.cbs.com/thunder/player/videoPlayerService.php',
            content_id, query={'partner': 'cbs', 'contentId': content_id})
        video_data = xpath_element(items_data, './/item')
        title = xpath_text(video_data, 'videoTitle', 'title', True)
        subtitles = {}
        formats = []
        for item in items_data.findall('.//item'):
            pid = xpath_text(item, 'pid')
            if not pid:
                continue
            tp_release_url = self.TP_RELEASE_URL_TEMPLATE % pid
            if '.m3u8' in xpath_text(item, 'contentUrl', default=''):
                tp_release_url += '&manifest=m3u'
            tp_formats, tp_subtitles = self._extract_theplatform_smil(
                tp_release_url, content_id, 'Downloading %s SMIL data' % pid)
            formats.extend(tp_formats)
            subtitles = self._merge_subtitles(subtitles, tp_subtitles)
        self._sort_formats(formats)
        info = self.get_metadata('dJ5BDC/media/guid/2198311517/%s' % content_id, content_id)
        info.update({
            'id': content_id,
            'display_id': display_id,
            'title': title,
            'series': xpath_text(video_data, 'seriesTitle'),
            'season_number': int_or_none(xpath_text(video_data, 'seasonNumber')),
            'episode_number': int_or_none(xpath_text(video_data, 'episodeNumber')),
            'duration': int_or_none(xpath_text(video_data, 'videoLength'), 1000),
            'thumbnail': xpath_text(video_data, 'previewImageURL'),
            'formats': formats,
            'subtitles': subtitles,
        })
        return info
--- a/youtube_dl/extractor/cbsnews.py
+++ b/youtube_dl/extractor/cbsnews.py
@ -30,9 +30,12 @@ class CBSNewsIE(CBSBaseIE):
        {
            'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
            'info_dict': {
-                'id': 'fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack',
+                'id': 'SNJBOYzXiWBOvaLsdzwH8fmtP1SCd91Y',
                'ext': 'mp4',
                'title': 'Fort Hood shooting: Army downplays mental illness as cause of attack',
                'description': 'md5:4a6983e480542d8b333a947bfc64ddc7',
                'upload_date': '19700101',
                'uploader': 'CBSI-NEW',
                'thumbnail': 're:^https?://.*\.jpg$',
                'duration': 205,
                'subtitles': {
@ -58,30 +61,8 @@ class CBSNewsIE(CBSBaseIE):
            webpage, 'video JSON info'), video_id)
        item = video_info['item'] if 'item' in video_info else video_info
-        title = item.get('articleTitle') or item.get('hed')
+        guid = item['mpxRefId']
-        duration = item.get('duration')
+        return self._extract_video_info('byGuid=%s' % guid, guid)
        thumbnail = item.get('mediaImage') or item.get('thumbnail')
        subtitles = {}
        formats = []
        for format_id in ['RtmpMobileLow', 'RtmpMobileHigh', 'Hls', 'RtmpDesktop']:
            pid = item.get('media' + format_id)
            if not pid:
                continue
            release_url = 'http://link.theplatform.com/s/dJ5BDC/%s?mbr=true' % pid
            tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % pid)
            formats.extend(tp_formats)
            subtitles = self._merge_subtitles(subtitles, tp_subtitles)
        self._sort_formats(formats)
        return {
            'id': video_id,
            'title': title,
            'thumbnail': thumbnail,
            'duration': duration,
            'formats': formats,
            'subtitles': subtitles,
        }
 class CBSNewsLiveVideoIE(InfoExtractor):
--- a/youtube_dl/extractor/cbssports.py
+++ b/youtube_dl/extractor/cbssports.py
@ -1,30 +1,28 @@
 from __future__ import unicode_literals
-import re
+from .cbs import CBSBaseIE
 from .common import InfoExtractor
-class CBSSportsIE(InfoExtractor):
+class CBSSportsIE(CBSBaseIE):
-    _VALID_URL = r'https?://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
+    _VALID_URL = r'https?://www\.cbssports\.com/video/player/[^/]+/(?P<id>\d+)'
-    _TEST = {
+    _TESTS = [{
-        'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s',
+        'url': 'http://www.cbssports.com/video/player/videos/708337219968/0/ben-simmons-the-next-lebron?-not-so-fast',
        'info_dict': {
-            'id': '_d5_GbO8p1sT',
+            'id': '708337219968',
-            'ext': 'flv',
+            'ext': 'mp4',
-            'title': 'US Open flashbacks: 1990s',
+            'title': 'Ben Simmons the next LeBron? Not so fast',
-            'description': 'Bill Macatee relives the best moments in US Open history from the 1990s.',
+            'description': 'md5:854294f627921baba1f4b9a990d87197',
            'timestamp': 1466293740,
            'upload_date': '20160618',
            'uploader': 'CBSI-NEW',
        },
        'params': {
            # m3u8 download
            'skip_download': True,
        }
    }]
    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
+        video_id = self._match_id(url)
-        section = mobj.group('section')
+        return self._extract_video_info('byId=%s' % video_id, video_id)
        video_id = mobj.group('id')
        all_videos = self._download_json(
            'http://www.cbssports.com/data/video/player/getVideos/%s?as=json' % section,
            video_id)
        # The json file contains the info of all the videos in the section
        video_info = next(v for v in all_videos if v['pcid'] == video_id)
        return self.url_result('theplatform:%s' % video_info['pid'], 'ThePlatform')
--- a/youtube_dl/extractor/closertotruth.py
+++ b/youtube_dl/extractor/closertotruth.py
@ -0,0 +1,92 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 class CloserToTruthIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?closertotruth\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://closertotruth.com/series/solutions-the-mind-body-problem#video-3688',
        'info_dict': {
            'id': '0_zof1ktre',
            'display_id': 'solutions-the-mind-body-problem',
            'ext': 'mov',
            'title': 'Solutions to the Mind-Body Problem?',
            'upload_date': '20140221',
            'timestamp': 1392956007,
            'uploader_id': 'CTTXML'
        },
        'params': {
            'skip_download': True,
        },
    }, {
        'url': 'http://closertotruth.com/episodes/how-do-brains-work',
        'info_dict': {
            'id': '0_iuxai6g6',
            'display_id': 'how-do-brains-work',
            'ext': 'mov',
            'title': 'How do Brains Work?',
            'upload_date': '20140221',
            'timestamp': 1392956024,
            'uploader_id': 'CTTXML'
        },
        'params': {
            'skip_download': True,
        },
    }, {
        'url': 'http://closertotruth.com/interviews/1725',
        'info_dict': {
            'id': '1725',
            'title': 'AyaFr-002',
        },
        'playlist_mincount': 2,
    }]
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
        partner_id = self._search_regex(
            r'<script[^>]+src=["\'].*?\b(?:partner_id|p)/(\d+)',
            webpage, 'kaltura partner_id')
        title = self._search_regex(
            r'<title>(.+?)\s*\|\s*.+?</title>', webpage, 'video title')
        select = self._search_regex(
            r'(?s)<select[^>]+id="select-version"[^>]*>(.+?)</select>',
            webpage, 'select version', default=None)
        if select:
            entry_ids = set()
            entries = []
            for mobj in re.finditer(
                    r'<option[^>]+value=(["\'])(?P<id>[0-9a-z_]+)(?:#.+?)?\1[^>]*>(?P<title>[^<]+)',
                    webpage):
                entry_id = mobj.group('id')
                if entry_id in entry_ids:
                    continue
                entry_ids.add(entry_id)
                entries.append({
                    '_type': 'url_transparent',
                    'url': 'kaltura:%s:%s' % (partner_id, entry_id),
                    'ie_key': 'Kaltura',
                    'title': mobj.group('title'),
                })
            if entries:
                return self.playlist_result(entries, display_id, title)
        entry_id = self._search_regex(
            r'<a[^>]+id=(["\'])embed-kaltura\1[^>]+data-kaltura=(["\'])(?P<id>[0-9a-z_]+)\2',
            webpage, 'kaltura entry_id', group='id')
        return {
            '_type': 'url_transparent',
            'display_id': display_id,
            'url': 'kaltura:%s:%s' % (partner_id, entry_id),
            'ie_key': 'Kaltura',
            'title': title
        }
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@ -53,6 +53,7 @@ from ..utils import (
    mimetype2ext,
    update_Request,
    update_url_query,
    parse_m3u8_attributes,
 )
@ -1150,23 +1151,11 @@ class InfoExtractor(object):
            }]
        last_info = None
        last_media = None
        kv_rex = re.compile(
            r'(?P<key>[a-zA-Z_-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)')
        for line in m3u8_doc.splitlines():
            if line.startswith('#EXT-X-STREAM-INF:'):
-                last_info = {}
+                last_info = parse_m3u8_attributes(line)
                for m in kv_rex.finditer(line):
                    v = m.group('val')
                    if v.startswith('"'):
                        v = v[1:-1]
                    last_info[m.group('key')] = v
            elif line.startswith('#EXT-X-MEDIA:'):
-                last_media = {}
+                last_media = parse_m3u8_attributes(line)
                for m in kv_rex.finditer(line):
                    v = m.group('val')
                    if v.startswith('"'):
                        v = v[1:-1]
                    last_media[m.group('key')] = v
            elif line.startswith('#') or not line.strip():
                continue
            else:
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -44,7 +44,6 @@ from .archiveorg import ArchiveOrgIE
 from .ard import (
    ARDIE,
    ARDMediathekIE,
    SportschauIE,
 )
 from .arte import (
    ArteTvIE,
@ -141,6 +140,7 @@ from .cliprs import ClipRsIE
 from .clipfish import ClipfishIE
 from .cliphunter import CliphunterIE
 from .clipsyndicate import ClipsyndicateIE
 from .closertotruth import CloserToTruthIE
 from .cloudy import CloudyIE
 from .clubic import ClubicIE
 from .clyp import ClypIE
@ -285,7 +285,6 @@ from .gameone import (
 from .gamersyde import GamersydeIE
 from .gamespot import GameSpotIE
 from .gamestar import GameStarIE
 from .gametrailers import GametrailersIE
 from .gazeta import GazetaIE
 from .gdcvault import GDCVaultIE
 from .generic import GenericIE
@ -481,7 +480,6 @@ from .nbc import (
    NBCNewsIE,
    NBCSportsIE,
    NBCSportsVPlayerIE,
    MSNBCIE,
 )
 from .ndr import (
    NDRIE,
@ -631,7 +629,10 @@ from .qqmusic import (
    QQMusicToplistIE,
    QQMusicPlaylistIE,
 )
-from .r7 import R7IE
+from .r7 import (
    R7IE,
    R7ArticleIE,
 )
 from .radiocanada import (
    RadioCanadaIE,
    RadioCanadaAudioVideoIE,
@ -747,6 +748,7 @@ from .sportbox import (
    SportBoxEmbedIE,
 )
 from .sportdeutschland import SportDeutschlandIE
 from .sportschau import SportschauIE
 from .srgssr import (
    SRGSSRIE,
    SRGSSRPlayIE,
--- a/youtube_dl/extractor/facebook.py
+++ b/youtube_dl/extractor/facebook.py
@ -239,6 +239,8 @@ class FacebookIE(InfoExtractor):
        formats = []
        for format_id, f in video_data.items():
            if f and isinstance(f, dict):
                f = [f]
            if not f or not isinstance(f, list):
                continue
            for quality in ('sd', 'hd'):
--- a/youtube_dl/extractor/foxsports.py
+++ b/youtube_dl/extractor/foxsports.py
@ -1,7 +1,10 @@
 from __future__ import unicode_literals
 from .common import InfoExtractor
-from ..utils import smuggle_url
+from ..utils import (
    smuggle_url,
    update_url_query,
 )
 class FoxSportsIE(InfoExtractor):
@ -9,11 +12,15 @@ class FoxSportsIE(InfoExtractor):
    _TEST = {
        'url': 'http://www.foxsports.com/video?vid=432609859715',
        'md5': 'b49050e955bebe32c301972e4012ac17',
        'info_dict': {
-            'id': 'gA0bHB3Ladz3',
+            'id': 'i0qKWsk3qJaM',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Courtney Lee on going up 2-0 in series vs. Blazers',
            'description': 'Courtney Lee talks about Memphis being focused.',
            'upload_date': '20150423',
            'timestamp': 1429761109,
            'uploader': 'NEWA-FNG-FOXSPORTS',
        },
        'add_ie': ['ThePlatform'],
    }
@ -28,5 +35,8 @@ class FoxSportsIE(InfoExtractor):
                r"data-player-config='([^']+)'", webpage, 'data player config'),
            video_id)
-        return self.url_result(smuggle_url(
+        return self.url_result(smuggle_url(update_url_query(
-            config['releaseURL'] + '&manifest=f4m', {'force_smil_url': True}))
+            config['releaseURL'], {
                'mbr': 'true',
                'switch': 'http',
            }), {'force_smil_url': True}))
--- a/youtube_dl/extractor/gamespot.py
+++ b/youtube_dl/extractor/gamespot.py
@ -1,19 +1,19 @@
 from __future__ import unicode_literals
 import re
 import json
-from .common import InfoExtractor
+from .once import OnceIE
 from ..compat import (
    compat_urllib_parse_unquote,
    compat_urlparse,
 )
 from ..utils import (
    unescapeHTML,
    url_basename,
    dict_get,
 )
-class GameSpotIE(InfoExtractor):
+class GameSpotIE(OnceIE):
    _VALID_URL = r'https?://(?:www\.)?gamespot\.com/.*-(?P<id>\d+)/?'
    _TESTS = [{
        'url': 'http://www.gamespot.com/videos/arma-3-community-guide-sitrep-i/2300-6410818/',
@ -39,29 +39,73 @@ class GameSpotIE(InfoExtractor):
        webpage = self._download_webpage(url, page_id)
        data_video_json = self._search_regex(
            r'data-video=["\'](.*?)["\']', webpage, 'data video')
-        data_video = json.loads(unescapeHTML(data_video_json))
+        data_video = self._parse_json(unescapeHTML(data_video_json), page_id)
        streams = data_video['videoStreams']
        manifest_url = None
        formats = []
        f4m_url = streams.get('f4m_stream')
-        if f4m_url is not None:
+        if f4m_url:
-            # Transform the manifest url to a link to the mp4 files
+            manifest_url = f4m_url
-            # they are used in mobile devices.
+            formats.extend(self._extract_f4m_formats(
-            f4m_path = compat_urlparse.urlparse(f4m_url).path
+                f4m_url + '?hdcore=3.7.0', page_id, f4m_id='hds', fatal=False))
        m3u8_url = streams.get('m3u8_stream')
        if m3u8_url:
            manifest_url = m3u8_url
            m3u8_formats = self._extract_m3u8_formats(
                m3u8_url, page_id, 'mp4', 'm3u8_native',
                m3u8_id='hls', fatal=False)
            formats.extend(m3u8_formats)
        progressive_url = dict_get(
            streams, ('progressive_hd', 'progressive_high', 'progressive_low'))
        if progressive_url and manifest_url:
            qualities_basename = self._search_regex(
                '/([^/]+)\.csmil/',
                manifest_url, 'qualities basename', default=None)
            if qualities_basename:
                QUALITIES_RE = r'((,\d+)+,?)'
-            qualities = self._search_regex(QUALITIES_RE, f4m_path, 'qualities').strip(',').split(',')
+                qualities = self._search_regex(
-            http_path = f4m_path[1:].split('/', 1)[1]
+                    QUALITIES_RE, qualities_basename,
-            http_template = re.sub(QUALITIES_RE, r'%s', http_path)
+                    'qualities', default=None)
-            http_template = http_template.replace('.csmil/manifest.f4m', '')
+                if qualities:
-            http_template = compat_urlparse.urljoin(
+                    qualities = list(map(lambda q: int(q), qualities.strip(',').split(',')))
-                'http://video.gamespotcdn.com/', http_template)
+                    qualities.sort()
                    http_template = re.sub(QUALITIES_RE, r'%d', qualities_basename)
                    http_url_basename = url_basename(progressive_url)
                    if m3u8_formats:
                        self._sort_formats(m3u8_formats)
                        m3u8_formats = list(filter(
                            lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
                            m3u8_formats))
                    if len(qualities) == len(m3u8_formats):
                        for q, m3u8_format in zip(qualities, m3u8_formats):
                            f = m3u8_format.copy()
                            f.update({
                                'url': progressive_url.replace(
                                    http_url_basename, http_template % q),
                                'format_id': f['format_id'].replace('hls', 'http'),
                                'protocol': 'http',
                            })
                            formats.append(f)
                    else:
                        for q in qualities:
                            formats.append({
-                    'url': http_template % q,
+                                'url': progressive_url.replace(
                                    http_url_basename, http_template % q),
                                'ext': 'mp4',
-                    'format_id': q,
+                                'format_id': 'http-%d' % q,
                                'tbr': q,
                            })
-        else:
+
        onceux_json = self._search_regex(
            r'data-onceux-options=["\'](.*?)["\']', webpage, 'data video', default=None)
        if onceux_json:
            onceux_url = self._parse_json(unescapeHTML(onceux_json), page_id).get('metadataUri')
            if onceux_url:
                formats.extend(self._extract_once_formats(re.sub(
                    r'https?://[^/]+', 'http://once.unicornmedia.com', onceux_url).replace('ads/vmap/', '')))
        if not formats:
            for quality in ['sd', 'hd']:
                # It's actually a link to a flv file
                flv_url = streams.get('f4m_{0}'.format(quality))
@ -71,6 +115,7 @@ class GameSpotIE(InfoExtractor):
                        'ext': 'flv',
                        'format_id': quality,
                    })
        self._sort_formats(formats)
        return {
            'id': data_video['guid'],
--- a/youtube_dl/extractor/gametrailers.py
+++ b/youtube_dl/extractor/gametrailers.py
@ -1,62 +0,0 @@
 from __future__ import unicode_literals
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
    parse_age_limit,
    url_basename,
 )
 class GametrailersIE(InfoExtractor):
    _VALID_URL = r'https?://www\.gametrailers\.com/videos/view/[^/]+/(?P<id>.+)'
    _TEST = {
        'url': 'http://www.gametrailers.com/videos/view/gametrailers-com/116437-Just-Cause-3-Review',
        'md5': 'f28c4efa0bdfaf9b760f6507955b6a6a',
        'info_dict': {
            'id': '2983958',
            'ext': 'mp4',
            'display_id': '116437-Just-Cause-3-Review',
            'title': 'Just Cause 3 - Review',
            'description': 'It\'s a lot of fun to shoot at things and then watch them explode in Just Cause 3, but should there be more to the experience than that?',
        },
    }
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
        title = self._html_search_regex(
            r'<title>(.+?)\|', webpage, 'title').strip()
        embed_url = self._proto_relative_url(
            self._search_regex(
                r'src=\'(//embed.gametrailers.com/embed/[^\']+)\'', webpage,
                'embed url'),
            scheme='http:')
        video_id = url_basename(embed_url)
        embed_page = self._download_webpage(embed_url, video_id)
        embed_vars_json = self._search_regex(
            r'(?s)var embedVars = (\{.*?\})\s*</script>', embed_page,
            'embed vars')
        info = self._parse_json(embed_vars_json, video_id)
        formats = []
        for media in info['media']:
            if media['mediaPurpose'] == 'play':
                formats.append({
                    'url': media['uri'],
                    'height': media['height'],
                    'width:': media['width'],
                })
        self._sort_formats(formats)
        return {
            'id': video_id,
            'display_id': display_id,
            'title': title,
            'formats': formats,
            'thumbnail': info.get('thumbUri'),
            'description': self._og_search_description(webpage),
            'duration': int_or_none(info.get('videoLengthInSeconds')),
            'age_limit': parse_age_limit(info.get('audienceRating')),
        }
--- a/youtube_dl/extractor/mtv.py
+++ b/youtube_dl/extractor/mtv.py
@ -85,9 +85,10 @@ class MTVServicesInfoExtractor(InfoExtractor):
                rtmp_video_url = rendition.find('./src').text
                if rtmp_video_url.endswith('siteunavail.png'):
                    continue
                new_url = self._transform_rtmp_url(rtmp_video_url)
                formats.append({
-                    'ext': ext,
+                    'ext': 'flv' if new_url.startswith('rtmp') else ext,
-                    'url': self._transform_rtmp_url(rtmp_video_url),
+                    'url': new_url,
                    'format_id': rendition.get('bitrate'),
                    'width': int(rendition.get('width')),
                    'height': int(rendition.get('height')),
--- a/youtube_dl/extractor/nbc.py
+++ b/youtube_dl/extractor/nbc.py
@ -9,10 +9,6 @@ from ..utils import (
    lowercase_escape,
    smuggle_url,
    unescapeHTML,
    update_url_query,
    int_or_none,
    HEADRequest,
    parse_iso8601,
 )
@ -192,9 +188,9 @@ class CSNNEIE(InfoExtractor):
 class NBCNewsIE(ThePlatformIE):
-    _VALID_URL = r'''(?x)https?://(?:www\.)?(?:nbcnews|today)\.com/
+    _VALID_URL = r'''(?x)https?://(?:www\.)?(?:nbcnews|today|msnbc)\.com/
        (?:video/.+?/(?P<id>\d+)|
-        ([^/]+/)*(?P<display_id>[^/?]+))
+        ([^/]+/)*(?:.*-)?(?P<mpx_id>[^/?]+))
        '''
    _TESTS = [
@ -216,13 +212,16 @@ class NBCNewsIE(ThePlatformIE):
                'ext': 'mp4',
                'title': 'How Twitter Reacted To The Snowden Interview',
                'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
                'uploader': 'NBCU-NEWS',
                'timestamp': 1401363060,
                'upload_date': '20140529',
            },
        },
        {
            'url': 'http://www.nbcnews.com/feature/dateline-full-episodes/full-episode-family-business-n285156',
            'md5': 'fdbf39ab73a72df5896b6234ff98518a',
            'info_dict': {
-                'id': 'Wjf9EDR3A_60',
+                'id': '529953347624',
                'ext': 'mp4',
                'title': 'FULL EPISODE: Family Business',
                'description': 'md5:757988edbaae9d7be1d585eb5d55cc04',
@ -237,6 +236,9 @@ class NBCNewsIE(ThePlatformIE):
                'ext': 'mp4',
                'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
                'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
                'timestamp': 1423104900,
                'uploader': 'NBCU-NEWS',
                'upload_date': '20150205',
            },
        },
        {
@ -245,10 +247,12 @@ class NBCNewsIE(ThePlatformIE):
            'info_dict': {
                'id': '529953347624',
                'ext': 'mp4',
-                'title': 'Volkswagen U.S. Chief: We \'Totally Screwed Up\'',
+                'title': 'Volkswagen U.S. Chief:\xa0 We Have Totally Screwed Up',
-                'description': 'md5:d22d1281a24f22ea0880741bb4dd6301',
+                'description': 'md5:c8be487b2d80ff0594c005add88d8351',
                'upload_date': '20150922',
                'timestamp': 1442917800,
                'uploader': 'NBCU-NEWS',
            },
            'expected_warnings': ['http-6000 is not available']
        },
        {
            'url': 'http://www.today.com/video/see-the-aurora-borealis-from-space-in-stunning-new-nasa-video-669831235788',
@ -260,6 +264,22 @@ class NBCNewsIE(ThePlatformIE):
                'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
                'upload_date': '20160420',
                'timestamp': 1461152093,
                'uploader': 'NBCU-NEWS',
            },
        },
        {
            'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
            'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
            'info_dict': {
                'id': '314487875924',
                'ext': 'mp4',
                'title': 'The chaotic GOP immigration vote',
                'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
                'thumbnail': 're:^https?://.*\.jpg$',
                'timestamp': 1406937606,
                'upload_date': '20140802',
                'uploader': 'NBCU-NEWS',
                'categories': ['MSNBC/Topics/Franchise/Best of last night', 'MSNBC/Topics/General/Congress'],
            },
        },
        {
@ -290,15 +310,16 @@ class NBCNewsIE(ThePlatformIE):
            }
        else:
            # "feature" and "nightly-news" pages use theplatform.com
-            display_id = mobj.group('display_id')
+            video_id = mobj.group('mpx_id')
-            webpage = self._download_webpage(url, display_id)
+            if not video_id.isdigit():
                webpage = self._download_webpage(url, video_id)
                info = None
                bootstrap_json = self._search_regex(
                    [r'(?m)(?:var\s+(?:bootstrapJson|playlistData)|NEWS\.videoObj)\s*=\s*({.+});?\s*$',
                     r'videoObj\s*:\s*({.+})', r'data-video="([^"]+)"'],
                    webpage, 'bootstrap json', default=None)
                bootstrap = self._parse_json(
-                bootstrap_json, display_id, transform_source=unescapeHTML)
+                    bootstrap_json, video_id, transform_source=unescapeHTML)
                if 'results' in bootstrap:
                    info = bootstrap['results'][0]['video']
                elif 'video' in bootstrap:
@ -306,89 +327,11 @@ class NBCNewsIE(ThePlatformIE):
                else:
                    info = bootstrap
                video_id = info['mpxId']
            title = info['title']
            subtitles = {}
            caption_links = info.get('captionLinks')
            if caption_links:
                for (sub_key, sub_ext) in (('smpte-tt', 'ttml'), ('web-vtt', 'vtt'), ('srt', 'srt')):
                    sub_url = caption_links.get(sub_key)
                    if sub_url:
                        subtitles.setdefault('en', []).append({
                            'url': sub_url,
                            'ext': sub_ext,
                        })
            formats = []
            for video_asset in info['videoAssets']:
                video_url = video_asset.get('publicUrl')
                if not video_url:
                    continue
                container = video_asset.get('format')
                asset_type = video_asset.get('assetType') or ''
                if container == 'ISM' or asset_type == 'FireTV-Once':
                    continue
                elif asset_type == 'OnceURL':
                    tp_formats, tp_subtitles = self._extract_theplatform_smil(
                        video_url, video_id)
                    formats.extend(tp_formats)
                    subtitles = self._merge_subtitles(subtitles, tp_subtitles)
                else:
                    tbr = int_or_none(video_asset.get('bitRate') or video_asset.get('bitrate'), 1000)
                    format_id = 'http%s' % ('-%d' % tbr if tbr else '')
                    video_url = update_url_query(
                        video_url, {'format': 'redirect'})
                    # resolve the url so that we can check availability and detect the correct extension
                    head = self._request_webpage(
                        HEADRequest(video_url), video_id,
                        'Checking %s url' % format_id,
                        '%s is not available' % format_id,
                        fatal=False)
                    if head:
                        video_url = head.geturl()
                        formats.append({
                            'format_id': format_id,
                            'url': video_url,
                            'width': int_or_none(video_asset.get('width')),
                            'height': int_or_none(video_asset.get('height')),
                            'tbr': tbr,
                            'container': video_asset.get('format'),
                        })
            self._sort_formats(formats)
            return {
                '_type': 'url_transparent',
                'id': video_id,
-                'title': title,
+                # http://feed.theplatform.com/f/2E2eJC/nbcnews also works
-                'description': info.get('description'),
+                'url': 'http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews?byId=%s' % video_id,
-                'thumbnail': info.get('thumbnail'),
+                'ie_key': 'ThePlatformFeed',
                'duration': int_or_none(info.get('duration')),
                'timestamp': parse_iso8601(info.get('pubDate') or info.get('pub_date')),
                'formats': formats,
                'subtitles': subtitles,
            }
 class MSNBCIE(InfoExtractor):
    # https URLs redirect to corresponding http ones
    _VALID_URL = r'https?://www\.msnbc\.com/[^/]+/watch/(?P<id>[^/]+)'
    _TEST = {
        'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
        'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
        'info_dict': {
            'id': 'n_hayes_Aimm_140801_272214',
            'ext': 'mp4',
            'title': 'The chaotic GOP immigration vote',
            'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
            'thumbnail': 're:^https?://.*\.jpg$',
            'timestamp': 1406937606,
            'upload_date': '20140802',
            'uploader': 'NBCU-NEWS',
            'categories': ['MSNBC/Topics/Franchise/Best of last night', 'MSNBC/Topics/General/Congress'],
        },
    }
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        embed_url = self._html_search_meta('embedURL', webpage)
        return self.url_result(embed_url)
--- a/youtube_dl/extractor/r7.py
+++ b/youtube_dl/extractor/r7.py
@ -2,15 +2,12 @@
 from __future__ import unicode_literals
 from .common import InfoExtractor
-from ..utils import (
+from ..utils import int_or_none
    js_to_json,
    unescapeHTML,
    int_or_none,
 )
 class R7IE(InfoExtractor):
-    _VALID_URL = r'''(?x)https?://
+    _VALID_URL = r'''(?x)
                        https?://
                        (?:
                            (?:[a-zA-Z]+)\.r7\.com(?:/[^/]+)+/idmedia/|
                            noticias\.r7\.com(?:/[^/]+)+/[^/]+-|
@ -25,6 +22,7 @@ class R7IE(InfoExtractor):
            'id': '54e7050b0cf2ff57e0279389',
            'ext': 'mp4',
            'title': 'Policiais humilham suspeito à beira da morte: "Morre com dignidade"',
            'description': 'md5:01812008664be76a6479aa58ec865b72',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 98,
            'like_count': int,
@ -44,45 +42,72 @@ class R7IE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        webpage = self._download_webpage(
+        video = self._download_json(
-            'http://player.r7.com/video/i/%s' % video_id, video_id)
+            'http://player-api.r7.com/video/i/%s' % video_id, video_id)
-        item = self._parse_json(js_to_json(self._search_regex(
+        title = video['title']
            r'(?s)var\s+item\s*=\s*({.+?});', webpage, 'player')), video_id)
        title = unescapeHTML(item['title'])
        thumbnail = item.get('init', {}).get('thumbUri')
        duration = None
        statistics = item.get('statistics', {})
        like_count = int_or_none(statistics.get('likes'))
        view_count = int_or_none(statistics.get('views'))
        formats = []
-        for format_key, format_dict in item['playlist'][0].items():
+        media_url_hls = video.get('media_url_hls')
-            src = format_dict.get('src')
+        if media_url_hls:
-            if not src:
+            formats.extend(self._extract_m3u8_formats(
-                continue
+                media_url_hls, video_id, 'mp4', entry_protocol='m3u8_native',
-            format_id = format_dict.get('format') or format_key
+                m3u8_id='hls', fatal=False))
-            if duration is None:
+        media_url = video.get('media_url')
-                duration = format_dict.get('duration')
+        if media_url:
-            if '.f4m' in src:
+            f = {
-                formats.extend(self._extract_f4m_formats(src, video_id, preference=-1))
+                'url': media_url,
-            elif src.endswith('.m3u8'):
+                'format_id': 'http',
-                formats.extend(self._extract_m3u8_formats(src, video_id, 'mp4', preference=-2))
+            }
-            else:
+            # m3u8 format always matches the http format, let's copy metadata from
-                formats.append({
+            # one to another
-                    'url': src,
+            m3u8_formats = list(filter(
-                    'format_id': format_id,
+                lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
-                })
+                formats))
            if len(m3u8_formats) == 1:
                f_copy = m3u8_formats[0].copy()
                f_copy.update(f)
                f_copy['protocol'] = 'http'
                f = f_copy
            formats.append(f)
        self._sort_formats(formats)
        description = video.get('description')
        thumbnail = video.get('thumb')
        duration = int_or_none(video.get('media_duration'))
        like_count = int_or_none(video.get('likes'))
        view_count = int_or_none(video.get('views'))
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'duration': duration,
            'like_count': like_count,
            'view_count': view_count,
            'formats': formats,
        }
 class R7ArticleIE(InfoExtractor):
    _VALID_URL = r'https?://(?:[a-zA-Z]+)\.r7\.com/(?:[^/]+/)+[^/?#&]+-(?P<id>\d+)'
    _TEST = {
        'url': 'http://tv.r7.com/record-play/balanco-geral/videos/policiais-humilham-suspeito-a-beira-da-morte-morre-com-dignidade-16102015',
        'only_matching': True,
    }
    @classmethod
    def suitable(cls, url):
        return False if R7IE.suitable(url) else super(R7ArticleIE, cls).suitable(url)
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
        video_id = self._search_regex(
            r'<div[^>]+(?:id=["\']player-|class=["\']embed["\'][^>]+id=["\'])([\da-f]{24})',
            webpage, 'video id')
        return self.url_result('http://player.r7.com/video/i/%s' % video_id, R7IE.ie_key())
--- a/youtube_dl/extractor/radiojavan.py
+++ b/youtube_dl/extractor/radiojavan.py
@ -3,7 +3,7 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
-from ..utils import(
+from ..utils import (
    unified_strdate,
    str_to_int,
 )
--- a/youtube_dl/extractor/sportschau.py
+++ b/youtube_dl/extractor/sportschau.py
@ -0,0 +1,38 @@
 # coding: utf-8
 from __future__ import unicode_literals
 from .wdr import WDRBaseIE
 from ..utils import get_element_by_attribute
 class SportschauIE(WDRBaseIE):
    IE_NAME = 'Sportschau'
    _VALID_URL = r'https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video-?(?P<id>[^/#?]+)\.html'
    _TEST = {
        'url': 'http://www.sportschau.de/uefaeuro2016/videos/video-dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100.html',
        'info_dict': {
            'id': 'mdb-1140188',
            'display_id': 'dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100',
            'ext': 'mp4',
            'title': 'DFB-Team geht gut gelaunt ins Spiel gegen Polen',
            'description': 'Vor dem zweiten Gruppenspiel gegen Polen herrscht gute Stimmung im deutschen Team. Insbesondere Bastian Schweinsteiger strotzt vor Optimismus nach seinem Tor gegen die Ukraine.',
            'upload_date': '20160615',
        },
        'skip': 'Geo-restricted to Germany',
    }
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        title = get_element_by_attribute('class', 'headline', webpage)
        description = self._html_search_meta('description', webpage, 'description')
        info = self._extract_wdr_video(webpage, video_id)
        info.update({
            'title': title,
            'description': description,
        })
        return info
--- a/youtube_dl/extractor/streamcloud.py
+++ b/youtube_dl/extractor/streamcloud.py
@ -6,7 +6,6 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
    sanitized_Request,
    urlencode_postdata,
 )
@ -45,20 +44,26 @@ class StreamcloudIE(InfoExtractor):
            (?:id="[^"]+"\s+)?
            value="([^"]*)"
            ''', orig_webpage)
        post = urlencode_postdata(fields)
        self._sleep(12, video_id)
        headers = {
            b'Content-Type': b'application/x-www-form-urlencoded',
        }
        req = sanitized_Request(url, post, headers)
        webpage = self._download_webpage(
-            req, video_id, note='Downloading video page ...')
+            url, video_id, data=urlencode_postdata(fields), headers={
                b'Content-Type': b'application/x-www-form-urlencoded',
            })
        try:
            title = self._html_search_regex(
                r'<h1[^>]*>([^<]+)<', webpage, 'title')
            video_url = self._search_regex(
                r'file:\s*"([^"]+)"', webpage, 'video URL')
        except ExtractorError:
            message = self._html_search_regex(
                r'(?s)<div[^>]+class=(["\']).*?msgboxinfo.*?\1[^>]*>(?P<message>.+?)</div>',
                webpage, 'message', default=None, group='message')
            if message:
                raise ExtractorError('%s said: %s' % (self.IE_NAME, message), expected=True)
            raise
        thumbnail = self._search_regex(
            r'image:\s*"([^"]+)"', webpage, 'thumbnail URL', fatal=False)
--- a/youtube_dl/extractor/svt.py
+++ b/youtube_dl/extractor/svt.py
@ -6,17 +6,14 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    determine_ext,
    dict_get,
    int_or_none,
    try_get,
 )
 class SVTBaseIE(InfoExtractor):
-    def _extract_video(self, url, video_id):
+    def _extract_video(self, video_info, video_id):
        info = self._download_json(url, video_id)
        title = info['context']['title']
        thumbnail = info['context'].get('thumbnailImage')
        video_info = info['video']
        formats = []
        for vr in video_info['videoReferences']:
            player_type = vr.get('playerType')
@ -40,27 +37,49 @@ class SVTBaseIE(InfoExtractor):
                    'format_id': player_type,
                    'url': vurl,
                })
        if not formats and video_info.get('rights', {}).get('geoBlockedSweden'):
            self.raise_geo_restricted('This video is only available in Sweden')
        self._sort_formats(formats)
        subtitles = {}
-        subtitle_references = video_info.get('subtitleReferences')
+        subtitle_references = dict_get(video_info, ('subtitles', 'subtitleReferences'))
        if isinstance(subtitle_references, list):
            for sr in subtitle_references:
                subtitle_url = sr.get('url')
                subtitle_lang = sr.get('language', 'sv')
                if subtitle_url:
-                    subtitles.setdefault('sv', []).append({'url': subtitle_url})
+                    if determine_ext(subtitle_url) == 'm3u8':
                        # TODO(yan12125): handle WebVTT in m3u8 manifests
                        continue
-        duration = video_info.get('materialLength')
+                    subtitles.setdefault(subtitle_lang, []).append({'url': subtitle_url})
-        age_limit = 18 if video_info.get('inappropriateForChildren') else 0
+
        title = video_info.get('title')
        series = video_info.get('programTitle')
        season_number = int_or_none(video_info.get('season'))
        episode = video_info.get('episodeTitle')
        episode_number = int_or_none(video_info.get('episodeNumber'))
        duration = int_or_none(dict_get(video_info, ('materialLength', 'contentDuration')))
        age_limit = None
        adult = dict_get(
            video_info, ('inappropriateForChildren', 'blockedForChildren'),
            skip_false_values=False)
        if adult is not None:
            age_limit = 18 if adult else 0
        return {
            'id': video_id,
            'title': title,
            'formats': formats,
            'subtitles': subtitles,
            'thumbnail': thumbnail,
            'duration': duration,
            'age_limit': age_limit,
            'series': series,
            'season_number': season_number,
            'episode': episode,
            'episode_number': episode_number,
        }
@ -68,11 +87,11 @@ class SVTIE(SVTBaseIE):
    _VALID_URL = r'https?://(?:www\.)?svt\.se/wd\?(?:.*?&)?widgetId=(?P<widget_id>\d+)&.*?\barticleId=(?P<id>\d+)'
    _TEST = {
        'url': 'http://www.svt.se/wd?widgetId=23991&sectionId=541&articleId=2900353&type=embed&contextSectionId=123&autostart=false',
-        'md5': '9648197555fc1b49e3dc22db4af51d46',
+        'md5': '33e9a5d8f646523ce0868ecfb0eed77d',
        'info_dict': {
            'id': '2900353',
-            'ext': 'flv',
+            'ext': 'mp4',
-            'title': 'Här trycker Jagr till Giroux (under SVT-intervjun)',
+            'title': 'Stjärnorna skojar till det - under SVT-intervjun',
            'duration': 27,
            'age_limit': 0,
        },
@ -89,15 +108,20 @@ class SVTIE(SVTBaseIE):
        mobj = re.match(self._VALID_URL, url)
        widget_id = mobj.group('widget_id')
        article_id = mobj.group('id')
-        return self._extract_video(
+
        info = self._download_json(
            'http://www.svt.se/wd?widgetId=%s&articleId=%s&format=json&type=embed&output=json' % (widget_id, article_id),
            article_id)
        info_dict = self._extract_video(info['video'], article_id)
        info_dict['title'] = info['context']['title']
        return info_dict
 class SVTPlayIE(SVTBaseIE):
    IE_DESC = 'SVT Play and Öppet arkiv'
-    _VALID_URL = r'https?://(?:www\.)?(?P<host>svtplay|oppetarkiv)\.se/video/(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:www\.)?(?:svtplay|oppetarkiv)\.se/video/(?P<id>[0-9]+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.svtplay.se/video/5996901/flygplan-till-haile-selassie/flygplan-till-haile-selassie-2',
        'md5': '2b6704fe4a28801e1a098bbf3c5ac611',
        'info_dict': {
@ -113,12 +137,47 @@ class SVTPlayIE(SVTBaseIE):
                }]
            },
        },
-    }
+    }, {
        # geo restricted to Sweden
        'url': 'http://www.oppetarkiv.se/video/5219710/trollflojten',
        'only_matching': True,
    }]
    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
+        video_id = self._match_id(url)
-        video_id = mobj.group('id')
+
-        host = mobj.group('host')
+        webpage = self._download_webpage(url, video_id)
-        return self._extract_video(
+
-            'http://www.%s.se/video/%s?output=json' % (host, video_id),
+        data = self._parse_json(
-            video_id)
+            self._search_regex(
                r'root\["__svtplay"\]\s*=\s*([^;]+);',
                webpage, 'embedded data', default='{}'),
            video_id, fatal=False)
        thumbnail = self._og_search_thumbnail(webpage)
        if data:
            video_info = try_get(
                data, lambda x: x['context']['dispatcher']['stores']['VideoTitlePageStore']['data']['video'],
                dict)
            if video_info:
                info_dict = self._extract_video(video_info, video_id)
                info_dict.update({
                    'title': data['context']['dispatcher']['stores']['MetaStore']['title'],
                    'thumbnail': thumbnail,
                })
                return info_dict
        video_id = self._search_regex(
            r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
            webpage, 'video id', default=None)
        if video_id:
            data = self._download_json(
                'http://www.svt.se/videoplayer-api/video/%s' % video_id, video_id)
            info_dict = self._extract_video(data, video_id)
            if not info_dict.get('title'):
                info_dict['title'] = re.sub(
                    r'\s*\|\s*.+?$', '',
                    info_dict.get('episode') or self._og_search_title(webpage))
            return info_dict
--- a/youtube_dl/extractor/tf1.py
+++ b/youtube_dl/extractor/tf1.py
@ -48,6 +48,6 @@ class TF1IE(InfoExtractor):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        wat_id = self._html_search_regex(
-            r'(["\'])(?:https?:)?//www\.wat\.tv/embedframe/.*?(?P<id>\d{8}).*?\1',
+            r'(["\'])(?:https?:)?//www\.wat\.tv/embedframe/.*?(?P<id>\d{8})\1',
            webpage, 'wat id', group='id')
        return self.url_result('wat:%s' % wat_id, 'Wat')
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@ -277,9 +277,9 @@ class ThePlatformIE(ThePlatformBaseIE):
 class ThePlatformFeedIE(ThePlatformBaseIE):
-    _URL_TEMPLATE = '%s//feed.theplatform.com/f/%s/%s?form=json&byGuid=%s'
+    _URL_TEMPLATE = '%s//feed.theplatform.com/f/%s/%s?form=json&%s'
-    _VALID_URL = r'https?://feed\.theplatform\.com/f/(?P<provider_id>[^/]+)/(?P<feed_id>[^?/]+)\?(?:[^&]+&)*byGuid=(?P<id>[a-zA-Z0-9_]+)'
+    _VALID_URL = r'https?://feed\.theplatform\.com/f/(?P<provider_id>[^/]+)/(?P<feed_id>[^?/]+)\?(?:[^&]+&)*(?P<filter>by(?:Gui|I)d=(?P<id>[\w-]+))'
-    _TEST = {
+    _TESTS = [{
        # From http://player.theplatform.com/p/7wvmTC/MSNBCEmbeddedOffSite?guid=n_hardball_5biden_140207
        'url': 'http://feed.theplatform.com/f/7wvmTC/msnbc_video-p-test?form=json&pretty=true&range=-40&byGuid=n_hardball_5biden_140207',
        'md5': '6e32495b5073ab414471b615c5ded394',
@ -295,30 +295,36 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
            'categories': ['MSNBC/Issues/Democrats', 'MSNBC/Issues/Elections/Election 2016'],
            'uploader': 'NBCU-NEWS',
        },
-    }
+    }]
-    def _real_extract(self, url):
+    def _extract_feed_info(self, provider_id, feed_id, filter_query, video_id, custom_fields=None, asset_types_query={}):
-        mobj = re.match(self._VALID_URL, url)
+        real_url = self._URL_TEMPLATE % (self.http_scheme(), provider_id, feed_id, filter_query)
-
+        entry = self._download_json(real_url, video_id)['entries'][0]
        video_id = mobj.group('id')
        provider_id = mobj.group('provider_id')
        feed_id = mobj.group('feed_id')
        real_url = self._URL_TEMPLATE % (self.http_scheme(), provider_id, feed_id, video_id)
        feed = self._download_json(real_url, video_id)
        entry = feed['entries'][0]
        formats = []
        subtitles = {}
        first_video_id = None
        duration = None
        asset_types = []
        for item in entry['media$content']:
-            smil_url = item['plfile$url'] + '&mbr=true'
+            smil_url = item['plfile$url']
            cur_video_id = ThePlatformIE._match_id(smil_url)
            if first_video_id is None:
                first_video_id = cur_video_id
                duration = float_or_none(item.get('plfile$duration'))
-            cur_formats, cur_subtitles = self._extract_theplatform_smil(smil_url, video_id, 'Downloading SMIL data for %s' % cur_video_id)
+            for asset_type in item['plfile$assetTypes']:
                if asset_type in asset_types:
                    continue
                asset_types.append(asset_type)
                query = {
                    'mbr': 'true',
                    'formats': item['plfile$format'],
                    'assetTypes': asset_type,
                }
                if asset_type in asset_types_query:
                    query.update(asset_types_query[asset_type])
                cur_formats, cur_subtitles = self._extract_theplatform_smil(update_url_query(
                    smil_url, query), video_id, 'Downloading SMIL data for %s' % asset_type)
                formats.extend(cur_formats)
                subtitles = self._merge_subtitles(subtitles, cur_subtitles)
@ -344,5 +350,17 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
            'timestamp': timestamp,
            'categories': categories,
        })
        if custom_fields:
            ret.update(custom_fields(entry))
        return ret
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        provider_id = mobj.group('provider_id')
        feed_id = mobj.group('feed_id')
        filter_query = mobj.group('filter')
        return self._extract_feed_info(provider_id, feed_id, filter_query, video_id)
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@ -8,6 +8,7 @@ import itertools
 from .common import InfoExtractor
 from ..compat import (
    compat_HTTPError,
    compat_str,
    compat_urlparse,
 )
 from ..utils import (
@ -24,6 +25,7 @@ from ..utils import (
    urlencode_postdata,
    unescapeHTML,
    parse_filesize,
    try_get,
 )
@ -144,7 +146,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
                            \.
                        )?
                        vimeo(?P<pro>pro)?\.com/
-                        (?!channels/[^/?#]+/?(?:$|[?#])|[^/]+/review/|(?:album|ondemand)/)
+                        (?!(?:channels|album)/[^/?#]+/?(?:$|[?#])|[^/]+/review/|ondemand/)
                        (?:.*?/)?
                        (?:
                            (?:
@ -225,8 +227,6 @@ class VimeoIE(VimeoBaseInfoExtractor):
        {
            'url': 'http://vimeo.com/channels/keypeele/75629013',
            'md5': '2f86a05afe9d7abc0b9126d229bbe15d',
            'note': 'Video is freely available via original URL '
                    'and protected with password when accessed via http://vimeo.com/75629013',
            'info_dict': {
                'id': '75629013',
                'ext': 'mp4',
@ -270,7 +270,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
        {
            # contains original format
            'url': 'https://vimeo.com/33951933',
-            'md5': '53c688fa95a55bf4b7293d37a89c5c53',
+            'md5': '2d9f5475e0537f013d0073e812ab89e6',
            'info_dict': {
                'id': '33951933',
                'ext': 'mp4',
@ -282,6 +282,29 @@ class VimeoIE(VimeoBaseInfoExtractor):
                'description': 'md5:ae23671e82d05415868f7ad1aec21147',
            },
        },
        {
            # only available via https://vimeo.com/channels/tributes/6213729 and
            # not via https://vimeo.com/6213729
            'url': 'https://vimeo.com/channels/tributes/6213729',
            'info_dict': {
                'id': '6213729',
                'ext': 'mp4',
                'title': 'Vimeo Tribute: The Shining',
                'uploader': 'Casey Donahue',
                'uploader_url': 're:https?://(?:www\.)?vimeo\.com/caseydonahue',
                'uploader_id': 'caseydonahue',
                'upload_date': '20090821',
                'description': 'md5:bdbf314014e58713e6e5b66eb252f4a6',
            },
            'params': {
                'skip_download': True,
            },
            'expected_warnings': ['Unable to download JSON metadata'],
        },
        {
            'url': 'http://vimeo.com/moogaloop.swf?clip_id=2539741',
            'only_matching': True,
        },
        {
            'url': 'https://vimeo.com/109815029',
            'note': 'Video not completely processed, "failed" seed status',
@ -291,6 +314,10 @@ class VimeoIE(VimeoBaseInfoExtractor):
            'url': 'https://vimeo.com/groups/travelhd/videos/22439234',
            'only_matching': True,
        },
        {
            'url': 'https://vimeo.com/album/2632481/video/79010983',
            'only_matching': True,
        },
        {
            # source file returns 403: Forbidden
            'url': 'https://vimeo.com/7809605',
@ -367,7 +394,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
        orig_url = url
        if mobj.group('pro') or mobj.group('player'):
            url = 'https://player.vimeo.com/video/' + video_id
-        else:
+        elif any(p in url for p in ('play_redirect_hls', 'moogaloop.swf')):
            url = 'https://vimeo.com/' + video_id
        # Retrieve video webpage to extract further information
@ -445,7 +472,18 @@ class VimeoIE(VimeoBaseInfoExtractor):
            if config.get('view') == 4:
                config = self._verify_player_video_password(url, video_id)
        def is_rented():
            if '>You rented this title.<' in webpage:
                return True
            if config.get('user', {}).get('purchased'):
                return True
            label = try_get(
                config, lambda x: x['video']['vod']['purchase_options'][0]['label_string'], compat_str)
            if label and label.startswith('You rented this'):
                return True
            return False
        if is_rented():
            feature_id = config.get('video', {}).get('vod', {}).get('feature_id')
            if feature_id and not data.get('force_feature_id', False):
                return self.url_result(smuggle_url(
@ -617,8 +655,21 @@ class VimeoChannelIE(VimeoBaseInfoExtractor):
                webpage = self._login_list_password(page_url, list_id, webpage)
                yield self._extract_list_title(webpage)
-            for video_id in re.findall(r'id="clip_(\d+?)"', webpage):
+            # Try extracting href first since not all videos are available via
-                yield self.url_result('https://vimeo.com/%s' % video_id, 'Vimeo')
+            # short https://vimeo.com/id URL (e.g. https://vimeo.com/channels/tributes/6213729)
            clips = re.findall(
                r'id="clip_(\d+)"[^>]*>\s*<a[^>]+href="(/(?:[^/]+/)*\1)', webpage)
            if clips:
                for video_id, video_url in clips:
                    yield self.url_result(
                        compat_urlparse.urljoin(base_url, video_url),
                        VimeoIE.ie_key(), video_id=video_id)
            # More relaxed fallback
            else:
                for video_id in re.findall(r'id=["\']clip_(\d+)', webpage):
                    yield self.url_result(
                        'https://vimeo.com/%s' % video_id,
                        VimeoIE.ie_key(), video_id=video_id)
            if re.search(self._MORE_PAGES_INDICATOR, webpage, re.DOTALL) is None:
                break
@ -655,7 +706,7 @@ class VimeoUserIE(VimeoChannelIE):
 class VimeoAlbumIE(VimeoChannelIE):
    IE_NAME = 'vimeo:album'
-    _VALID_URL = r'https://vimeo\.com/album/(?P<id>\d+)'
+    _VALID_URL = r'https://vimeo\.com/album/(?P<id>\d+)(?:$|[?#]|/(?!video))'
    _TITLE_RE = r'<header id="page_header">\n\s*<h1>(.*?)</h1>'
    _TESTS = [{
        'url': 'https://vimeo.com/album/2632481',
@ -675,6 +726,13 @@ class VimeoAlbumIE(VimeoChannelIE):
        'params': {
            'videopassword': 'youtube-dl',
        }
    }, {
        'url': 'https://vimeo.com/album/2632481/sort:plays/format:thumbnail',
        'only_matching': True,
    }, {
        # TODO: respect page number
        'url': 'https://vimeo.com/album/2632481/page:2/sort:plays/format:thumbnail',
        'only_matching': True,
    }]
    def _page_url(self, base_url, pagenum):
--- a/youtube_dl/extractor/vine.py
+++ b/youtube_dl/extractor/vine.py
@ -24,6 +24,7 @@ class VineIE(InfoExtractor):
            'upload_date': '20130519',
            'uploader': 'Jack Dorsey',
            'uploader_id': '76',
            'view_count': int,
            'like_count': int,
            'comment_count': int,
            'repost_count': int,
@ -39,6 +40,7 @@ class VineIE(InfoExtractor):
            'upload_date': '20140815',
            'uploader': 'Mars Ruiz',
            'uploader_id': '1102363502380728320',
            'view_count': int,
            'like_count': int,
            'comment_count': int,
            'repost_count': int,
@ -54,6 +56,7 @@ class VineIE(InfoExtractor):
            'upload_date': '20130430',
            'uploader': 'Z3k3',
            'uploader_id': '936470460173008896',
            'view_count': int,
            'like_count': int,
            'comment_count': int,
            'repost_count': int,
@ -71,6 +74,7 @@ class VineIE(InfoExtractor):
            'upload_date': '20150705',
            'uploader': 'Pimry_zaa',
            'uploader_id': '1135760698325307392',
            'view_count': int,
            'like_count': int,
            'comment_count': int,
            'repost_count': int,
@ -109,6 +113,7 @@ class VineIE(InfoExtractor):
            'upload_date': unified_strdate(data.get('created')),
            'uploader': username,
            'uploader_id': data.get('userIdStr'),
            'view_count': int_or_none(data.get('loops', {}).get('count')),
            'like_count': int_or_none(data.get('likes', {}).get('count')),
            'comment_count': int_or_none(data.get('comments', {}).get('count')),
            'repost_count': int_or_none(data.get('reposts', {}).get('count')),
--- a/youtube_dl/extractor/vk.py
+++ b/youtube_dl/extractor/vk.py
@ -3,6 +3,7 @@ from __future__ import unicode_literals
 import re
 import json
 import sys
 from .common import InfoExtractor
 from ..compat import compat_str
@ -10,7 +11,6 @@ from ..utils import (
    ExtractorError,
    int_or_none,
    orderedSet,
    sanitized_Request,
    str_to_int,
    unescapeHTML,
    unified_strdate,
@ -190,7 +190,7 @@ class VKIE(InfoExtractor):
        if username is None:
            return
-        login_page = self._download_webpage(
+        login_page, url_handle = self._download_webpage_handle(
            'https://vk.com', None, 'Downloading login page')
        login_form = self._hidden_inputs(login_page)
@ -200,11 +200,26 @@ class VKIE(InfoExtractor):
            'pass': password.encode('cp1251'),
        })
-        request = sanitized_Request(
+        # https://new.vk.com/ serves two same remixlhk cookies in Set-Cookie header
-            'https://login.vk.com/?act=login',
+        # and expects the first one to be set rather than second (see
-            urlencode_postdata(login_form))
+        # https://github.com/rg3/youtube-dl/issues/9841#issuecomment-227871201).
        # As of RFC6265 the newer one cookie should be set into cookie store
        # what actually happens.
        # We will workaround this VK issue by resetting the remixlhk cookie to
        # the first one manually.
        cookies = url_handle.headers.get('Set-Cookie')
        if sys.version_info[0] >= 3:
            cookies = cookies.encode('iso-8859-1')
        cookies = cookies.decode('utf-8')
        remixlhk = re.search(r'remixlhk=(.+?);.*?\bdomain=(.+?)(?:[,;]|$)', cookies)
        if remixlhk:
            value, domain = remixlhk.groups()
            self._set_cookie(domain, 'remixlhk', value)
        login_page = self._download_webpage(
-            request, None, note='Logging in as %s' % username)
+            'https://login.vk.com/?act=login', None,
            note='Logging in as %s' % username,
            data=urlencode_postdata(login_form))
        if re.search(r'onLoginFailed', login_page):
            raise ExtractorError(
--- a/youtube_dl/extractor/wdr.py
+++ b/youtube_dl/extractor/wdr.py
@ -15,7 +15,87 @@ from ..utils import (
 )
-class WDRIE(InfoExtractor):
+class WDRBaseIE(InfoExtractor):
    def _extract_wdr_video(self, webpage, display_id):
        # for wdr.de the data-extension is in a tag with the class "mediaLink"
        # for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
        # for wdrmaus its in a link to the page in a multiline "videoLink"-tag
        json_metadata = self._html_search_regex(
            r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
            webpage, 'media link', default=None, flags=re.MULTILINE)
        if not json_metadata:
            return
        media_link_obj = self._parse_json(json_metadata, display_id,
                                          transform_source=js_to_json)
        jsonp_url = media_link_obj['mediaObj']['url']
        metadata = self._download_json(
            jsonp_url, 'metadata', transform_source=strip_jsonp)
        metadata_tracker_data = metadata['trackerData']
        metadata_media_resource = metadata['mediaResource']
        formats = []
        # check if the metadata contains a direct URL to a file
        for kind, media_resource in metadata_media_resource.items():
            if kind not in ('dflt', 'alt'):
                continue
            for tag_name, medium_url in media_resource.items():
                if tag_name not in ('videoURL', 'audioURL'):
                    continue
                ext = determine_ext(medium_url)
                if ext == 'm3u8':
                    formats.extend(self._extract_m3u8_formats(
                        medium_url, display_id, 'mp4', 'm3u8_native',
                        m3u8_id='hls'))
                elif ext == 'f4m':
                    manifest_url = update_url_query(
                        medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
                    formats.extend(self._extract_f4m_formats(
                        manifest_url, display_id, f4m_id='hds', fatal=False))
                elif ext == 'smil':
                    formats.extend(self._extract_smil_formats(
                        medium_url, 'stream', fatal=False))
                else:
                    a_format = {
                        'url': medium_url
                    }
                    if ext == 'unknown_video':
                        urlh = self._request_webpage(
                            medium_url, display_id, note='Determining extension')
                        ext = urlhandle_detect_ext(urlh)
                        a_format['ext'] = ext
                    formats.append(a_format)
        self._sort_formats(formats)
        subtitles = {}
        caption_url = metadata_media_resource.get('captionURL')
        if caption_url:
            subtitles['de'] = [{
                'url': caption_url,
                'ext': 'ttml',
            }]
        title = metadata_tracker_data['trackerClipTitle']
        return {
            'id': metadata_tracker_data.get('trackerClipId', display_id),
            'display_id': display_id,
            'title': title,
            'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
            'formats': formats,
            'subtitles': subtitles,
            'upload_date': unified_strdate(metadata_tracker_data.get('trackerClipAirTime')),
        }
 class WDRIE(WDRBaseIE):
    _CURRENT_MAUS_URL = r'https?://(?:www\.)wdrmaus.de/(?:[^/]+/){1,2}[^/?#]+\.php5'
    _PAGE_REGEX = r'/(?:mediathek/)?[^/]+/(?P<type>[^/]+)/(?P<display_id>.+)\.html'
    _VALID_URL = r'(?P<page_url>https?://(?:www\d\.)?wdr\d?\.de)' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
@ -91,10 +171,10 @@ class WDRIE(InfoExtractor):
        },
        {
            'url': 'http://www.wdrmaus.de/sachgeschichten/sachgeschichten/achterbahn.php5',
-            # HDS download, MD5 is unstable
+            'md5': '803138901f6368ee497b4d195bb164f2',
            'info_dict': {
                'id': 'mdb-186083',
-                'ext': 'flv',
+                'ext': 'mp4',
                'upload_date': '20130919',
                'title': 'Sachgeschichte - Achterbahn ',
                'description': '- Die Sendung mit der Maus -',
@ -120,14 +200,9 @@ class WDRIE(InfoExtractor):
        display_id = mobj.group('display_id')
        webpage = self._download_webpage(url, display_id)
-        # for wdr.de the data-extension is in a tag with the class "mediaLink"
+        info_dict = self._extract_wdr_video(webpage, display_id)
        # for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
        # for wdrmaus its in a link to the page in a multiline "videoLink"-tag
        json_metadata = self._html_search_regex(
            r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
            webpage, 'media link', default=None, flags=re.MULTILINE)
-        if not json_metadata:
+        if not info_dict:
            entries = [
                self.url_result(page_url + href[0], 'WDR')
                for href in re.findall(
@ -140,86 +215,22 @@ class WDRIE(InfoExtractor):
            raise ExtractorError('No downloadable streams found', expected=True)
        media_link_obj = self._parse_json(json_metadata, display_id,
                                          transform_source=js_to_json)
        jsonp_url = media_link_obj['mediaObj']['url']
        metadata = self._download_json(
            jsonp_url, 'metadata', transform_source=strip_jsonp)
        metadata_tracker_data = metadata['trackerData']
        metadata_media_resource = metadata['mediaResource']
        formats = []
        # check if the metadata contains a direct URL to a file
        for kind, media_resource in metadata_media_resource.items():
            if kind not in ('dflt', 'alt'):
                continue
            for tag_name, medium_url in media_resource.items():
                if tag_name not in ('videoURL', 'audioURL'):
                    continue
                ext = determine_ext(medium_url)
                if ext == 'm3u8':
                    formats.extend(self._extract_m3u8_formats(
                        medium_url, display_id, 'mp4', 'm3u8_native',
                        m3u8_id='hls'))
                elif ext == 'f4m':
                    manifest_url = update_url_query(
                        medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
                    formats.extend(self._extract_f4m_formats(
                        manifest_url, display_id, f4m_id='hds', fatal=False))
                elif ext == 'smil':
                    formats.extend(self._extract_smil_formats(
                        medium_url, 'stream', fatal=False))
                else:
                    a_format = {
                        'url': medium_url
                    }
                    if ext == 'unknown_video':
                        urlh = self._request_webpage(
                            medium_url, display_id, note='Determining extension')
                        ext = urlhandle_detect_ext(urlh)
                        a_format['ext'] = ext
                    formats.append(a_format)
        self._sort_formats(formats)
        subtitles = {}
        caption_url = metadata_media_resource.get('captionURL')
        if caption_url:
            subtitles['de'] = [{
                'url': caption_url,
                'ext': 'ttml',
            }]
        title = metadata_tracker_data.get('trackerClipTitle')
        is_live = url_type == 'live'
        if is_live:
-            title = self._live_title(title)
+            info_dict.update({
-            upload_date = None
+                'title': self._live_title(info_dict['title']),
-        elif 'trackerClipAirTime' in metadata_tracker_data:
+                'upload_date': None,
-            upload_date = metadata_tracker_data['trackerClipAirTime']
+            })
-        else:
+        elif 'upload_date' not in info_dict:
-            upload_date = self._html_search_meta('DC.Date', webpage, 'upload date')
+            info_dict['upload_date'] = unified_strdate(self._html_search_meta('DC.Date', webpage, 'upload date'))
-        if upload_date:
+        info_dict.update({
            upload_date = unified_strdate(upload_date)
        return {
            'id': metadata_tracker_data.get('trackerClipId', display_id),
            'display_id': display_id,
            'title': title,
            'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
            'formats': formats,
            'upload_date': upload_date,
            'description': self._html_search_meta('Description', webpage),
            'is_live': is_live,
-            'subtitles': subtitles,
+        })
-        }
+
        return info_dict
 class WDRMobileIE(InfoExtractor):
--- a/youtube_dl/extractor/xnxx.py
+++ b/youtube_dl/extractor/xnxx.py
@ -6,17 +6,23 @@ from ..compat import compat_urllib_parse_unquote
 class XNXXIE(InfoExtractor):
-    _VALID_URL = r'^https?://(?:video|www)\.xnxx\.com/video(?P<id>[0-9]+)/(.*)'
+    _VALID_URL = r'https?://(?:video|www)\.xnxx\.com/video-?(?P<id>[0-9a-z]+)/'
-    _TEST = {
+    _TESTS = [{
-        'url': 'http://video.xnxx.com/video1135332/lida_naked_funny_actress_5_',
+        'url': 'http://www.xnxx.com/video-55awb78/skyrim_test_video',
-        'md5': '0831677e2b4761795f68d417e0b7b445',
+        'md5': 'ef7ecee5af78f8b03dca2cf31341d3a0',
        'info_dict': {
-            'id': '1135332',
+            'id': '55awb78',
            'ext': 'flv',
-            'title': 'lida » Naked Funny Actress  (5)',
+            'title': 'Skyrim Test Video',
            'age_limit': 18,
-        }
+        },
-    }
+    }, {
        'url': 'http://video.xnxx.com/video1135332/lida_naked_funny_actress_5_',
        'only_matching': True,
    }, {
        'url': 'http://www.xnxx.com/video-55awb78/',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
--- a/youtube_dl/jsinterp.py
+++ b/youtube_dl/jsinterp.py
@ -232,7 +232,7 @@ class JSInterpreter(object):
    def extract_function(self, funcname):
        func_m = re.search(
            r'''(?x)
-                (?:function\s+%s|[{;,]%s\s*=\s*function|var\s+%s\s*=\s*function)\s*
+                (?:function\s+%s|[{;,]\s*%s\s*=\s*function|var\s+%s\s*=\s*function)\s*
                \((?P<args>[^)]*)\)\s*
                \{(?P<code>[^}]+)\}''' % (
                re.escape(funcname), re.escape(funcname), re.escape(funcname)),
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@ -2852,3 +2852,12 @@ def decode_packed_codes(code):
    return re.sub(
        r'\b(\w+)\b', lambda mobj: symbol_table[mobj.group(0)],
        obfucasted_code)
 def parse_m3u8_attributes(attrib):
    info = {}
    for (key, val) in re.findall(r'(?P<key>[A-Z0-9-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)', attrib):
        if val.startswith('"'):
            val = val[1:-1]
        info[key] = val
    return info
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2016.06.18.1'
+__version__ = '2016.06.23.1'
Author	SHA1	Message	Date
Sergey M․	011bd3221b	release 2016.06.23.1	2016-06-23 09:42:56 +07:00
Sergey M․	b46eabecd3	[jsinterp] Relax JS function regex (Closes #9863 )	2016-06-23 09:41:34 +07:00
Remita Amine	0437307a41	[nbc:nbcnews] improve extraction and add msnbc to the extractor	2016-06-23 01:36:19 +01:00
Remita Amine	22b7ac13ef	[tf1] fix wat id extraction(closes #9862 )	2016-06-23 00:14:34 +01:00
Sergey M․	96f88e91b7	release 2016.06.23	2016-06-23 04:29:34 +07:00
Sergey M․	3331a4644d	[vk] Remove unused import	2016-06-23 04:27:10 +07:00
Sergey M․	adf1921dc1	[xnxx] Improve _VALID_URL (Closes #9858 )	2016-06-23 04:26:49 +07:00
Sergey M․	97674f0419	[xnxx] Replace test	2016-06-23 04:24:00 +07:00
rr-	73843ae8ac	[xnxx] fix url regex The pattern has changed from "video123412" to "video-o8xa19". The changes maintain backwards compatibility with old-style URLs.	2016-06-23 04:19:55 +07:00
Sergey M․	f2bb8c036a	[vk] Modernize	2016-06-23 04:18:43 +07:00
Sergey M․	75ca6bcee2	[vk] Workaround buggy new.vk.com Set-Cookie headers	2016-06-23 04:17:13 +07:00
Sergey M․	089657ed1f	[vimeo:album] Add paged example URL	2016-06-23 02:00:03 +07:00
Sergey M․	b5eab86c24	[vimeo:album] Impove _VALID_URL	2016-06-23 01:56:58 +07:00
Sergey M․	c8e3e0974b	[vimeo:channel] Improve playlist extraction	2016-06-23 01:28:36 +07:00
Purdea Andrei	dfc8f46e1c	[vimeo:channel] Add video id to url_result This will allow us to decide much faster that we don't want an already archived video, and will allow having to download webpages for each video that has already been downloaded, thus significantly speeding up the archival of channels that have no new content.	2016-06-23 01:26:27 +07:00
Sergey M․	c143ddce5d	[vimeo] Override original URL only when necessary	2016-06-23 00:51:36 +07:00
Jaime Marquínez Ferrándiz	169d836feb	lazy-extractors: Fix after commit `6e6b9f600f` The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor.	2016-06-22 19:20:50 +02:00
TRox1972	6ae938b295	[Vine] Extract view count	2016-06-22 23:57:35 +07:00
Sergey M․	cf40fdf5c1	release 2016.06.22	2016-06-22 23:43:24 +07:00
Sergey M․	23bdae0955	[svt] Various improvements + [svt:play] Add fallback path looking for video id and fix extraction for oppetarkiv * [svt:base] Detect geo restriction * [svt:base] Extract series related metadata	2016-06-22 23:36:07 +07:00
Shai Coleman	ca74c90bf5	Fix issue downloading facebook videos youtube-dl expects the format items to be returned as a list, but when there's only one item Facebook returns a dict instead, this wraps the dict in a list if necessary	2016-06-22 12:52:15 +01:00
Sergey M․	7cfc1e2a10	[gametrailers] Remove extractor gametrailers closed (see http://www.polygon.com/2016/2/8/10944452/gametrailers-shuts-down-after-13-year-run)	2016-06-21 22:31:41 +07:00
Remita Amine	1ac5705f62	[gamespot] extract all formats	2016-06-21 13:37:57 +01:00
Yen Chi Hsuan	e4f90ea0a7	[svt] Fix extraction for SVTPlay (closes #9809 )	2016-06-21 17:55:53 +08:00
Sergey M․	cdfc187cd5	[cbs] Remove unused import	2016-06-20 22:40:33 +07:00
Sergey M․	feef925f49	[streamcloud] Capture error message (#9840 )	2016-06-20 22:40:22 +07:00
Sergey M․	19e2d1cdea	release 2016.06.20	2016-06-20 20:50:01 +07:00
Sergey M․	8369a4fe76	[downloader/hls] Simplify and carry long lines	2016-06-20 21:55:17 +07:00
Philipp Hagemeister	1f749b6658	Revert "[jsinterp] Avoid double key lookup for setting new key" This reverts commit `7c05097633`.	2016-06-20 13:29:13 +02:00
Remita Amine	819707920a	[cbs] fix _VALID_URL	2016-06-19 23:55:19 +01:00
Remita Amine	43518503a6	[cbs,cbsnews,cbssports] reduce requests while extracting all formats	2016-06-19 23:40:00 +01:00
Remita Amine	5839d556e4	[theplatform] reduce requests for theplatform feed info extraction	2016-06-19 23:37:05 +01:00
Yen Chi Hsuan	6c83e583b3	[radiojavan] PEP8 E275 is added in pycodestyle 2.6 See https://github.com/PyCQA/pycodestyle/pull/491	2016-06-19 13:32:08 +08:00
Yen Chi Hsuan	6aeb64b673	Merge pull request #8201 from remitamine/hls-aes [downloader/hls] Add support for AES-128 encrypted segments in hlsnative downloader	2016-06-19 13:25:08 +08:00
Remita Amine	6cd64b6806	[foxsports] extract http formats	2016-06-19 05:45:48 +01:00
remitamine	e154c65128	[downloader/hls] Add support for AES-128 encrypted segments in hlsnative downloader	2016-06-19 01:01:40 +01:00
Sergey M․	a50fd6e026	release 2016.06.19.1	2016-06-19 03:57:14 +07:00
Sergey M․	6a55bb66ee	[vimeo] Fix rented videos (Closes #9830 )	2016-06-19 03:56:01 +07:00
Lucas Moura	7c05097633	[jsinterp] Avoid double key lookup for setting new key In order to add a new key to both __objects and __functions dicts on jsinterp.py, it is necessary to first verify if a key was present and if not, create the key and assign it to a value. However, this can be done with a single step using dict setdefault method.	2016-06-19 03:29:45 +07:00
Sergey M․	589568789f	release 2016.06.19	2016-06-19 02:30:29 +07:00
Sergey M․	7577d849a6	[r7] Fix extraction and add support for articles (Closes #9826 )	2016-06-19 02:25:34 +07:00
Sergey M․	cb23192bc4	[closertotruth] Update and improve (Closes #8680 )	2016-06-19 00:35:29 +07:00
Steven Gosseling	41c1023300	[closertotruth] Add extractor Removed print statement from code. Replaced two regex searches with the corret ones. Removed some unnecessary semicolumns fixed title extraction refactored everything to search_regex processed comments on commit 5650b0d, fixed feedback from flake8 Improved regexes and returns info dict now. Added support for closertotruth interview URL Added support for episodes page	2016-06-18 23:19:56 +07:00
Sergey M․	90b6288cce	[arte:+7] Simplify _VALID_URL	2016-06-18 22:23:48 +07:00
Sergey M․	c1823c8ad9	[README.md] Remove 'small' from description (#9814 )	2016-06-18 22:08:48 +07:00
Sergey M․	d7c6c656c5	[arte:+7] Expand _VALID_URL (Closes #9820 )	2016-06-18 21:42:17 +07:00
Yen Chi Hsuan	b0b128049a	[extractors] Update references to sportschau (#9799 )	2016-06-18 13:43:47 +08:00
Yen Chi Hsuan	e8f13f2637	[sportschau.de] Fix extraction and moved to its own file (closes #9799 )	2016-06-18 13:42:58 +08:00
Yen Chi Hsuan	b5aad37f6b	[ard] Remove SportschauIE, which is now based on WDR (#9799 )	2016-06-18 13:42:39 +08:00
Yen Chi Hsuan	6d0d4fc26d	[wdr] Add WDRBaseIE, for Sportschau (#9799 )	2016-06-18 13:40:55 +08:00
Yen Chi Hsuan	0278aa443f	[br] Skip invalid tests	2016-06-18 12:53:48 +08:00
Yen Chi Hsuan	1f35745758	[azubu] Don't fail on optional fields	2016-06-18 12:39:08 +08:00
Yen Chi Hsuan	573c35272f	[bbc] Skip a geo-restricted test case	2016-06-18 12:35:55 +08:00
Yen Chi Hsuan	09e3f91e40	[arte] Update _TESTS and fix for pages with multiple YouTube videos Some tests are from #6895 and #6613	2016-06-18 12:34:58 +08:00
Yen Chi Hsuan	1b6cf16be7	[aftonbladet] Fix extraction	2016-06-18 12:27:39 +08:00
Yen Chi Hsuan	26264cb056	[adobetv] Use embedded data in the webpage Sometimes the HTML webpage is returned even with '?format=json'	2016-06-18 12:21:40 +08:00
Yen Chi Hsuan	a72df5f36f	[mtvservices] Fix ext for RTMP streams	2016-06-18 12:19:06 +08:00
Yen Chi Hsuan	c878e635de	[bet] Moved to MTVServices	2016-06-18 12:17:24 +08:00
`@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2016.06.18.1'`	`__version__ = '2016.06.23.1'`