release 2016.06.12

[streamcloud] Detect removed videos (Closes #3768 )
[nrk:skole] Fix extraction
2016-06-12 12:06:48 +07:00 · 2016-06-12 11:08:39 +07:00 · 2016-06-12 07:20:37 +07:00 · 2016-06-12 06:57:04 +07:00 · 2016-06-12 06:39:31 +07:00 · 2016-06-12 06:06:04 +07:00
20 changed files with 352 additions and 155 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -6,8 +6,8 @@
 ---
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.11.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.11.1**
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.12**
 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.06.11.1
+[debug] youtube-dl version 2016.06.12
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/README.md
+++ b/README.md
@ -511,6 +511,9 @@ The basic usage is not to set any template arguments when downloading a single f
 - `autonumber`: Five-digit number that will be increased with each download, starting at zero
 - `playlist`: Name or id of the playlist that contains the video
 - `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist
 - `playlist_id`: Playlist identifier
 - `playlist_title`: Playlist title
 Available for the video that belongs to some logical chapter or section:
 - `chapter`: Name or title of the chapter the video belongs to
@ -550,6 +553,10 @@ The current default template is `%(title)s-%(id)s.%(ext)s`.
 In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
 #### Output template and Windows batch files
 If you are using output template inside a Windows batch file then you must escape plain percent characters (`%`) by doubling, so that `-o "%(title)s-%(id)s.%(ext)s"` should become `-o "%%(title)s-%%(id)s.%%(ext)s"`. However you should not touch `%`'s that are not plain characters, e.g. environment variables for expansion should stay intact: `-o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s"`.
 #### Output template examples
 Note on Windows you may need to use double quotes instead of single.
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -44,8 +44,8 @@
 - **appletrailers:section**
 - **archive.org**: archive.org videos
 - **ARD**
 - **ARD:mediathek**
 - **ARD:mediathek**: Saarländischer Rundfunk
 - **ARD:mediathek**
 - **arte.tv**
 - **arte.tv:+7**
 - **arte.tv:cinema**
@ -647,6 +647,7 @@
 - **Telegraaf**
 - **TeleMB**
 - **TeleTask**
 - **Telewebion**
 - **TF1**
 - **TheIntercept**
 - **ThePlatform**
--- a/setup.py
+++ b/setup.py
@ -122,6 +122,7 @@ setup(
        "Programming Language :: Python :: 3.2",
        "Programming Language :: Python :: 3.3",
        "Programming Language :: Python :: 3.4",
        "Programming Language :: Python :: 3.5",
    ],
    cmdclass={'build_lazy_extractors': build_lazy_extractors},
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -777,6 +777,7 @@ from .telecinco import TelecincoIE
 from .telegraaf import TelegraafIE
 from .telemb import TeleMBIE
 from .teletask import TeleTaskIE
 from .telewebion import TelewebionIE
 from .testurl import TestURLIE
 from .tf1 import TF1IE
 from .theintercept import TheInterceptIE
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@ -1073,20 +1073,6 @@ class GenericIE(InfoExtractor):
                'skip_download': True,
            }
        },
        # Contains a SMIL manifest
        {
            'url': 'http://www.telewebion.com/fa/1263668/%D9%82%D8%B1%D8%B9%D9%87%E2%80%8C%DA%A9%D8%B4%DB%8C-%D9%84%DB%8C%DA%AF-%D9%82%D9%87%D8%B1%D9%85%D8%A7%D9%86%D8%A7%D9%86-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7/%2B-%D9%81%D9%88%D8%AA%D8%A8%D8%A7%D9%84.html',
            'info_dict': {
                'id': 'file',
                'ext': 'flv',
                'title': '+ Football: Lottery Champions League Europe',
                'uploader': 'www.telewebion.com',
            },
            'params': {
                # rtmpe downloads
                'skip_download': True,
            }
        },
        # Brightcove URL in single quotes
        {
            'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/',
--- a/youtube_dl/extractor/indavideo.py
+++ b/youtube_dl/extractor/indavideo.py
@ -60,7 +60,8 @@ class IndavideoEmbedIE(InfoExtractor):
        formats = [{
            'url': video_url,
-            'height': self._search_regex(r'\.(\d{3,4})\.mp4$', video_url, 'height', default=None),
+            'height': int_or_none(self._search_regex(
                r'\.(\d{3,4})\.mp4(?:\?|$)', video_url, 'height', default=None)),
        } for video_url in video_urls]
        self._sort_formats(formats)
--- a/youtube_dl/extractor/instagram.py
+++ b/youtube_dl/extractor/instagram.py
@ -8,6 +8,7 @@ from ..utils import (
    int_or_none,
    limit_length,
    lowercase_escape,
    try_get,
 )
@ -19,10 +20,16 @@ class InstagramIE(InfoExtractor):
        'info_dict': {
            'id': 'aye83DjauH',
            'ext': 'mp4',
            'uploader_id': 'naomipq',
            'title': 'Video by naomipq',
            'description': 'md5:1f17f0ab29bd6fe2bfad705f58de3cb8',
-        }
+            'thumbnail': 're:^https?://.*\.jpg',
            'timestamp': 1371748545,
            'upload_date': '20130620',
            'uploader_id': 'naomipq',
            'uploader': 'Naomi Leonor Phan-Quang',
            'like_count': int,
            'comment_count': int,
        },
    }, {
        # missing description
        'url': 'https://www.instagram.com/p/BA-pQFBG8HZ/?taken-by=britneyspears',
@ -31,6 +38,13 @@ class InstagramIE(InfoExtractor):
            'ext': 'mp4',
            'uploader_id': 'britneyspears',
            'title': 'Video by britneyspears',
            'thumbnail': 're:^https?://.*\.jpg',
            'timestamp': 1453760977,
            'upload_date': '20160125',
            'uploader_id': 'britneyspears',
            'uploader': 'Britney Spears',
            'like_count': int,
            'comment_count': int,
        },
        'params': {
            'skip_download': True,
@ -67,21 +81,57 @@ class InstagramIE(InfoExtractor):
        url = mobj.group('url')
        webpage = self._download_webpage(url, video_id)
-        uploader_id = self._search_regex(r'"owner":{"username":"(.+?)"',
+
-                                         webpage, 'uploader id', fatal=False)
+        (video_url, description, thumbnail, timestamp, uploader,
-        desc = self._search_regex(
+         uploader_id, like_count, comment_count) = [None] * 8
-            r'"caption":"(.+?)"', webpage, 'description', default=None)
+
-        if desc is not None:
+        shared_data = self._parse_json(
-            desc = lowercase_escape(desc)
+            self._search_regex(
                r'window\._sharedData\s*=\s*({.+?});',
                webpage, 'shared data', default='{}'),
            video_id, fatal=False)
        if shared_data:
            media = try_get(
                shared_data, lambda x: x['entry_data']['PostPage'][0]['media'], dict)
            if media:
                video_url = media.get('video_url')
                description = media.get('caption')
                thumbnail = media.get('display_src')
                timestamp = int_or_none(media.get('date'))
                uploader = media.get('owner', {}).get('full_name')
                uploader_id = media.get('owner', {}).get('username')
                like_count = int_or_none(media.get('likes', {}).get('count'))
                comment_count = int_or_none(media.get('comments', {}).get('count'))
        if not video_url:
            video_url = self._og_search_video_url(webpage, secure=False)
        if not uploader_id:
            uploader_id = self._search_regex(
                r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"',
                webpage, 'uploader id', fatal=False)
        if not description:
            description = self._search_regex(
                r'"caption"\s*:\s*"(.+?)"', webpage, 'description', default=None)
            if description is not None:
                description = lowercase_escape(description)
        if not thumbnail:
            thumbnail = self._og_search_thumbnail(webpage)
        return {
            'id': video_id,
-            'url': self._og_search_video_url(webpage, secure=False),
+            'url': video_url,
            'ext': 'mp4',
            'title': 'Video by %s' % uploader_id,
-            'thumbnail': self._og_search_thumbnail(webpage),
+            'description': description,
            'thumbnail': thumbnail,
            'timestamp': timestamp,
            'uploader_id': uploader_id,
-            'description': desc,
+            'uploader': uploader,
            'like_count': like_count,
            'comment_count': comment_count,
        }
--- a/youtube_dl/extractor/kuwo.py
+++ b/youtube_dl/extractor/kuwo.py
@ -148,8 +148,8 @@ class KuwoAlbumIE(InfoExtractor):
        'url': 'http://www.kuwo.cn/album/502294/',
        'info_dict': {
            'id': '502294',
-            'title': 'M',
+            'title': 'Made\xa0Series\xa0《M》',
-            'description': 'md5:6a7235a84cc6400ec3b38a7bdaf1d60c',
+            'description': 'md5:d463f0d8a0ff3c3ea3d6ed7452a9483f',
        },
        'playlist_count': 2,
    }
@ -209,7 +209,7 @@ class KuwoSingerIE(InfoExtractor):
        'url': 'http://www.kuwo.cn/mingxing/bruno+mars/',
        'info_dict': {
            'id': 'bruno+mars',
-            'title': 'Bruno Mars',
+            'title': 'Bruno\xa0Mars',
        },
        'playlist_mincount': 329,
    }, {
@ -306,7 +306,7 @@ class KuwoMvIE(KuwoBaseIE):
            'id': '6480076',
            'ext': 'mp4',
            'title': 'My HouseMV',
-            'creator': '2PM',
+            'creator': 'PM02:00',
        },
        # In this video, music URLs (anti.s) are blocked outside China and
        # USA, while the MV URL (mvurl) is available globally, so force the MV
--- a/youtube_dl/extractor/leeco.py
+++ b/youtube_dl/extractor/leeco.py
@ -28,7 +28,7 @@ from ..utils import (
 class LeIE(InfoExtractor):
    IE_DESC = '乐视网'
-    _VALID_URL = r'https?://www\.le\.com/ptv/vplay/(?P<id>\d+)\.html'
+    _VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|sports\.le\.com/video)/(?P<id>\d+)\.html'
    _URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html'
@ -69,6 +69,9 @@ class LeIE(InfoExtractor):
            'hls_prefer_native': True,
        },
        'skip': 'Only available in China',
    }, {
        'url': 'http://sports.le.com/video/25737697.html',
        'only_matching': True,
    }]
    @staticmethod
@ -196,7 +199,7 @@ class LeIE(InfoExtractor):
 class LePlaylistIE(InfoExtractor):
-    _VALID_URL = r'https?://[a-z]+\.le\.com/[a-z]+/(?P<id>[a-z0-9_]+)'
+    _VALID_URL = r'https?://[a-z]+\.le\.com/(?!video)[a-z]+/(?P<id>[a-z0-9_]+)'
    _TESTS = [{
        'url': 'http://www.le.com/tv/46177.html',
--- a/youtube_dl/extractor/limelight.py
+++ b/youtube_dl/extractor/limelight.py
@ -98,13 +98,19 @@ class LimelightBaseIE(InfoExtractor):
        } for thumbnail in properties.get('thumbnails', []) if thumbnail.get('url')]
        subtitles = {}
-        for caption in properties.get('captions', {}):
+        for caption in properties.get('captions', []):
            lang = caption.get('language_code')
            subtitles_url = caption.get('url')
            if lang and subtitles_url:
-                subtitles[lang] = [{
+                subtitles.setdefault(lang, []).append({
                    'url': subtitles_url,
-                }]
+                })
        closed_captions_url = properties.get('closed_captions_url')
        if closed_captions_url:
            subtitles.setdefault('en', []).append({
                'url': closed_captions_url,
                'ext': 'ttml',
            })
        return {
            'id': video_id,
@ -123,7 +129,18 @@ class LimelightBaseIE(InfoExtractor):
 class LimelightMediaIE(LimelightBaseIE):
    IE_NAME = 'limelight'
-    _VALID_URL = r'(?:limelight:media:|https?://link\.videoplatform\.limelight\.com/media/\??\bmediaId=)(?P<id>[a-z0-9]{32})'
+    _VALID_URL = r'''(?x)
                        (?:
                            limelight:media:|
                            https?://
                                (?:
                                    link\.videoplatform\.limelight\.com/media/|
                                    assets\.delvenetworks\.com/player/loader\.swf
                                )
                                \?.*?\bmediaId=
                        )
                        (?P<id>[a-z0-9]{32})
                    '''
    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?mediaId=3ffd040b522b4485b6d84effc750cd86',
        'info_dict': {
@ -158,6 +175,9 @@ class LimelightMediaIE(LimelightBaseIE):
            # rtmp download
            'skip_download': True,
        },
    }, {
        'url': 'https://assets.delvenetworks.com/player/loader.swf?mediaId=8018a574f08d416e95ceaccae4ba0452',
        'only_matching': True,
    }]
    _PLAYLIST_SERVICE_PATH = 'media'
    _API_PATH = 'media'
@ -176,15 +196,29 @@ class LimelightMediaIE(LimelightBaseIE):
 class LimelightChannelIE(LimelightBaseIE):
    IE_NAME = 'limelight:channel'
-    _VALID_URL = r'(?:limelight:channel:|https?://link\.videoplatform\.limelight\.com/media/\??\bchannelId=)(?P<id>[a-z0-9]{32})'
+    _VALID_URL = r'''(?x)
-    _TEST = {
+                        (?:
                            limelight:channel:|
                            https?://
                                (?:
                                    link\.videoplatform\.limelight\.com/media/|
                                    assets\.delvenetworks\.com/player/loader\.swf
                                )
                                \?.*?\bchannelId=
                        )
                        (?P<id>[a-z0-9]{32})
                    '''
    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?channelId=ab6a524c379342f9b23642917020c082',
        'info_dict': {
            'id': 'ab6a524c379342f9b23642917020c082',
            'title': 'Javascript Sample Code',
        },
        'playlist_mincount': 3,
-    }
+    }, {
        'url': 'http://assets.delvenetworks.com/player/loader.swf?channelId=ab6a524c379342f9b23642917020c082',
        'only_matching': True,
    }]
    _PLAYLIST_SERVICE_PATH = 'channel'
    _API_PATH = 'channels'
@ -207,15 +241,29 @@ class LimelightChannelIE(LimelightBaseIE):
 class LimelightChannelListIE(LimelightBaseIE):
    IE_NAME = 'limelight:channel_list'
-    _VALID_URL = r'(?:limelight:channel_list:|https?://link\.videoplatform\.limelight\.com/media/\?.*?\bchannelListId=)(?P<id>[a-z0-9]{32})'
+    _VALID_URL = r'''(?x)
-    _TEST = {
+                        (?:
                            limelight:channel_list:|
                            https?://
                                (?:
                                    link\.videoplatform\.limelight\.com/media/|
                                    assets\.delvenetworks\.com/player/loader\.swf
                                )
                                \?.*?\bchannelListId=
                        )
                        (?P<id>[a-z0-9]{32})
                    '''
    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?channelListId=301b117890c4465c8179ede21fd92e2b',
        'info_dict': {
            'id': '301b117890c4465c8179ede21fd92e2b',
            'title': 'Website - Hero Player',
        },
        'playlist_mincount': 2,
-    }
+    }, {
        'url': 'https://assets.delvenetworks.com/player/loader.swf?channelListId=301b117890c4465c8179ede21fd92e2b',
        'only_matching': True,
    }]
    _PLAYLIST_SERVICE_PATH = 'channel_list'
    def _real_extract(self, url):
--- a/youtube_dl/extractor/matchtv.py
+++ b/youtube_dl/extractor/matchtv.py
@ -4,16 +4,12 @@ from __future__ import unicode_literals
 import random
 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_urlencode
+from ..utils import xpath_text
 from ..utils import (
    sanitized_Request,
    xpath_text,
 )
 class MatchTVIE(InfoExtractor):
-    _VALID_URL = r'https?://matchtv\.ru/?#live-player'
+    _VALID_URL = r'https?://matchtv\.ru(?:/on-air|/?#live-player)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://matchtv.ru/#live-player',
        'info_dict': {
            'id': 'matchtv-live',
@ -24,12 +20,16 @@ class MatchTVIE(InfoExtractor):
        'params': {
            'skip_download': True,
        },
-    }
+    }, {
        'url': 'http://matchtv.ru/on-air/',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = 'matchtv-live'
-        request = sanitized_Request(
+        video_url = self._download_json(
-            'http://player.matchtv.ntvplus.tv/player/smil?%s' % compat_urllib_parse_urlencode({
+            'http://player.matchtv.ntvplus.tv/player/smil', video_id,
            query={
                'ts': '',
                'quality': 'SD',
                'contentId': '561d2c0df7159b37178b4567',
@ -40,11 +40,10 @@ class MatchTVIE(InfoExtractor):
                'contentType': 'channel',
                'timeShift': '0',
                'platform': 'portal',
-            }),
+            },
            headers={
                'Referer': 'http://player.matchtv.ntvplus.tv/embed-player/NTVEmbedPlayer.swf',
-            })
+            })['data']['videoUrl']
        video_url = self._download_json(request, video_id)['data']['videoUrl']
        f4m_url = xpath_text(self._download_xml(video_url, video_id), './to')
        formats = self._extract_f4m_formats(f4m_url, video_id)
        self._sort_formats(formats)
--- a/youtube_dl/extractor/nrk.py
+++ b/youtube_dl/extractor/nrk.py
@ -163,7 +163,7 @@ class NRKTVIE(NRKBaseIE):
            'ext': 'mp4',
            'title': '20 spørsmål 23.05.2014',
            'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
-            'duration': 1741.52,
+            'duration': 1741,
        },
    }, {
        'url': 'https://tv.nrk.no/program/mdfp15000514',
@ -173,7 +173,7 @@ class NRKTVIE(NRKBaseIE):
            'ext': 'mp4',
            'title': 'Grunnlovsjubiléet - Stor ståhei for ingenting 24.05.2014',
            'description': 'md5:89290c5ccde1b3a24bb8050ab67fe1db',
-            'duration': 4605.08,
+            'duration': 4605,
        },
    }, {
        # single playlist video
@ -260,30 +260,34 @@ class NRKPlaylistIE(InfoExtractor):
 class NRKSkoleIE(InfoExtractor):
    IE_DESC = 'NRK Skole'
-    _VALID_URL = r'https?://(?:www\.)?nrk\.no/skole/klippdetalj?.*\btopic=(?P<id>[^/?#&]+)'
+    _VALID_URL = r'https?://(?:www\.)?nrk\.no/skole/?\?.*\bmediaId=(?P<id>\d+)'
    _TESTS = [{
-        'url': 'http://nrk.no/skole/klippdetalj?topic=nrk:klipp/616532',
+        'url': 'https://www.nrk.no/skole/?page=search&q=&mediaId=14099',
-        'md5': '04cd85877cc1913bce73c5d28a47e00f',
+        'md5': '6bc936b01f9dd8ed45bc58b252b2d9b6',
        'info_dict': {
            'id': '6021',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Genetikk og eneggede tvillinger',
            'description': 'md5:3aca25dcf38ec30f0363428d2b265f8d',
            'duration': 399,
        },
    }, {
-        'url': 'http://www.nrk.no/skole/klippdetalj?topic=nrk%3Aklipp%2F616532#embed',
+        'url': 'https://www.nrk.no/skole/?page=objectives&subject=naturfag&objective=K15114&mediaId=19355',
        'only_matching': True,
    }, {
        'url': 'http://www.nrk.no/skole/klippdetalj?topic=urn:x-mediadb:21379',
        'only_matching': True,
    }]
    def _real_extract(self, url):
-        video_id = compat_urllib_parse_unquote(self._match_id(url))
+        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        webpage = self._download_webpage(
            'https://mimir.nrk.no/plugin/1.0/static?mediaId=%s' % video_id,
            video_id)
        nrk_id = self._parse_json(
            self._search_regex(
                r'<script[^>]+type=["\']application/json["\'][^>]*>({.+?})</script>',
                webpage, 'application json'),
            video_id)['activeMedia']['psId']
        nrk_id = self._search_regex(r'data-nrk-id=["\'](\d+)', webpage, 'nrk id')
        return self.url_result('nrk:%s' % nrk_id)
--- a/youtube_dl/extractor/streamcloud.py
+++ b/youtube_dl/extractor/streamcloud.py
@ -5,6 +5,7 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
    sanitized_Request,
    urlencode_postdata,
 )
@ -14,7 +15,7 @@ class StreamcloudIE(InfoExtractor):
    IE_NAME = 'streamcloud.eu'
    _VALID_URL = r'https?://streamcloud\.eu/(?P<id>[a-zA-Z0-9_-]+)(?:/(?P<fname>[^#?]*)\.html)?'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://streamcloud.eu/skp9j99s4bpz/youtube-dl_test_video_____________-BaW_jenozKc.mp4.html',
        'md5': '6bea4c7fa5daaacc2a946b7146286686',
        'info_dict': {
@ -23,7 +24,10 @@ class StreamcloudIE(InfoExtractor):
            'title': 'youtube-dl test video  \'/\\ ä ↭',
        },
        'skip': 'Only available from the EU'
-    }
+    }, {
        'url': 'http://streamcloud.eu/ua8cmfh1nbe6/NSHIP-148--KUC-NG--H264-.mp4.html',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
@ -31,6 +35,10 @@ class StreamcloudIE(InfoExtractor):
        orig_webpage = self._download_webpage(url, video_id)
        if '>File Not Found<' in orig_webpage:
            raise ExtractorError(
                'Video %s does not exist' % video_id, expected=True)
        fields = re.findall(r'''(?x)<input\s+
            type="(?:hidden|submit)"\s+
            name="([^"]+)"\s+
--- a/youtube_dl/extractor/telewebion.py
+++ b/youtube_dl/extractor/telewebion.py
@ -0,0 +1,55 @@
 # coding: utf-8
 from __future__ import unicode_literals
 from .common import InfoExtractor
 class TelewebionIE(InfoExtractor):
    _VALID_URL = r'https?://www\.telewebion\.com/#!/episode/(?P<id>\d+)'
    _TEST = {
        'url': 'http://www.telewebion.com/#!/episode/1263668/',
        'info_dict': {
            'id': '1263668',
            'ext': 'mp4',
            'title': 'قرعه\u200cکشی لیگ قهرمانان اروپا',
            'thumbnail': 're:^https?://.*\.jpg',
            'view_count': int,
        },
        'params': {
            # m3u8 download
            'skip_download': True,
        },
    }
    def _real_extract(self, url):
        video_id = self._match_id(url)
        secure_token = self._download_webpage(
            'http://m.s2.telewebion.com/op/op?action=getSecurityToken', video_id)
        episode_details = self._download_json(
            'http://m.s2.telewebion.com/op/op', video_id,
            query={'action': 'getEpisodeDetails', 'episode_id': video_id})
        m3u8_url = 'http://m.s1.telewebion.com/smil/%s.m3u8?filepath=%s&m3u8=1&secure_token=%s' % (
            video_id, episode_details['file_path'], secure_token)
        formats = self._extract_m3u8_formats(
            m3u8_url, video_id, ext='mp4', m3u8_id='hls')
        picture_paths = [
            episode_details.get('picture_path'),
            episode_details.get('large_picture_path'),
        ]
        thumbnails = [{
            'url': picture_path,
            'preference': idx,
        } for idx, picture_path in enumerate(picture_paths) if picture_path is not None]
        return {
            'id': video_id,
            'title': episode_details['title'],
            'formats': formats,
            'thumbnails': thumbnails,
            'view_count': episode_details.get('view_count'),
        }
--- a/youtube_dl/extractor/viki.py
+++ b/youtube_dl/extractor/viki.py
@ -101,10 +101,13 @@ class VikiBaseIE(InfoExtractor):
            self.report_warning('Unable to get session token, login has probably failed')
    @staticmethod
-    def dict_selection(dict_obj, preferred_key):
+    def dict_selection(dict_obj, preferred_key, allow_fallback=True):
        if preferred_key in dict_obj:
            return dict_obj.get(preferred_key)
        if not allow_fallback:
            return
        filtered_dict = list(filter(None, [dict_obj.get(k) for k in dict_obj.keys()]))
        return filtered_dict[0] if filtered_dict else None
@ -127,7 +130,7 @@ class VikiIE(VikiBaseIE):
    }, {
        # clip
        'url': 'http://www.viki.com/videos/1067139v-the-avengers-age-of-ultron-press-conference',
-        'md5': '86c0b5dbd4d83a6611a79987cc7a1989',
+        'md5': 'feea2b1d7b3957f70886e6dfd8b8be84',
        'info_dict': {
            'id': '1067139v',
            'ext': 'mp4',
@ -156,17 +159,18 @@ class VikiIE(VikiBaseIE):
        'params': {
            # m3u8 download
            'skip_download': True,
-        }
+        },
        'skip': 'Blocked in the US',
    }, {
        # episode
        'url': 'http://www.viki.com/videos/44699v-boys-over-flowers-episode-1',
-        'md5': '190f3ef426005ba3a080a63325955bc3',
+        'md5': '1f54697dabc8f13f31bf06bb2e4de6db',
        'info_dict': {
            'id': '44699v',
            'ext': 'mp4',
            'title': 'Boys Over Flowers - Episode 1',
-            'description': 'md5:52617e4f729c7d03bfd4bcbbb6e946f2',
+            'description': 'md5:b89cf50038b480b88b5b3c93589a9076',
-            'duration': 4155,
+            'duration': 4204,
            'timestamp': 1270496524,
            'upload_date': '20100405',
            'uploader': 'group8',
@ -196,7 +200,7 @@ class VikiIE(VikiBaseIE):
    }, {
        # non-English description
        'url': 'http://www.viki.com/videos/158036v-love-in-magic',
-        'md5': '1713ae35df5a521b31f6dc40730e7c9c',
+        'md5': '013dc282714e22acf9447cad14ff1208',
        'info_dict': {
            'id': '158036v',
            'ext': 'mp4',
@ -217,7 +221,7 @@ class VikiIE(VikiBaseIE):
        self._check_errors(video)
-        title = self.dict_selection(video.get('titles', {}), 'en')
+        title = self.dict_selection(video.get('titles', {}), 'en', allow_fallback=False)
        if not title:
            title = 'Episode %d' % video.get('number') if video.get('type') == 'episode' else video.get('id') or video_id
            container_titles = video.get('container', {}).get('titles', {})
@ -302,7 +306,7 @@ class VikiChannelIE(VikiBaseIE):
            'title': 'Boys Over Flowers',
            'description': 'md5:ecd3cff47967fe193cff37c0bec52790',
        },
-        'playlist_count': 70,
+        'playlist_mincount': 71,
    }, {
        'url': 'http://www.viki.com/tv/1354c-poor-nastya-complete',
        'info_dict': {
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@ -66,6 +66,69 @@ class VimeoBaseInfoExtractor(InfoExtractor):
    def _set_vimeo_cookie(self, name, value):
        self._set_cookie('vimeo.com', name, value)
    def _vimeo_sort_formats(self, formats):
        # Bitrates are completely broken. Single m3u8 may contain entries in kbps and bps
        # at the same time without actual units specified. This lead to wrong sorting.
        self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'format_id'))
    def _parse_config(self, config, video_id):
        # Extract title
        video_title = config['video']['title']
        # Extract uploader, uploader_url and uploader_id
        video_uploader = config['video'].get('owner', {}).get('name')
        video_uploader_url = config['video'].get('owner', {}).get('url')
        video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
        # Extract video thumbnail
        video_thumbnail = config['video'].get('thumbnail')
        if video_thumbnail is None:
            video_thumbs = config['video'].get('thumbs')
            if video_thumbs and isinstance(video_thumbs, dict):
                _, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
        # Extract video duration
        video_duration = int_or_none(config['video'].get('duration'))
        formats = []
        config_files = config['video'].get('files') or config['request'].get('files', {})
        for f in config_files.get('progressive', []):
            video_url = f.get('url')
            if not video_url:
                continue
            formats.append({
                'url': video_url,
                'format_id': 'http-%s' % f.get('quality'),
                'width': int_or_none(f.get('width')),
                'height': int_or_none(f.get('height')),
                'fps': int_or_none(f.get('fps')),
                'tbr': int_or_none(f.get('bitrate')),
            })
        m3u8_url = config_files.get('hls', {}).get('url')
        if m3u8_url:
            formats.extend(self._extract_m3u8_formats(
                m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
        subtitles = {}
        text_tracks = config['request'].get('text_tracks')
        if text_tracks:
            for tt in text_tracks:
                subtitles[tt['lang']] = [{
                    'ext': 'vtt',
                    'url': 'https://vimeo.com' + tt['url'],
                }]
        return {
            'title': video_title,
            'uploader': video_uploader,
            'uploader_id': video_uploader_id,
            'uploader_url': video_uploader_url,
            'thumbnail': video_thumbnail,
            'duration': video_duration,
            'formats': formats,
            'subtitles': subtitles,
        }
 class VimeoIE(VimeoBaseInfoExtractor):
    """Information extractor for vimeo.com."""
@ -153,7 +216,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
                'uploader_id': 'user18948128',
                'uploader': 'Jaime Marquínez Ferrándiz',
                'duration': 10,
-                'description': 'This is "youtube-dl password protected test video" by Jaime Marquínez Ferrándiz on Vimeo, the home for high quality videos and the people\u2026',
+                'description': 'This is "youtube-dl password protected test video" by  on Vimeo, the home for high quality videos and the people who love them.',
            },
            'params': {
                'videopassword': 'youtube-dl',
@ -389,21 +452,6 @@ class VimeoIE(VimeoBaseInfoExtractor):
                    'https://player.vimeo.com/player/%s' % feature_id,
                    {'force_feature_id': True}), 'Vimeo')
        # Extract title
        video_title = config['video']['title']
        # Extract uploader, uploader_url and uploader_id
        video_uploader = config['video'].get('owner', {}).get('name')
        video_uploader_url = config['video'].get('owner', {}).get('url')
        video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
        # Extract video thumbnail
        video_thumbnail = config['video'].get('thumbnail')
        if video_thumbnail is None:
            video_thumbs = config['video'].get('thumbs')
            if video_thumbs and isinstance(video_thumbs, dict):
                _, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
        # Extract video description
        video_description = self._html_search_regex(
@ -423,9 +471,6 @@ class VimeoIE(VimeoBaseInfoExtractor):
        if not video_description and not mobj.group('player'):
            self._downloader.report_warning('Cannot find video description')
        # Extract video duration
        video_duration = int_or_none(config['video'].get('duration'))
        # Extract upload date
        video_upload_date = None
        mobj = re.search(r'<time[^>]+datetime="([^"]+)"', webpage)
@ -463,53 +508,22 @@ class VimeoIE(VimeoBaseInfoExtractor):
                            'format_id': source_name,
                            'preference': 1,
                        })
        config_files = config['video'].get('files') or config['request'].get('files', {})
        for f in config_files.get('progressive', []):
            video_url = f.get('url')
            if not video_url:
                continue
            formats.append({
                'url': video_url,
                'format_id': 'http-%s' % f.get('quality'),
                'width': int_or_none(f.get('width')),
                'height': int_or_none(f.get('height')),
                'fps': int_or_none(f.get('fps')),
                'tbr': int_or_none(f.get('bitrate')),
            })
        m3u8_url = config_files.get('hls', {}).get('url')
        if m3u8_url:
            formats.extend(self._extract_m3u8_formats(
                m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
        # Bitrates are completely broken. Single m3u8 may contain entries in kbps and bps
        # at the same time without actual units specified. This lead to wrong sorting.
        self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'format_id'))
-        subtitles = {}
+        info_dict = self._parse_config(config, video_id)
-        text_tracks = config['request'].get('text_tracks')
+        formats.extend(info_dict['formats'])
-        if text_tracks:
+        self._vimeo_sort_formats(formats)
-            for tt in text_tracks:
+        info_dict.update({
                subtitles[tt['lang']] = [{
                    'ext': 'vtt',
                    'url': 'https://vimeo.com' + tt['url'],
                }]
        return {
            'id': video_id,
            'uploader': video_uploader,
            'uploader_url': video_uploader_url,
            'uploader_id': video_uploader_id,
            'upload_date': video_upload_date,
            'title': video_title,
            'thumbnail': video_thumbnail,
            'description': video_description,
            'duration': video_duration,
            'formats': formats,
            'upload_date': video_upload_date,
            'description': video_description,
            'webpage_url': url,
            'view_count': view_count,
            'like_count': like_count,
            'comment_count': comment_count,
-            'subtitles': subtitles,
+        })
-        }
+
        return info_dict
 class VimeoOndemandIE(VimeoBaseInfoExtractor):
@ -692,7 +706,7 @@ class VimeoGroupsIE(VimeoAlbumIE):
        return self._extract_videos(name, 'https://vimeo.com/groups/%s' % name)
-class VimeoReviewIE(InfoExtractor):
+class VimeoReviewIE(VimeoBaseInfoExtractor):
    IE_NAME = 'vimeo:review'
    IE_DESC = 'Review pages on vimeo'
    _VALID_URL = r'https://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
@ -704,6 +718,7 @@ class VimeoReviewIE(InfoExtractor):
            'ext': 'mp4',
            'title': "DICK HARDWICK 'Comedian'",
            'uploader': 'Richard Hardwick',
            'uploader_id': 'user21297594',
        }
    }, {
        'note': 'video player needs Referer',
@ -716,14 +731,18 @@ class VimeoReviewIE(InfoExtractor):
            'uploader': 'DevWeek Events',
            'duration': 2773,
            'thumbnail': 're:^https?://.*\.jpg$',
            'uploader_id': 'user22258446',
        }
    }]
    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
+        video_id = self._match_id(url)
-        video_id = mobj.group('id')
+        config = self._download_json(
-        player_url = 'https://player.vimeo.com/player/' + video_id
+            'https://player.vimeo.com/video/%s/config' % video_id, video_id)
-        return self.url_result(player_url, 'Vimeo', video_id)
+        info_dict = self._parse_config(config, video_id)
        self._vimeo_sort_formats(info_dict['formats'])
        info_dict['id'] = video_id
        return info_dict
 class VimeoWatchLaterIE(VimeoChannelIE):
--- a/youtube_dl/extractor/youporn.py
+++ b/youtube_dl/extractor/youporn.py
@ -17,7 +17,7 @@ class YouPornIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?youporn\.com/watch/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
-        'md5': '71ec5fcfddacf80f495efa8b6a8d9a89',
+        'md5': '3744d24c50438cf5b6f6d59feb5055c2',
        'info_dict': {
            'id': '505835',
            'display_id': 'sex-ed-is-it-safe-to-masturbate-daily',
@ -121,21 +121,21 @@ class YouPornIE(InfoExtractor):
            webpage, 'thumbnail', fatal=False, group='thumbnail')
        uploader = self._html_search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoBy(?:\s+[^"\']+)?["\'][^>]*>\s*By:\s*</div>(.+?)</(?:a|div)>',
+            r'(?s)<div[^>]+class=["\']submitByLink["\'][^>]*>(.+?)</div>',
            webpage, 'uploader', fatal=False)
        upload_date = unified_strdate(self._html_search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoTime["\'][^>]*>(.+?)</div>',
+            r'(?s)<div[^>]+class=["\']videoInfo(?:Date|Time)["\'][^>]*>(.+?)</div>',
            webpage, 'upload date', fatal=False))
        age_limit = self._rta_search(webpage)
        average_rating = int_or_none(self._search_regex(
-            r'<div[^>]+class=["\']videoInfoRating["\'][^>]*>\s*<div[^>]+class=["\']videoRatingPercentage["\'][^>]*>(\d+)%</div>',
+            r'<div[^>]+class=["\']videoRatingPercentage["\'][^>]*>(\d+)%</div>',
            webpage, 'average rating', fatal=False))
        view_count = str_to_int(self._search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoViews["\'][^>]*>.*?([\d,.]+)\s*</div>',
+            r'(?s)<div[^>]+class=(["\']).*?\bvideoInfoViews\b.*?\1[^>]*>.*?(?P<count>[\d,.]+)<',
-            webpage, 'view count', fatal=False))
+            webpage, 'view count', fatal=False, group='count'))
        comment_count = str_to_int(self._search_regex(
            r'>All [Cc]omments? \(([\d,.]+)\)',
            webpage, 'comment count', fatal=False))
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@ -76,7 +76,7 @@ def register_socks_protocols():
 compiled_regex_type = type(re.compile(''))
 std_headers = {
-    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/44.0 (Chrome)',
+    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)',
    'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate',
@ -1901,6 +1901,16 @@ def dict_get(d, key_or_keys, default=None, skip_false_values=True):
    return d.get(key_or_keys, default)
 def try_get(src, getter, expected_type=None):
    try:
        v = getter(src)
    except (AttributeError, KeyError, TypeError, IndexError):
        pass
    else:
        if expected_type is None or isinstance(v, expected_type):
            return v
 def encode_compat_str(string, encoding=preferredencoding(), errors='strict'):
    return string if isinstance(string, compat_str) else compat_str(string, encoding, errors)
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2016.06.11.1'
+__version__ = '2016.06.12'
Author	SHA1	Message	Date
Sergey M․	77a9a9c295	release 2016.06.12	2016-06-12 12:06:48 +07:00
Sergey M․	84dcd1c4e4	[streamcloud] Detect removed videos (Closes #3768 )	2016-06-12 11:08:39 +07:00
Sergey M․	971e3b7520	[nrk:skole] Fix extraction	2016-06-12 07:20:37 +07:00
Sergey M․	4e79011729	[nrktv] Fix tests	2016-06-12 06:57:04 +07:00
Sergey M․	a936ac321c	[README.md] Document using output template in batch files (Closes #9717 )	2016-06-12 06:39:31 +07:00
Sergey M․	98960c911c	[instagram] Extract metadata from JSON	2016-06-12 06:06:04 +07:00
Sergey M․	329ca3bef6	[utils] Add try_get To reduce boilerplate when accessing JSON	2016-06-12 06:05:34 +07:00
Sergey M․	2c3322e36e	[youporn] Fix metadata extraction	2016-06-12 04:49:37 +07:00
Sergey M․	80ae228b34	[matchtv] Modernize	2016-06-12 01:57:23 +07:00
Yen Chi Hsuan	6d28c408cf	[viki] Do not use a fallback language for title in the first try In test_Viki_3, 'titles' gives a Hebrew title.	2016-06-11 23:00:44 +08:00
Yen Chi Hsuan	c83b35d4aa	[viki] Update _TESTS	2016-06-11 22:39:13 +08:00
Yen Chi Hsuan	94e5d6aedb	[viki] Skip a geo-restricted test	2016-06-11 21:49:01 +08:00
Yen Chi Hsuan	531a74968c	[vimeo] Fix extraction for VimeoReview videos	2016-06-11 21:35:08 +08:00
Yen Chi Hsuan	c5edd147d1	[generic] Remove an invalid test Now handled by telewebion.py	2016-06-11 18:39:58 +08:00
Yen Chi Hsuan	856150d056	[telewebion] Add new extractor (closes #5135 )	2016-06-11 18:39:58 +08:00
Yen Chi Hsuan	03ebea89b0	Merge pull request #9755 from vxbinaca/patch-2 [utils] Change Firefox 44 to 47	2016-06-11 17:38:45 +08:00
Paul Henning	15d106787e	[utils] Change Firefox 44 to 47 See commit title.	2016-06-11 05:36:31 -04:00
Yen Chi Hsuan	7aab3696dd	[kuwo] Update _TESTS	2016-06-11 15:37:04 +08:00
Yen Chi Hsuan	47787efa2b	[leeco] Recognize Le Sports URLs (fixes #9750 )	2016-06-11 13:14:41 +08:00
Sergey M․	4a420119a6	release 2016.06.11.3	2016-06-11 08:34:30 +07:00
Sergey M․	33751818d3	release 2016.06.11.2	2016-06-11 08:28:51 +07:00
Sergey M․	698f127c1a	[setup.py] Add python 3.5 classifier	2016-06-11 06:14:22 +07:00
Sergey M․	fe458b6596	[limelight] Extract ttml subtitles (Closes #9739 )	2016-06-11 05:57:27 +07:00
Sergey M․	21ac1a8ac3	[limelight] Fix typo	2016-06-11 05:52:50 +07:00
Sergey M․	79027c0ea0	[limelight] Improve _VALID_URLs	2016-06-11 05:40:02 +07:00
Sergey M․	4cad2929cd	[limelight] Fix _VALID_URLs	2016-06-11 05:30:44 +07:00
Sergey M․	62666af99f	[indavideo] Fix formats' height (Closes #9744 )	2016-06-11 05:13:05 +07:00
Sergey M․	9ddc289f88	[README.md] Document missing playlist fields in output template	2016-06-11 04:59:47 +07:00
`@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2016.06.11.1'`	`__version__ = '2016.06.12'`