release 2016.06.12

[streamcloud] Detect removed videos (Closes #3768 )
[nrk:skole] Fix extraction
2016-06-12 12:06:48 +07:00 · 2016-06-12 11:08:39 +07:00 · 2016-06-12 07:20:37 +07:00 · 2016-06-12 06:57:04 +07:00 · 2016-06-12 06:39:31 +07:00 · 2016-06-12 06:06:04 +07:00
20 changed files with 352 additions and 155 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.11.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.11.1**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.12**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.06.11.1
+[debug] youtube-dl version 2016.06.12
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/README.md
+++ b/README.md
@ -511,6 +511,9 @@ The basic usage is not to set any template arguments when downloading a single f
 - `autonumber`: Five-digit number that will be increased with each download, starting at zero
 - `playlist`: Name or id of the playlist that contains the video
 - `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist
+ - `playlist_id`: Playlist identifier
+ - `playlist_title`: Playlist title
+

 Available for the video that belongs to some logical chapter or section:
 - `chapter`: Name or title of the chapter the video belongs to
@ -550,6 +553,10 @@ The current default template is `%(title)s-%(id)s.%(ext)s`.

 In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:

+#### Output template and Windows batch files
+
+If you are using output template inside a Windows batch file then you must escape plain percent characters (`%`) by doubling, so that `-o "%(title)s-%(id)s.%(ext)s"` should become `-o "%%(title)s-%%(id)s.%%(ext)s"`. However you should not touch `%`'s that are not plain characters, e.g. environment variables for expansion should stay intact: `-o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s"`.
+
 #### Output template examples

 Note on Windows you may need to use double quotes instead of single.
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -44,8 +44,8 @@
 - **appletrailers:section**
 - **archive.org**: archive.org videos
 - **ARD**
- - **ARD:mediathek**
 - **ARD:mediathek**: Saarländischer Rundfunk
+ - **ARD:mediathek**
 - **arte.tv**
 - **arte.tv:+7**
 - **arte.tv:cinema**
@ -647,6 +647,7 @@
 - **Telegraaf**
 - **TeleMB**
 - **TeleTask**
+ - **Telewebion**
 - **TF1**
 - **TheIntercept**
 - **ThePlatform**
--- a/setup.py
+++ b/setup.py
@ -122,6 +122,7 @@ setup(
        "Programming Language :: Python :: 3.2",
        "Programming Language :: Python :: 3.3",
        "Programming Language :: Python :: 3.4",
+        "Programming Language :: Python :: 3.5",
    ],

    cmdclass={'build_lazy_extractors': build_lazy_extractors},
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -777,6 +777,7 @@ from .telecinco import TelecincoIE
 from .telegraaf import TelegraafIE
 from .telemb import TeleMBIE
 from .teletask import TeleTaskIE
+from .telewebion import TelewebionIE
 from .testurl import TestURLIE
 from .tf1 import TF1IE
 from .theintercept import TheInterceptIE
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@ -1073,20 +1073,6 @@ class GenericIE(InfoExtractor):
                'skip_download': True,
            }
        },
-        # Contains a SMIL manifest
-        {
-            'url': 'http://www.telewebion.com/fa/1263668/%D9%82%D8%B1%D8%B9%D9%87%E2%80%8C%DA%A9%D8%B4%DB%8C-%D9%84%DB%8C%DA%AF-%D9%82%D9%87%D8%B1%D9%85%D8%A7%D9%86%D8%A7%D9%86-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7/%2B-%D9%81%D9%88%D8%AA%D8%A8%D8%A7%D9%84.html',
-            'info_dict': {
-                'id': 'file',
-                'ext': 'flv',
-                'title': '+ Football: Lottery Champions League Europe',
-                'uploader': 'www.telewebion.com',
-            },
-            'params': {
-                # rtmpe downloads
-                'skip_download': True,
-            }
-        },
        # Brightcove URL in single quotes
        {
            'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/',
--- a/youtube_dl/extractor/indavideo.py
+++ b/youtube_dl/extractor/indavideo.py
@ -60,7 +60,8 @@ class IndavideoEmbedIE(InfoExtractor):

        formats = [{
            'url': video_url,
-            'height': self._search_regex(r'\.(\d{3,4})\.mp4$', video_url, 'height', default=None),
+            'height': int_or_none(self._search_regex(
+                r'\.(\d{3,4})\.mp4(?:\?|$)', video_url, 'height', default=None)),
        } for video_url in video_urls]
        self._sort_formats(formats)

--- a/youtube_dl/extractor/instagram.py
+++ b/youtube_dl/extractor/instagram.py
@ -8,6 +8,7 @@ from ..utils import (
    int_or_none,
    limit_length,
    lowercase_escape,
+    try_get,
 )


@ -19,10 +20,16 @@ class InstagramIE(InfoExtractor):
        'info_dict': {
            'id': 'aye83DjauH',
            'ext': 'mp4',
-            'uploader_id': 'naomipq',
            'title': 'Video by naomipq',
            'description': 'md5:1f17f0ab29bd6fe2bfad705f58de3cb8',
-        }
+            'thumbnail': 're:^https?://.*\.jpg',
+            'timestamp': 1371748545,
+            'upload_date': '20130620',
+            'uploader_id': 'naomipq',
+            'uploader': 'Naomi Leonor Phan-Quang',
+            'like_count': int,
+            'comment_count': int,
+        },
    }, {
        # missing description
        'url': 'https://www.instagram.com/p/BA-pQFBG8HZ/?taken-by=britneyspears',
@ -31,6 +38,13 @@ class InstagramIE(InfoExtractor):
            'ext': 'mp4',
            'uploader_id': 'britneyspears',
            'title': 'Video by britneyspears',
+            'thumbnail': 're:^https?://.*\.jpg',
+            'timestamp': 1453760977,
+            'upload_date': '20160125',
+            'uploader_id': 'britneyspears',
+            'uploader': 'Britney Spears',
+            'like_count': int,
+            'comment_count': int,
        },
        'params': {
            'skip_download': True,
@ -67,21 +81,57 @@ class InstagramIE(InfoExtractor):
        url = mobj.group('url')

        webpage = self._download_webpage(url, video_id)
-        uploader_id = self._search_regex(r'"owner":{"username":"(.+?)"',
-                                         webpage, 'uploader id', fatal=False)
-        desc = self._search_regex(
-            r'"caption":"(.+?)"', webpage, 'description', default=None)
-        if desc is not None:
-            desc = lowercase_escape(desc)
+
+        (video_url, description, thumbnail, timestamp, uploader,
+         uploader_id, like_count, comment_count) = [None] * 8
+
+        shared_data = self._parse_json(
+            self._search_regex(
+                r'window\._sharedData\s*=\s*({.+?});',
+                webpage, 'shared data', default='{}'),
+            video_id, fatal=False)
+        if shared_data:
+            media = try_get(
+                shared_data, lambda x: x['entry_data']['PostPage'][0]['media'], dict)
+            if media:
+                video_url = media.get('video_url')
+                description = media.get('caption')
+                thumbnail = media.get('display_src')
+                timestamp = int_or_none(media.get('date'))
+                uploader = media.get('owner', {}).get('full_name')
+                uploader_id = media.get('owner', {}).get('username')
+                like_count = int_or_none(media.get('likes', {}).get('count'))
+                comment_count = int_or_none(media.get('comments', {}).get('count'))
+
+        if not video_url:
+            video_url = self._og_search_video_url(webpage, secure=False)
+
+        if not uploader_id:
+            uploader_id = self._search_regex(
+                r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"',
+                webpage, 'uploader id', fatal=False)
+
+        if not description:
+            description = self._search_regex(
+                r'"caption"\s*:\s*"(.+?)"', webpage, 'description', default=None)
+            if description is not None:
+                description = lowercase_escape(description)
+
+        if not thumbnail:
+            thumbnail = self._og_search_thumbnail(webpage)

        return {
            'id': video_id,
-            'url': self._og_search_video_url(webpage, secure=False),
+            'url': video_url,
            'ext': 'mp4',
            'title': 'Video by %s' % uploader_id,
-            'thumbnail': self._og_search_thumbnail(webpage),
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
            'uploader_id': uploader_id,
-            'description': desc,
+            'uploader': uploader,
+            'like_count': like_count,
+            'comment_count': comment_count,
        }


--- a/youtube_dl/extractor/kuwo.py
+++ b/youtube_dl/extractor/kuwo.py
@ -148,8 +148,8 @@ class KuwoAlbumIE(InfoExtractor):
        'url': 'http://www.kuwo.cn/album/502294/',
        'info_dict': {
            'id': '502294',
-            'title': 'M',
-            'description': 'md5:6a7235a84cc6400ec3b38a7bdaf1d60c',
+            'title': 'Made\xa0Series\xa0《M》',
+            'description': 'md5:d463f0d8a0ff3c3ea3d6ed7452a9483f',
        },
        'playlist_count': 2,
    }
@ -209,7 +209,7 @@ class KuwoSingerIE(InfoExtractor):
        'url': 'http://www.kuwo.cn/mingxing/bruno+mars/',
        'info_dict': {
            'id': 'bruno+mars',
-            'title': 'Bruno Mars',
+            'title': 'Bruno\xa0Mars',
        },
        'playlist_mincount': 329,
    }, {
@ -306,7 +306,7 @@ class KuwoMvIE(KuwoBaseIE):
            'id': '6480076',
            'ext': 'mp4',
            'title': 'My HouseMV',
-            'creator': '2PM',
+            'creator': 'PM02:00',
        },
        # In this video, music URLs (anti.s) are blocked outside China and
        # USA, while the MV URL (mvurl) is available globally, so force the MV
--- a/youtube_dl/extractor/leeco.py
+++ b/youtube_dl/extractor/leeco.py
@ -28,7 +28,7 @@ from ..utils import (

 class LeIE(InfoExtractor):
    IE_DESC = '乐视网'
-    _VALID_URL = r'https?://www\.le\.com/ptv/vplay/(?P<id>\d+)\.html'
+    _VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|sports\.le\.com/video)/(?P<id>\d+)\.html'

    _URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html'

@ -69,6 +69,9 @@ class LeIE(InfoExtractor):
            'hls_prefer_native': True,
        },
        'skip': 'Only available in China',
+    }, {
+        'url': 'http://sports.le.com/video/25737697.html',
+        'only_matching': True,
    }]

    @staticmethod
@ -196,7 +199,7 @@ class LeIE(InfoExtractor):


 class LePlaylistIE(InfoExtractor):
-    _VALID_URL = r'https?://[a-z]+\.le\.com/[a-z]+/(?P<id>[a-z0-9_]+)'
+    _VALID_URL = r'https?://[a-z]+\.le\.com/(?!video)[a-z]+/(?P<id>[a-z0-9_]+)'

    _TESTS = [{
        'url': 'http://www.le.com/tv/46177.html',
--- a/youtube_dl/extractor/limelight.py
+++ b/youtube_dl/extractor/limelight.py
@ -98,13 +98,19 @@ class LimelightBaseIE(InfoExtractor):
        } for thumbnail in properties.get('thumbnails', []) if thumbnail.get('url')]

        subtitles = {}
-        for caption in properties.get('captions', {}):
+        for caption in properties.get('captions', []):
            lang = caption.get('language_code')
            subtitles_url = caption.get('url')
            if lang and subtitles_url:
-                subtitles[lang] = [{
+                subtitles.setdefault(lang, []).append({
                    'url': subtitles_url,
-                }]
+                })
+        closed_captions_url = properties.get('closed_captions_url')
+        if closed_captions_url:
+            subtitles.setdefault('en', []).append({
+                'url': closed_captions_url,
+                'ext': 'ttml',
+            })

        return {
            'id': video_id,
@ -123,7 +129,18 @@ class LimelightBaseIE(InfoExtractor):

 class LimelightMediaIE(LimelightBaseIE):
    IE_NAME = 'limelight'
-    _VALID_URL = r'(?:limelight:media:|https?://link\.videoplatform\.limelight\.com/media/\??\bmediaId=)(?P<id>[a-z0-9]{32})'
+    _VALID_URL = r'''(?x)
+                        (?:
+                            limelight:media:|
+                            https?://
+                                (?:
+                                    link\.videoplatform\.limelight\.com/media/|
+                                    assets\.delvenetworks\.com/player/loader\.swf
+                                )
+                                \?.*?\bmediaId=
+                        )
+                        (?P<id>[a-z0-9]{32})
+                    '''
    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?mediaId=3ffd040b522b4485b6d84effc750cd86',
        'info_dict': {
@ -158,6 +175,9 @@ class LimelightMediaIE(LimelightBaseIE):
            # rtmp download
            'skip_download': True,
        },
+    }, {
+        'url': 'https://assets.delvenetworks.com/player/loader.swf?mediaId=8018a574f08d416e95ceaccae4ba0452',
+        'only_matching': True,
    }]
    _PLAYLIST_SERVICE_PATH = 'media'
    _API_PATH = 'media'
@ -176,15 +196,29 @@ class LimelightMediaIE(LimelightBaseIE):

 class LimelightChannelIE(LimelightBaseIE):
    IE_NAME = 'limelight:channel'
-    _VALID_URL = r'(?:limelight:channel:|https?://link\.videoplatform\.limelight\.com/media/\??\bchannelId=)(?P<id>[a-z0-9]{32})'
-    _TEST = {
+    _VALID_URL = r'''(?x)
+                        (?:
+                            limelight:channel:|
+                            https?://
+                                (?:
+                                    link\.videoplatform\.limelight\.com/media/|
+                                    assets\.delvenetworks\.com/player/loader\.swf
+                                )
+                                \?.*?\bchannelId=
+                        )
+                        (?P<id>[a-z0-9]{32})
+                    '''
+    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?channelId=ab6a524c379342f9b23642917020c082',
        'info_dict': {
            'id': 'ab6a524c379342f9b23642917020c082',
            'title': 'Javascript Sample Code',
        },
        'playlist_mincount': 3,
-    }
+    }, {
+        'url': 'http://assets.delvenetworks.com/player/loader.swf?channelId=ab6a524c379342f9b23642917020c082',
+        'only_matching': True,
+    }]
    _PLAYLIST_SERVICE_PATH = 'channel'
    _API_PATH = 'channels'

@ -207,15 +241,29 @@ class LimelightChannelIE(LimelightBaseIE):

 class LimelightChannelListIE(LimelightBaseIE):
    IE_NAME = 'limelight:channel_list'
-    _VALID_URL = r'(?:limelight:channel_list:|https?://link\.videoplatform\.limelight\.com/media/\?.*?\bchannelListId=)(?P<id>[a-z0-9]{32})'
-    _TEST = {
+    _VALID_URL = r'''(?x)
+                        (?:
+                            limelight:channel_list:|
+                            https?://
+                                (?:
+                                    link\.videoplatform\.limelight\.com/media/|
+                                    assets\.delvenetworks\.com/player/loader\.swf
+                                )
+                                \?.*?\bchannelListId=
+                        )
+                        (?P<id>[a-z0-9]{32})
+                    '''
+    _TESTS = [{
        'url': 'http://link.videoplatform.limelight.com/media/?channelListId=301b117890c4465c8179ede21fd92e2b',
        'info_dict': {
            'id': '301b117890c4465c8179ede21fd92e2b',
            'title': 'Website - Hero Player',
        },
        'playlist_mincount': 2,
-    }
+    }, {
+        'url': 'https://assets.delvenetworks.com/player/loader.swf?channelListId=301b117890c4465c8179ede21fd92e2b',
+        'only_matching': True,
+    }]
    _PLAYLIST_SERVICE_PATH = 'channel_list'

    def _real_extract(self, url):
--- a/youtube_dl/extractor/matchtv.py
+++ b/youtube_dl/extractor/matchtv.py
@ -4,16 +4,12 @@ from __future__ import unicode_literals
 import random

 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_urlencode
-from ..utils import (
-    sanitized_Request,
-    xpath_text,
-)
+from ..utils import xpath_text


 class MatchTVIE(InfoExtractor):
-    _VALID_URL = r'https?://matchtv\.ru/?#live-player'
-    _TEST = {
+    _VALID_URL = r'https?://matchtv\.ru(?:/on-air|/?#live-player)'
+    _TESTS = [{
        'url': 'http://matchtv.ru/#live-player',
        'info_dict': {
            'id': 'matchtv-live',
@ -24,12 +20,16 @@ class MatchTVIE(InfoExtractor):
        'params': {
            'skip_download': True,
        },
-    }
+    }, {
+        'url': 'http://matchtv.ru/on-air/',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = 'matchtv-live'
-        request = sanitized_Request(
-            'http://player.matchtv.ntvplus.tv/player/smil?%s' % compat_urllib_parse_urlencode({
+        video_url = self._download_json(
+            'http://player.matchtv.ntvplus.tv/player/smil', video_id,
+            query={
                'ts': '',
                'quality': 'SD',
                'contentId': '561d2c0df7159b37178b4567',
@ -40,11 +40,10 @@ class MatchTVIE(InfoExtractor):
                'contentType': 'channel',
                'timeShift': '0',
                'platform': 'portal',
-            }),
+            },
            headers={
                'Referer': 'http://player.matchtv.ntvplus.tv/embed-player/NTVEmbedPlayer.swf',
-            })
-        video_url = self._download_json(request, video_id)['data']['videoUrl']
+            })['data']['videoUrl']
        f4m_url = xpath_text(self._download_xml(video_url, video_id), './to')
        formats = self._extract_f4m_formats(f4m_url, video_id)
        self._sort_formats(formats)
--- a/youtube_dl/extractor/nrk.py
+++ b/youtube_dl/extractor/nrk.py
@ -163,7 +163,7 @@ class NRKTVIE(NRKBaseIE):
            'ext': 'mp4',
            'title': '20 spørsmål 23.05.2014',
            'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
-            'duration': 1741.52,
+            'duration': 1741,
        },
    }, {
        'url': 'https://tv.nrk.no/program/mdfp15000514',
@ -173,7 +173,7 @@ class NRKTVIE(NRKBaseIE):
            'ext': 'mp4',
            'title': 'Grunnlovsjubiléet - Stor ståhei for ingenting 24.05.2014',
            'description': 'md5:89290c5ccde1b3a24bb8050ab67fe1db',
-            'duration': 4605.08,
+            'duration': 4605,
        },
    }, {
        # single playlist video
@ -260,30 +260,34 @@ class NRKPlaylistIE(InfoExtractor):

 class NRKSkoleIE(InfoExtractor):
    IE_DESC = 'NRK Skole'
-    _VALID_URL = r'https?://(?:www\.)?nrk\.no/skole/klippdetalj?.*\btopic=(?P<id>[^/?#&]+)'
+    _VALID_URL = r'https?://(?:www\.)?nrk\.no/skole/?\?.*\bmediaId=(?P<id>\d+)'

    _TESTS = [{
-        'url': 'http://nrk.no/skole/klippdetalj?topic=nrk:klipp/616532',
-        'md5': '04cd85877cc1913bce73c5d28a47e00f',
+        'url': 'https://www.nrk.no/skole/?page=search&q=&mediaId=14099',
+        'md5': '6bc936b01f9dd8ed45bc58b252b2d9b6',
        'info_dict': {
            'id': '6021',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Genetikk og eneggede tvillinger',
            'description': 'md5:3aca25dcf38ec30f0363428d2b265f8d',
            'duration': 399,
        },
    }, {
-        'url': 'http://www.nrk.no/skole/klippdetalj?topic=nrk%3Aklipp%2F616532#embed',
-        'only_matching': True,
-    }, {
-        'url': 'http://www.nrk.no/skole/klippdetalj?topic=urn:x-mediadb:21379',
+        'url': 'https://www.nrk.no/skole/?page=objectives&subject=naturfag&objective=K15114&mediaId=19355',
        'only_matching': True,
    }]

    def _real_extract(self, url):
-        video_id = compat_urllib_parse_unquote(self._match_id(url))
+        video_id = self._match_id(url)

-        webpage = self._download_webpage(url, video_id)
+        webpage = self._download_webpage(
+            'https://mimir.nrk.no/plugin/1.0/static?mediaId=%s' % video_id,
+            video_id)
+
+        nrk_id = self._parse_json(
+            self._search_regex(
+                r'<script[^>]+type=["\']application/json["\'][^>]*>({.+?})</script>',
+                webpage, 'application json'),
+            video_id)['activeMedia']['psId']

-        nrk_id = self._search_regex(r'data-nrk-id=["\'](\d+)', webpage, 'nrk id')
        return self.url_result('nrk:%s' % nrk_id)
--- a/youtube_dl/extractor/streamcloud.py
+++ b/youtube_dl/extractor/streamcloud.py
@ -5,6 +5,7 @@ import re

 from .common import InfoExtractor
 from ..utils import (
+    ExtractorError,
    sanitized_Request,
    urlencode_postdata,
 )
@ -14,7 +15,7 @@ class StreamcloudIE(InfoExtractor):
    IE_NAME = 'streamcloud.eu'
    _VALID_URL = r'https?://streamcloud\.eu/(?P<id>[a-zA-Z0-9_-]+)(?:/(?P<fname>[^#?]*)\.html)?'

-    _TEST = {
+    _TESTS = [{
        'url': 'http://streamcloud.eu/skp9j99s4bpz/youtube-dl_test_video_____________-BaW_jenozKc.mp4.html',
        'md5': '6bea4c7fa5daaacc2a946b7146286686',
        'info_dict': {
@ -23,7 +24,10 @@ class StreamcloudIE(InfoExtractor):
            'title': 'youtube-dl test video  \'/\\ ä ↭',
        },
        'skip': 'Only available from the EU'
-    }
+    }, {
+        'url': 'http://streamcloud.eu/ua8cmfh1nbe6/NSHIP-148--KUC-NG--H264-.mp4.html',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
@ -31,6 +35,10 @@ class StreamcloudIE(InfoExtractor):

        orig_webpage = self._download_webpage(url, video_id)

+        if '>File Not Found<' in orig_webpage:
+            raise ExtractorError(
+                'Video %s does not exist' % video_id, expected=True)
+
        fields = re.findall(r'''(?x)<input\s+
            type="(?:hidden|submit)"\s+
            name="([^"]+)"\s+
--- a/youtube_dl/extractor/telewebion.py
+++ b/youtube_dl/extractor/telewebion.py
@ -0,0 +1,55 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class TelewebionIE(InfoExtractor):
+    _VALID_URL = r'https?://www\.telewebion\.com/#!/episode/(?P<id>\d+)'
+
+    _TEST = {
+        'url': 'http://www.telewebion.com/#!/episode/1263668/',
+        'info_dict': {
+            'id': '1263668',
+            'ext': 'mp4',
+            'title': 'قرعه\u200cکشی لیگ قهرمانان اروپا',
+            'thumbnail': 're:^https?://.*\.jpg',
+            'view_count': int,
+        },
+        'params': {
+            # m3u8 download
+            'skip_download': True,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        secure_token = self._download_webpage(
+            'http://m.s2.telewebion.com/op/op?action=getSecurityToken', video_id)
+        episode_details = self._download_json(
+            'http://m.s2.telewebion.com/op/op', video_id,
+            query={'action': 'getEpisodeDetails', 'episode_id': video_id})
+
+        m3u8_url = 'http://m.s1.telewebion.com/smil/%s.m3u8?filepath=%s&m3u8=1&secure_token=%s' % (
+            video_id, episode_details['file_path'], secure_token)
+        formats = self._extract_m3u8_formats(
+            m3u8_url, video_id, ext='mp4', m3u8_id='hls')
+
+        picture_paths = [
+            episode_details.get('picture_path'),
+            episode_details.get('large_picture_path'),
+        ]
+
+        thumbnails = [{
+            'url': picture_path,
+            'preference': idx,
+        } for idx, picture_path in enumerate(picture_paths) if picture_path is not None]
+
+        return {
+            'id': video_id,
+            'title': episode_details['title'],
+            'formats': formats,
+            'thumbnails': thumbnails,
+            'view_count': episode_details.get('view_count'),
+        }
--- a/youtube_dl/extractor/viki.py
+++ b/youtube_dl/extractor/viki.py
@ -101,10 +101,13 @@ class VikiBaseIE(InfoExtractor):
            self.report_warning('Unable to get session token, login has probably failed')

    @staticmethod
-    def dict_selection(dict_obj, preferred_key):
+    def dict_selection(dict_obj, preferred_key, allow_fallback=True):
        if preferred_key in dict_obj:
            return dict_obj.get(preferred_key)

+        if not allow_fallback:
+            return
+
        filtered_dict = list(filter(None, [dict_obj.get(k) for k in dict_obj.keys()]))
        return filtered_dict[0] if filtered_dict else None

@ -127,7 +130,7 @@ class VikiIE(VikiBaseIE):
    }, {
        # clip
        'url': 'http://www.viki.com/videos/1067139v-the-avengers-age-of-ultron-press-conference',
-        'md5': '86c0b5dbd4d83a6611a79987cc7a1989',
+        'md5': 'feea2b1d7b3957f70886e6dfd8b8be84',
        'info_dict': {
            'id': '1067139v',
            'ext': 'mp4',
@ -156,17 +159,18 @@ class VikiIE(VikiBaseIE):
        'params': {
            # m3u8 download
            'skip_download': True,
-        }
+        },
+        'skip': 'Blocked in the US',
    }, {
        # episode
        'url': 'http://www.viki.com/videos/44699v-boys-over-flowers-episode-1',
-        'md5': '190f3ef426005ba3a080a63325955bc3',
+        'md5': '1f54697dabc8f13f31bf06bb2e4de6db',
        'info_dict': {
            'id': '44699v',
            'ext': 'mp4',
            'title': 'Boys Over Flowers - Episode 1',
-            'description': 'md5:52617e4f729c7d03bfd4bcbbb6e946f2',
-            'duration': 4155,
+            'description': 'md5:b89cf50038b480b88b5b3c93589a9076',
+            'duration': 4204,
            'timestamp': 1270496524,
            'upload_date': '20100405',
            'uploader': 'group8',
@ -196,7 +200,7 @@ class VikiIE(VikiBaseIE):
    }, {
        # non-English description
        'url': 'http://www.viki.com/videos/158036v-love-in-magic',
-        'md5': '1713ae35df5a521b31f6dc40730e7c9c',
+        'md5': '013dc282714e22acf9447cad14ff1208',
        'info_dict': {
            'id': '158036v',
            'ext': 'mp4',
@ -217,7 +221,7 @@ class VikiIE(VikiBaseIE):

        self._check_errors(video)

-        title = self.dict_selection(video.get('titles', {}), 'en')
+        title = self.dict_selection(video.get('titles', {}), 'en', allow_fallback=False)
        if not title:
            title = 'Episode %d' % video.get('number') if video.get('type') == 'episode' else video.get('id') or video_id
            container_titles = video.get('container', {}).get('titles', {})
@ -302,7 +306,7 @@ class VikiChannelIE(VikiBaseIE):
            'title': 'Boys Over Flowers',
            'description': 'md5:ecd3cff47967fe193cff37c0bec52790',
        },
-        'playlist_count': 70,
+        'playlist_mincount': 71,
    }, {
        'url': 'http://www.viki.com/tv/1354c-poor-nastya-complete',
        'info_dict': {
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@ -66,6 +66,69 @@ class VimeoBaseInfoExtractor(InfoExtractor):
    def _set_vimeo_cookie(self, name, value):
        self._set_cookie('vimeo.com', name, value)

+    def _vimeo_sort_formats(self, formats):
+        # Bitrates are completely broken. Single m3u8 may contain entries in kbps and bps
+        # at the same time without actual units specified. This lead to wrong sorting.
+        self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'format_id'))
+
+    def _parse_config(self, config, video_id):
+        # Extract title
+        video_title = config['video']['title']
+
+        # Extract uploader, uploader_url and uploader_id
+        video_uploader = config['video'].get('owner', {}).get('name')
+        video_uploader_url = config['video'].get('owner', {}).get('url')
+        video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
+
+        # Extract video thumbnail
+        video_thumbnail = config['video'].get('thumbnail')
+        if video_thumbnail is None:
+            video_thumbs = config['video'].get('thumbs')
+            if video_thumbs and isinstance(video_thumbs, dict):
+                _, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
+
+        # Extract video duration
+        video_duration = int_or_none(config['video'].get('duration'))
+
+        formats = []
+        config_files = config['video'].get('files') or config['request'].get('files', {})
+        for f in config_files.get('progressive', []):
+            video_url = f.get('url')
+            if not video_url:
+                continue
+            formats.append({
+                'url': video_url,
+                'format_id': 'http-%s' % f.get('quality'),
+                'width': int_or_none(f.get('width')),
+                'height': int_or_none(f.get('height')),
+                'fps': int_or_none(f.get('fps')),
+                'tbr': int_or_none(f.get('bitrate')),
+            })
+        m3u8_url = config_files.get('hls', {}).get('url')
+        if m3u8_url:
+            formats.extend(self._extract_m3u8_formats(
+                m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
+
+        subtitles = {}
+        text_tracks = config['request'].get('text_tracks')
+        if text_tracks:
+            for tt in text_tracks:
+                subtitles[tt['lang']] = [{
+                    'ext': 'vtt',
+                    'url': 'https://vimeo.com' + tt['url'],
+                }]
+
+        return {
+            'title': video_title,
+            'uploader': video_uploader,
+            'uploader_id': video_uploader_id,
+            'uploader_url': video_uploader_url,
+            'thumbnail': video_thumbnail,
+            'duration': video_duration,
+            'formats': formats,
+            'subtitles': subtitles,
+        }
+

 class VimeoIE(VimeoBaseInfoExtractor):
    """Information extractor for vimeo.com."""
@ -153,7 +216,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
                'uploader_id': 'user18948128',
                'uploader': 'Jaime Marquínez Ferrándiz',
                'duration': 10,
-                'description': 'This is "youtube-dl password protected test video" by Jaime Marquínez Ferrándiz on Vimeo, the home for high quality videos and the people\u2026',
+                'description': 'This is "youtube-dl password protected test video" by  on Vimeo, the home for high quality videos and the people who love them.',
            },
            'params': {
                'videopassword': 'youtube-dl',
@ -389,21 +452,6 @@ class VimeoIE(VimeoBaseInfoExtractor):
                    'https://player.vimeo.com/player/%s' % feature_id,
                    {'force_feature_id': True}), 'Vimeo')

-        # Extract title
-        video_title = config['video']['title']
-
-        # Extract uploader, uploader_url and uploader_id
-        video_uploader = config['video'].get('owner', {}).get('name')
-        video_uploader_url = config['video'].get('owner', {}).get('url')
-        video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
-
-        # Extract video thumbnail
-        video_thumbnail = config['video'].get('thumbnail')
-        if video_thumbnail is None:
-            video_thumbs = config['video'].get('thumbs')
-            if video_thumbs and isinstance(video_thumbs, dict):
-                _, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
-
        # Extract video description

        video_description = self._html_search_regex(
@ -423,9 +471,6 @@ class VimeoIE(VimeoBaseInfoExtractor):
        if not video_description and not mobj.group('player'):
            self._downloader.report_warning('Cannot find video description')

-        # Extract video duration
-        video_duration = int_or_none(config['video'].get('duration'))
-
        # Extract upload date
        video_upload_date = None
        mobj = re.search(r'<time[^>]+datetime="([^"]+)"', webpage)
@ -463,53 +508,22 @@ class VimeoIE(VimeoBaseInfoExtractor):
                            'format_id': source_name,
                            'preference': 1,
                        })
-        config_files = config['video'].get('files') or config['request'].get('files', {})
-        for f in config_files.get('progressive', []):
-            video_url = f.get('url')
-            if not video_url:
-                continue
-            formats.append({
-                'url': video_url,
-                'format_id': 'http-%s' % f.get('quality'),
-                'width': int_or_none(f.get('width')),
-                'height': int_or_none(f.get('height')),
-                'fps': int_or_none(f.get('fps')),
-                'tbr': int_or_none(f.get('bitrate')),
-            })
-        m3u8_url = config_files.get('hls', {}).get('url')
-        if m3u8_url:
-            formats.extend(self._extract_m3u8_formats(
-                m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
-        # Bitrates are completely broken. Single m3u8 may contain entries in kbps and bps
-        # at the same time without actual units specified. This lead to wrong sorting.
-        self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'format_id'))

-        subtitles = {}
-        text_tracks = config['request'].get('text_tracks')
-        if text_tracks:
-            for tt in text_tracks:
-                subtitles[tt['lang']] = [{
-                    'ext': 'vtt',
-                    'url': 'https://vimeo.com' + tt['url'],
-                }]
-
-        return {
+        info_dict = self._parse_config(config, video_id)
+        formats.extend(info_dict['formats'])
+        self._vimeo_sort_formats(formats)
+        info_dict.update({
            'id': video_id,
-            'uploader': video_uploader,
-            'uploader_url': video_uploader_url,
-            'uploader_id': video_uploader_id,
-            'upload_date': video_upload_date,
-            'title': video_title,
-            'thumbnail': video_thumbnail,
-            'description': video_description,
-            'duration': video_duration,
            'formats': formats,
+            'upload_date': video_upload_date,
+            'description': video_description,
            'webpage_url': url,
            'view_count': view_count,
            'like_count': like_count,
            'comment_count': comment_count,
-            'subtitles': subtitles,
-        }
+        })
+
+        return info_dict


 class VimeoOndemandIE(VimeoBaseInfoExtractor):
@ -692,7 +706,7 @@ class VimeoGroupsIE(VimeoAlbumIE):
        return self._extract_videos(name, 'https://vimeo.com/groups/%s' % name)


-class VimeoReviewIE(InfoExtractor):
+class VimeoReviewIE(VimeoBaseInfoExtractor):
    IE_NAME = 'vimeo:review'
    IE_DESC = 'Review pages on vimeo'
    _VALID_URL = r'https://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
@ -704,6 +718,7 @@ class VimeoReviewIE(InfoExtractor):
            'ext': 'mp4',
            'title': "DICK HARDWICK 'Comedian'",
            'uploader': 'Richard Hardwick',
+            'uploader_id': 'user21297594',
        }
    }, {
        'note': 'video player needs Referer',
@ -716,14 +731,18 @@ class VimeoReviewIE(InfoExtractor):
            'uploader': 'DevWeek Events',
            'duration': 2773,
            'thumbnail': 're:^https?://.*\.jpg$',
+            'uploader_id': 'user22258446',
        }
    }]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        player_url = 'https://player.vimeo.com/player/' + video_id
-        return self.url_result(player_url, 'Vimeo', video_id)
+        video_id = self._match_id(url)
+        config = self._download_json(
+            'https://player.vimeo.com/video/%s/config' % video_id, video_id)
+        info_dict = self._parse_config(config, video_id)
+        self._vimeo_sort_formats(info_dict['formats'])
+        info_dict['id'] = video_id
+        return info_dict


 class VimeoWatchLaterIE(VimeoChannelIE):
--- a/youtube_dl/extractor/youporn.py
+++ b/youtube_dl/extractor/youporn.py
@ -17,7 +17,7 @@ class YouPornIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?youporn\.com/watch/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
-        'md5': '71ec5fcfddacf80f495efa8b6a8d9a89',
+        'md5': '3744d24c50438cf5b6f6d59feb5055c2',
        'info_dict': {
            'id': '505835',
            'display_id': 'sex-ed-is-it-safe-to-masturbate-daily',
@ -121,21 +121,21 @@ class YouPornIE(InfoExtractor):
            webpage, 'thumbnail', fatal=False, group='thumbnail')

        uploader = self._html_search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoBy(?:\s+[^"\']+)?["\'][^>]*>\s*By:\s*</div>(.+?)</(?:a|div)>',
+            r'(?s)<div[^>]+class=["\']submitByLink["\'][^>]*>(.+?)</div>',
            webpage, 'uploader', fatal=False)
        upload_date = unified_strdate(self._html_search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoTime["\'][^>]*>(.+?)</div>',
+            r'(?s)<div[^>]+class=["\']videoInfo(?:Date|Time)["\'][^>]*>(.+?)</div>',
            webpage, 'upload date', fatal=False))

        age_limit = self._rta_search(webpage)

        average_rating = int_or_none(self._search_regex(
-            r'<div[^>]+class=["\']videoInfoRating["\'][^>]*>\s*<div[^>]+class=["\']videoRatingPercentage["\'][^>]*>(\d+)%</div>',
+            r'<div[^>]+class=["\']videoRatingPercentage["\'][^>]*>(\d+)%</div>',
            webpage, 'average rating', fatal=False))

        view_count = str_to_int(self._search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfoViews["\'][^>]*>.*?([\d,.]+)\s*</div>',
-            webpage, 'view count', fatal=False))
+            r'(?s)<div[^>]+class=(["\']).*?\bvideoInfoViews\b.*?\1[^>]*>.*?(?P<count>[\d,.]+)<',
+            webpage, 'view count', fatal=False, group='count'))
        comment_count = str_to_int(self._search_regex(
            r'>All [Cc]omments? \(([\d,.]+)\)',
            webpage, 'comment count', fatal=False))
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@ -76,7 +76,7 @@ def register_socks_protocols():
 compiled_regex_type = type(re.compile(''))

 std_headers = {
-    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/44.0 (Chrome)',
+    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)',
    'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate',
@ -1901,6 +1901,16 @@ def dict_get(d, key_or_keys, default=None, skip_false_values=True):
    return d.get(key_or_keys, default)


+def try_get(src, getter, expected_type=None):
+    try:
+        v = getter(src)
+    except (AttributeError, KeyError, TypeError, IndexError):
+        pass
+    else:
+        if expected_type is None or isinstance(v, expected_type):
+            return v
+
+
 def encode_compat_str(string, encoding=preferredencoding(), errors='strict'):
    return string if isinstance(string, compat_str) else compat_str(string, encoding, errors)

--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.06.11.1'
+__version__ = '2016.06.12'
Author	SHA1	Message	Date
Sergey M․	77a9a9c295	release 2016.06.12	2016-06-12 12:06:48 +07:00
Sergey M․	84dcd1c4e4	[streamcloud] Detect removed videos (Closes #3768 )	2016-06-12 11:08:39 +07:00
Sergey M․	971e3b7520	[nrk:skole] Fix extraction	2016-06-12 07:20:37 +07:00
Sergey M․	4e79011729	[nrktv] Fix tests	2016-06-12 06:57:04 +07:00
Sergey M․	a936ac321c	[README.md] Document using output template in batch files (Closes #9717 )	2016-06-12 06:39:31 +07:00
Sergey M․	98960c911c	[instagram] Extract metadata from JSON	2016-06-12 06:06:04 +07:00
Sergey M․	329ca3bef6	[utils] Add try_get To reduce boilerplate when accessing JSON	2016-06-12 06:05:34 +07:00
Sergey M․	2c3322e36e	[youporn] Fix metadata extraction	2016-06-12 04:49:37 +07:00
Sergey M․	80ae228b34	[matchtv] Modernize	2016-06-12 01:57:23 +07:00
Yen Chi Hsuan	6d28c408cf	[viki] Do not use a fallback language for title in the first try In test_Viki_3, 'titles' gives a Hebrew title.	2016-06-11 23:00:44 +08:00
Yen Chi Hsuan	c83b35d4aa	[viki] Update _TESTS	2016-06-11 22:39:13 +08:00
Yen Chi Hsuan	94e5d6aedb	[viki] Skip a geo-restricted test	2016-06-11 21:49:01 +08:00
Yen Chi Hsuan	531a74968c	[vimeo] Fix extraction for VimeoReview videos	2016-06-11 21:35:08 +08:00
Yen Chi Hsuan	c5edd147d1	[generic] Remove an invalid test Now handled by telewebion.py	2016-06-11 18:39:58 +08:00
Yen Chi Hsuan	856150d056	[telewebion] Add new extractor (closes #5135 )	2016-06-11 18:39:58 +08:00
Yen Chi Hsuan	03ebea89b0	Merge pull request #9755 from vxbinaca/patch-2 [utils] Change Firefox 44 to 47	2016-06-11 17:38:45 +08:00
Paul Henning	15d106787e	[utils] Change Firefox 44 to 47 See commit title.	2016-06-11 05:36:31 -04:00
Yen Chi Hsuan	7aab3696dd	[kuwo] Update _TESTS	2016-06-11 15:37:04 +08:00
Yen Chi Hsuan	47787efa2b	[leeco] Recognize Le Sports URLs (fixes #9750 )	2016-06-11 13:14:41 +08:00
Sergey M․	4a420119a6	release 2016.06.11.3	2016-06-11 08:34:30 +07:00
Sergey M․	33751818d3	release 2016.06.11.2	2016-06-11 08:28:51 +07:00
Sergey M․	698f127c1a	[setup.py] Add python 3.5 classifier	2016-06-11 06:14:22 +07:00
Sergey M․	fe458b6596	[limelight] Extract ttml subtitles (Closes #9739 )	2016-06-11 05:57:27 +07:00
Sergey M․	21ac1a8ac3	[limelight] Fix typo	2016-06-11 05:52:50 +07:00
Sergey M․	79027c0ea0	[limelight] Improve _VALID_URLs	2016-06-11 05:40:02 +07:00
Sergey M․	4cad2929cd	[limelight] Fix _VALID_URLs	2016-06-11 05:30:44 +07:00
Sergey M․	62666af99f	[indavideo] Fix formats' height (Closes #9744 )	2016-06-11 05:13:05 +07:00
Sergey M․	9ddc289f88	[README.md] Document missing playlist fields in output template	2016-06-11 04:59:47 +07:00