release 2017.02.27

[ChangeLog] Actualize
[npo] Relax _VALID_URL for zapp.nl
2017-02-27 23:26:07 +07:00 · 2017-02-27 23:24:03 +07:00 · 2017-02-27 23:13:51 +07:00 · 2017-02-27 23:10:29 +07:00 · 2017-02-27 23:10:00 +07:00 · 2017-02-27 22:43:19 +07:00
27 changed files with 469 additions and 125 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.24**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.27*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.27**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2017.02.24
+[debug] youtube-dl version 2017.02.27
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/38
+++ b/38
@ -1,3 +1,41 @@
+version 2017.02.27
+
+Core
+* [downloader/common] Limit displaying 2 digits after decimal point in sleep
+  interval message (#12183)
+ [extractor/common] Add preference to _parse_html5_media_entries
+
+Extractors
+ [npo] Add support for zapp.nl
+ [npo] Add support for hetklokhuis.nl (#12293)
+- [scivee] Remove extractor (#9315)
+ [cda] Decode download URL (#12255)
+ [crunchyroll] Improve uploader extraction (#12267)
+ [youtube] Raise GeoRestrictedError
+ [dailymotion] Raise GeoRestrictedError
+ [mdr] Recognize more URL patterns (#12169)
+ [tvigle] Raise GeoRestrictedError
+* [vevo] Fix extraction for videos with the new streams/streamsV3 format
+  (#11719)
+ [freshlive] Add support for freshlive.tv (#12175)
+ [xhamster] Capture and output videoClosed error (#12263)
+ [etonline] Add support for etonline.com (#12236)
+ [njpwworld] Add support for njpwworld.com (#11561)
+* [amcnetworks] Relax URL regular expression (#12127)
+
+
+version 2017.02.24.1
+
+Extractors
+* [noco] Modernize
+* [noco] Switch login URL to https (#12246)
+ [thescene] Extract more metadata
+* [thescene] Fix extraction (#12235)
+ [tubitv] Use geo bypass mechanism
+* [openload] Fix extraction (#10408)
+ [ivi] Raise GeoRestrictedError
+
+
 version 2017.02.24

 Core
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -239,6 +239,7 @@
 - **ESPN**
 - **ESPNArticle**
 - **EsriVideo**
+ - **ETOnline**
 - **Europa**
 - **EveryonesMixtape**
 - **ExpoTV**
@ -274,6 +275,7 @@
 - **francetvinfo.fr**
 - **Freesound**
 - **freespeech.org**
+ - **FreshLive**
 - **Funimation**
 - **FunnyOrDie**
 - **Fusion**
@ -310,6 +312,7 @@
 - **HellPorno**
 - **Helsinki**: helsinki.fi
 - **HentaiStigma**
+ - **hetklokhuis**
 - **hgtv.com:show**
 - **HistoricFilms**
 - **history:topic**: History.com Topic
@ -511,6 +514,7 @@
 - **Nintendo**
 - **njoy**: N-JOY
 - **njoy:embed**
+ - **NJPWWorld**: 新日本プロレスワールド
 - **NobelPrize**
 - **Noco**
 - **Normalboots**
@ -666,7 +670,6 @@
 - **savefrom.net**
 - **SBS**: sbs.com.au
 - **schooltv**
- - **SciVee**
 - **screen.yahoo:search**: Yahoo screen search
 - **Screencast**
 - **ScreencastOMatic**
--- a/youtube_dl/compat.py
+++ b/youtube_dl/compat.py
@ -2760,8 +2760,10 @@ else:
    compat_kwargs = lambda kwargs: kwargs


-compat_numeric_types = ((int, float, long, complex) if sys.version_info[0] < 3
-                        else (int, float, complex))
+try:
+    compat_numeric_types = (int, float, long, complex)
+except NameError:  # Python 3
+    compat_numeric_types = (int, float, complex)


 if sys.version_info < (2, 7):
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dl/downloader/common.py
@ -347,7 +347,10 @@ class FileDownloader(object):
        if min_sleep_interval:
            max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
            sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
-            self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
+            self.to_screen(
+                '[download] Sleeping %s seconds...' % (
+                    int(sleep_interval) if sleep_interval.is_integer()
+                    else '%.2f' % sleep_interval))
            time.sleep(sleep_interval)

        return self.real_download(filename, info_dict)
--- a/youtube_dl/extractor/amcnetworks.py
+++ b/youtube_dl/extractor/amcnetworks.py
@ -10,7 +10,7 @@ from ..utils import (


 class AMCNetworksIE(ThePlatformIE):
-    _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
+    _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
    _TESTS = [{
        'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
        'md5': '',
@ -44,6 +44,12 @@ class AMCNetworksIE(ThePlatformIE):
    }, {
        'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
        'only_matching': True,
+    }, {
+        'url': 'http://www.wetv.com/shows/mama-june-from-not-to-hot/full-episode/season-01/thin-tervention',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/cda.py
+++ b/youtube_dl/extractor/cda.py
@ -1,6 +1,7 @@
 # coding: utf-8
 from __future__ import unicode_literals

+import codecs
 import re

 from .common import InfoExtractor
@ -96,6 +97,10 @@ class CDAIE(InfoExtractor):
            if not video or 'file' not in video:
                self.report_warning('Unable to extract %s version information' % version)
                return
+            if video['file'].startswith('uggc'):
+                video['file'] = codecs.decode(video['file'], 'rot_13')
+                if video['file'].endswith('adc.mp4'):
+                    video['file'] = video['file'].replace('adc.mp4', '.mp4')
            f = {
                'url': video['file'],
            }
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@ -2010,7 +2010,7 @@ class InfoExtractor(object):
                })
        return formats

-    def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None):
+    def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
        def absolute_url(video_url):
            return compat_urlparse.urljoin(base_url, video_url)

@ -2032,7 +2032,8 @@ class InfoExtractor(object):
                is_plain_url = False
                formats = self._extract_m3u8_formats(
                    full_url, video_id, ext='mp4',
-                    entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id)
+                    entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
+                    preference=preference)
            elif ext == 'mpd':
                is_plain_url = False
                formats = self._extract_mpd_formats(
--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dl/extractor/crunchyroll.py
@ -207,6 +207,21 @@ class CrunchyrollIE(CrunchyrollBaseIE):
            # Just test metadata extraction
            'skip_download': True,
        },
+    }, {
+        # make sure we can extract an uploader name that's not a link
+        'url': 'http://www.crunchyroll.com/hakuoki-reimeiroku/episode-1-dawn-of-the-divine-warriors-606899',
+        'info_dict': {
+            'id': '606899',
+            'ext': 'mp4',
+            'title': 'Hakuoki Reimeiroku Episode 1 – Dawn of the Divine Warriors',
+            'description': 'Ryunosuke was left to die, but Serizawa-san asked him a simple question "Do you want to live?"',
+            'uploader': 'Geneon Entertainment',
+            'upload_date': '20120717',
+        },
+        'params': {
+            # just test metadata extraction
+            'skip_download': True,
+        },
    }]

    _FORMAT_IDS = {
@ -388,8 +403,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
        if video_upload_date:
            video_upload_date = unified_strdate(video_upload_date)
        video_uploader = self._html_search_regex(
-            r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', webpage,
-            'video_uploader', fatal=False)
+            # try looking for both an uploader that's a link and one that's not
+            [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
+            webpage, 'video_uploader', fatal=False)

        available_fmts = []
        for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@ -282,9 +282,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
        }

    def _check_error(self, info):
+        error = info.get('error')
        if info.get('error') is not None:
+            title = error['title']
+            # See https://developer.dailymotion.com/api#access-error
+            if error.get('code') == 'DM007':
+                self.raise_geo_restricted(msg=title)
            raise ExtractorError(
-                '%s said: %s' % (self.IE_NAME, info['error']['title']), expected=True)
+                '%s said: %s' % (self.IE_NAME, title), expected=True)

    def _get_subtitles(self, video_id, webpage):
        try:
--- a/youtube_dl/extractor/etonline.py
+++ b/youtube_dl/extractor/etonline.py
@ -0,0 +1,39 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+
+
+class ETOnlineIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?etonline\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
+    _TESTS = [{
+        'url': 'http://www.etonline.com/tv/211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale/',
+        'info_dict': {
+            'id': '211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale',
+            'title': 'md5:a21ec7d3872ed98335cbd2a046f34ee6',
+            'description': 'md5:8b94484063f463cca709617c79618ccd',
+        },
+        'playlist_count': 2,
+    }, {
+        'url': 'http://www.etonline.com/media/video/here_are_the_stars_who_love_bringing_their_moms_as_dates_to_the_oscars-211359/',
+        'only_matching': True,
+    }]
+    BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911076001/default_default/index.html?videoId=ref:%s'
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        entries = [
+            self.url_result(
+                self.BRIGHTCOVE_URL_TEMPLATE % video_id, 'BrightcoveNew', video_id)
+            for video_id in re.findall(
+                r'site\.brightcove\s*\([^,]+,\s*["\'](title_\d+)', webpage)]
+
+        return self.playlist_result(
+            entries, playlist_id,
+            self._og_search_title(webpage, fatal=False),
+            self._og_search_description(webpage))
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -288,6 +288,7 @@ from .espn import (
    ESPNArticleIE,
 )
 from .esri import EsriVideoIE
+from .etonline import ETOnlineIE
 from .europa import EuropaIE
 from .everyonesmixtape import EveryonesMixtapeIE
 from .expotv import ExpoTVIE
@ -338,6 +339,7 @@ from .francetv import (
 )
 from .freesound import FreesoundIE
 from .freespeech import FreespeechIE
+from .freshlive import FreshLiveIE
 from .funimation import FunimationIE
 from .funnyordie import FunnyOrDieIE
 from .fusion import FusionIE
@ -637,6 +639,7 @@ from .ninecninemedia import (
 from .ninegag import NineGagIE
 from .ninenow import NineNowIE
 from .nintendo import NintendoIE
+from .njpwworld import NJPWWorldIE
 from .nobelprize import NobelPrizeIE
 from .noco import NocoIE
 from .normalboots import NormalbootsIE
@ -666,6 +669,7 @@ from .npo import (
    NPORadioIE,
    NPORadioFragmentIE,
    SchoolTVIE,
+    HetKlokhuisIE,
    VPROIE,
    WNLIE,
 )
@ -835,7 +839,6 @@ from .safari import (
 from .sapo import SapoIE
 from .savefrom import SaveFromIE
 from .sbs import SBSIE
-from .scivee import SciVeeIE
 from .screencast import ScreencastIE
 from .screencastomatic import ScreencastOMaticIE
 from .scrippsnetworks import ScrippsNetworksWatchIE
--- a/youtube_dl/extractor/freshlive.py
+++ b/youtube_dl/extractor/freshlive.py
@ -0,0 +1,84 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    try_get,
+    unified_timestamp,
+)
+
+
+class FreshLiveIE(InfoExtractor):
+    _VALID_URL = r'https?://freshlive\.tv/[^/]+/(?P<id>\d+)'
+    _TEST = {
+        'url': 'https://freshlive.tv/satotv/74712',
+        'md5': '9f0cf5516979c4454ce982df3d97f352',
+        'info_dict': {
+            'id': '74712',
+            'ext': 'mp4',
+            'title': 'テスト',
+            'description': 'テスト',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'duration': 1511,
+            'timestamp': 1483619655,
+            'upload_date': '20170105',
+            'uploader': 'サトTV',
+            'uploader_id': 'satotv',
+            'view_count': int,
+            'comment_count': int,
+            'is_live': False,
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        options = self._parse_json(
+            self._search_regex(
+                r'window\.__CONTEXT__\s*=\s*({.+?});\s*</script>',
+                webpage, 'initial context'),
+            video_id)
+
+        info = options['context']['dispatcher']['stores']['ProgramStore']['programs'][video_id]
+
+        title = info['title']
+
+        if info.get('status') == 'upcoming':
+            raise ExtractorError('Stream %s is upcoming' % video_id, expected=True)
+
+        stream_url = info.get('liveStreamUrl') or info['archiveStreamUrl']
+
+        is_live = info.get('liveStreamUrl') is not None
+
+        formats = self._extract_m3u8_formats(
+            stream_url, video_id, ext='mp4',
+            entry_protocol='m3u8' if is_live else 'm3u8_native',
+            m3u8_id='hls')
+
+        if is_live:
+            title = self._live_title(title)
+
+        return {
+            'id': video_id,
+            'formats': formats,
+            'title': title,
+            'description': info.get('description'),
+            'thumbnail': info.get('thumbnailUrl'),
+            'duration': int_or_none(info.get('airTime')),
+            'timestamp': unified_timestamp(info.get('createdAt')),
+            'uploader': try_get(
+                info, lambda x: x['channel']['title'], compat_str),
+            'uploader_id': try_get(
+                info, lambda x: x['channel']['code'], compat_str),
+            'uploader_url': try_get(
+                info, lambda x: x['channel']['permalink'], compat_str),
+            'view_count': int_or_none(info.get('viewCount')),
+            'comment_count': int_or_none(info.get('commentCount')),
+            'tags': info.get('tags', []),
+            'is_live': is_live,
+        }
--- a/youtube_dl/extractor/ivi.py
+++ b/youtube_dl/extractor/ivi.py
@ -16,6 +16,8 @@ class IviIE(InfoExtractor):
    IE_DESC = 'ivi.ru'
    IE_NAME = 'ivi'
    _VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'
+    _GEO_BYPASS = False
+    _GEO_COUNTRIES = ['RU']

    _TESTS = [
        # Single movie
@ -91,7 +93,11 @@ class IviIE(InfoExtractor):

        if 'error' in video_json:
            error = video_json['error']
-            if error['origin'] == 'NoRedisValidData':
+            origin = error['origin']
+            if origin == 'NotAllowedForLocation':
+                self.raise_geo_restricted(
+                    msg=error['message'], countries=self._GEO_COUNTRIES)
+            elif origin == 'NoRedisValidData':
                raise ExtractorError('Video %s does not exist' % video_id, expected=True)
            raise ExtractorError(
                'Unable to download video %s: %s' % (video_id, error['message']),
--- a/youtube_dl/extractor/mdr.py
+++ b/youtube_dl/extractor/mdr.py
@ -14,7 +14,7 @@ from ..utils import (

 class MDRIE(InfoExtractor):
    IE_DESC = 'MDR.DE and KiKA'
-    _VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z]+-?(?P<id>\d+)(?:_.+?)?\.html'
+    _VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'

    _TESTS = [{
        # MDR regularly deletes its videos
@ -31,6 +31,7 @@ class MDRIE(InfoExtractor):
            'duration': 250,
            'uploader': 'MITTELDEUTSCHER RUNDFUNK',
        },
+        'skip': '404 not found',
    }, {
        'url': 'http://www.kika.de/baumhaus/videos/video19636.html',
        'md5': '4930515e36b06c111213e80d1e4aad0e',
@ -41,6 +42,7 @@ class MDRIE(InfoExtractor):
            'duration': 134,
            'uploader': 'KIKA',
        },
+        'skip': '404 not found',
    }, {
        'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html',
        'md5': '5fe9c4dd7d71e3b238f04b8fdd588357',
@ -49,11 +51,21 @@ class MDRIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Beutolomäus und der geheime Weihnachtswunsch',
            'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd',
-            'timestamp': 1450950000,
-            'upload_date': '20151224',
+            'timestamp': 1482541200,
+            'upload_date': '20161224',
            'duration': 4628,
            'uploader': 'KIKA',
        },
+    }, {
+        # audio with alternative playerURL pattern
+        'url': 'http://www.mdr.de/kultur/videos-und-audios/audio-radio/operation-mindfuck-robert-wilson100.html',
+        'info_dict': {
+            'id': '100',
+            'ext': 'mp4',
+            'title': 'Feature: Operation Mindfuck - Robert Anton Wilson',
+            'duration': 3239,
+            'uploader': 'MITTELDEUTSCHER RUNDFUNK',
+        },
    }, {
        'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
        'only_matching': True,
@ -71,7 +83,7 @@ class MDRIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)

        data_url = self._search_regex(
-            r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+/(?:video|audio)-?[0-9]+-avCustom\.xml)\1',
+            r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+?-avCustom\.xml)\1',
            webpage, 'data url', group='url').replace(r'\/', '/')

        doc = self._download_xml(
--- a/youtube_dl/extractor/njpwworld.py
+++ b/youtube_dl/extractor/njpwworld.py
@ -0,0 +1,83 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_urlparse
+from ..utils import (
+    get_element_by_class,
+    urlencode_postdata,
+)
+
+
+class NJPWWorldIE(InfoExtractor):
+    _VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
+    IE_DESC = '新日本プロレスワールド'
+    _NETRC_MACHINE = 'njpwworld'
+
+    _TEST = {
+        'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
+        'info_dict': {
+            'id': 's_series_00155_1_9',
+            'ext': 'mp4',
+            'title': '第9試合　ランディ・サベージ　vs　リック・スタイナー',
+            'tags': list,
+        },
+        'params': {
+            'skip_download': True,  # AES-encrypted m3u8
+        },
+        'skip': 'Requires login',
+    }
+
+    def _real_initialize(self):
+        self._login()
+
+    def _login(self):
+        username, password = self._get_login_info()
+        # No authentication to be performed
+        if not username:
+            return True
+
+        webpage, urlh = self._download_webpage_handle(
+            'https://njpwworld.com/auth/login', None,
+            note='Logging in', errnote='Unable to login',
+            data=urlencode_postdata({'login_id': username, 'pw': password}))
+        # /auth/login will return 302 for successful logins
+        if urlh.geturl() == 'https://njpwworld.com/auth/login':
+            self.report_warning('unable to login')
+            return False
+
+        return True
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        formats = []
+        for player_url, kind in re.findall(r'<a[^>]+href="(/player[^"]+)".+?<img[^>]+src="[^"]+qf_btn_([^".]+)', webpage):
+            player_url = compat_urlparse.urljoin(url, player_url)
+
+            player_page = self._download_webpage(
+                player_url, video_id, note='Downloading player page')
+
+            entries = self._parse_html5_media_entries(
+                player_url, player_page, video_id, m3u8_id='hls-%s' % kind,
+                m3u8_entry_protocol='m3u8_native',
+                preference=2 if 'hq' in kind else 1)
+            formats.extend(entries[0]['formats'])
+
+        self._sort_formats(formats)
+
+        post_content = get_element_by_class('post-content', webpage)
+        tags = re.findall(
+            r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content
+        ) if post_content else None
+
+        return {
+            'id': video_id,
+            'title': self._og_search_title(webpage),
+            'formats': formats,
+            'tags': tags,
+        }
--- a/youtube_dl/extractor/noco.py
+++ b/youtube_dl/extractor/noco.py
@ -23,7 +23,7 @@ from ..utils import (

 class NocoIE(InfoExtractor):
    _VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)'
-    _LOGIN_URL = 'http://noco.tv/do.php'
+    _LOGIN_URL = 'https://noco.tv/do.php'
    _API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s'
    _SUB_LANG_TEMPLATE = '&sub_lang=%s'
    _NETRC_MACHINE = 'noco'
@ -69,16 +69,17 @@ class NocoIE(InfoExtractor):
        if username is None:
            return

-        login_form = {
-            'a': 'login',
-            'cookie': '1',
-            'username': username,
-            'password': password,
-        }
-        request = sanitized_Request(self._LOGIN_URL, urlencode_postdata(login_form))
-        request.add_header('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8')
-
-        login = self._download_json(request, None, 'Logging in as %s' % username)
+        login = self._download_json(
+            self._LOGIN_URL, None, 'Logging in as %s' % username,
+            data=urlencode_postdata({
+                'a': 'login',
+                'cookie': '1',
+                'username': username,
+                'password': password,
+            }),
+            headers={
+                'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
+            })

        if 'erreur' in login:
            raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True)
--- a/youtube_dl/extractor/npo.py
+++ b/youtube_dl/extractor/npo.py
@ -51,7 +51,8 @@ class NPOIE(NPOBaseIE):
                            (?:
                                npo\.nl/(?!live|radio)(?:[^/]+/){2}|
                                ntr\.nl/(?:[^/]+/){2,}|
-                                omroepwnl\.nl/video/fragment/[^/]+__
+                                omroepwnl\.nl/video/fragment/[^/]+__|
+                                zapp\.nl/[^/]+/[^/]+/
                            )
                        )
                        (?P<id>[^/?#]+)
@ -140,6 +141,18 @@ class NPOIE(NPOBaseIE):
                'upload_date': '20150508',
                'duration': 462,
            },
+        },
+        {
+            'url': 'http://www.zapp.nl/de-bzt-show/gemist/KN_1687547',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.zapp.nl/de-bzt-show/filmpjes/POMS_KN_7315118',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.zapp.nl/beste-vrienden-quiz/extra-video-s/WO_NTR_1067990',
+            'only_matching': True,
        }
    ]

@ -416,7 +429,21 @@ class NPORadioFragmentIE(InfoExtractor):
        }


-class SchoolTVIE(InfoExtractor):
+class NPODataMidEmbedIE(InfoExtractor):
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+        video_id = self._search_regex(
+            r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
+        return {
+            '_type': 'url_transparent',
+            'ie_key': 'NPO',
+            'url': 'npo:%s' % video_id,
+            'display_id': display_id
+        }
+
+
+class SchoolTVIE(NPODataMidEmbedIE):
    IE_NAME = 'schooltv'
    _VALID_URL = r'https?://(?:www\.)?schooltv\.nl/video/(?P<id>[^/?#&]+)'

@ -435,17 +462,25 @@ class SchoolTVIE(InfoExtractor):
        }
    }

-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-        webpage = self._download_webpage(url, display_id)
-        video_id = self._search_regex(
-            r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
-        return {
-            '_type': 'url_transparent',
-            'ie_key': 'NPO',
-            'url': 'npo:%s' % video_id,
-            'display_id': display_id
+
+class HetKlokhuisIE(NPODataMidEmbedIE):
+    IE_NAME = 'hetklokhuis'
+    _VALID_URL = r'https?://(?:www\.)?hetklokhuis.nl/[^/]+/\d+/(?P<id>[^/?#&]+)'
+
+    _TEST = {
+        'url': 'http://hetklokhuis.nl/tv-uitzending/3471/Zwaartekrachtsgolven',
+        'info_dict': {
+            'id': 'VPWON_1260528',
+            'display_id': 'Zwaartekrachtsgolven',
+            'ext': 'm4v',
+            'title': 'Het Klokhuis: Zwaartekrachtsgolven',
+            'description': 'md5:c94f31fb930d76c2efa4a4a71651dd48',
+            'upload_date': '20170223',
+        },
+        'params': {
+            'skip_download': True
        }
+    }


 class NPOPlaylistBaseIE(NPOIE):
--- a/youtube_dl/extractor/openload.py
+++ b/youtube_dl/extractor/openload.py
@ -72,16 +72,21 @@ class OpenloadIE(InfoExtractor):
            raise ExtractorError('File not found', expected=True)

        ol_id = self._search_regex(
-            '<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>',
+            '<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
            webpage, 'openload ID')

-        first_two_chars = int(float(ol_id[0:][:2]))
+        first_char = int(ol_id[0])
        urlcode = []
-        num = 2
+        num = 1

        while num < len(ol_id):
-            key = int(float(ol_id[num + 3:][:2]))
-            urlcode.append((key, compat_chr(int(float(ol_id[num:][:3])) - first_two_chars)))
+            i = ord(ol_id[num])
+            key = 0
+            if i <= 90:
+                key = i - 65
+            elif i >= 97:
+                key = 25 + i - 97
+            urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
            num += 5

        video_url = 'https://openload.co/stream/' + ''.join(
--- a/youtube_dl/extractor/scivee.py
+++ b/youtube_dl/extractor/scivee.py
@ -1,57 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import int_or_none
-
-
-class SciVeeIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?scivee\.tv/node/(?P<id>\d+)'
-
-    _TEST = {
-        'url': 'http://www.scivee.tv/node/62352',
-        'md5': 'b16699b74c9e6a120f6772a44960304f',
-        'info_dict': {
-            'id': '62352',
-            'ext': 'mp4',
-            'title': 'Adam Arkin at the 2014 DOE JGI Genomics of Energy & Environment Meeting',
-            'description': 'md5:81f1710638e11a481358fab1b11059d7',
-        },
-        'skip': 'Not accessible from Travis CI server',
-    }
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
-        # annotations XML is malformed
-        annotations = self._download_webpage(
-            'http://www.scivee.tv/assets/annotations/%s' % video_id, video_id, 'Downloading annotations')
-
-        title = self._html_search_regex(r'<title>([^<]+)</title>', annotations, 'title')
-        description = self._html_search_regex(r'<abstract>([^<]+)</abstract>', annotations, 'abstract', fatal=False)
-        filesize = int_or_none(self._html_search_regex(
-            r'<filesize>([^<]+)</filesize>', annotations, 'filesize', fatal=False))
-
-        formats = [
-            {
-                'url': 'http://www.scivee.tv/assets/audio/%s' % video_id,
-                'ext': 'mp3',
-                'format_id': 'audio',
-            },
-            {
-                'url': 'http://www.scivee.tv/assets/video/%s' % video_id,
-                'ext': 'mp4',
-                'format_id': 'video',
-                'filesize': filesize,
-            },
-        ]
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': description,
-            'thumbnail': 'http://www.scivee.tv/assets/videothumb/%s' % video_id,
-            'formats': formats,
-        }
--- a/youtube_dl/extractor/thescene.py
+++ b/youtube_dl/extractor/thescene.py
@ -3,7 +3,10 @@ from __future__ import unicode_literals
 from .common import InfoExtractor

 from ..compat import compat_urlparse
-from ..utils import qualities
+from ..utils import (
+    int_or_none,
+    qualities,
+)


 class TheSceneIE(InfoExtractor):
@ -16,6 +19,11 @@ class TheSceneIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear',
            'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear',
+            'duration': 127,
+            'series': 'Style.com Fashion Shows',
+            'season': 'Ready To Wear Spring 2013',
+            'tags': list,
+            'categories': list,
        },
    }

@ -32,21 +40,29 @@ class TheSceneIE(InfoExtractor):
        player = self._download_webpage(player_url, display_id)
        info = self._parse_json(
            self._search_regex(
-                r'(?m)var\s+video\s+=\s+({.+?});$', player, 'info json'),
+                r'(?m)video\s*:\s*({.+?}),$', player, 'info json'),
            display_id)

+        video_id = info['id']
+        title = info['title']
+
        qualities_order = qualities(('low', 'high'))
        formats = [{
            'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']),
            'url': f['src'],
            'quality': qualities_order(f['quality']),
-        } for f in info['sources'][0]]
+        } for f in info['sources']]
        self._sort_formats(formats)

        return {
-            'id': info['id'],
+            'id': video_id,
            'display_id': display_id,
-            'title': info['title'],
+            'title': title,
            'formats': formats,
            'thumbnail': info.get('poster_frame'),
+            'duration': int_or_none(info.get('duration')),
+            'series': info.get('series_title'),
+            'season': info.get('season_title'),
+            'tags': info.get('tags'),
+            'categories': info.get('categories'),
        }
--- a/youtube_dl/extractor/tubitv.py
+++ b/youtube_dl/extractor/tubitv.py
@ -16,6 +16,7 @@ class TubiTvIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)'
    _LOGIN_URL = 'http://tubitv.com/login'
    _NETRC_MACHINE = 'tubitv'
+    _GEO_COUNTRIES = ['US']
    _TEST = {
        'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday',
        'md5': '43ac06be9326f41912dc64ccf7a80320',
--- a/youtube_dl/extractor/tvigle.py
+++ b/youtube_dl/extractor/tvigle.py
@ -17,6 +17,9 @@ class TvigleIE(InfoExtractor):
    IE_DESC = 'Интернет-телевидение Tvigle.ru'
    _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'

+    _GEO_BYPASS = False
+    _GEO_COUNTRIES = ['RU']
+
    _TESTS = [
        {
            'url': 'http://www.tvigle.ru/video/sokrat/',
@ -72,8 +75,13 @@ class TvigleIE(InfoExtractor):

        error_message = item.get('errorMessage')
        if not videos and error_message:
-            raise ExtractorError(
-                '%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
+            if item.get('isGeoBlocked') is True:
+                self.raise_geo_restricted(
+                    msg=error_message, countries=self._GEO_COUNTRIES)
+            else:
+                raise ExtractorError(
+                    '%s returned error: %s' % (self.IE_NAME, error_message),
+                    expected=True)

        title = item['title']
        description = item.get('description')
--- a/youtube_dl/extractor/vevo.py
+++ b/youtube_dl/extractor/vevo.py
@ -17,12 +17,12 @@ from ..utils import (


 class VevoBaseIE(InfoExtractor):
-    def _extract_json(self, webpage, video_id, item):
+    def _extract_json(self, webpage, video_id):
        return self._parse_json(
            self._search_regex(
                r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>',
                webpage, 'initial store'),
-            video_id)['default'][item]
+            video_id)


 class VevoIE(VevoBaseIE):
@ -139,6 +139,11 @@ class VevoIE(VevoBaseIE):
        # no genres available
        'url': 'http://www.vevo.com/watch/INS171400764',
        'only_matching': True,
+    }, {
+        # Another case available only via the webpage; using streams/streamsV3 formats
+        # Geo-restricted to Netherlands/Germany
+        'url': 'http://www.vevo.com/watch/boostee/pop-corn-clip-officiel/FR1A91600909',
+        'only_matching': True,
    }]
    _VERSIONS = {
        0: 'youtube',  # only in AuthenticateVideo videoVersions
@ -193,7 +198,14 @@ class VevoIE(VevoBaseIE):
        # https://github.com/rg3/youtube-dl/issues/9366)
        if not video_versions:
            webpage = self._download_webpage(url, video_id)
-            video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0]
+            json_data = self._extract_json(webpage, video_id)
+            if 'streams' in json_data.get('default', {}):
+                video_versions = json_data['default']['streams'][video_id][0]
+            else:
+                video_versions = [
+                    value
+                    for key, value in json_data['apollo']['data'].items()
+                    if key.startswith('%s.streams' % video_id)]

        uploader = None
        artist = None
@ -207,7 +219,7 @@ class VevoIE(VevoBaseIE):

        formats = []
        for video_version in video_versions:
-            version = self._VERSIONS.get(video_version['version'])
+            version = self._VERSIONS.get(video_version.get('version'), 'generic')
            version_url = video_version.get('url')
            if not version_url:
                continue
@ -339,7 +351,7 @@ class VevoPlaylistIE(VevoBaseIE):
            if video_id:
                return self.url_result('vevo:%s' % video_id, VevoIE.ie_key())

-        playlists = self._extract_json(webpage, playlist_id, '%ss' % playlist_kind)
+        playlists = self._extract_json(webpage, playlist_id)['default']['%ss' % playlist_kind]

        playlist = (list(playlists.values())[0]
                    if playlist_kind == 'playlist' else playlists[playlist_id])
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dl/extractor/xhamster.py
@ -5,6 +5,7 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    dict_get,
+    ExtractorError,
    int_or_none,
    parse_duration,
    unified_strdate,
@ -57,6 +58,10 @@ class XHamsterIE(InfoExtractor):
    }, {
        'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html',
        'only_matching': True,
+    }, {
+        # This video is visible for marcoalfa123456's friends only
+        'url': 'https://it.xhamster.com/movies/7263980/la_mia_vicina.html',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@ -78,6 +83,12 @@ class XHamsterIE(InfoExtractor):
        mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
        webpage = self._download_webpage(mrss_url, video_id)

+        error = self._html_search_regex(
+            r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
+            webpage, 'error', default=None)
+        if error:
+            raise ExtractorError(error, expected=True)
+
        title = self._html_search_regex(
            [r'<h1[^>]*>([^<]+)</h1>',
             r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"',
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@ -47,7 +47,6 @@ from ..utils import (
    unsmuggle_url,
    uppercase_escape,
    urlencode_postdata,
-    ISO3166Utils,
 )


@ -371,6 +370,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
    }
    _SUBTITLE_FORMATS = ('ttml', 'vtt')

+    _GEO_BYPASS = False
+
    IE_NAME = 'youtube'
    _TESTS = [
        {
@ -917,7 +918,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            # itag 212
            'url': '1t24XAntNCY',
            'only_matching': True,
-        }
+        },
+        {
+            # geo restricted to JP
+            'url': 'sJL6WA-aGkQ',
+            'only_matching': True,
+        },
    ]

    def __init__(self, *args, **kwargs):
@ -1376,11 +1382,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        if 'token' not in video_info:
            if 'reason' in video_info:
                if 'The uploader has not made this video available in your country.' in video_info['reason']:
-                    regions_allowed = self._html_search_meta('regionsAllowed', video_webpage, default=None)
-                    if regions_allowed:
-                        raise ExtractorError('YouTube said: This video is available in %s only' % (
-                            ', '.join(map(ISO3166Utils.short2full, regions_allowed.split(',')))),
-                            expected=True)
+                    regions_allowed = self._html_search_meta(
+                        'regionsAllowed', video_webpage, default=None)
+                    countries = regions_allowed.split(',') if regions_allowed else None
+                    self.raise_geo_restricted(
+                        msg=video_info['reason'][0], countries=countries)
                raise ExtractorError(
                    'YouTube said: %s' % video_info['reason'][0],
                    expected=True, video_id=video_id)
@ -2226,7 +2232,7 @@ class YoutubeUserIE(YoutubeChannelIE):
        'url': 'https://www.youtube.com/gametrailers',
        'only_matching': True,
    }, {
-        # This channel is not available.
+        # This channel is not available, geo restricted to JP
        'url': 'https://www.youtube.com/user/kananishinoSMEJ/videos',
        'only_matching': True,
    }]
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2017.02.24'
+__version__ = '2017.02.27'
Author	SHA1	Message	Date
Sergey M․	ef48a1175d	release 2017.02.27	2017-02-27 23:26:07 +07:00
Sergey M․	c6184bcf7b	[ChangeLog] Actualize	2017-02-27 23:24:03 +07:00
Sergey M․	18abb74376	[npo] Relax _VALID_URL for zapp.nl	2017-02-27 23:13:51 +07:00
Sergey M․	dbc01fdb6f	[hetklokhuis] Fix IE_NAME	2017-02-27 23:10:29 +07:00
Sergey M․	f264c62334	[npo] Add support for zapp.nl	2017-02-27 23:10:00 +07:00
Sergey M․	0dc5a86a32	[npo] Add support for hetklokhuis.nl (closes #12293 )	2017-02-27 22:43:19 +07:00
Sergey M․	0e879f432a	[youtube:channel] Remove duplicate test	2017-02-27 22:22:43 +07:00
Yen Chi Hsuan	892b47ab6c	[scivee] Remove extractor (#9315 ) The Wikipedia page is changed from active to down: https://en.wikipedia.org/w/index.php?title=SciVee&diff=prev&oldid=723161154 Some other interesting bits: $ nslookup www.scivee.tv Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: www.scivee.tv canonical name = scivee.rcsb.org. Name: scivee.rcsb.org Address: 132.249.231.211 $ nslookup rcsb.org Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: Name: rcsb.org Address: 132.249.231.77 Both IPs are from UCSD. I guess it's maintained by a lab and they don't maintain it anymore.	2017-02-27 21:34:33 +08:00
Alex Seiler	fdeea72611	[cda] Decode URL (fixes #12255 )	2017-02-26 22:05:52 +08:00
xbe	7fd4655256	[crunchyroll] Extract uploader name that's not a link Provide the Crunchyroll extractor with the ability to extract uploader names that aren't links. Add a test for this new functionality. This fixes #12267.	2017-02-26 19:08:10 +08:00
Sergey M․	fd5c4aab59	[youtube] Raise GeoRestrictedError	2017-02-26 16:52:40 +07:00
Sergey M․	8878789f11	[dailymotion] Raise GeoRestrictedError	2017-02-26 16:52:40 +07:00
Yen Chi Hsuan	a5cf17989b	[MDR] Relax _VALID_URL and playerURL matching and update _TESTS Ref: #12169	2017-02-26 17:24:54 +08:00
Sergey M․	b3aec47665	[tvigle] Raise GeoRestrictedError	2017-02-25 23:27:45 +07:00
Yen Chi Hsuan	9d0c08a02c	[vevo] Fix videos with the new streams/streamsV3 format (closes #11719 )	2017-02-26 00:15:49 +08:00
Sergey M․	e498758b9c	[freshlive] Fix issues and improve (closes #12175 )	2017-02-25 22:56:42 +07:00
Ricardo Constantino	5fc8d89361	[freshlive] Add extractor	2017-02-25 22:55:17 +07:00
Pratyush Singh	d374d943f3	[downloader/common] Limit displaying 2 digits after decimal point in sleep interval message	2017-02-25 20:59:04 +07:00
Sergey M․	103f8c8d36	[xhamster] Capture and output videoClosed error (#12263 )	2017-02-25 20:38:21 +07:00
Sergey M․	922ab7840b	[etonline] Add extractor (closes #12236 )	2017-02-25 20:16:40 +07:00
Sergey M․	831217291a	[compat] Use try except for compat_numeric_types	2017-02-25 19:44:50 +07:00
Yen Chi Hsuan	db182c63fb	[njpwworld] Add new extractor (closes #11561 )	2017-02-25 18:44:39 +08:00
Yen Chi Hsuan	eeb0a95684	[extractor/common] Add 'preference' to _parse_html5_media_entries Some websites, like NJPWorld, put different qualities on different player pages.	2017-02-25 18:40:05 +08:00
Sergey M․	231bcd0b6b	[amcnetworks] Relax _VALID_URL (#12127 )	2017-02-25 02:51:53 +07:00
Sergey M․	204efc8509	release 2017.02.24.1	2017-02-24 21:59:39 +07:00
Sergey M․	5d3a51e1b9	[ChangeLog] Actualize	2017-02-24 21:57:39 +07:00
Sergey M․	ad3033037c	[noco] Modernize	2017-02-24 21:51:56 +07:00
Sergey M․	f3bc281239	[noco] Swtich login URL to https (closes #12246 )	2017-02-24 21:48:34 +07:00
Sergey M․	441d7a32e5	[thescene] Extract more metadata	2017-02-24 21:22:29 +07:00
Thomas Christlieb	51ed496307	[thescene] Fix extraction (closes #12235 )	2017-02-24 22:08:45 +08:00
Remita Amine	68f17a9c2d	[tubitv] use geo bypass mechanism	2017-02-24 12:27:56 +01:00
Remita Amine	39e7277ed1	[openload] fix extraction(closes #10408 )	2017-02-24 11:21:58 +01:00
Sergey M․	42dcdbe11c	[ivi] Raise GeoRestrictedError	2017-02-24 10:54:39 +07:00