Compare commits

...

33 Commits

Author SHA1 Message Date
ef48a1175d release 2017.02.27 2017-02-27 23:26:07 +07:00
c6184bcf7b [ChangeLog] Actualize 2017-02-27 23:24:03 +07:00
18abb74376 [npo] Relax _VALID_URL for zapp.nl 2017-02-27 23:13:51 +07:00
dbc01fdb6f [hetklokhuis] Fix IE_NAME 2017-02-27 23:10:29 +07:00
f264c62334 [npo] Add support for zapp.nl 2017-02-27 23:10:00 +07:00
0dc5a86a32 [npo] Add support for hetklokhuis.nl (closes #12293) 2017-02-27 22:43:19 +07:00
0e879f432a [youtube:channel] Remove duplicate test 2017-02-27 22:22:43 +07:00
892b47ab6c [scivee] Remove extractor (#9315)
The Wikipedia page is changed from active to down:
https://en.wikipedia.org/w/index.php?title=SciVee&diff=prev&oldid=723161154

Some other interesting bits:

$ nslookup www.scivee.tv
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
www.scivee.tv   canonical name = scivee.rcsb.org.
Name:   scivee.rcsb.org
Address: 132.249.231.211

$ nslookup rcsb.org
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   rcsb.org
Address: 132.249.231.77

Both IPs are from UCSD. I guess it's maintained by a lab and they don't
maintain it anymore.
2017-02-27 21:34:33 +08:00
fdeea72611 [cda] Decode URL (fixes #12255) 2017-02-26 22:05:52 +08:00
xbe
7fd4655256 [crunchyroll] Extract uploader name that's not a link
Provide the Crunchyroll extractor with the ability to extract uploader
names that aren't links. Add a test for this new functionality.
This fixes #12267.
2017-02-26 19:08:10 +08:00
fd5c4aab59 [youtube] Raise GeoRestrictedError 2017-02-26 16:52:40 +07:00
8878789f11 [dailymotion] Raise GeoRestrictedError 2017-02-26 16:52:40 +07:00
a5cf17989b [MDR] Relax _VALID_URL and playerURL matching and update _TESTS
Ref: #12169
2017-02-26 17:24:54 +08:00
b3aec47665 [tvigle] Raise GeoRestrictedError 2017-02-25 23:27:45 +07:00
9d0c08a02c [vevo] Fix videos with the new streams/streamsV3 format (closes #11719) 2017-02-26 00:15:49 +08:00
e498758b9c [freshlive] Fix issues and improve (closes #12175) 2017-02-25 22:56:42 +07:00
5fc8d89361 [freshlive] Add extractor 2017-02-25 22:55:17 +07:00
d374d943f3 [downloader/common] Limit displaying 2 digits after decimal point in sleep interval message 2017-02-25 20:59:04 +07:00
103f8c8d36 [xhamster] Capture and output videoClosed error (#12263) 2017-02-25 20:38:21 +07:00
922ab7840b [etonline] Add extractor (closes #12236) 2017-02-25 20:16:40 +07:00
831217291a [compat] Use try except for compat_numeric_types 2017-02-25 19:44:50 +07:00
db182c63fb [njpwworld] Add new extractor (closes #11561) 2017-02-25 18:44:39 +08:00
eeb0a95684 [extractor/common] Add 'preference' to _parse_html5_media_entries
Some websites, like NJPWorld, put different qualities on different
player pages.
2017-02-25 18:40:05 +08:00
231bcd0b6b [amcnetworks] Relax _VALID_URL (#12127) 2017-02-25 02:51:53 +07:00
204efc8509 release 2017.02.24.1 2017-02-24 21:59:39 +07:00
5d3a51e1b9 [ChangeLog] Actualize 2017-02-24 21:57:39 +07:00
ad3033037c [noco] Modernize 2017-02-24 21:51:56 +07:00
f3bc281239 [noco] Swtich login URL to https (closes #12246) 2017-02-24 21:48:34 +07:00
441d7a32e5 [thescene] Extract more metadata 2017-02-24 21:22:29 +07:00
51ed496307 [thescene] Fix extraction (closes #12235) 2017-02-24 22:08:45 +08:00
68f17a9c2d [tubitv] use geo bypass mechanism 2017-02-24 12:27:56 +01:00
39e7277ed1 [openload] fix extraction(closes #10408) 2017-02-24 11:21:58 +01:00
42dcdbe11c [ivi] Raise GeoRestrictedError 2017-02-24 10:54:39 +07:00
27 changed files with 469 additions and 125 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.27*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.24** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.27**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.02.24 [debug] youtube-dl version 2017.02.27
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -1,3 +1,41 @@
version 2017.02.27
Core
* [downloader/common] Limit displaying 2 digits after decimal point in sleep
interval message (#12183)
+ [extractor/common] Add preference to _parse_html5_media_entries
Extractors
+ [npo] Add support for zapp.nl
+ [npo] Add support for hetklokhuis.nl (#12293)
- [scivee] Remove extractor (#9315)
+ [cda] Decode download URL (#12255)
+ [crunchyroll] Improve uploader extraction (#12267)
+ [youtube] Raise GeoRestrictedError
+ [dailymotion] Raise GeoRestrictedError
+ [mdr] Recognize more URL patterns (#12169)
+ [tvigle] Raise GeoRestrictedError
* [vevo] Fix extraction for videos with the new streams/streamsV3 format
(#11719)
+ [freshlive] Add support for freshlive.tv (#12175)
+ [xhamster] Capture and output videoClosed error (#12263)
+ [etonline] Add support for etonline.com (#12236)
+ [njpwworld] Add support for njpwworld.com (#11561)
* [amcnetworks] Relax URL regular expression (#12127)
version 2017.02.24.1
Extractors
* [noco] Modernize
* [noco] Switch login URL to https (#12246)
+ [thescene] Extract more metadata
* [thescene] Fix extraction (#12235)
+ [tubitv] Use geo bypass mechanism
* [openload] Fix extraction (#10408)
+ [ivi] Raise GeoRestrictedError
version 2017.02.24 version 2017.02.24
Core Core

View File

@ -239,6 +239,7 @@
- **ESPN** - **ESPN**
- **ESPNArticle** - **ESPNArticle**
- **EsriVideo** - **EsriVideo**
- **ETOnline**
- **Europa** - **Europa**
- **EveryonesMixtape** - **EveryonesMixtape**
- **ExpoTV** - **ExpoTV**
@ -274,6 +275,7 @@
- **francetvinfo.fr** - **francetvinfo.fr**
- **Freesound** - **Freesound**
- **freespeech.org** - **freespeech.org**
- **FreshLive**
- **Funimation** - **Funimation**
- **FunnyOrDie** - **FunnyOrDie**
- **Fusion** - **Fusion**
@ -310,6 +312,7 @@
- **HellPorno** - **HellPorno**
- **Helsinki**: helsinki.fi - **Helsinki**: helsinki.fi
- **HentaiStigma** - **HentaiStigma**
- **hetklokhuis**
- **hgtv.com:show** - **hgtv.com:show**
- **HistoricFilms** - **HistoricFilms**
- **history:topic**: History.com Topic - **history:topic**: History.com Topic
@ -511,6 +514,7 @@
- **Nintendo** - **Nintendo**
- **njoy**: N-JOY - **njoy**: N-JOY
- **njoy:embed** - **njoy:embed**
- **NJPWWorld**: 新日本プロレスワールド
- **NobelPrize** - **NobelPrize**
- **Noco** - **Noco**
- **Normalboots** - **Normalboots**
@ -666,7 +670,6 @@
- **savefrom.net** - **savefrom.net**
- **SBS**: sbs.com.au - **SBS**: sbs.com.au
- **schooltv** - **schooltv**
- **SciVee**
- **screen.yahoo:search**: Yahoo screen search - **screen.yahoo:search**: Yahoo screen search
- **Screencast** - **Screencast**
- **ScreencastOMatic** - **ScreencastOMatic**

View File

@ -2760,8 +2760,10 @@ else:
compat_kwargs = lambda kwargs: kwargs compat_kwargs = lambda kwargs: kwargs
compat_numeric_types = ((int, float, long, complex) if sys.version_info[0] < 3 try:
else (int, float, complex)) compat_numeric_types = (int, float, long, complex)
except NameError: # Python 3
compat_numeric_types = (int, float, complex)
if sys.version_info < (2, 7): if sys.version_info < (2, 7):

View File

@ -347,7 +347,10 @@ class FileDownloader(object):
if min_sleep_interval: if min_sleep_interval:
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval) max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval) sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval) self.to_screen(
'[download] Sleeping %s seconds...' % (
int(sleep_interval) if sleep_interval.is_integer()
else '%.2f' % sleep_interval))
time.sleep(sleep_interval) time.sleep(sleep_interval)
return self.real_download(filename, info_dict) return self.real_download(filename, info_dict)

View File

@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE): class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1', 'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '', 'md5': '',
@ -44,6 +44,12 @@ class AMCNetworksIE(ThePlatformIE):
}, { }, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version', 'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/mama-june-from-not-to-hot/full-episode/season-01/thin-tervention',
'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -1,6 +1,7 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import codecs
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@ -96,6 +97,10 @@ class CDAIE(InfoExtractor):
if not video or 'file' not in video: if not video or 'file' not in video:
self.report_warning('Unable to extract %s version information' % version) self.report_warning('Unable to extract %s version information' % version)
return return
if video['file'].startswith('uggc'):
video['file'] = codecs.decode(video['file'], 'rot_13')
if video['file'].endswith('adc.mp4'):
video['file'] = video['file'].replace('adc.mp4', '.mp4')
f = { f = {
'url': video['file'], 'url': video['file'],
} }

View File

@ -2010,7 +2010,7 @@ class InfoExtractor(object):
}) })
return formats return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None): def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
def absolute_url(video_url): def absolute_url(video_url):
return compat_urlparse.urljoin(base_url, video_url) return compat_urlparse.urljoin(base_url, video_url)
@ -2032,7 +2032,8 @@ class InfoExtractor(object):
is_plain_url = False is_plain_url = False
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4', full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id) entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
preference=preference)
elif ext == 'mpd': elif ext == 'mpd':
is_plain_url = False is_plain_url = False
formats = self._extract_mpd_formats( formats = self._extract_mpd_formats(

View File

@ -207,6 +207,21 @@ class CrunchyrollIE(CrunchyrollBaseIE):
# Just test metadata extraction # Just test metadata extraction
'skip_download': True, 'skip_download': True,
}, },
}, {
# make sure we can extract an uploader name that's not a link
'url': 'http://www.crunchyroll.com/hakuoki-reimeiroku/episode-1-dawn-of-the-divine-warriors-606899',
'info_dict': {
'id': '606899',
'ext': 'mp4',
'title': 'Hakuoki Reimeiroku Episode 1 Dawn of the Divine Warriors',
'description': 'Ryunosuke was left to die, but Serizawa-san asked him a simple question "Do you want to live?"',
'uploader': 'Geneon Entertainment',
'upload_date': '20120717',
},
'params': {
# just test metadata extraction
'skip_download': True,
},
}] }]
_FORMAT_IDS = { _FORMAT_IDS = {
@ -388,8 +403,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
if video_upload_date: if video_upload_date:
video_upload_date = unified_strdate(video_upload_date) video_upload_date = unified_strdate(video_upload_date)
video_uploader = self._html_search_regex( video_uploader = self._html_search_regex(
r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', webpage, # try looking for both an uploader that's a link and one that's not
'video_uploader', fatal=False) [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
webpage, 'video_uploader', fatal=False)
available_fmts = [] available_fmts = []
for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage): for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):

View File

@ -282,9 +282,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
} }
def _check_error(self, info): def _check_error(self, info):
error = info.get('error')
if info.get('error') is not None: if info.get('error') is not None:
title = error['title']
# See https://developer.dailymotion.com/api#access-error
if error.get('code') == 'DM007':
self.raise_geo_restricted(msg=title)
raise ExtractorError( raise ExtractorError(
'%s said: %s' % (self.IE_NAME, info['error']['title']), expected=True) '%s said: %s' % (self.IE_NAME, title), expected=True)
def _get_subtitles(self, video_id, webpage): def _get_subtitles(self, video_id, webpage):
try: try:

View File

@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class ETOnlineIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?etonline\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.etonline.com/tv/211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale/',
'info_dict': {
'id': '211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale',
'title': 'md5:a21ec7d3872ed98335cbd2a046f34ee6',
'description': 'md5:8b94484063f463cca709617c79618ccd',
},
'playlist_count': 2,
}, {
'url': 'http://www.etonline.com/media/video/here_are_the_stars_who_love_bringing_their_moms_as_dates_to_the_oscars-211359/',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911076001/default_default/index.html?videoId=ref:%s'
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % video_id, 'BrightcoveNew', video_id)
for video_id in re.findall(
r'site\.brightcove\s*\([^,]+,\s*["\'](title_\d+)', webpage)]
return self.playlist_result(
entries, playlist_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))

View File

@ -288,6 +288,7 @@ from .espn import (
ESPNArticleIE, ESPNArticleIE,
) )
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .etonline import ETOnlineIE
from .europa import EuropaIE from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE from .everyonesmixtape import EveryonesMixtapeIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
@ -338,6 +339,7 @@ from .francetv import (
) )
from .freesound import FreesoundIE from .freesound import FreesoundIE
from .freespeech import FreespeechIE from .freespeech import FreespeechIE
from .freshlive import FreshLiveIE
from .funimation import FunimationIE from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE from .fusion import FusionIE
@ -637,6 +639,7 @@ from .ninecninemedia import (
from .ninegag import NineGagIE from .ninegag import NineGagIE
from .ninenow import NineNowIE from .ninenow import NineNowIE
from .nintendo import NintendoIE from .nintendo import NintendoIE
from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE from .nobelprize import NobelPrizeIE
from .noco import NocoIE from .noco import NocoIE
from .normalboots import NormalbootsIE from .normalboots import NormalbootsIE
@ -666,6 +669,7 @@ from .npo import (
NPORadioIE, NPORadioIE,
NPORadioFragmentIE, NPORadioFragmentIE,
SchoolTVIE, SchoolTVIE,
HetKlokhuisIE,
VPROIE, VPROIE,
WNLIE, WNLIE,
) )
@ -835,7 +839,6 @@ from .safari import (
from .sapo import SapoIE from .sapo import SapoIE
from .savefrom import SaveFromIE from .savefrom import SaveFromIE
from .sbs import SBSIE from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE from .screencastomatic import ScreencastOMaticIE
from .scrippsnetworks import ScrippsNetworksWatchIE from .scrippsnetworks import ScrippsNetworksWatchIE

View File

@ -0,0 +1,84 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
try_get,
unified_timestamp,
)
class FreshLiveIE(InfoExtractor):
_VALID_URL = r'https?://freshlive\.tv/[^/]+/(?P<id>\d+)'
_TEST = {
'url': 'https://freshlive.tv/satotv/74712',
'md5': '9f0cf5516979c4454ce982df3d97f352',
'info_dict': {
'id': '74712',
'ext': 'mp4',
'title': 'テスト',
'description': 'テスト',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1511,
'timestamp': 1483619655,
'upload_date': '20170105',
'uploader': 'サトTV',
'uploader_id': 'satotv',
'view_count': int,
'comment_count': int,
'is_live': False,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
options = self._parse_json(
self._search_regex(
r'window\.__CONTEXT__\s*=\s*({.+?});\s*</script>',
webpage, 'initial context'),
video_id)
info = options['context']['dispatcher']['stores']['ProgramStore']['programs'][video_id]
title = info['title']
if info.get('status') == 'upcoming':
raise ExtractorError('Stream %s is upcoming' % video_id, expected=True)
stream_url = info.get('liveStreamUrl') or info['archiveStreamUrl']
is_live = info.get('liveStreamUrl') is not None
formats = self._extract_m3u8_formats(
stream_url, video_id, ext='mp4',
entry_protocol='m3u8' if is_live else 'm3u8_native',
m3u8_id='hls')
if is_live:
title = self._live_title(title)
return {
'id': video_id,
'formats': formats,
'title': title,
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'duration': int_or_none(info.get('airTime')),
'timestamp': unified_timestamp(info.get('createdAt')),
'uploader': try_get(
info, lambda x: x['channel']['title'], compat_str),
'uploader_id': try_get(
info, lambda x: x['channel']['code'], compat_str),
'uploader_url': try_get(
info, lambda x: x['channel']['permalink'], compat_str),
'view_count': int_or_none(info.get('viewCount')),
'comment_count': int_or_none(info.get('commentCount')),
'tags': info.get('tags', []),
'is_live': is_live,
}

View File

@ -16,6 +16,8 @@ class IviIE(InfoExtractor):
IE_DESC = 'ivi.ru' IE_DESC = 'ivi.ru'
IE_NAME = 'ivi' IE_NAME = 'ivi'
_VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [ _TESTS = [
# Single movie # Single movie
@ -91,7 +93,11 @@ class IviIE(InfoExtractor):
if 'error' in video_json: if 'error' in video_json:
error = video_json['error'] error = video_json['error']
if error['origin'] == 'NoRedisValidData': origin = error['origin']
if origin == 'NotAllowedForLocation':
self.raise_geo_restricted(
msg=error['message'], countries=self._GEO_COUNTRIES)
elif origin == 'NoRedisValidData':
raise ExtractorError('Video %s does not exist' % video_id, expected=True) raise ExtractorError('Video %s does not exist' % video_id, expected=True)
raise ExtractorError( raise ExtractorError(
'Unable to download video %s: %s' % (video_id, error['message']), 'Unable to download video %s: %s' % (video_id, error['message']),

View File

@ -14,7 +14,7 @@ from ..utils import (
class MDRIE(InfoExtractor): class MDRIE(InfoExtractor):
IE_DESC = 'MDR.DE and KiKA' IE_DESC = 'MDR.DE and KiKA'
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z]+-?(?P<id>\d+)(?:_.+?)?\.html' _VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
_TESTS = [{ _TESTS = [{
# MDR regularly deletes its videos # MDR regularly deletes its videos
@ -31,6 +31,7 @@ class MDRIE(InfoExtractor):
'duration': 250, 'duration': 250,
'uploader': 'MITTELDEUTSCHER RUNDFUNK', 'uploader': 'MITTELDEUTSCHER RUNDFUNK',
}, },
'skip': '404 not found',
}, { }, {
'url': 'http://www.kika.de/baumhaus/videos/video19636.html', 'url': 'http://www.kika.de/baumhaus/videos/video19636.html',
'md5': '4930515e36b06c111213e80d1e4aad0e', 'md5': '4930515e36b06c111213e80d1e4aad0e',
@ -41,6 +42,7 @@ class MDRIE(InfoExtractor):
'duration': 134, 'duration': 134,
'uploader': 'KIKA', 'uploader': 'KIKA',
}, },
'skip': '404 not found',
}, { }, {
'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html', 'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html',
'md5': '5fe9c4dd7d71e3b238f04b8fdd588357', 'md5': '5fe9c4dd7d71e3b238f04b8fdd588357',
@ -49,11 +51,21 @@ class MDRIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Beutolomäus und der geheime Weihnachtswunsch', 'title': 'Beutolomäus und der geheime Weihnachtswunsch',
'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd', 'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd',
'timestamp': 1450950000, 'timestamp': 1482541200,
'upload_date': '20151224', 'upload_date': '20161224',
'duration': 4628, 'duration': 4628,
'uploader': 'KIKA', 'uploader': 'KIKA',
}, },
}, {
# audio with alternative playerURL pattern
'url': 'http://www.mdr.de/kultur/videos-und-audios/audio-radio/operation-mindfuck-robert-wilson100.html',
'info_dict': {
'id': '100',
'ext': 'mp4',
'title': 'Feature: Operation Mindfuck - Robert Anton Wilson',
'duration': 3239,
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
}, { }, {
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html', 'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
'only_matching': True, 'only_matching': True,
@ -71,7 +83,7 @@ class MDRIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
data_url = self._search_regex( data_url = self._search_regex(
r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+/(?:video|audio)-?[0-9]+-avCustom\.xml)\1', r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+?-avCustom\.xml)\1',
webpage, 'data url', group='url').replace(r'\/', '/') webpage, 'data url', group='url').replace(r'\/', '/')
doc = self._download_xml( doc = self._download_xml(

View File

@ -0,0 +1,83 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
get_element_by_class,
urlencode_postdata,
)
class NJPWWorldIE(InfoExtractor):
_VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
IE_DESC = '新日本プロレスワールド'
_NETRC_MACHINE = 'njpwworld'
_TEST = {
'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
'info_dict': {
'id': 's_series_00155_1_9',
'ext': 'mp4',
'title': '第9試合 ランディ・サベージ vs リック・スタイナー',
'tags': list,
},
'params': {
'skip_download': True, # AES-encrypted m3u8
},
'skip': 'Requires login',
}
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
# No authentication to be performed
if not username:
return True
webpage, urlh = self._download_webpage_handle(
'https://njpwworld.com/auth/login', None,
note='Logging in', errnote='Unable to login',
data=urlencode_postdata({'login_id': username, 'pw': password}))
# /auth/login will return 302 for successful logins
if urlh.geturl() == 'https://njpwworld.com/auth/login':
self.report_warning('unable to login')
return False
return True
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
formats = []
for player_url, kind in re.findall(r'<a[^>]+href="(/player[^"]+)".+?<img[^>]+src="[^"]+qf_btn_([^".]+)', webpage):
player_url = compat_urlparse.urljoin(url, player_url)
player_page = self._download_webpage(
player_url, video_id, note='Downloading player page')
entries = self._parse_html5_media_entries(
player_url, player_page, video_id, m3u8_id='hls-%s' % kind,
m3u8_entry_protocol='m3u8_native',
preference=2 if 'hq' in kind else 1)
formats.extend(entries[0]['formats'])
self._sort_formats(formats)
post_content = get_element_by_class('post-content', webpage)
tags = re.findall(
r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content
) if post_content else None
return {
'id': video_id,
'title': self._og_search_title(webpage),
'formats': formats,
'tags': tags,
}

View File

@ -23,7 +23,7 @@ from ..utils import (
class NocoIE(InfoExtractor): class NocoIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)' _VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)'
_LOGIN_URL = 'http://noco.tv/do.php' _LOGIN_URL = 'https://noco.tv/do.php'
_API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s' _API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s'
_SUB_LANG_TEMPLATE = '&sub_lang=%s' _SUB_LANG_TEMPLATE = '&sub_lang=%s'
_NETRC_MACHINE = 'noco' _NETRC_MACHINE = 'noco'
@ -69,16 +69,17 @@ class NocoIE(InfoExtractor):
if username is None: if username is None:
return return
login_form = { login = self._download_json(
'a': 'login', self._LOGIN_URL, None, 'Logging in as %s' % username,
'cookie': '1', data=urlencode_postdata({
'username': username, 'a': 'login',
'password': password, 'cookie': '1',
} 'username': username,
request = sanitized_Request(self._LOGIN_URL, urlencode_postdata(login_form)) 'password': password,
request.add_header('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8') }),
headers={
login = self._download_json(request, None, 'Logging in as %s' % username) 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
})
if 'erreur' in login: if 'erreur' in login:
raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True) raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True)

View File

@ -51,7 +51,8 @@ class NPOIE(NPOBaseIE):
(?: (?:
npo\.nl/(?!live|radio)(?:[^/]+/){2}| npo\.nl/(?!live|radio)(?:[^/]+/){2}|
ntr\.nl/(?:[^/]+/){2,}| ntr\.nl/(?:[^/]+/){2,}|
omroepwnl\.nl/video/fragment/[^/]+__ omroepwnl\.nl/video/fragment/[^/]+__|
zapp\.nl/[^/]+/[^/]+/
) )
) )
(?P<id>[^/?#]+) (?P<id>[^/?#]+)
@ -140,6 +141,18 @@ class NPOIE(NPOBaseIE):
'upload_date': '20150508', 'upload_date': '20150508',
'duration': 462, 'duration': 462,
}, },
},
{
'url': 'http://www.zapp.nl/de-bzt-show/gemist/KN_1687547',
'only_matching': True,
},
{
'url': 'http://www.zapp.nl/de-bzt-show/filmpjes/POMS_KN_7315118',
'only_matching': True,
},
{
'url': 'http://www.zapp.nl/beste-vrienden-quiz/extra-video-s/WO_NTR_1067990',
'only_matching': True,
} }
] ]
@ -416,7 +429,21 @@ class NPORadioFragmentIE(InfoExtractor):
} }
class SchoolTVIE(InfoExtractor): class NPODataMidEmbedIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
return {
'_type': 'url_transparent',
'ie_key': 'NPO',
'url': 'npo:%s' % video_id,
'display_id': display_id
}
class SchoolTVIE(NPODataMidEmbedIE):
IE_NAME = 'schooltv' IE_NAME = 'schooltv'
_VALID_URL = r'https?://(?:www\.)?schooltv\.nl/video/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?schooltv\.nl/video/(?P<id>[^/?#&]+)'
@ -435,17 +462,25 @@ class SchoolTVIE(InfoExtractor):
} }
} }
def _real_extract(self, url):
display_id = self._match_id(url) class HetKlokhuisIE(NPODataMidEmbedIE):
webpage = self._download_webpage(url, display_id) IE_NAME = 'hetklokhuis'
video_id = self._search_regex( _VALID_URL = r'https?://(?:www\.)?hetklokhuis.nl/[^/]+/\d+/(?P<id>[^/?#&]+)'
r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
return { _TEST = {
'_type': 'url_transparent', 'url': 'http://hetklokhuis.nl/tv-uitzending/3471/Zwaartekrachtsgolven',
'ie_key': 'NPO', 'info_dict': {
'url': 'npo:%s' % video_id, 'id': 'VPWON_1260528',
'display_id': display_id 'display_id': 'Zwaartekrachtsgolven',
'ext': 'm4v',
'title': 'Het Klokhuis: Zwaartekrachtsgolven',
'description': 'md5:c94f31fb930d76c2efa4a4a71651dd48',
'upload_date': '20170223',
},
'params': {
'skip_download': True
} }
}
class NPOPlaylistBaseIE(NPOIE): class NPOPlaylistBaseIE(NPOIE):

View File

@ -72,16 +72,21 @@ class OpenloadIE(InfoExtractor):
raise ExtractorError('File not found', expected=True) raise ExtractorError('File not found', expected=True)
ol_id = self._search_regex( ol_id = self._search_regex(
'<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>', '<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
webpage, 'openload ID') webpage, 'openload ID')
first_two_chars = int(float(ol_id[0:][:2])) first_char = int(ol_id[0])
urlcode = [] urlcode = []
num = 2 num = 1
while num < len(ol_id): while num < len(ol_id):
key = int(float(ol_id[num + 3:][:2])) i = ord(ol_id[num])
urlcode.append((key, compat_chr(int(float(ol_id[num:][:3])) - first_two_chars))) key = 0
if i <= 90:
key = i - 65
elif i >= 97:
key = 25 + i - 97
urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
num += 5 num += 5
video_url = 'https://openload.co/stream/' + ''.join( video_url = 'https://openload.co/stream/' + ''.join(

View File

@ -1,57 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
class SciVeeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?scivee\.tv/node/(?P<id>\d+)'
_TEST = {
'url': 'http://www.scivee.tv/node/62352',
'md5': 'b16699b74c9e6a120f6772a44960304f',
'info_dict': {
'id': '62352',
'ext': 'mp4',
'title': 'Adam Arkin at the 2014 DOE JGI Genomics of Energy & Environment Meeting',
'description': 'md5:81f1710638e11a481358fab1b11059d7',
},
'skip': 'Not accessible from Travis CI server',
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
# annotations XML is malformed
annotations = self._download_webpage(
'http://www.scivee.tv/assets/annotations/%s' % video_id, video_id, 'Downloading annotations')
title = self._html_search_regex(r'<title>([^<]+)</title>', annotations, 'title')
description = self._html_search_regex(r'<abstract>([^<]+)</abstract>', annotations, 'abstract', fatal=False)
filesize = int_or_none(self._html_search_regex(
r'<filesize>([^<]+)</filesize>', annotations, 'filesize', fatal=False))
formats = [
{
'url': 'http://www.scivee.tv/assets/audio/%s' % video_id,
'ext': 'mp3',
'format_id': 'audio',
},
{
'url': 'http://www.scivee.tv/assets/video/%s' % video_id,
'ext': 'mp4',
'format_id': 'video',
'filesize': filesize,
},
]
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': 'http://www.scivee.tv/assets/videothumb/%s' % video_id,
'formats': formats,
}

View File

@ -3,7 +3,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..utils import qualities from ..utils import (
int_or_none,
qualities,
)
class TheSceneIE(InfoExtractor): class TheSceneIE(InfoExtractor):
@ -16,6 +19,11 @@ class TheSceneIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear', 'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear',
'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear', 'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear',
'duration': 127,
'series': 'Style.com Fashion Shows',
'season': 'Ready To Wear Spring 2013',
'tags': list,
'categories': list,
}, },
} }
@ -32,21 +40,29 @@ class TheSceneIE(InfoExtractor):
player = self._download_webpage(player_url, display_id) player = self._download_webpage(player_url, display_id)
info = self._parse_json( info = self._parse_json(
self._search_regex( self._search_regex(
r'(?m)var\s+video\s+=\s+({.+?});$', player, 'info json'), r'(?m)video\s*:\s*({.+?}),$', player, 'info json'),
display_id) display_id)
video_id = info['id']
title = info['title']
qualities_order = qualities(('low', 'high')) qualities_order = qualities(('low', 'high'))
formats = [{ formats = [{
'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']), 'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']),
'url': f['src'], 'url': f['src'],
'quality': qualities_order(f['quality']), 'quality': qualities_order(f['quality']),
} for f in info['sources'][0]] } for f in info['sources']]
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': info['id'], 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': info['title'], 'title': title,
'formats': formats, 'formats': formats,
'thumbnail': info.get('poster_frame'), 'thumbnail': info.get('poster_frame'),
'duration': int_or_none(info.get('duration')),
'series': info.get('series_title'),
'season': info.get('season_title'),
'tags': info.get('tags'),
'categories': info.get('categories'),
} }

View File

@ -16,6 +16,7 @@ class TubiTvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)'
_LOGIN_URL = 'http://tubitv.com/login' _LOGIN_URL = 'http://tubitv.com/login'
_NETRC_MACHINE = 'tubitv' _NETRC_MACHINE = 'tubitv'
_GEO_COUNTRIES = ['US']
_TEST = { _TEST = {
'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday', 'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday',
'md5': '43ac06be9326f41912dc64ccf7a80320', 'md5': '43ac06be9326f41912dc64ccf7a80320',

View File

@ -17,6 +17,9 @@ class TvigleIE(InfoExtractor):
IE_DESC = 'Интернет-телевидение Tvigle.ru' IE_DESC = 'Интернет-телевидение Tvigle.ru'
_VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))' _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.tvigle.ru/video/sokrat/', 'url': 'http://www.tvigle.ru/video/sokrat/',
@ -72,8 +75,13 @@ class TvigleIE(InfoExtractor):
error_message = item.get('errorMessage') error_message = item.get('errorMessage')
if not videos and error_message: if not videos and error_message:
raise ExtractorError( if item.get('isGeoBlocked') is True:
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True) self.raise_geo_restricted(
msg=error_message, countries=self._GEO_COUNTRIES)
else:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message),
expected=True)
title = item['title'] title = item['title']
description = item.get('description') description = item.get('description')

View File

@ -17,12 +17,12 @@ from ..utils import (
class VevoBaseIE(InfoExtractor): class VevoBaseIE(InfoExtractor):
def _extract_json(self, webpage, video_id, item): def _extract_json(self, webpage, video_id):
return self._parse_json( return self._parse_json(
self._search_regex( self._search_regex(
r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>', r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>',
webpage, 'initial store'), webpage, 'initial store'),
video_id)['default'][item] video_id)
class VevoIE(VevoBaseIE): class VevoIE(VevoBaseIE):
@ -139,6 +139,11 @@ class VevoIE(VevoBaseIE):
# no genres available # no genres available
'url': 'http://www.vevo.com/watch/INS171400764', 'url': 'http://www.vevo.com/watch/INS171400764',
'only_matching': True, 'only_matching': True,
}, {
# Another case available only via the webpage; using streams/streamsV3 formats
# Geo-restricted to Netherlands/Germany
'url': 'http://www.vevo.com/watch/boostee/pop-corn-clip-officiel/FR1A91600909',
'only_matching': True,
}] }]
_VERSIONS = { _VERSIONS = {
0: 'youtube', # only in AuthenticateVideo videoVersions 0: 'youtube', # only in AuthenticateVideo videoVersions
@ -193,7 +198,14 @@ class VevoIE(VevoBaseIE):
# https://github.com/rg3/youtube-dl/issues/9366) # https://github.com/rg3/youtube-dl/issues/9366)
if not video_versions: if not video_versions:
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0] json_data = self._extract_json(webpage, video_id)
if 'streams' in json_data.get('default', {}):
video_versions = json_data['default']['streams'][video_id][0]
else:
video_versions = [
value
for key, value in json_data['apollo']['data'].items()
if key.startswith('%s.streams' % video_id)]
uploader = None uploader = None
artist = None artist = None
@ -207,7 +219,7 @@ class VevoIE(VevoBaseIE):
formats = [] formats = []
for video_version in video_versions: for video_version in video_versions:
version = self._VERSIONS.get(video_version['version']) version = self._VERSIONS.get(video_version.get('version'), 'generic')
version_url = video_version.get('url') version_url = video_version.get('url')
if not version_url: if not version_url:
continue continue
@ -339,7 +351,7 @@ class VevoPlaylistIE(VevoBaseIE):
if video_id: if video_id:
return self.url_result('vevo:%s' % video_id, VevoIE.ie_key()) return self.url_result('vevo:%s' % video_id, VevoIE.ie_key())
playlists = self._extract_json(webpage, playlist_id, '%ss' % playlist_kind) playlists = self._extract_json(webpage, playlist_id)['default']['%ss' % playlist_kind]
playlist = (list(playlists.values())[0] playlist = (list(playlists.values())[0]
if playlist_kind == 'playlist' else playlists[playlist_id]) if playlist_kind == 'playlist' else playlists[playlist_id])

View File

@ -5,6 +5,7 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
dict_get, dict_get,
ExtractorError,
int_or_none, int_or_none,
parse_duration, parse_duration,
unified_strdate, unified_strdate,
@ -57,6 +58,10 @@ class XHamsterIE(InfoExtractor):
}, { }, {
'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html', 'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html',
'only_matching': True, 'only_matching': True,
}, {
# This video is visible for marcoalfa123456's friends only
'url': 'https://it.xhamster.com/movies/7263980/la_mia_vicina.html',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -78,6 +83,12 @@ class XHamsterIE(InfoExtractor):
mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo) mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
webpage = self._download_webpage(mrss_url, video_id) webpage = self._download_webpage(mrss_url, video_id)
error = self._html_search_regex(
r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
title = self._html_search_regex( title = self._html_search_regex(
[r'<h1[^>]*>([^<]+)</h1>', [r'<h1[^>]*>([^<]+)</h1>',
r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"', r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"',

View File

@ -47,7 +47,6 @@ from ..utils import (
unsmuggle_url, unsmuggle_url,
uppercase_escape, uppercase_escape,
urlencode_postdata, urlencode_postdata,
ISO3166Utils,
) )
@ -371,6 +370,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
} }
_SUBTITLE_FORMATS = ('ttml', 'vtt') _SUBTITLE_FORMATS = ('ttml', 'vtt')
_GEO_BYPASS = False
IE_NAME = 'youtube' IE_NAME = 'youtube'
_TESTS = [ _TESTS = [
{ {
@ -917,7 +918,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# itag 212 # itag 212
'url': '1t24XAntNCY', 'url': '1t24XAntNCY',
'only_matching': True, 'only_matching': True,
} },
{
# geo restricted to JP
'url': 'sJL6WA-aGkQ',
'only_matching': True,
},
] ]
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
@ -1376,11 +1382,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if 'token' not in video_info: if 'token' not in video_info:
if 'reason' in video_info: if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']: if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta('regionsAllowed', video_webpage, default=None) regions_allowed = self._html_search_meta(
if regions_allowed: 'regionsAllowed', video_webpage, default=None)
raise ExtractorError('YouTube said: This video is available in %s only' % ( countries = regions_allowed.split(',') if regions_allowed else None
', '.join(map(ISO3166Utils.short2full, regions_allowed.split(',')))), self.raise_geo_restricted(
expected=True) msg=video_info['reason'][0], countries=countries)
raise ExtractorError( raise ExtractorError(
'YouTube said: %s' % video_info['reason'][0], 'YouTube said: %s' % video_info['reason'][0],
expected=True, video_id=video_id) expected=True, video_id=video_id)
@ -2226,7 +2232,7 @@ class YoutubeUserIE(YoutubeChannelIE):
'url': 'https://www.youtube.com/gametrailers', 'url': 'https://www.youtube.com/gametrailers',
'only_matching': True, 'only_matching': True,
}, { }, {
# This channel is not available. # This channel is not available, geo restricted to JP
'url': 'https://www.youtube.com/user/kananishinoSMEJ/videos', 'url': 'https://www.youtube.com/user/kananishinoSMEJ/videos',
'only_matching': True, 'only_matching': True,
}] }]

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2017.02.24' __version__ = '2017.02.27'