Compare commits

...

33 Commits

Author SHA1 Message Date
ef48a1175d release 2017.02.27 2017-02-27 23:26:07 +07:00
c6184bcf7b [ChangeLog] Actualize 2017-02-27 23:24:03 +07:00
18abb74376 [npo] Relax _VALID_URL for zapp.nl 2017-02-27 23:13:51 +07:00
dbc01fdb6f [hetklokhuis] Fix IE_NAME 2017-02-27 23:10:29 +07:00
f264c62334 [npo] Add support for zapp.nl 2017-02-27 23:10:00 +07:00
0dc5a86a32 [npo] Add support for hetklokhuis.nl (closes #12293) 2017-02-27 22:43:19 +07:00
0e879f432a [youtube:channel] Remove duplicate test 2017-02-27 22:22:43 +07:00
892b47ab6c [scivee] Remove extractor (#9315)
The Wikipedia page is changed from active to down:
https://en.wikipedia.org/w/index.php?title=SciVee&diff=prev&oldid=723161154

Some other interesting bits:

$ nslookup www.scivee.tv
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
www.scivee.tv   canonical name = scivee.rcsb.org.
Name:   scivee.rcsb.org
Address: 132.249.231.211

$ nslookup rcsb.org
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   rcsb.org
Address: 132.249.231.77

Both IPs are from UCSD. I guess it's maintained by a lab and they don't
maintain it anymore.
2017-02-27 21:34:33 +08:00
fdeea72611 [cda] Decode URL (fixes #12255) 2017-02-26 22:05:52 +08:00
xbe
7fd4655256 [crunchyroll] Extract uploader name that's not a link
Provide the Crunchyroll extractor with the ability to extract uploader
names that aren't links. Add a test for this new functionality.
This fixes #12267.
2017-02-26 19:08:10 +08:00
fd5c4aab59 [youtube] Raise GeoRestrictedError 2017-02-26 16:52:40 +07:00
8878789f11 [dailymotion] Raise GeoRestrictedError 2017-02-26 16:52:40 +07:00
a5cf17989b [MDR] Relax _VALID_URL and playerURL matching and update _TESTS
Ref: #12169
2017-02-26 17:24:54 +08:00
b3aec47665 [tvigle] Raise GeoRestrictedError 2017-02-25 23:27:45 +07:00
9d0c08a02c [vevo] Fix videos with the new streams/streamsV3 format (closes #11719) 2017-02-26 00:15:49 +08:00
e498758b9c [freshlive] Fix issues and improve (closes #12175) 2017-02-25 22:56:42 +07:00
5fc8d89361 [freshlive] Add extractor 2017-02-25 22:55:17 +07:00
d374d943f3 [downloader/common] Limit displaying 2 digits after decimal point in sleep interval message 2017-02-25 20:59:04 +07:00
103f8c8d36 [xhamster] Capture and output videoClosed error (#12263) 2017-02-25 20:38:21 +07:00
922ab7840b [etonline] Add extractor (closes #12236) 2017-02-25 20:16:40 +07:00
831217291a [compat] Use try except for compat_numeric_types 2017-02-25 19:44:50 +07:00
db182c63fb [njpwworld] Add new extractor (closes #11561) 2017-02-25 18:44:39 +08:00
eeb0a95684 [extractor/common] Add 'preference' to _parse_html5_media_entries
Some websites, like NJPWorld, put different qualities on different
player pages.
2017-02-25 18:40:05 +08:00
231bcd0b6b [amcnetworks] Relax _VALID_URL (#12127) 2017-02-25 02:51:53 +07:00
204efc8509 release 2017.02.24.1 2017-02-24 21:59:39 +07:00
5d3a51e1b9 [ChangeLog] Actualize 2017-02-24 21:57:39 +07:00
ad3033037c [noco] Modernize 2017-02-24 21:51:56 +07:00
f3bc281239 [noco] Swtich login URL to https (closes #12246) 2017-02-24 21:48:34 +07:00
441d7a32e5 [thescene] Extract more metadata 2017-02-24 21:22:29 +07:00
51ed496307 [thescene] Fix extraction (closes #12235) 2017-02-24 22:08:45 +08:00
68f17a9c2d [tubitv] use geo bypass mechanism 2017-02-24 12:27:56 +01:00
39e7277ed1 [openload] fix extraction(closes #10408) 2017-02-24 11:21:58 +01:00
42dcdbe11c [ivi] Raise GeoRestrictedError 2017-02-24 10:54:39 +07:00
27 changed files with 469 additions and 125 deletions

View File

@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.24**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.27*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.27**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.02.24
[debug] youtube-dl version 2017.02.27
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -1,3 +1,41 @@
version 2017.02.27
Core
* [downloader/common] Limit displaying 2 digits after decimal point in sleep
interval message (#12183)
+ [extractor/common] Add preference to _parse_html5_media_entries
Extractors
+ [npo] Add support for zapp.nl
+ [npo] Add support for hetklokhuis.nl (#12293)
- [scivee] Remove extractor (#9315)
+ [cda] Decode download URL (#12255)
+ [crunchyroll] Improve uploader extraction (#12267)
+ [youtube] Raise GeoRestrictedError
+ [dailymotion] Raise GeoRestrictedError
+ [mdr] Recognize more URL patterns (#12169)
+ [tvigle] Raise GeoRestrictedError
* [vevo] Fix extraction for videos with the new streams/streamsV3 format
(#11719)
+ [freshlive] Add support for freshlive.tv (#12175)
+ [xhamster] Capture and output videoClosed error (#12263)
+ [etonline] Add support for etonline.com (#12236)
+ [njpwworld] Add support for njpwworld.com (#11561)
* [amcnetworks] Relax URL regular expression (#12127)
version 2017.02.24.1
Extractors
* [noco] Modernize
* [noco] Switch login URL to https (#12246)
+ [thescene] Extract more metadata
* [thescene] Fix extraction (#12235)
+ [tubitv] Use geo bypass mechanism
* [openload] Fix extraction (#10408)
+ [ivi] Raise GeoRestrictedError
version 2017.02.24
Core

View File

@ -239,6 +239,7 @@
- **ESPN**
- **ESPNArticle**
- **EsriVideo**
- **ETOnline**
- **Europa**
- **EveryonesMixtape**
- **ExpoTV**
@ -274,6 +275,7 @@
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **FreshLive**
- **Funimation**
- **FunnyOrDie**
- **Fusion**
@ -310,6 +312,7 @@
- **HellPorno**
- **Helsinki**: helsinki.fi
- **HentaiStigma**
- **hetklokhuis**
- **hgtv.com:show**
- **HistoricFilms**
- **history:topic**: History.com Topic
@ -511,6 +514,7 @@
- **Nintendo**
- **njoy**: N-JOY
- **njoy:embed**
- **NJPWWorld**: 新日本プロレスワールド
- **NobelPrize**
- **Noco**
- **Normalboots**
@ -666,7 +670,6 @@
- **savefrom.net**
- **SBS**: sbs.com.au
- **schooltv**
- **SciVee**
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**

View File

@ -2760,8 +2760,10 @@ else:
compat_kwargs = lambda kwargs: kwargs
compat_numeric_types = ((int, float, long, complex) if sys.version_info[0] < 3
else (int, float, complex))
try:
compat_numeric_types = (int, float, long, complex)
except NameError: # Python 3
compat_numeric_types = (int, float, complex)
if sys.version_info < (2, 7):

View File

@ -347,7 +347,10 @@ class FileDownloader(object):
if min_sleep_interval:
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
self.to_screen(
'[download] Sleeping %s seconds...' % (
int(sleep_interval) if sleep_interval.is_integer()
else '%.2f' % sleep_interval))
time.sleep(sleep_interval)
return self.real_download(filename, info_dict)

View File

@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '',
@ -44,6 +44,12 @@ class AMCNetworksIE(ThePlatformIE):
}, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/mama-june-from-not-to-hot/full-episode/season-01/thin-tervention',
'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@ -1,6 +1,7 @@
# coding: utf-8
from __future__ import unicode_literals
import codecs
import re
from .common import InfoExtractor
@ -96,6 +97,10 @@ class CDAIE(InfoExtractor):
if not video or 'file' not in video:
self.report_warning('Unable to extract %s version information' % version)
return
if video['file'].startswith('uggc'):
video['file'] = codecs.decode(video['file'], 'rot_13')
if video['file'].endswith('adc.mp4'):
video['file'] = video['file'].replace('adc.mp4', '.mp4')
f = {
'url': video['file'],
}

View File

@ -2010,7 +2010,7 @@ class InfoExtractor(object):
})
return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None):
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
def absolute_url(video_url):
return compat_urlparse.urljoin(base_url, video_url)
@ -2032,7 +2032,8 @@ class InfoExtractor(object):
is_plain_url = False
formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id)
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
preference=preference)
elif ext == 'mpd':
is_plain_url = False
formats = self._extract_mpd_formats(

View File

@ -207,6 +207,21 @@ class CrunchyrollIE(CrunchyrollBaseIE):
# Just test metadata extraction
'skip_download': True,
},
}, {
# make sure we can extract an uploader name that's not a link
'url': 'http://www.crunchyroll.com/hakuoki-reimeiroku/episode-1-dawn-of-the-divine-warriors-606899',
'info_dict': {
'id': '606899',
'ext': 'mp4',
'title': 'Hakuoki Reimeiroku Episode 1 Dawn of the Divine Warriors',
'description': 'Ryunosuke was left to die, but Serizawa-san asked him a simple question "Do you want to live?"',
'uploader': 'Geneon Entertainment',
'upload_date': '20120717',
},
'params': {
# just test metadata extraction
'skip_download': True,
},
}]
_FORMAT_IDS = {
@ -388,8 +403,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
if video_upload_date:
video_upload_date = unified_strdate(video_upload_date)
video_uploader = self._html_search_regex(
r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', webpage,
'video_uploader', fatal=False)
# try looking for both an uploader that's a link and one that's not
[r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
webpage, 'video_uploader', fatal=False)
available_fmts = []
for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):

View File

@ -282,9 +282,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
}
def _check_error(self, info):
error = info.get('error')
if info.get('error') is not None:
title = error['title']
# See https://developer.dailymotion.com/api#access-error
if error.get('code') == 'DM007':
self.raise_geo_restricted(msg=title)
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, info['error']['title']), expected=True)
'%s said: %s' % (self.IE_NAME, title), expected=True)
def _get_subtitles(self, video_id, webpage):
try:

View File

@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class ETOnlineIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?etonline\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.etonline.com/tv/211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale/',
'info_dict': {
'id': '211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale',
'title': 'md5:a21ec7d3872ed98335cbd2a046f34ee6',
'description': 'md5:8b94484063f463cca709617c79618ccd',
},
'playlist_count': 2,
}, {
'url': 'http://www.etonline.com/media/video/here_are_the_stars_who_love_bringing_their_moms_as_dates_to_the_oscars-211359/',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911076001/default_default/index.html?videoId=ref:%s'
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % video_id, 'BrightcoveNew', video_id)
for video_id in re.findall(
r'site\.brightcove\s*\([^,]+,\s*["\'](title_\d+)', webpage)]
return self.playlist_result(
entries, playlist_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))

View File

@ -288,6 +288,7 @@ from .espn import (
ESPNArticleIE,
)
from .esri import EsriVideoIE
from .etonline import ETOnlineIE
from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE
from .expotv import ExpoTVIE
@ -338,6 +339,7 @@ from .francetv import (
)
from .freesound import FreesoundIE
from .freespeech import FreespeechIE
from .freshlive import FreshLiveIE
from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE
@ -637,6 +639,7 @@ from .ninecninemedia import (
from .ninegag import NineGagIE
from .ninenow import NineNowIE
from .nintendo import NintendoIE
from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE
from .noco import NocoIE
from .normalboots import NormalbootsIE
@ -666,6 +669,7 @@ from .npo import (
NPORadioIE,
NPORadioFragmentIE,
SchoolTVIE,
HetKlokhuisIE,
VPROIE,
WNLIE,
)
@ -835,7 +839,6 @@ from .safari import (
from .sapo import SapoIE
from .savefrom import SaveFromIE
from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .scrippsnetworks import ScrippsNetworksWatchIE

View File

@ -0,0 +1,84 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
try_get,
unified_timestamp,
)
class FreshLiveIE(InfoExtractor):
_VALID_URL = r'https?://freshlive\.tv/[^/]+/(?P<id>\d+)'
_TEST = {
'url': 'https://freshlive.tv/satotv/74712',
'md5': '9f0cf5516979c4454ce982df3d97f352',
'info_dict': {
'id': '74712',
'ext': 'mp4',
'title': 'テスト',
'description': 'テスト',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1511,
'timestamp': 1483619655,
'upload_date': '20170105',
'uploader': 'サトTV',
'uploader_id': 'satotv',
'view_count': int,
'comment_count': int,
'is_live': False,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
options = self._parse_json(
self._search_regex(
r'window\.__CONTEXT__\s*=\s*({.+?});\s*</script>',
webpage, 'initial context'),
video_id)
info = options['context']['dispatcher']['stores']['ProgramStore']['programs'][video_id]
title = info['title']
if info.get('status') == 'upcoming':
raise ExtractorError('Stream %s is upcoming' % video_id, expected=True)
stream_url = info.get('liveStreamUrl') or info['archiveStreamUrl']
is_live = info.get('liveStreamUrl') is not None
formats = self._extract_m3u8_formats(
stream_url, video_id, ext='mp4',
entry_protocol='m3u8' if is_live else 'm3u8_native',
m3u8_id='hls')
if is_live:
title = self._live_title(title)
return {
'id': video_id,
'formats': formats,
'title': title,
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'duration': int_or_none(info.get('airTime')),
'timestamp': unified_timestamp(info.get('createdAt')),
'uploader': try_get(
info, lambda x: x['channel']['title'], compat_str),
'uploader_id': try_get(
info, lambda x: x['channel']['code'], compat_str),
'uploader_url': try_get(
info, lambda x: x['channel']['permalink'], compat_str),
'view_count': int_or_none(info.get('viewCount')),
'comment_count': int_or_none(info.get('commentCount')),
'tags': info.get('tags', []),
'is_live': is_live,
}

View File

@ -16,6 +16,8 @@ class IviIE(InfoExtractor):
IE_DESC = 'ivi.ru'
IE_NAME = 'ivi'
_VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [
# Single movie
@ -91,7 +93,11 @@ class IviIE(InfoExtractor):
if 'error' in video_json:
error = video_json['error']
if error['origin'] == 'NoRedisValidData':
origin = error['origin']
if origin == 'NotAllowedForLocation':
self.raise_geo_restricted(
msg=error['message'], countries=self._GEO_COUNTRIES)
elif origin == 'NoRedisValidData':
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
raise ExtractorError(
'Unable to download video %s: %s' % (video_id, error['message']),

View File

@ -14,7 +14,7 @@ from ..utils import (
class MDRIE(InfoExtractor):
IE_DESC = 'MDR.DE and KiKA'
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z]+-?(?P<id>\d+)(?:_.+?)?\.html'
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
_TESTS = [{
# MDR regularly deletes its videos
@ -31,6 +31,7 @@ class MDRIE(InfoExtractor):
'duration': 250,
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
'skip': '404 not found',
}, {
'url': 'http://www.kika.de/baumhaus/videos/video19636.html',
'md5': '4930515e36b06c111213e80d1e4aad0e',
@ -41,6 +42,7 @@ class MDRIE(InfoExtractor):
'duration': 134,
'uploader': 'KIKA',
},
'skip': '404 not found',
}, {
'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html',
'md5': '5fe9c4dd7d71e3b238f04b8fdd588357',
@ -49,11 +51,21 @@ class MDRIE(InfoExtractor):
'ext': 'mp4',
'title': 'Beutolomäus und der geheime Weihnachtswunsch',
'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd',
'timestamp': 1450950000,
'upload_date': '20151224',
'timestamp': 1482541200,
'upload_date': '20161224',
'duration': 4628,
'uploader': 'KIKA',
},
}, {
# audio with alternative playerURL pattern
'url': 'http://www.mdr.de/kultur/videos-und-audios/audio-radio/operation-mindfuck-robert-wilson100.html',
'info_dict': {
'id': '100',
'ext': 'mp4',
'title': 'Feature: Operation Mindfuck - Robert Anton Wilson',
'duration': 3239,
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
}, {
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
'only_matching': True,
@ -71,7 +83,7 @@ class MDRIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
data_url = self._search_regex(
r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+/(?:video|audio)-?[0-9]+-avCustom\.xml)\1',
r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+?-avCustom\.xml)\1',
webpage, 'data url', group='url').replace(r'\/', '/')
doc = self._download_xml(

View File

@ -0,0 +1,83 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
get_element_by_class,
urlencode_postdata,
)
class NJPWWorldIE(InfoExtractor):
_VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
IE_DESC = '新日本プロレスワールド'
_NETRC_MACHINE = 'njpwworld'
_TEST = {
'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
'info_dict': {
'id': 's_series_00155_1_9',
'ext': 'mp4',
'title': '第9試合 ランディ・サベージ vs リック・スタイナー',
'tags': list,
},
'params': {
'skip_download': True, # AES-encrypted m3u8
},
'skip': 'Requires login',
}
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
# No authentication to be performed
if not username:
return True
webpage, urlh = self._download_webpage_handle(
'https://njpwworld.com/auth/login', None,
note='Logging in', errnote='Unable to login',
data=urlencode_postdata({'login_id': username, 'pw': password}))
# /auth/login will return 302 for successful logins
if urlh.geturl() == 'https://njpwworld.com/auth/login':
self.report_warning('unable to login')
return False
return True
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
formats = []
for player_url, kind in re.findall(r'<a[^>]+href="(/player[^"]+)".+?<img[^>]+src="[^"]+qf_btn_([^".]+)', webpage):
player_url = compat_urlparse.urljoin(url, player_url)
player_page = self._download_webpage(
player_url, video_id, note='Downloading player page')
entries = self._parse_html5_media_entries(
player_url, player_page, video_id, m3u8_id='hls-%s' % kind,
m3u8_entry_protocol='m3u8_native',
preference=2 if 'hq' in kind else 1)
formats.extend(entries[0]['formats'])
self._sort_formats(formats)
post_content = get_element_by_class('post-content', webpage)
tags = re.findall(
r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content
) if post_content else None
return {
'id': video_id,
'title': self._og_search_title(webpage),
'formats': formats,
'tags': tags,
}

View File

@ -23,7 +23,7 @@ from ..utils import (
class NocoIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)'
_LOGIN_URL = 'http://noco.tv/do.php'
_LOGIN_URL = 'https://noco.tv/do.php'
_API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s'
_SUB_LANG_TEMPLATE = '&sub_lang=%s'
_NETRC_MACHINE = 'noco'
@ -69,16 +69,17 @@ class NocoIE(InfoExtractor):
if username is None:
return
login_form = {
'a': 'login',
'cookie': '1',
'username': username,
'password': password,
}
request = sanitized_Request(self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8')
login = self._download_json(request, None, 'Logging in as %s' % username)
login = self._download_json(
self._LOGIN_URL, None, 'Logging in as %s' % username,
data=urlencode_postdata({
'a': 'login',
'cookie': '1',
'username': username,
'password': password,
}),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
})
if 'erreur' in login:
raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True)

View File

@ -51,7 +51,8 @@ class NPOIE(NPOBaseIE):
(?:
npo\.nl/(?!live|radio)(?:[^/]+/){2}|
ntr\.nl/(?:[^/]+/){2,}|
omroepwnl\.nl/video/fragment/[^/]+__
omroepwnl\.nl/video/fragment/[^/]+__|
zapp\.nl/[^/]+/[^/]+/
)
)
(?P<id>[^/?#]+)
@ -140,6 +141,18 @@ class NPOIE(NPOBaseIE):
'upload_date': '20150508',
'duration': 462,
},
},
{
'url': 'http://www.zapp.nl/de-bzt-show/gemist/KN_1687547',
'only_matching': True,
},
{
'url': 'http://www.zapp.nl/de-bzt-show/filmpjes/POMS_KN_7315118',
'only_matching': True,
},
{
'url': 'http://www.zapp.nl/beste-vrienden-quiz/extra-video-s/WO_NTR_1067990',
'only_matching': True,
}
]
@ -416,7 +429,21 @@ class NPORadioFragmentIE(InfoExtractor):
}
class SchoolTVIE(InfoExtractor):
class NPODataMidEmbedIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
return {
'_type': 'url_transparent',
'ie_key': 'NPO',
'url': 'npo:%s' % video_id,
'display_id': display_id
}
class SchoolTVIE(NPODataMidEmbedIE):
IE_NAME = 'schooltv'
_VALID_URL = r'https?://(?:www\.)?schooltv\.nl/video/(?P<id>[^/?#&]+)'
@ -435,17 +462,25 @@ class SchoolTVIE(InfoExtractor):
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
return {
'_type': 'url_transparent',
'ie_key': 'NPO',
'url': 'npo:%s' % video_id,
'display_id': display_id
class HetKlokhuisIE(NPODataMidEmbedIE):
IE_NAME = 'hetklokhuis'
_VALID_URL = r'https?://(?:www\.)?hetklokhuis.nl/[^/]+/\d+/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://hetklokhuis.nl/tv-uitzending/3471/Zwaartekrachtsgolven',
'info_dict': {
'id': 'VPWON_1260528',
'display_id': 'Zwaartekrachtsgolven',
'ext': 'm4v',
'title': 'Het Klokhuis: Zwaartekrachtsgolven',
'description': 'md5:c94f31fb930d76c2efa4a4a71651dd48',
'upload_date': '20170223',
},
'params': {
'skip_download': True
}
}
class NPOPlaylistBaseIE(NPOIE):

View File

@ -72,16 +72,21 @@ class OpenloadIE(InfoExtractor):
raise ExtractorError('File not found', expected=True)
ol_id = self._search_regex(
'<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>',
'<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
webpage, 'openload ID')
first_two_chars = int(float(ol_id[0:][:2]))
first_char = int(ol_id[0])
urlcode = []
num = 2
num = 1
while num < len(ol_id):
key = int(float(ol_id[num + 3:][:2]))
urlcode.append((key, compat_chr(int(float(ol_id[num:][:3])) - first_two_chars)))
i = ord(ol_id[num])
key = 0
if i <= 90:
key = i - 65
elif i >= 97:
key = 25 + i - 97
urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
num += 5
video_url = 'https://openload.co/stream/' + ''.join(

View File

@ -1,57 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
class SciVeeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?scivee\.tv/node/(?P<id>\d+)'
_TEST = {
'url': 'http://www.scivee.tv/node/62352',
'md5': 'b16699b74c9e6a120f6772a44960304f',
'info_dict': {
'id': '62352',
'ext': 'mp4',
'title': 'Adam Arkin at the 2014 DOE JGI Genomics of Energy & Environment Meeting',
'description': 'md5:81f1710638e11a481358fab1b11059d7',
},
'skip': 'Not accessible from Travis CI server',
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
# annotations XML is malformed
annotations = self._download_webpage(
'http://www.scivee.tv/assets/annotations/%s' % video_id, video_id, 'Downloading annotations')
title = self._html_search_regex(r'<title>([^<]+)</title>', annotations, 'title')
description = self._html_search_regex(r'<abstract>([^<]+)</abstract>', annotations, 'abstract', fatal=False)
filesize = int_or_none(self._html_search_regex(
r'<filesize>([^<]+)</filesize>', annotations, 'filesize', fatal=False))
formats = [
{
'url': 'http://www.scivee.tv/assets/audio/%s' % video_id,
'ext': 'mp3',
'format_id': 'audio',
},
{
'url': 'http://www.scivee.tv/assets/video/%s' % video_id,
'ext': 'mp4',
'format_id': 'video',
'filesize': filesize,
},
]
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': 'http://www.scivee.tv/assets/videothumb/%s' % video_id,
'formats': formats,
}

View File

@ -3,7 +3,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import qualities
from ..utils import (
int_or_none,
qualities,
)
class TheSceneIE(InfoExtractor):
@ -16,6 +19,11 @@ class TheSceneIE(InfoExtractor):
'ext': 'mp4',
'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear',
'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear',
'duration': 127,
'series': 'Style.com Fashion Shows',
'season': 'Ready To Wear Spring 2013',
'tags': list,
'categories': list,
},
}
@ -32,21 +40,29 @@ class TheSceneIE(InfoExtractor):
player = self._download_webpage(player_url, display_id)
info = self._parse_json(
self._search_regex(
r'(?m)var\s+video\s+=\s+({.+?});$', player, 'info json'),
r'(?m)video\s*:\s*({.+?}),$', player, 'info json'),
display_id)
video_id = info['id']
title = info['title']
qualities_order = qualities(('low', 'high'))
formats = [{
'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']),
'url': f['src'],
'quality': qualities_order(f['quality']),
} for f in info['sources'][0]]
} for f in info['sources']]
self._sort_formats(formats)
return {
'id': info['id'],
'id': video_id,
'display_id': display_id,
'title': info['title'],
'title': title,
'formats': formats,
'thumbnail': info.get('poster_frame'),
'duration': int_or_none(info.get('duration')),
'series': info.get('series_title'),
'season': info.get('season_title'),
'tags': info.get('tags'),
'categories': info.get('categories'),
}

View File

@ -16,6 +16,7 @@ class TubiTvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)'
_LOGIN_URL = 'http://tubitv.com/login'
_NETRC_MACHINE = 'tubitv'
_GEO_COUNTRIES = ['US']
_TEST = {
'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday',
'md5': '43ac06be9326f41912dc64ccf7a80320',

View File

@ -17,6 +17,9 @@ class TvigleIE(InfoExtractor):
IE_DESC = 'Интернет-телевидение Tvigle.ru'
_VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [
{
'url': 'http://www.tvigle.ru/video/sokrat/',
@ -72,8 +75,13 @@ class TvigleIE(InfoExtractor):
error_message = item.get('errorMessage')
if not videos and error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
if item.get('isGeoBlocked') is True:
self.raise_geo_restricted(
msg=error_message, countries=self._GEO_COUNTRIES)
else:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message),
expected=True)
title = item['title']
description = item.get('description')

View File

@ -17,12 +17,12 @@ from ..utils import (
class VevoBaseIE(InfoExtractor):
def _extract_json(self, webpage, video_id, item):
def _extract_json(self, webpage, video_id):
return self._parse_json(
self._search_regex(
r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>',
webpage, 'initial store'),
video_id)['default'][item]
video_id)
class VevoIE(VevoBaseIE):
@ -139,6 +139,11 @@ class VevoIE(VevoBaseIE):
# no genres available
'url': 'http://www.vevo.com/watch/INS171400764',
'only_matching': True,
}, {
# Another case available only via the webpage; using streams/streamsV3 formats
# Geo-restricted to Netherlands/Germany
'url': 'http://www.vevo.com/watch/boostee/pop-corn-clip-officiel/FR1A91600909',
'only_matching': True,
}]
_VERSIONS = {
0: 'youtube', # only in AuthenticateVideo videoVersions
@ -193,7 +198,14 @@ class VevoIE(VevoBaseIE):
# https://github.com/rg3/youtube-dl/issues/9366)
if not video_versions:
webpage = self._download_webpage(url, video_id)
video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0]
json_data = self._extract_json(webpage, video_id)
if 'streams' in json_data.get('default', {}):
video_versions = json_data['default']['streams'][video_id][0]
else:
video_versions = [
value
for key, value in json_data['apollo']['data'].items()
if key.startswith('%s.streams' % video_id)]
uploader = None
artist = None
@ -207,7 +219,7 @@ class VevoIE(VevoBaseIE):
formats = []
for video_version in video_versions:
version = self._VERSIONS.get(video_version['version'])
version = self._VERSIONS.get(video_version.get('version'), 'generic')
version_url = video_version.get('url')
if not version_url:
continue
@ -339,7 +351,7 @@ class VevoPlaylistIE(VevoBaseIE):
if video_id:
return self.url_result('vevo:%s' % video_id, VevoIE.ie_key())
playlists = self._extract_json(webpage, playlist_id, '%ss' % playlist_kind)
playlists = self._extract_json(webpage, playlist_id)['default']['%ss' % playlist_kind]
playlist = (list(playlists.values())[0]
if playlist_kind == 'playlist' else playlists[playlist_id])

View File

@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import (
dict_get,
ExtractorError,
int_or_none,
parse_duration,
unified_strdate,
@ -57,6 +58,10 @@ class XHamsterIE(InfoExtractor):
}, {
'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html',
'only_matching': True,
}, {
# This video is visible for marcoalfa123456's friends only
'url': 'https://it.xhamster.com/movies/7263980/la_mia_vicina.html',
'only_matching': True,
}]
def _real_extract(self, url):
@ -78,6 +83,12 @@ class XHamsterIE(InfoExtractor):
mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
webpage = self._download_webpage(mrss_url, video_id)
error = self._html_search_regex(
r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
title = self._html_search_regex(
[r'<h1[^>]*>([^<]+)</h1>',
r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"',

View File

@ -47,7 +47,6 @@ from ..utils import (
unsmuggle_url,
uppercase_escape,
urlencode_postdata,
ISO3166Utils,
)
@ -371,6 +370,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
}
_SUBTITLE_FORMATS = ('ttml', 'vtt')
_GEO_BYPASS = False
IE_NAME = 'youtube'
_TESTS = [
{
@ -917,7 +918,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# itag 212
'url': '1t24XAntNCY',
'only_matching': True,
}
},
{
# geo restricted to JP
'url': 'sJL6WA-aGkQ',
'only_matching': True,
},
]
def __init__(self, *args, **kwargs):
@ -1376,11 +1382,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if 'token' not in video_info:
if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta('regionsAllowed', video_webpage, default=None)
if regions_allowed:
raise ExtractorError('YouTube said: This video is available in %s only' % (
', '.join(map(ISO3166Utils.short2full, regions_allowed.split(',')))),
expected=True)
regions_allowed = self._html_search_meta(
'regionsAllowed', video_webpage, default=None)
countries = regions_allowed.split(',') if regions_allowed else None
self.raise_geo_restricted(
msg=video_info['reason'][0], countries=countries)
raise ExtractorError(
'YouTube said: %s' % video_info['reason'][0],
expected=True, video_id=video_id)
@ -2226,7 +2232,7 @@ class YoutubeUserIE(YoutubeChannelIE):
'url': 'https://www.youtube.com/gametrailers',
'only_matching': True,
}, {
# This channel is not available.
# This channel is not available, geo restricted to JP
'url': 'https://www.youtube.com/user/kananishinoSMEJ/videos',
'only_matching': True,
}]

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.02.24'
__version__ = '2017.02.27'