Compare commits
30 Commits
2017.02.07
...
2017.02.11
Author | SHA1 | Date | |
---|---|---|---|
9b92a5917b | |||
3e2274c8b7 | |||
3d7e3aaa0e | |||
624c4b92ff | |||
2af12ad9d2 | |||
97eb9bd2ac | |||
71cdd75628 | |||
c7d6f614f3 | |||
08a00eef79 | |||
9dd5408c99 | |||
9510709575 | |||
5abcca9060 | |||
e01bfc19c3 | |||
4d32b63851 | |||
55d4de2283 | |||
61ee556aea | |||
ff24261ba0 | |||
fbc6dc525e | |||
9150d1eb69 | |||
b7f9843bec | |||
e64b0fca14 | |||
78ef214d2d | |||
be670b8e8f | |||
37084f6641 | |||
b04975733c | |||
c8b8fb0a99 | |||
8298018273 | |||
ae8d5a5c59 | |||
b9c9cb5f79 | |||
fdf9b959bc |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@ -6,8 +6,8 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.07*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.11*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.07**
|
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.11**
|
||||||
|
|
||||||
### Before submitting an *issue* make sure you have:
|
### Before submitting an *issue* make sure you have:
|
||||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||||
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
|||||||
[debug] User config: []
|
[debug] User config: []
|
||||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||||
[debug] youtube-dl version 2017.02.07
|
[debug] youtube-dl version 2017.02.11
|
||||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||||
[debug] Proxy map: {}
|
[debug] Proxy map: {}
|
||||||
|
33
ChangeLog
33
ChangeLog
@ -1,3 +1,36 @@
|
|||||||
|
version 2017.02.11
|
||||||
|
|
||||||
|
Core
|
||||||
|
+ [utils] Introduce get_elements_by_class and get_elements_by_attribute
|
||||||
|
utility functions
|
||||||
|
+ [extractor/common] Skip m3u8 manifests protected with Adobe Flash Access
|
||||||
|
|
||||||
|
Extractor
|
||||||
|
* [pluralsight:course] Fix extraction (#12075)
|
||||||
|
+ [bbc] Extract m3u8 formats with 320k audio
|
||||||
|
* [facebook] Relax video id matching (#11017, #12055, #12056)
|
||||||
|
+ [corus] Add support for Corus Entertainment sites (#12060, #9164)
|
||||||
|
+ [pluralsight] Detect blocked account error message (#12070)
|
||||||
|
+ [bloomberg] Add another video id pattern (#12062)
|
||||||
|
* [extractor/commonmistakes] Restrict URL regular expression (#12050)
|
||||||
|
+ [tvplayer] Add support for tvplayer.com
|
||||||
|
|
||||||
|
|
||||||
|
version 2017.02.10
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [xtube] Fix extraction (#12023)
|
||||||
|
* [pornhub] Fix extraction (#12007, #12018)
|
||||||
|
* [facebook] Improve JS data regular expression (#12042)
|
||||||
|
* [kaltura] Improve embed partner id extraction (#12041)
|
||||||
|
+ [sprout] Add support for sproutonline.com
|
||||||
|
* [6play] Improve extraction
|
||||||
|
+ [scrippsnetworks:watch] Add support for Scripps Networks sites (#10765)
|
||||||
|
+ [go] Add support for Adobe Pass authentication (#11468, #10831)
|
||||||
|
* [6play] Fix extraction (#12011)
|
||||||
|
+ [nbc] Add support for Adobe Pass authentication (#12006)
|
||||||
|
|
||||||
|
|
||||||
version 2017.02.07
|
version 2017.02.07
|
||||||
|
|
||||||
Core
|
Core
|
||||||
|
@ -11,6 +11,7 @@
|
|||||||
- **4tube**
|
- **4tube**
|
||||||
- **56.com**
|
- **56.com**
|
||||||
- **5min**
|
- **5min**
|
||||||
|
- **6play**
|
||||||
- **8tracks**
|
- **8tracks**
|
||||||
- **91porn**
|
- **91porn**
|
||||||
- **9c9media**
|
- **9c9media**
|
||||||
@ -168,6 +169,7 @@
|
|||||||
- **ComedyCentralShortname**
|
- **ComedyCentralShortname**
|
||||||
- **ComedyCentralTV**
|
- **ComedyCentralTV**
|
||||||
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
||||||
|
- **Corus**
|
||||||
- **Coub**
|
- **Coub**
|
||||||
- **Cracked**
|
- **Cracked**
|
||||||
- **Crackle**
|
- **Crackle**
|
||||||
@ -308,7 +310,6 @@
|
|||||||
- **HellPorno**
|
- **HellPorno**
|
||||||
- **Helsinki**: helsinki.fi
|
- **Helsinki**: helsinki.fi
|
||||||
- **HentaiStigma**
|
- **HentaiStigma**
|
||||||
- **HGTV**
|
|
||||||
- **hgtv.com:show**
|
- **hgtv.com:show**
|
||||||
- **HistoricFilms**
|
- **HistoricFilms**
|
||||||
- **history:topic**: History.com Topic
|
- **history:topic**: History.com Topic
|
||||||
@ -667,6 +668,7 @@
|
|||||||
- **screen.yahoo:search**: Yahoo screen search
|
- **screen.yahoo:search**: Yahoo screen search
|
||||||
- **Screencast**
|
- **Screencast**
|
||||||
- **ScreencastOMatic**
|
- **ScreencastOMatic**
|
||||||
|
- **scrippsnetworks:watch**
|
||||||
- **Seeker**
|
- **Seeker**
|
||||||
- **SenateISVP**
|
- **SenateISVP**
|
||||||
- **SendtoNews**
|
- **SendtoNews**
|
||||||
@ -676,7 +678,6 @@
|
|||||||
- **Shared**: shared.sx
|
- **Shared**: shared.sx
|
||||||
- **ShowRoomLive**
|
- **ShowRoomLive**
|
||||||
- **Sina**
|
- **Sina**
|
||||||
- **SixPlay**
|
|
||||||
- **skynewsarabia:article**
|
- **skynewsarabia:article**
|
||||||
- **skynewsarabia:video**
|
- **skynewsarabia:video**
|
||||||
- **SkySports**
|
- **SkySports**
|
||||||
@ -711,6 +712,7 @@
|
|||||||
- **SportBoxEmbed**
|
- **SportBoxEmbed**
|
||||||
- **SportDeutschland**
|
- **SportDeutschland**
|
||||||
- **Sportschau**
|
- **Sportschau**
|
||||||
|
- **Sprout**
|
||||||
- **sr:mediathek**: Saarländischer Rundfunk
|
- **sr:mediathek**: Saarländischer Rundfunk
|
||||||
- **SRGSSR**
|
- **SRGSSR**
|
||||||
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
|
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
|
||||||
@ -804,6 +806,7 @@
|
|||||||
- **tvp**: Telewizja Polska
|
- **tvp**: Telewizja Polska
|
||||||
- **tvp:embed**: Telewizja Polska
|
- **tvp:embed**: Telewizja Polska
|
||||||
- **tvp:series**
|
- **tvp:series**
|
||||||
|
- **TVPlayer**
|
||||||
- **Tweakers**
|
- **Tweakers**
|
||||||
- **twitch:chapter**
|
- **twitch:chapter**
|
||||||
- **twitch:clips**
|
- **twitch:clips**
|
||||||
|
@ -34,6 +34,9 @@ from youtube_dl.utils import (
|
|||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
fix_xml_ampersands,
|
fix_xml_ampersands,
|
||||||
get_element_by_class,
|
get_element_by_class,
|
||||||
|
get_element_by_attribute,
|
||||||
|
get_elements_by_class,
|
||||||
|
get_elements_by_attribute,
|
||||||
InAdvancePagedList,
|
InAdvancePagedList,
|
||||||
intlist_to_bytes,
|
intlist_to_bytes,
|
||||||
is_html,
|
is_html,
|
||||||
@ -1124,6 +1127,32 @@ The first line
|
|||||||
self.assertEqual(get_element_by_class('foo', html), 'nice')
|
self.assertEqual(get_element_by_class('foo', html), 'nice')
|
||||||
self.assertEqual(get_element_by_class('no-such-class', html), None)
|
self.assertEqual(get_element_by_class('no-such-class', html), None)
|
||||||
|
|
||||||
|
def test_get_element_by_attribute(self):
|
||||||
|
html = '''
|
||||||
|
<span class="foo bar">nice</span>
|
||||||
|
'''
|
||||||
|
|
||||||
|
self.assertEqual(get_element_by_attribute('class', 'foo bar', html), 'nice')
|
||||||
|
self.assertEqual(get_element_by_attribute('class', 'foo', html), None)
|
||||||
|
self.assertEqual(get_element_by_attribute('class', 'no-such-foo', html), None)
|
||||||
|
|
||||||
|
def test_get_elements_by_class(self):
|
||||||
|
html = '''
|
||||||
|
<span class="foo bar">nice</span><span class="foo bar">also nice</span>
|
||||||
|
'''
|
||||||
|
|
||||||
|
self.assertEqual(get_elements_by_class('foo', html), ['nice', 'also nice'])
|
||||||
|
self.assertEqual(get_elements_by_class('no-such-class', html), [])
|
||||||
|
|
||||||
|
def test_get_elements_by_attribute(self):
|
||||||
|
html = '''
|
||||||
|
<span class="foo bar">nice</span><span class="foo bar">also nice</span>
|
||||||
|
'''
|
||||||
|
|
||||||
|
self.assertEqual(get_elements_by_attribute('class', 'foo bar', html), ['nice', 'also nice'])
|
||||||
|
self.assertEqual(get_elements_by_attribute('class', 'foo', html), [])
|
||||||
|
self.assertEqual(get_elements_by_attribute('class', 'no-such-foo', html), [])
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -275,7 +275,7 @@ class FFmpegFD(ExternalFD):
|
|||||||
args += ['-f', 'mpegts']
|
args += ['-f', 'mpegts']
|
||||||
else:
|
else:
|
||||||
args += ['-f', 'mp4']
|
args += ['-f', 'mp4']
|
||||||
if (ffpp.basename == 'ffmpeg' and is_outdated_version(ffpp._versions['ffmpeg'], '3.2')) and (not info_dict.get('acodec') or info_dict['acodec'].split('.')[0] in ('aac', 'mp4a')):
|
if (ffpp.basename == 'ffmpeg' and is_outdated_version(ffpp._versions['ffmpeg'], '3.2', False)) and (not info_dict.get('acodec') or info_dict['acodec'].split('.')[0] in ('aac', 'mp4a')):
|
||||||
args += ['-bsf:a', 'aac_adtstoasc']
|
args += ['-bsf:a', 'aac_adtstoasc']
|
||||||
elif protocol == 'rtmp':
|
elif protocol == 'rtmp':
|
||||||
args += ['-f', 'flv']
|
args += ['-f', 'flv']
|
||||||
|
@ -225,6 +225,8 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
|
_USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
|
||||||
|
|
||||||
class MediaSelectionError(Exception):
|
class MediaSelectionError(Exception):
|
||||||
def __init__(self, id):
|
def __init__(self, id):
|
||||||
self.id = id
|
self.id = id
|
||||||
@ -336,6 +338,15 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
|
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
|
||||||
m3u8_id=format_id, fatal=False))
|
m3u8_id=format_id, fatal=False))
|
||||||
|
if re.search(self._USP_RE, href):
|
||||||
|
usp_formats = self._extract_m3u8_formats(
|
||||||
|
re.sub(self._USP_RE, r'/\1.ism/\1.m3u8', href),
|
||||||
|
programme_id, ext='mp4', entry_protocol='m3u8_native',
|
||||||
|
m3u8_id=format_id, fatal=False)
|
||||||
|
for f in usp_formats:
|
||||||
|
if f.get('height') and f['height'] > 720:
|
||||||
|
continue
|
||||||
|
formats.append(f)
|
||||||
elif transfer_format == 'hds':
|
elif transfer_format == 'hds':
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
href, programme_id, f4m_id=format_id, fatal=False))
|
href, programme_id, f4m_id=format_id, fatal=False))
|
||||||
|
@ -33,6 +33,10 @@ class BloombergIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'format': 'best[format_id^=hds]',
|
'format': 'best[format_id^=hds]',
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# data-bmmrid=
|
||||||
|
'url': 'https://www.bloomberg.com/politics/articles/2017-02-08/le-pen-aide-briefed-french-central-banker-on-plan-to-print-money',
|
||||||
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
|
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -45,9 +49,10 @@ class BloombergIE(InfoExtractor):
|
|||||||
name = self._match_id(url)
|
name = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, name)
|
webpage = self._download_webpage(url, name)
|
||||||
video_id = self._search_regex(
|
video_id = self._search_regex(
|
||||||
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||||
r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
|
r'videoId\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||||
webpage, 'id', group='url', default=None)
|
r'data-bmmrid=(["\'])(?P<id>(?:(?!\1).)+)\1'),
|
||||||
|
webpage, 'id', group='id', default=None)
|
||||||
if not video_id:
|
if not video_id:
|
||||||
bplayer_data = self._parse_json(self._search_regex(
|
bplayer_data = self._parse_json(self._search_regex(
|
||||||
r'BPlayer\(null,\s*({[^;]+})\);', webpage, 'id'), name)
|
r'BPlayer\(null,\s*({[^;]+})\);', webpage, 'id'), name)
|
||||||
|
@ -1208,6 +1208,9 @@ class InfoExtractor(object):
|
|||||||
m3u8_doc, urlh = res
|
m3u8_doc, urlh = res
|
||||||
m3u8_url = urlh.geturl()
|
m3u8_url = urlh.geturl()
|
||||||
|
|
||||||
|
if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access
|
||||||
|
return []
|
||||||
|
|
||||||
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
|
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
|
||||||
|
|
||||||
format_url = lambda u: (
|
format_url = lambda u: (
|
||||||
|
@ -7,7 +7,7 @@ from ..utils import ExtractorError
|
|||||||
class CommonMistakesIE(InfoExtractor):
|
class CommonMistakesIE(InfoExtractor):
|
||||||
IE_DESC = False # Do not list
|
IE_DESC = False # Do not list
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
(?:url|URL)
|
(?:url|URL)$
|
||||||
'''
|
'''
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
|
72
youtube_dl/extractor/corus.py
Normal file
72
youtube_dl/extractor/corus.py
Normal file
@ -0,0 +1,72 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .theplatform import ThePlatformFeedIE
|
||||||
|
from ..utils import int_or_none
|
||||||
|
|
||||||
|
|
||||||
|
class CorusIE(ThePlatformFeedIE):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:globaltv|etcanada)\.com|(?:hgtv|foodnetwork|slice)\.ca)/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/',
|
||||||
|
'md5': '05dcbca777bf1e58c2acbb57168ad3a6',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '870923331648',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Movie Night Popcorn with Bryan',
|
||||||
|
'description': 'Bryan whips up homemade popcorn, the old fashion way for Jojo and Lincoln.',
|
||||||
|
'uploader': 'SHWM-NEW',
|
||||||
|
'upload_date': '20170206',
|
||||||
|
'timestamp': 1486392197,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.foodnetwork.ca/shows/chopped/video/episode/chocolate-obsession/video.html?v=872683587753',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://etcanada.com/video/873675331955/meet-the-survivor-game-changers-castaways-part-2/',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
_TP_FEEDS = {
|
||||||
|
'globaltv': {
|
||||||
|
'feed_id': 'ChQqrem0lNUp',
|
||||||
|
'account_id': 2269680845,
|
||||||
|
},
|
||||||
|
'etcanada': {
|
||||||
|
'feed_id': 'ChQqrem0lNUp',
|
||||||
|
'account_id': 2269680845,
|
||||||
|
},
|
||||||
|
'hgtv': {
|
||||||
|
'feed_id': 'L0BMHXi2no43',
|
||||||
|
'account_id': 2414428465,
|
||||||
|
},
|
||||||
|
'foodnetwork': {
|
||||||
|
'feed_id': 'ukK8o58zbRmJ',
|
||||||
|
'account_id': 2414429569,
|
||||||
|
},
|
||||||
|
'slice': {
|
||||||
|
'feed_id': '5tUJLgV2YNJ5',
|
||||||
|
'account_id': 2414427935,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
domain, video_id = re.match(self._VALID_URL, url).groups()
|
||||||
|
feed_info = self._TP_FEEDS[domain.split('.')[0]]
|
||||||
|
return self._extract_feed_info('dtjsEC', feed_info['feed_id'], 'byId=' + video_id, video_id, lambda e: {
|
||||||
|
'episode_number': int_or_none(e.get('pl1$episode')),
|
||||||
|
'season_number': int_or_none(e.get('pl1$season')),
|
||||||
|
'series': e.get('pl1$show'),
|
||||||
|
}, {
|
||||||
|
'HLS': {
|
||||||
|
'manifest': 'm3u',
|
||||||
|
},
|
||||||
|
'DesktopHLS Default': {
|
||||||
|
'manifest': 'm3u',
|
||||||
|
},
|
||||||
|
'MP4 MBR': {
|
||||||
|
'manifest': 'm3u',
|
||||||
|
},
|
||||||
|
}, feed_info['account_id'])
|
@ -202,6 +202,7 @@ from .commonprotocols import (
|
|||||||
RtmpIE,
|
RtmpIE,
|
||||||
)
|
)
|
||||||
from .condenast import CondeNastIE
|
from .condenast import CondeNastIE
|
||||||
|
from .corus import CorusIE
|
||||||
from .cracked import CrackedIE
|
from .cracked import CrackedIE
|
||||||
from .crackle import CrackleIE
|
from .crackle import CrackleIE
|
||||||
from .criterion import CriterionIE
|
from .criterion import CriterionIE
|
||||||
@ -381,10 +382,7 @@ from .heise import HeiseIE
|
|||||||
from .hellporno import HellPornoIE
|
from .hellporno import HellPornoIE
|
||||||
from .helsinki import HelsinkiIE
|
from .helsinki import HelsinkiIE
|
||||||
from .hentaistigma import HentaiStigmaIE
|
from .hentaistigma import HentaiStigmaIE
|
||||||
from .hgtv import (
|
from .hgtv import HGTVComShowIE
|
||||||
HGTVIE,
|
|
||||||
HGTVComShowIE,
|
|
||||||
)
|
|
||||||
from .historicfilms import HistoricFilmsIE
|
from .historicfilms import HistoricFilmsIE
|
||||||
from .hitbox import HitboxIE, HitboxLiveIE
|
from .hitbox import HitboxIE, HitboxLiveIE
|
||||||
from .hitrecord import HitRecordIE
|
from .hitrecord import HitRecordIE
|
||||||
@ -838,6 +836,7 @@ from .sbs import SBSIE
|
|||||||
from .scivee import SciVeeIE
|
from .scivee import SciVeeIE
|
||||||
from .screencast import ScreencastIE
|
from .screencast import ScreencastIE
|
||||||
from .screencastomatic import ScreencastOMaticIE
|
from .screencastomatic import ScreencastOMaticIE
|
||||||
|
from .scrippsnetworks import ScrippsNetworksWatchIE
|
||||||
from .seeker import SeekerIE
|
from .seeker import SeekerIE
|
||||||
from .senateisvp import SenateISVPIE
|
from .senateisvp import SenateISVPIE
|
||||||
from .sendtonews import SendtoNewsIE
|
from .sendtonews import SendtoNewsIE
|
||||||
@ -895,6 +894,7 @@ from .sport5 import Sport5IE
|
|||||||
from .sportbox import SportBoxEmbedIE
|
from .sportbox import SportBoxEmbedIE
|
||||||
from .sportdeutschland import SportDeutschlandIE
|
from .sportdeutschland import SportDeutschlandIE
|
||||||
from .sportschau import SportschauIE
|
from .sportschau import SportschauIE
|
||||||
|
from .sprout import SproutIE
|
||||||
from .srgssr import (
|
from .srgssr import (
|
||||||
SRGSSRIE,
|
SRGSSRIE,
|
||||||
SRGSSRPlayIE,
|
SRGSSRPlayIE,
|
||||||
@ -1017,6 +1017,7 @@ from .tvplay import (
|
|||||||
TVPlayIE,
|
TVPlayIE,
|
||||||
ViafreeIE,
|
ViafreeIE,
|
||||||
)
|
)
|
||||||
|
from .tvplayer import TVPlayerIE
|
||||||
from .tweakers import TweakersIE
|
from .tweakers import TweakersIE
|
||||||
from .twentyfourvideo import TwentyFourVideoIE
|
from .twentyfourvideo import TwentyFourVideoIE
|
||||||
from .twentymin import TwentyMinutenIE
|
from .twentymin import TwentyMinutenIE
|
||||||
|
@ -1,3 +1,4 @@
|
|||||||
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
@ -134,6 +135,46 @@ class FacebookIE(InfoExtractor):
|
|||||||
'upload_date': '20161030',
|
'upload_date': '20161030',
|
||||||
'uploader': 'CNN',
|
'uploader': 'CNN',
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# bigPipe.onPageletArrive ... onPageletArrive pagelet_group_mall
|
||||||
|
'url': 'https://www.facebook.com/yaroslav.korpan/videos/1417995061575415/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1417995061575415',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'md5:a7b86ca673f51800cd54687b7f4012fe',
|
||||||
|
'timestamp': 1486648217,
|
||||||
|
'upload_date': '20170209',
|
||||||
|
'uploader': 'Yaroslav Korpan',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.facebook.com/LaGuiaDelVaron/posts/1072691702860471',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1072691702860471',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'md5:ae2d22a93fbb12dad20dc393a869739d',
|
||||||
|
'timestamp': 1477305000,
|
||||||
|
'upload_date': '20161024',
|
||||||
|
'uploader': 'La Guía Del Varón',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.facebook.com/groups/1024490957622648/permalink/1396382447100162/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1396382447100162',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'md5:e2d2700afdf84e121f5d0f999bad13a3',
|
||||||
|
'timestamp': 1486035494,
|
||||||
|
'upload_date': '20170202',
|
||||||
|
'uploader': 'Elisabeth Ahtn',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.facebook.com/video.php?v=10204634152394104',
|
'url': 'https://www.facebook.com/video.php?v=10204634152394104',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -249,7 +290,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
for item in instances:
|
for item in instances:
|
||||||
if item[1][0] == 'VideoConfig':
|
if item[1][0] == 'VideoConfig':
|
||||||
video_item = item[2][0]
|
video_item = item[2][0]
|
||||||
if video_item.get('video_id') == video_id:
|
if video_item.get('video_id'):
|
||||||
return video_item['videoData']
|
return video_item['videoData']
|
||||||
|
|
||||||
server_js_data = self._parse_json(self._search_regex(
|
server_js_data = self._parse_json(self._search_regex(
|
||||||
@ -262,7 +303,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
if not video_data:
|
if not video_data:
|
||||||
server_js_data = self._parse_json(
|
server_js_data = self._parse_json(
|
||||||
self._search_regex(
|
self._search_regex(
|
||||||
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+stream_pagelet',
|
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:stream_pagelet|pagelet_group_mall)',
|
||||||
webpage, 'js data', default='{}'),
|
webpage, 'js data', default='{}'),
|
||||||
video_id, transform_source=js_to_json, fatal=False)
|
video_id, transform_source=js_to_json, fatal=False)
|
||||||
if server_js_data:
|
if server_js_data:
|
||||||
|
@ -3,7 +3,7 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .adobepass import AdobePassIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
determine_ext,
|
determine_ext,
|
||||||
@ -13,15 +13,30 @@ from ..utils import (
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class GoIE(InfoExtractor):
|
class GoIE(AdobePassIE):
|
||||||
_BRANDS = {
|
_SITE_INFO = {
|
||||||
'abc': '001',
|
'abc': {
|
||||||
'freeform': '002',
|
'brand': '001',
|
||||||
'watchdisneychannel': '004',
|
'requestor_id': 'ABC',
|
||||||
'watchdisneyjunior': '008',
|
},
|
||||||
'watchdisneyxd': '009',
|
'freeform': {
|
||||||
|
'brand': '002',
|
||||||
|
'requestor_id': 'ABCFamily',
|
||||||
|
},
|
||||||
|
'watchdisneychannel': {
|
||||||
|
'brand': '004',
|
||||||
|
'requestor_id': 'Disney',
|
||||||
|
},
|
||||||
|
'watchdisneyjunior': {
|
||||||
|
'brand': '008',
|
||||||
|
'requestor_id': 'DisneyJunior',
|
||||||
|
},
|
||||||
|
'watchdisneyxd': {
|
||||||
|
'brand': '009',
|
||||||
|
'requestor_id': 'DisneyXD',
|
||||||
}
|
}
|
||||||
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_BRANDS.keys())
|
}
|
||||||
|
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
|
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -47,7 +62,8 @@ class GoIE(InfoExtractor):
|
|||||||
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
|
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
|
||||||
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
|
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
|
||||||
r'data-video-id=["\']*VDKA(\w+)', webpage, 'video id')
|
r'data-video-id=["\']*VDKA(\w+)', webpage, 'video id')
|
||||||
brand = self._BRANDS[sub_domain]
|
site_info = self._SITE_INFO[sub_domain]
|
||||||
|
brand = site_info['brand']
|
||||||
video_data = self._download_json(
|
video_data = self._download_json(
|
||||||
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (brand, video_id),
|
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (brand, video_id),
|
||||||
video_id)['video'][0]
|
video_id)['video'][0]
|
||||||
@ -63,14 +79,26 @@ class GoIE(InfoExtractor):
|
|||||||
if ext == 'm3u8':
|
if ext == 'm3u8':
|
||||||
video_type = video_data.get('type')
|
video_type = video_data.get('type')
|
||||||
if video_type == 'lf':
|
if video_type == 'lf':
|
||||||
entitlement = self._download_json(
|
data = {
|
||||||
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
|
|
||||||
video_id, data=urlencode_postdata({
|
|
||||||
'video_id': video_data['id'],
|
'video_id': video_data['id'],
|
||||||
'video_type': video_type,
|
'video_type': video_type,
|
||||||
'brand': brand,
|
'brand': brand,
|
||||||
'device': '001',
|
'device': '001',
|
||||||
}))
|
}
|
||||||
|
if video_data.get('accesslevel') == '1':
|
||||||
|
requestor_id = site_info['requestor_id']
|
||||||
|
resource = self._get_mvpd_resource(
|
||||||
|
requestor_id, title, video_id, None)
|
||||||
|
auth = self._extract_mvpd_auth(
|
||||||
|
url, video_id, requestor_id, resource)
|
||||||
|
data.update({
|
||||||
|
'token': auth,
|
||||||
|
'token_type': 'ap',
|
||||||
|
'adobe_requestor_id': requestor_id,
|
||||||
|
})
|
||||||
|
entitlement = self._download_json(
|
||||||
|
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
|
||||||
|
video_id, data=urlencode_postdata(data), headers=self.geo_verification_headers())
|
||||||
errors = entitlement.get('errors', {}).get('errors', [])
|
errors = entitlement.get('errors', {}).get('errors', [])
|
||||||
if errors:
|
if errors:
|
||||||
error_message = ', '.join([error['message'] for error in errors])
|
error_message = ', '.join([error['message'] for error in errors])
|
||||||
|
@ -2,50 +2,6 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
|
||||||
int_or_none,
|
|
||||||
js_to_json,
|
|
||||||
smuggle_url,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class HGTVIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?hgtv\.ca/[^/]+/video/(?P<id>[^/]+)/video.html'
|
|
||||||
_TEST = {
|
|
||||||
'url': 'http://www.hgtv.ca/homefree/video/overnight-success/video.html?v=738081859718&p=1&s=da#video',
|
|
||||||
'md5': '',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'aFH__I_5FBOX',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Overnight Success',
|
|
||||||
'description': 'After weeks of hard work, high stakes, breakdowns and pep talks, the final 2 contestants compete to win the ultimate dream.',
|
|
||||||
'uploader': 'SHWM-NEW',
|
|
||||||
'timestamp': 1470320034,
|
|
||||||
'upload_date': '20160804',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
# m3u8 download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
|
||||||
embed_vars = self._parse_json(self._search_regex(
|
|
||||||
r'(?s)embed_vars\s*=\s*({.*?});',
|
|
||||||
webpage, 'embed vars'), display_id, js_to_json)
|
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'url': smuggle_url(
|
|
||||||
'http://link.theplatform.com/s/dtjsEC/%s?mbr=true&manifest=m3u' % embed_vars['pid'], {
|
|
||||||
'force_smil_url': True
|
|
||||||
}),
|
|
||||||
'series': embed_vars.get('show'),
|
|
||||||
'season_number': int_or_none(embed_vars.get('season')),
|
|
||||||
'episode_number': int_or_none(embed_vars.get('episode')),
|
|
||||||
'ie_key': 'ThePlatform',
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class HGTVComShowIE(InfoExtractor):
|
class HGTVComShowIE(InfoExtractor):
|
||||||
|
@ -23,11 +23,11 @@ class KalturaIE(InfoExtractor):
|
|||||||
(?:
|
(?:
|
||||||
kaltura:(?P<partner_id>\d+):(?P<id>[0-9a-z_]+)|
|
kaltura:(?P<partner_id>\d+):(?P<id>[0-9a-z_]+)|
|
||||||
https?://
|
https?://
|
||||||
(:?(?:www|cdnapi(?:sec)?)\.)?kaltura\.com/
|
(:?(?:www|cdnapi(?:sec)?)\.)?kaltura\.com(?::\d+)?/
|
||||||
(?:
|
(?:
|
||||||
(?:
|
(?:
|
||||||
# flash player
|
# flash player
|
||||||
index\.php/kwidget|
|
index\.php/(?:kwidget|extwidget/preview)|
|
||||||
# html5 player
|
# html5 player
|
||||||
html5/html5lib/[^/]+/mwEmbedFrame\.php
|
html5/html5lib/[^/]+/mwEmbedFrame\.php
|
||||||
)
|
)
|
||||||
@ -94,6 +94,14 @@ class KalturaIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'https://www.kaltura.com/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
|
||||||
|
'only_matching': True,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'https://www.kaltura.com:443/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
|
||||||
|
'only_matching': True,
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
@ -112,7 +120,7 @@ class KalturaIE(InfoExtractor):
|
|||||||
re.search(
|
re.search(
|
||||||
r'''(?xs)
|
r'''(?xs)
|
||||||
(?P<q1>["\'])
|
(?P<q1>["\'])
|
||||||
(?:https?:)?//cdnapi(?:sec)?\.kaltura\.com/(?:(?!(?P=q1)).)*(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)*
|
(?:https?:)?//cdnapi(?:sec)?\.kaltura\.com(?::\d+)?/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)*
|
||||||
(?P=q1).*?
|
(?P=q1).*?
|
||||||
(?:
|
(?:
|
||||||
entry_?[Ii]d|
|
entry_?[Ii]d|
|
||||||
@ -209,6 +217,8 @@ class KalturaIE(InfoExtractor):
|
|||||||
partner_id = params['wid'][0][1:]
|
partner_id = params['wid'][0][1:]
|
||||||
elif 'p' in params:
|
elif 'p' in params:
|
||||||
partner_id = params['p'][0]
|
partner_id = params['p'][0]
|
||||||
|
elif 'partner_id' in params:
|
||||||
|
partner_id = params['partner_id'][0]
|
||||||
else:
|
else:
|
||||||
raise ExtractorError('Invalid URL', expected=True)
|
raise ExtractorError('Invalid URL', expected=True)
|
||||||
if 'entry_id' in params:
|
if 'entry_id' in params:
|
||||||
|
@ -4,23 +4,26 @@ import re
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .theplatform import ThePlatformIE
|
from .theplatform import ThePlatformIE
|
||||||
|
from .adobepass import AdobePassIE
|
||||||
|
from ..compat import compat_urllib_parse_urlparse
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
lowercase_escape,
|
lowercase_escape,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
|
int_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class NBCIE(InfoExtractor):
|
class NBCIE(AdobePassIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?nbc\.com/(?:[^/]+/)+(?P<id>n?\d+)'
|
_VALID_URL = r'https?://(?:www\.)?nbc\.com/(?:[^/]+/)+(?P<id>n?\d+)'
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
'url': 'http://www.nbc.com/the-tonight-show/segments/112966',
|
'url': 'http://www.nbc.com/the-tonight-show/video/jimmy-fallon-surprises-fans-at-ben-jerrys/2848237',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '112966',
|
'id': '2848237',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
|
'title': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
|
||||||
'description': 'Jimmy gives out free scoops of his new "Tonight Dough" ice cream flavor by surprising customers at the Ben & Jerry\'s scoop shop.',
|
'description': 'Jimmy gives out free scoops of his new "Tonight Dough" ice cream flavor by surprising customers at the Ben & Jerry\'s scoop shop.',
|
||||||
@ -69,7 +72,7 @@ class NBCIE(InfoExtractor):
|
|||||||
# HLS streams requires the 'hdnea3' cookie
|
# HLS streams requires the 'hdnea3' cookie
|
||||||
'url': 'http://www.nbc.com/Kings/video/goliath/n1806',
|
'url': 'http://www.nbc.com/Kings/video/goliath/n1806',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'n1806',
|
'id': '101528f5a9e8127b107e98c5e6ce4638',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Goliath',
|
'title': 'Goliath',
|
||||||
'description': 'When an unknown soldier saves the life of the King\'s son in battle, he\'s thrust into the limelight and politics of the kingdom.',
|
'description': 'When an unknown soldier saves the life of the King\'s son in battle, he\'s thrust into the limelight and politics of the kingdom.',
|
||||||
@ -87,6 +90,46 @@ class NBCIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
info = {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'ie_key': 'ThePlatform',
|
||||||
|
'id': video_id,
|
||||||
|
}
|
||||||
|
video_data = None
|
||||||
|
preload = self._search_regex(
|
||||||
|
r'PRELOAD\s*=\s*({.+})', webpage, 'preload data', default=None)
|
||||||
|
if preload:
|
||||||
|
preload_data = self._parse_json(preload, video_id)
|
||||||
|
path = compat_urllib_parse_urlparse(url).path.rstrip('/')
|
||||||
|
entity_id = preload_data.get('xref', {}).get(path)
|
||||||
|
video_data = preload_data.get('entities', {}).get(entity_id)
|
||||||
|
if video_data:
|
||||||
|
query = {
|
||||||
|
'mbr': 'true',
|
||||||
|
'manifest': 'm3u',
|
||||||
|
}
|
||||||
|
video_id = video_data['guid']
|
||||||
|
title = video_data['title']
|
||||||
|
if video_data.get('entitlement') == 'auth':
|
||||||
|
resource = self._get_mvpd_resource(
|
||||||
|
'nbcentertainment', title, video_id,
|
||||||
|
video_data.get('vChipRating'))
|
||||||
|
query['auth'] = self._extract_mvpd_auth(
|
||||||
|
url, video_id, 'nbcentertainment', resource)
|
||||||
|
theplatform_url = smuggle_url(update_url_query(
|
||||||
|
'http://link.theplatform.com/s/NnzsPC/media/guid/2410887629/' + video_id,
|
||||||
|
query), {'force_smil_url': True})
|
||||||
|
info.update({
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'url': theplatform_url,
|
||||||
|
'description': video_data.get('description'),
|
||||||
|
'keywords': video_data.get('keywords'),
|
||||||
|
'season_number': int_or_none(video_data.get('seasonNumber')),
|
||||||
|
'episode_number': int_or_none(video_data.get('episodeNumber')),
|
||||||
|
'series': video_data.get('showName'),
|
||||||
|
})
|
||||||
|
else:
|
||||||
theplatform_url = unescapeHTML(lowercase_escape(self._html_search_regex(
|
theplatform_url = unescapeHTML(lowercase_escape(self._html_search_regex(
|
||||||
[
|
[
|
||||||
r'(?:class="video-player video-player-full" data-mpx-url|class="player" src)="(.*?)"',
|
r'(?:class="video-player video-player-full" data-mpx-url|class="player" src)="(.*?)"',
|
||||||
@ -96,12 +139,8 @@ class NBCIE(InfoExtractor):
|
|||||||
webpage, 'theplatform url').replace('_no_endcard', '').replace('\\/', '/')))
|
webpage, 'theplatform url').replace('_no_endcard', '').replace('\\/', '/')))
|
||||||
if theplatform_url.startswith('//'):
|
if theplatform_url.startswith('//'):
|
||||||
theplatform_url = 'http:' + theplatform_url
|
theplatform_url = 'http:' + theplatform_url
|
||||||
return {
|
info['url'] = smuggle_url(theplatform_url, {'source_url': url})
|
||||||
'_type': 'url_transparent',
|
return info
|
||||||
'ie_key': 'ThePlatform',
|
|
||||||
'url': smuggle_url(theplatform_url, {'source_url': url}),
|
|
||||||
'id': video_id,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class NBCSportsVPlayerIE(InfoExtractor):
|
class NBCSportsVPlayerIE(InfoExtractor):
|
||||||
|
@ -18,6 +18,7 @@ from ..utils import (
|
|||||||
parse_duration,
|
parse_duration,
|
||||||
qualities,
|
qualities,
|
||||||
srt_subtitles_timecode,
|
srt_subtitles_timecode,
|
||||||
|
update_url_query,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -92,6 +93,10 @@ class PluralsightIE(PluralsightBaseIE):
|
|||||||
raise ExtractorError('Unable to login: %s' % error, expected=True)
|
raise ExtractorError('Unable to login: %s' % error, expected=True)
|
||||||
|
|
||||||
if all(p not in response for p in ('__INITIAL_STATE__', '"currentUser"')):
|
if all(p not in response for p in ('__INITIAL_STATE__', '"currentUser"')):
|
||||||
|
BLOCKED = 'Your account has been blocked due to suspicious activity'
|
||||||
|
if BLOCKED in response:
|
||||||
|
raise ExtractorError(
|
||||||
|
'Unable to login: %s' % BLOCKED, expected=True)
|
||||||
raise ExtractorError('Unable to log in')
|
raise ExtractorError('Unable to log in')
|
||||||
|
|
||||||
def _get_subtitles(self, author, clip_id, lang, name, duration, video_id):
|
def _get_subtitles(self, author, clip_id, lang, name, duration, video_id):
|
||||||
@ -327,25 +332,44 @@ class PluralsightCourseIE(PluralsightBaseIE):
|
|||||||
# TODO: PSM cookie
|
# TODO: PSM cookie
|
||||||
|
|
||||||
course = self._download_json(
|
course = self._download_json(
|
||||||
'%s/data/course/%s' % (self._API_BASE, course_id),
|
'%s/player/functions/rpc' % self._API_BASE, course_id,
|
||||||
course_id, 'Downloading course JSON')
|
'Downloading course JSON',
|
||||||
|
data=json.dumps({
|
||||||
|
'fn': 'bootstrapPlayer',
|
||||||
|
'payload': {
|
||||||
|
'courseId': course_id,
|
||||||
|
}
|
||||||
|
}).encode('utf-8'),
|
||||||
|
headers={
|
||||||
|
'Content-Type': 'application/json;charset=utf-8'
|
||||||
|
})['payload']['course']
|
||||||
|
|
||||||
title = course['title']
|
title = course['title']
|
||||||
|
course_name = course['name']
|
||||||
|
course_data = course['modules']
|
||||||
description = course.get('description') or course.get('shortDescription')
|
description = course.get('description') or course.get('shortDescription')
|
||||||
|
|
||||||
course_data = self._download_json(
|
|
||||||
'%s/data/course/content/%s' % (self._API_BASE, course_id),
|
|
||||||
course_id, 'Downloading course data JSON')
|
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
for num, module in enumerate(course_data, 1):
|
for num, module in enumerate(course_data, 1):
|
||||||
for clip in module.get('clips', []):
|
author = module.get('author')
|
||||||
player_parameters = clip.get('playerParameters')
|
module_name = module.get('name')
|
||||||
if not player_parameters:
|
if not author or not module_name:
|
||||||
continue
|
continue
|
||||||
|
for clip in module.get('clips', []):
|
||||||
|
clip_index = int_or_none(clip.get('index'))
|
||||||
|
if clip_index is None:
|
||||||
|
continue
|
||||||
|
clip_url = update_url_query(
|
||||||
|
'%s/player' % self._API_BASE, query={
|
||||||
|
'mode': 'live',
|
||||||
|
'course': course_name,
|
||||||
|
'author': author,
|
||||||
|
'name': module_name,
|
||||||
|
'clip': clip_index,
|
||||||
|
})
|
||||||
entries.append({
|
entries.append({
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'url': '%s/training/player?%s' % (self._API_BASE, player_parameters),
|
'url': clip_url,
|
||||||
'ie_key': PluralsightIE.ie_key(),
|
'ie_key': PluralsightIE.ie_key(),
|
||||||
'chapter': module.get('title'),
|
'chapter': module.get('title'),
|
||||||
'chapter_number': num,
|
'chapter_number': num,
|
||||||
|
@ -156,11 +156,17 @@ class PornHubIE(InfoExtractor):
|
|||||||
comment_count = self._extract_count(
|
comment_count = self._extract_count(
|
||||||
r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')
|
r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')
|
||||||
|
|
||||||
|
video_variables = {}
|
||||||
|
for video_variablename, quote, video_variable in re.findall(
|
||||||
|
r'(player_quality_[0-9]{3,4}p\w+)\s*=\s*(["\'])(.+?)\2;', webpage):
|
||||||
|
video_variables[video_variablename] = video_variable
|
||||||
|
|
||||||
video_urls = []
|
video_urls = []
|
||||||
for quote, video_url in re.findall(
|
for encoded_video_url in re.findall(
|
||||||
r'player_quality_[0-9]{3,4}p\s*=\s*(["\'])(.+?)\1;', webpage):
|
r'player_quality_[0-9]{3,4}p\s*=(.+?);', webpage):
|
||||||
video_urls.append(compat_urllib_parse_unquote(re.sub(
|
for varname, varval in video_variables.items():
|
||||||
r'{0}\s*\+\s*{0}'.format(quote), '', video_url)))
|
encoded_video_url = encoded_video_url.replace(varname, varval)
|
||||||
|
video_urls.append(re.sub(r'[\s+]', '', encoded_video_url))
|
||||||
|
|
||||||
if webpage.find('"encrypted":true') != -1:
|
if webpage.find('"encrypted":true') != -1:
|
||||||
password = compat_urllib_parse_unquote_plus(
|
password = compat_urllib_parse_unquote_plus(
|
||||||
|
60
youtube_dl/extractor/scrippsnetworks.py
Normal file
60
youtube_dl/extractor/scrippsnetworks.py
Normal file
@ -0,0 +1,60 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .adobepass import AdobePassIE
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
smuggle_url,
|
||||||
|
update_url_query,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class ScrippsNetworksWatchIE(AdobePassIE):
|
||||||
|
IE_NAME = 'scrippsnetworks:watch'
|
||||||
|
_VALID_URL = r'https?://watch\.(?:hgtv|foodnetwork|travelchannel|diynetwork|cookingchanneltv)\.com/player\.[A-Z0-9]+\.html#(?P<id>\d+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://watch.hgtv.com/player.HNT.html#0256538',
|
||||||
|
'md5': '26545fd676d939954c6808274bdb905a',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '0256538',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Seeking a Wow House',
|
||||||
|
'description': 'Buyers retiring in Palm Springs, California, want a modern house with major wow factor. They\'re also looking for a pool and a large, open floorplan with tall windows looking out at the views.',
|
||||||
|
'uploader': 'SCNI',
|
||||||
|
'upload_date': '20170207',
|
||||||
|
'timestamp': 1486450493,
|
||||||
|
},
|
||||||
|
'skip': 'requires TV provider authentication',
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
channel = self._parse_json(self._search_regex(
|
||||||
|
r'"channels"\s*:\s*(\[.+\])',
|
||||||
|
webpage, 'channels'), video_id)[0]
|
||||||
|
video_data = next(v for v in channel['videos'] if v.get('nlvid') == video_id)
|
||||||
|
title = video_data['title']
|
||||||
|
release_url = video_data['releaseUrl']
|
||||||
|
if video_data.get('restricted'):
|
||||||
|
requestor_id = self._search_regex(
|
||||||
|
r'requestorId\s*=\s*"([^"]+)";', webpage, 'requestor id')
|
||||||
|
resource = self._get_mvpd_resource(
|
||||||
|
requestor_id, title, video_id,
|
||||||
|
video_data.get('ratings', [{}])[0].get('rating'))
|
||||||
|
auth = self._extract_mvpd_auth(
|
||||||
|
url, video_id, requestor_id, resource)
|
||||||
|
release_url = update_url_query(release_url, {'auth': auth})
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'url': smuggle_url(release_url, {'force_smil_url': True}),
|
||||||
|
'description': video_data.get('description'),
|
||||||
|
'thumbnail': video_data.get('thumbnailUrl'),
|
||||||
|
'series': video_data.get('showTitle'),
|
||||||
|
'season_number': int_or_none(video_data.get('season')),
|
||||||
|
'episode_number': int_or_none(video_data.get('episodeNumber')),
|
||||||
|
'ie_key': 'ThePlatform',
|
||||||
|
}
|
@ -1,64 +1,101 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
qualities,
|
|
||||||
int_or_none,
|
|
||||||
mimetype2ext,
|
|
||||||
determine_ext,
|
determine_ext,
|
||||||
|
int_or_none,
|
||||||
|
try_get,
|
||||||
|
qualities,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class SixPlayIE(InfoExtractor):
|
class SixPlayIE(InfoExtractor):
|
||||||
|
IE_NAME = '6play'
|
||||||
_VALID_URL = r'(?:6play:|https?://(?:www\.)?6play\.fr/.+?-c_)(?P<id>[0-9]+)'
|
_VALID_URL = r'(?:6play:|https?://(?:www\.)?6play\.fr/.+?-c_)(?P<id>[0-9]+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.6play.fr/jamel-et-ses-amis-au-marrakech-du-rire-p_1316/jamel-et-ses-amis-au-marrakech-du-rire-2015-c_11495320',
|
'url': 'http://www.6play.fr/le-meilleur-patissier-p_1807/le-meilleur-patissier-special-fetes-mercredi-a-21-00-sur-m6-c_11638450',
|
||||||
'md5': '42310bffe4ba3982db112b9cd3467328',
|
'md5': '42310bffe4ba3982db112b9cd3467328',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '11495320',
|
'id': '11638450',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Jamel et ses amis au Marrakech du rire 2015',
|
'title': 'Le Meilleur Pâtissier, spécial fêtes mercredi à 21:00 sur M6',
|
||||||
'description': 'md5:ba2149d5c321d5201b78070ee839d872',
|
'description': 'md5:308853f6a5f9e2d55a30fc0654de415f',
|
||||||
|
'duration': 39,
|
||||||
|
'series': 'Le meilleur pâtissier',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
clip_data = self._download_json(
|
|
||||||
'https://player.m6web.fr/v2/video/config/6play-auth/FR/%s.json' % video_id,
|
|
||||||
video_id)
|
|
||||||
video_data = clip_data['videoInfo']
|
|
||||||
|
|
||||||
|
data = self._download_json(
|
||||||
|
'https://pc.middleware.6play.fr/6play/v2/platforms/m6group_web/services/6play/videos/clip_%s' % video_id,
|
||||||
|
video_id, query={
|
||||||
|
'csa': 5,
|
||||||
|
'with': 'clips',
|
||||||
|
})
|
||||||
|
|
||||||
|
clip_data = data['clips'][0]
|
||||||
|
title = clip_data['title']
|
||||||
|
|
||||||
|
urls = []
|
||||||
quality_key = qualities(['lq', 'sd', 'hq', 'hd'])
|
quality_key = qualities(['lq', 'sd', 'hq', 'hd'])
|
||||||
formats = []
|
formats = []
|
||||||
for source in clip_data['sources']:
|
for asset in clip_data['assets']:
|
||||||
source_type, source_url = source.get('type'), source.get('src')
|
asset_url = asset.get('full_physical_path')
|
||||||
if not source_url or source_type == 'hls/primetime':
|
protocol = asset.get('protocol')
|
||||||
|
if not asset_url or protocol == 'primetime' or asset_url in urls:
|
||||||
continue
|
continue
|
||||||
ext = mimetype2ext(source_type) or determine_ext(source_url)
|
urls.append(asset_url)
|
||||||
if ext == 'm3u8':
|
container = asset.get('video_container')
|
||||||
|
ext = determine_ext(asset_url)
|
||||||
|
if container == 'm3u8' or ext == 'm3u8':
|
||||||
|
if protocol == 'usp':
|
||||||
|
asset_url = re.sub(r'/([^/]+)\.ism/[^/]*\.m3u8', r'/\1.ism/\1.m3u8', asset_url)
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
source_url, video_id, 'mp4', 'm3u8_native',
|
asset_url, video_id, 'mp4', 'm3u8_native',
|
||||||
m3u8_id='hls', fatal=False))
|
m3u8_id='hls', fatal=False))
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
source_url.replace('.m3u8', '.f4m'),
|
asset_url.replace('.m3u8', '.f4m'),
|
||||||
video_id, f4m_id='hds', fatal=False))
|
video_id, f4m_id='hds', fatal=False))
|
||||||
elif ext == 'mp4':
|
formats.extend(self._extract_mpd_formats(
|
||||||
quality = source.get('quality')
|
asset_url.replace('.m3u8', '.mpd'),
|
||||||
|
video_id, mpd_id='dash', fatal=False))
|
||||||
|
formats.extend(self._extract_ism_formats(
|
||||||
|
re.sub(r'/[^/]+\.m3u8', '/Manifest', asset_url),
|
||||||
|
video_id, ism_id='mss', fatal=False))
|
||||||
|
else:
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
asset_url, video_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False))
|
||||||
|
elif container == 'mp4' or ext == 'mp4':
|
||||||
|
quality = asset.get('video_quality')
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': source_url,
|
'url': asset_url,
|
||||||
'format_id': quality,
|
'format_id': quality,
|
||||||
'quality': quality_key(quality),
|
'quality': quality_key(quality),
|
||||||
'ext': ext,
|
'ext': ext,
|
||||||
})
|
})
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
def get(getter):
|
||||||
|
for src in (data, clip_data):
|
||||||
|
v = try_get(src, getter, compat_str)
|
||||||
|
if v:
|
||||||
|
return v
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': video_data['title'].strip(),
|
'title': title,
|
||||||
'description': video_data.get('description'),
|
'description': get(lambda x: x['description']),
|
||||||
'duration': int_or_none(video_data.get('duration')),
|
'duration': int_or_none(clip_data.get('duration')),
|
||||||
'series': video_data.get('titlePgm'),
|
'series': get(lambda x: x['program']['title']),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
}
|
}
|
||||||
|
52
youtube_dl/extractor/sprout.py
Normal file
52
youtube_dl/extractor/sprout.py
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .adobepass import AdobePassIE
|
||||||
|
from ..utils import (
|
||||||
|
extract_attributes,
|
||||||
|
update_url_query,
|
||||||
|
smuggle_url,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class SproutIE(AdobePassIE):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?sproutonline\.com/watch/(?P<id>[^/?#]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.sproutonline.com/watch/cowboy-adventure',
|
||||||
|
'md5': '74bf14128578d1e040c3ebc82088f45f',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '9dexnwtmh8_X',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'A Cowboy Adventure',
|
||||||
|
'description': 'Ruff-Ruff, Tweet and Dave get to be cowboys for the day at Six Cow Corral.',
|
||||||
|
'timestamp': 1437758640,
|
||||||
|
'upload_date': '20150724',
|
||||||
|
'uploader': 'NBCU-SPROUT-NEW',
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
video_component = self._search_regex(
|
||||||
|
r'(?s)(<div[^>]+data-component="video"[^>]*?>)',
|
||||||
|
webpage, 'video component', default=None)
|
||||||
|
if video_component:
|
||||||
|
options = self._parse_json(extract_attributes(
|
||||||
|
video_component)['data-options'], video_id)
|
||||||
|
theplatform_url = options['video']
|
||||||
|
query = {
|
||||||
|
'mbr': 'true',
|
||||||
|
'manifest': 'm3u',
|
||||||
|
}
|
||||||
|
if options.get('protected'):
|
||||||
|
query['auth'] = self._extract_mvpd_auth(url, options['pid'], 'sprout', 'sprout')
|
||||||
|
theplatform_url = smuggle_url(update_url_query(
|
||||||
|
theplatform_url, query), {'force_smil_url': True})
|
||||||
|
else:
|
||||||
|
iframe = self._search_regex(
|
||||||
|
r'(<iframe[^>]+id="sproutVideoIframe"[^>]*?>)',
|
||||||
|
webpage, 'iframe')
|
||||||
|
theplatform_url = extract_attributes(iframe)['src']
|
||||||
|
|
||||||
|
return self.url_result(theplatform_url, 'ThePlatform')
|
@ -306,9 +306,10 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
|
|||||||
},
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _extract_feed_info(self, provider_id, feed_id, filter_query, video_id, custom_fields=None, asset_types_query={}):
|
def _extract_feed_info(self, provider_id, feed_id, filter_query, video_id, custom_fields=None, asset_types_query={}, account_id=None):
|
||||||
real_url = self._URL_TEMPLATE % (self.http_scheme(), provider_id, feed_id, filter_query)
|
real_url = self._URL_TEMPLATE % (self.http_scheme(), provider_id, feed_id, filter_query)
|
||||||
entry = self._download_json(real_url, video_id)['entries'][0]
|
entry = self._download_json(real_url, video_id)['entries'][0]
|
||||||
|
main_smil_url = 'http://link.theplatform.com/s/%s/media/guid/%d/%s' % (provider_id, account_id, entry['guid']) if account_id else None
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
@ -333,7 +334,7 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
|
|||||||
if asset_type in asset_types_query:
|
if asset_type in asset_types_query:
|
||||||
query.update(asset_types_query[asset_type])
|
query.update(asset_types_query[asset_type])
|
||||||
cur_formats, cur_subtitles = self._extract_theplatform_smil(update_url_query(
|
cur_formats, cur_subtitles = self._extract_theplatform_smil(update_url_query(
|
||||||
smil_url, query), video_id, 'Downloading SMIL data for %s' % asset_type)
|
main_smil_url or smil_url, query), video_id, 'Downloading SMIL data for %s' % asset_type)
|
||||||
formats.extend(cur_formats)
|
formats.extend(cur_formats)
|
||||||
subtitles = self._merge_subtitles(subtitles, cur_subtitles)
|
subtitles = self._merge_subtitles(subtitles, cur_subtitles)
|
||||||
|
|
||||||
|
75
youtube_dl/extractor/tvplayer.py
Normal file
75
youtube_dl/extractor/tvplayer.py
Normal file
@ -0,0 +1,75 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_HTTPError
|
||||||
|
from ..utils import (
|
||||||
|
extract_attributes,
|
||||||
|
urlencode_postdata,
|
||||||
|
ExtractorError,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TVPlayerIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?tvplayer\.com/watch/(?P<id>[^/?#]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://tvplayer.com/watch/bbcone',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '89',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': r're:^BBC One [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
current_channel = extract_attributes(self._search_regex(
|
||||||
|
r'(<div[^>]+class="[^"]*current-channel[^"]*"[^>]*>)',
|
||||||
|
webpage, 'channel element'))
|
||||||
|
title = current_channel['data-name']
|
||||||
|
|
||||||
|
resource_id = self._search_regex(
|
||||||
|
r'resourceId\s*=\s*"(\d+)"', webpage, 'resource id')
|
||||||
|
platform = self._search_regex(
|
||||||
|
r'platform\s*=\s*"([^"]+)"', webpage, 'platform')
|
||||||
|
token = self._search_regex(
|
||||||
|
r'token\s*=\s*"([^"]+)"', webpage, 'token', default='null')
|
||||||
|
validate = self._search_regex(
|
||||||
|
r'validate\s*=\s*"([^"]+)"', webpage, 'validate', default='null')
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = self._download_json(
|
||||||
|
'http://api.tvplayer.com/api/v2/stream/live',
|
||||||
|
resource_id, headers={
|
||||||
|
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
|
||||||
|
}, data=urlencode_postdata({
|
||||||
|
'service': 1,
|
||||||
|
'platform': platform,
|
||||||
|
'id': resource_id,
|
||||||
|
'token': token,
|
||||||
|
'validate': validate,
|
||||||
|
}))['tvplayer']['response']
|
||||||
|
except ExtractorError as e:
|
||||||
|
if isinstance(e.cause, compat_HTTPError):
|
||||||
|
response = self._parse_json(
|
||||||
|
e.cause.read().decode(), resource_id)['tvplayer']['response']
|
||||||
|
raise ExtractorError(
|
||||||
|
'%s said: %s' % (self.IE_NAME, response['error']), expected=True)
|
||||||
|
raise
|
||||||
|
|
||||||
|
formats = self._extract_m3u8_formats(response['stream'], resource_id, 'mp4')
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': resource_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': self._live_title(title),
|
||||||
|
'formats': formats,
|
||||||
|
'is_live': True,
|
||||||
|
}
|
@ -53,14 +53,15 @@ class XTubeIE(InfoExtractor):
|
|||||||
|
|
||||||
if not display_id:
|
if not display_id:
|
||||||
display_id = video_id
|
display_id = video_id
|
||||||
url = 'http://www.xtube.com/watch.php?v=%s' % video_id
|
url = 'http://www.xtube.com/video-watch/-%s' % video_id
|
||||||
|
|
||||||
req = sanitized_Request(url)
|
req = sanitized_Request(url)
|
||||||
req.add_header('Cookie', 'age_verified=1; cookiesAccepted=1')
|
req.add_header('Cookie', 'age_verified=1; cookiesAccepted=1')
|
||||||
webpage = self._download_webpage(req, display_id)
|
webpage = self._download_webpage(req, display_id)
|
||||||
|
|
||||||
sources = self._parse_json(self._search_regex(
|
sources = self._parse_json(self._search_regex(
|
||||||
r'sources\s*:\s*({.+?}),', webpage, 'sources'), video_id)
|
r'(["\'])sources\1\s*:\s*(?P<sources>{.+?}),',
|
||||||
|
webpage, 'sources', group='sources'), video_id)
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for format_id, format_url in sources.items():
|
for format_id, format_url in sources.items():
|
||||||
@ -81,10 +82,10 @@ class XTubeIE(InfoExtractor):
|
|||||||
r'<span[^>]+class="nickname"[^>]*>([^<]+)'),
|
r'<span[^>]+class="nickname"[^>]*>([^<]+)'),
|
||||||
webpage, 'uploader', fatal=False)
|
webpage, 'uploader', fatal=False)
|
||||||
duration = parse_duration(self._search_regex(
|
duration = parse_duration(self._search_regex(
|
||||||
r'<dt>Runtime:</dt>\s*<dd>([^<]+)</dd>',
|
r'<dt>Runtime:?</dt>\s*<dd>([^<]+)</dd>',
|
||||||
webpage, 'duration', fatal=False))
|
webpage, 'duration', fatal=False))
|
||||||
view_count = str_to_int(self._search_regex(
|
view_count = str_to_int(self._search_regex(
|
||||||
r'<dt>Views:</dt>\s*<dd>([\d,\.]+)</dd>',
|
r'<dt>Views:?</dt>\s*<dd>([\d,\.]+)</dd>',
|
||||||
webpage, 'view count', fatal=False))
|
webpage, 'view count', fatal=False))
|
||||||
comment_count = str_to_int(self._html_search_regex(
|
comment_count = str_to_int(self._html_search_regex(
|
||||||
r'>Comments? \(([\d,\.]+)\)<',
|
r'>Comments? \(([\d,\.]+)\)<',
|
||||||
|
@ -337,17 +337,30 @@ def get_element_by_id(id, html):
|
|||||||
|
|
||||||
|
|
||||||
def get_element_by_class(class_name, html):
|
def get_element_by_class(class_name, html):
|
||||||
return get_element_by_attribute(
|
"""Return the content of the first tag with the specified class in the passed HTML document"""
|
||||||
|
retval = get_elements_by_class(class_name, html)
|
||||||
|
return retval[0] if retval else None
|
||||||
|
|
||||||
|
|
||||||
|
def get_element_by_attribute(attribute, value, html, escape_value=True):
|
||||||
|
retval = get_elements_by_attribute(attribute, value, html, escape_value)
|
||||||
|
return retval[0] if retval else None
|
||||||
|
|
||||||
|
|
||||||
|
def get_elements_by_class(class_name, html):
|
||||||
|
"""Return the content of all tags with the specified class in the passed HTML document as a list"""
|
||||||
|
return get_elements_by_attribute(
|
||||||
'class', r'[^\'"]*\b%s\b[^\'"]*' % re.escape(class_name),
|
'class', r'[^\'"]*\b%s\b[^\'"]*' % re.escape(class_name),
|
||||||
html, escape_value=False)
|
html, escape_value=False)
|
||||||
|
|
||||||
|
|
||||||
def get_element_by_attribute(attribute, value, html, escape_value=True):
|
def get_elements_by_attribute(attribute, value, html, escape_value=True):
|
||||||
"""Return the content of the tag with the specified attribute in the passed HTML document"""
|
"""Return the content of the tag with the specified attribute in the passed HTML document"""
|
||||||
|
|
||||||
value = re.escape(value) if escape_value else value
|
value = re.escape(value) if escape_value else value
|
||||||
|
|
||||||
m = re.search(r'''(?xs)
|
retlist = []
|
||||||
|
for m in re.finditer(r'''(?xs)
|
||||||
<([a-zA-Z0-9:._-]+)
|
<([a-zA-Z0-9:._-]+)
|
||||||
(?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'))*?
|
(?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'))*?
|
||||||
\s+%s=['"]?%s['"]?
|
\s+%s=['"]?%s['"]?
|
||||||
@ -355,16 +368,15 @@ def get_element_by_attribute(attribute, value, html, escape_value=True):
|
|||||||
\s*>
|
\s*>
|
||||||
(?P<content>.*?)
|
(?P<content>.*?)
|
||||||
</\1>
|
</\1>
|
||||||
''' % (re.escape(attribute), value), html)
|
''' % (re.escape(attribute), value), html):
|
||||||
|
|
||||||
if not m:
|
|
||||||
return None
|
|
||||||
res = m.group('content')
|
res = m.group('content')
|
||||||
|
|
||||||
if res.startswith('"') or res.startswith("'"):
|
if res.startswith('"') or res.startswith("'"):
|
||||||
res = res[1:-1]
|
res = res[1:-1]
|
||||||
|
|
||||||
return unescapeHTML(res)
|
retlist.append(unescapeHTML(res))
|
||||||
|
|
||||||
|
return retlist
|
||||||
|
|
||||||
|
|
||||||
class HTMLAttributeParser(compat_HTMLParser):
|
class HTMLAttributeParser(compat_HTMLParser):
|
||||||
|
@ -1,3 +1,3 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
__version__ = '2017.02.07'
|
__version__ = '2017.02.11'
|
||||||
|
Reference in New Issue
Block a user