Compare commits

...

29 Commits

Author SHA1 Message Date
Sergey M․
38d2f8325f release 2017.04.14 2017-04-14 00:31:49 +07:00
Sergey M․
6f4dd6667b [ChangeLog] Actualize 2017-04-14 00:29:36 +07:00
Sergey M․
95728fda70 [aenetworks] Add support for specials (closes #12723) 2017-04-13 23:39:55 +07:00
Remita Amine
3f7409f745 [hbo] extract m3u8 formats 2017-04-13 12:28:22 +01:00
Remita Amine
b2fff30817 [go90] Add new extractor(closes #10127) 2017-04-13 12:28:22 +01:00
Remita Amine
f59746372a [downloader/hls] add basic support for EXT-X-BYTERANGE tag(#10955) 2017-04-13 12:28:22 +01:00
Yen Chi Hsuan
34d98cc411 [tv2hu] Fix invalid escape sequence on Py3.6 2017-04-13 16:52:52 +08:00
Remita Amine
40fcba5edb improve coding style 2017-04-12 20:38:43 +01:00
Remita Amine
e4d74e2778 [tv2hu] improve extraction 2017-04-12 19:44:32 +01:00
Vukkk
3ef1d0c733 [tv2hu] Add new extractor 2017-04-12 19:36:02 +01:00
Remita Amine
3962260b7d Credit @SkiTheSlicer for Verizon adobe pass support (#12721) 2017-04-12 16:54:53 +01:00
Remita Amine
0ee79a378a [generic] exclude urls with xml ext from valid video urls(closes #10768)(closes #11654) 2017-04-12 16:22:08 +01:00
Remita Amine
79a51069e5 [adobepass] improve comcast and verison login code(closes #10803) 2017-04-12 15:19:58 +01:00
Daniel Twardowski
a9a346535b [adobepass] Add Verizon support (#10803) 2017-04-12 01:18:57 -04:00
Remita Amine
89beedd31f [youtube] improve m3u8 format extraction 2017-04-11 21:55:59 +01:00
Sergey M․
e109f1ff43 [afreecatv] Fix extraction for videos with different key layout (closes #12718) 2017-04-12 02:17:34 +07:00
Sergey M․
d23028a8fb [youtube] Remove explicit preference for audio-only and video-only formats
In order not to break sorting when new formats appear
2017-04-11 22:41:48 +07:00
Sergey M․
6214611a4a [canalplus] Bypass geo restriction 2017-04-11 22:30:15 +07:00
Sergey M․
1730878167 release 2017.04.11 2017-04-11 02:17:53 +07:00
Sergey M․
689cd458a6 [ChangeLog] Actualize 2017-04-11 02:16:20 +07:00
Sergey M․
6b9466de2f [afreecatv] Fix extraction (closes #12706) 2017-04-11 02:05:53 +07:00
John Hawkinson
61568e50cf [generic] Add support for <object> youtube embeds (closes #12637) 2017-04-11 00:04:32 +07:00
Sergey M․
364a69e8c6 [test_download] Fix testing playlists with single video and add comments 2017-04-11 00:01:02 +07:00
Sergey M․
6240925b40 [bbccouk] Treat bitrate as audio+video bitrate in media selector 2017-04-10 22:56:22 +07:00
Sergey M․
964744af95 [bbccouk] Skip unrecognized formats in media selector (#12701) 2017-04-10 22:53:51 +07:00
Sergey M․
1af959ef9f [bbccouk] Add support for https protocol in media selector (closes #12701) 2017-04-10 22:53:06 +07:00
Remita Amine
a206ef62df [curiositystream] fix extraction(closes #12638) 2017-04-10 13:50:08 +01:00
Remita Amine
3f2ce6896a [adn] update subtitle decryption key 2017-04-09 12:33:29 +01:00
Sergey M․
a6f7263cf4 [chaturbate] Fix extraction (closes #12665) 2017-04-09 01:39:40 +07:00
30 changed files with 603 additions and 167 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.04.09*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.04.09**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.04.14*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.04.14**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.04.09
[debug] youtube-dl version 2017.04.14
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -210,3 +210,4 @@ Lars Vierbergen
Juanjo Benages
Xiao Di Guan
Thomas Winant
Daniel Twardowski

View File

@@ -1,3 +1,36 @@
version 2017.04.14
Core
+ [downloader/hls] Add basic support for EXT-X-BYTERANGE tag (#10955)
+ [adobepass] Improve Comcast and Verison login code (#10803)
+ [adobepass] Add support for Verizon (#10803)
Extractors
+ [aenetworks] Add support for specials (#12723)
+ [hbo] Extract HLS formats
+ [go90] Add support for go90.com (#10127)
+ [tv2hu] Add support for tv2.hu (#10509)
+ [generic] Exclude URLs with xml ext from valid video URLs (#10768, #11654)
* [youtube] Improve HLS formats extraction
* [afreecatv] Fix extraction for videos with different key layout (#12718)
- [youtube] Remove explicit preference for audio-only and video-only formats in
order not to break sorting when new formats appear
* [canalplus] Bypass geo restriction
version 2017.04.11
Extractors
* [afreecatv] Fix extraction (#12706)
+ [generic] Add support for <object> YouTube embeds (#12637)
* [bbccouk] Treat bitrate as audio+video bitrate in media selector
+ [bbccouk] Skip unrecognized formats in media selector (#12701)
+ [bbccouk] Add support for https protocol in media selector (#12701)
* [curiositystream] Fix extraction (#12638)
* [adn] Update subtitle decryption key
* [chaturbate] Fix extraction (#12665, #12688, #12690)
version 2017.04.09
Extractors

View File

@@ -308,6 +308,7 @@
- **Globo**
- **GloboArticle**
- **Go**
- **Go90**
- **GodTube**
- **GodTV**
- **Golem**
@@ -818,6 +819,7 @@
- **Tutv**
- **tv.dfb.de**
- **TV2**
- **tv2.hu**
- **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se

View File

@@ -199,9 +199,16 @@ def generator(test_case, tname):
self.assertEqual(
test_case['playlist_duration_sum'], got_duration)
# Generalize both playlists and single videos to unified format for
# simplicity
if 'entries' not in res_dict:
res_dict['entries'] = [res_dict]
for tc_num, tc in enumerate(test_cases):
tc_res_dict = res_dict['entries'][tc_num] if is_playlist else res_dict
tc_res_dict = res_dict['entries'][tc_num]
# First, check test cases' data against extracted data alone
expect_info_dict(self, tc_res_dict, tc.get('info_dict', {}))
# Now, check downloaded file consistency
tc_filename = get_tc_filename(tc)
if not test_case.get('params', {}).get('skip_download', False):
self.assertTrue(os.path.exists(tc_filename), msg='Missing file ' + tc_filename)
@@ -219,13 +226,14 @@ def generator(test_case, tname):
if 'md5' in tc:
md5_for_file = _file_md5(tc_filename)
self.assertEqual(md5_for_file, tc['md5'])
# Finally, check test cases' data again but this time against
# extracted data from info JSON file written during processing
info_json_fn = os.path.splitext(tc_filename)[0] + '.info.json'
self.assertTrue(
os.path.exists(info_json_fn),
'Missing info file %s' % info_json_fn)
with io.open(info_json_fn, encoding='utf-8') as infof:
info_dict = json.load(infof)
expect_info_dict(self, info_dict, tc.get('info_dict', {}))
finally:
try_rm_tcs_files()

View File

@@ -855,7 +855,7 @@ class YoutubeDL(object):
return self.process_ie_result(
new_result, download=download, extra_info=extra_info)
elif result_type == 'playlist' or result_type == 'multi_video':
elif result_type in ('playlist', 'multi_video'):
# We process each entry in the playlist
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)

View File

@@ -2692,7 +2692,7 @@ else:
userhome = pwent.pw_dir
userhome = userhome.rstrip('/')
return (userhome + path[i:]) or '/'
elif compat_os_name == 'nt' or compat_os_name == 'ce':
elif compat_os_name in ('nt', 'ce'):
def compat_expanduser(path):
"""Expand ~ and ~user constructs.

View File

@@ -34,7 +34,7 @@ class HlsFD(FragmentFD):
def can_download(manifest, info_dict):
UNSUPPORTED_FEATURES = (
r'#EXT-X-KEY:METHOD=(?!NONE|AES-128)', # encrypted streams [1]
r'#EXT-X-BYTERANGE', # playlists composed of byte ranges of media files [2]
# r'#EXT-X-BYTERANGE', # playlists composed of byte ranges of media files [2]
# Live streams heuristic does not always work (e.g. geo restricted to Germany
# http://hls-geo.daserste.de/i/videoportal/Film/c_620000/622873/format,716451,716457,716450,716458,716459,.mp4.csmil/index_4_av.m3u8?null=0)
@@ -52,7 +52,9 @@ class HlsFD(FragmentFD):
# 4. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.5
)
check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
check_results.append(can_decrypt_frag or '#EXT-X-KEY:METHOD=AES-128' not in manifest)
is_aes128_enc = '#EXT-X-KEY:METHOD=AES-128' in manifest
check_results.append(can_decrypt_frag or not is_aes128_enc)
check_results.append(not (is_aes128_enc and r'#EXT-X-BYTERANGE' in manifest))
check_results.append(not info_dict.get('is_live'))
return all(check_results)
@@ -100,6 +102,7 @@ class HlsFD(FragmentFD):
i = 0
media_sequence = 0
decrypt_info = {'METHOD': 'NONE'}
byte_range = {}
frags_filenames = []
for line in s.splitlines():
line = line.strip()
@@ -114,11 +117,14 @@ class HlsFD(FragmentFD):
if extra_query:
frag_url = update_url_query(frag_url, extra_query)
count = 0
headers = info_dict.get('http_headers', {})
if byte_range:
headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'])
while count <= fragment_retries:
try:
success = ctx['dl'].download(frag_filename, {
'url': frag_url,
'http_headers': info_dict.get('http_headers'),
'http_headers': headers,
})
if not success:
return False
@@ -167,6 +173,13 @@ class HlsFD(FragmentFD):
decrypt_info['KEY'] = self.ydl.urlopen(decrypt_info['URI']).read()
elif line.startswith('#EXT-X-MEDIA-SEQUENCE'):
media_sequence = int(line[22:])
elif line.startswith('#EXT-X-BYTERANGE'):
splitted_byte_range = line[17:].split('@')
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range['end']
byte_range = {
'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]),
}
self._finish_frag_download(ctx)

View File

@@ -169,7 +169,7 @@ class RtmpFD(FileDownloader):
self.report_error('[rtmpdump] Could not connect to RTMP server.')
return False
while (retval == RD_INCOMPLETE or retval == RD_FAILED) and not test and not live:
while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live:
prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] %s bytes' % prevsize)
time.sleep(5.0) # This seems to be needed

View File

@@ -45,7 +45,7 @@ class ADNIE(InfoExtractor):
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
bytes_to_intlist(b'\xb5@\xcfq\xa3\x98"N\xe4\xf3\x12\x98}}\x16\xd8'),
bytes_to_intlist(b'\nd\xaf\xd2J\xd0\xfc\xe1\xfc\xdf\xb61\xe8\xe1\xf0\xcc'),
bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json(

View File

@@ -41,6 +41,11 @@ MSO_INFO = {
'username_field': 'IDToken1',
'password_field': 'IDToken2',
},
'Verizon': {
'name': 'Verizon FiOS',
'username_field': 'IDToken1',
'password_field': 'IDToken2',
},
'thr030': {
'name': '3 Rivers Communications'
},
@@ -1384,40 +1389,72 @@ class AdobePassIE(InfoExtractor):
# Comcast page flow varies by video site and whether you
# are on Comcast's network.
provider_redirect_page, urlh = provider_redirect_page_res
# Check for Comcast auto login
if 'automatically signing you in' in provider_redirect_page:
oauth_redirect_url = self._html_search_regex(
r'window\.location\s*=\s*[\'"]([^\'"]+)',
provider_redirect_page, 'oauth redirect')
# Just need to process the request. No useful data comes back
self._download_webpage(
oauth_redirect_url, video_id, 'Confirming auto login')
else:
if '<form name="signin"' in provider_redirect_page:
# already have the form, just fill it
provider_login_page_res = provider_redirect_page_res
elif 'http-equiv="refresh"' in provider_redirect_page:
# redirects to the login page
oauth_redirect_url = self._html_search_regex(
r'content="0;\s*url=([^\'"]+)',
provider_redirect_page, 'meta refresh redirect')
provider_login_page_res = self._download_webpage_handle(
oauth_redirect_url,
video_id, 'Downloading Provider Login Page')
oauth_redirect_url, video_id,
'Downloading Provider Login Page')
else:
provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page')
provider_redirect_page_res,
'Downloading Provider Login Page')
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password,
})
mvpd_confirm_page_res = post_form(
provider_login_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
mvpd_confirm_page, urlh = mvpd_confirm_page_res
if '<button class="submit" value="Resume">Resume</button>' in mvpd_confirm_page:
post_form(mvpd_confirm_page_res, 'Confirming Login')
elif mso_id == 'Verizon':
# In general, if you're connecting from a Verizon-assigned IP,
# you will not actually pass your credentials.
provider_redirect_page, urlh = provider_redirect_page_res
if 'Please wait ...' in provider_redirect_page:
saml_redirect_url = self._html_search_regex(
r'self\.parent\.location=(["\'])(?P<url>.+?)\1',
provider_redirect_page,
'SAML Redirect URL', group='url')
saml_login_page = self._download_webpage(
saml_redirect_url, video_id,
'Downloading SAML Login Page')
else:
saml_login_page_res = post_form(
provider_redirect_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
saml_login_page, urlh = saml_login_page_res
if 'Please try again.' in saml_login_page:
raise ExtractorError(
'We\'re sorry, but either the User ID or Password entered is not correct.')
saml_login_url = self._search_regex(
r'xmlHttp\.open\("POST"\s*,\s*(["\'])(?P<url>.+?)\1',
saml_login_page, 'SAML Login URL', group='url')
saml_response_json = self._download_json(
saml_login_url, video_id, 'Downloading SAML Response',
headers={'Content-Type': 'text/xml'})
self._download_webpage(
saml_response_json['targetValue'], video_id,
'Confirming Login', data=urlencode_postdata({
'SAMLResponse': saml_response_json['SAMLResponse'],
'RelayState': saml_response_json['RelayState']
}), headers={
'Content-Type': 'application/x-www-form-urlencoded'
})
else:
# Normal, non-Comcast flow
provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page')
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {

View File

@@ -23,7 +23,19 @@ class AENetworksBaseIE(ThePlatformIE):
class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks'
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network'
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime|lifetimemovieclub)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?)'
_VALID_URL = r'''(?x)
https?://
(?:www\.)?
(?P<domain>
(?:history|aetv|mylifetime|lifetimemovieclub)\.com|
fyi\.tv
)/
(?:
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
specials/(?P<special_display_id>[^/]+)/full-special
)
'''
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
@@ -65,6 +77,9 @@ class AENetworksIE(AENetworksBaseIE):
}, {
'url': 'https://www.lifetimemovieclub.com/movies/a-killer-among-us',
'only_matching': True
}, {
'url': 'http://www.history.com/specials/sniper-into-the-kill-zone/full-special',
'only_matching': True
}]
_DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY',
@@ -75,8 +90,8 @@ class AENetworksIE(AENetworksBaseIE):
}
def _real_extract(self, url):
domain, show_path, movie_display_id = re.match(self._VALID_URL, url).groups()
display_id = show_path or movie_display_id
domain, show_path, movie_display_id, special_display_id = re.match(self._VALID_URL, url).groups()
display_id = show_path or movie_display_id or special_display_id
webpage = self._download_webpage(url, display_id)
if show_path:
url_parts = show_path.split('/')

View File

@@ -6,6 +6,7 @@ import re
from .common import InfoExtractor
from ..compat import compat_xpath
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
xpath_text,
@@ -72,13 +73,70 @@ class AfreecaTVIE(InfoExtractor):
'url': 'http://vod.afreecatv.com/PLAYER/STATION/18650793',
'info_dict': {
'id': '18650793',
'ext': 'flv',
'ext': 'mp4',
'title': '오늘은 다르다! 쏘님의 우월한 위아래~ 댄스리액션!',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '윈아디',
'uploader_id': 'badkids',
'title': '오늘은 다르다! 쏘님의 우월한 위아래~ 댄스리액션!',
'duration': 107,
},
'params': {
'skip_download': True, # requires rtmpdump
'skip_download': True,
},
}, {
'url': 'http://vod.afreecatv.com/PLAYER/STATION/10481652',
'info_dict': {
'id': '10481652',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!'",
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'dailyapril',
'uploader_id': 'dailyapril',
'duration': 6492,
},
'playlist_count': 2,
'playlist': [{
'md5': 'd8b7c174568da61d774ef0203159bf97',
'info_dict': {
'id': '20160502_c4c62b9d_174361386_1',
'ext': 'mp4',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!' (part 1)",
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'dailyapril',
'uploader_id': 'dailyapril',
'upload_date': '20160502',
'duration': 3601,
},
}, {
'md5': '58f2ce7f6044e34439ab2d50612ab02b',
'info_dict': {
'id': '20160502_39e739bb_174361386_2',
'ext': 'mp4',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!' (part 2)",
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'dailyapril',
'uploader_id': 'dailyapril',
'upload_date': '20160502',
'duration': 2891,
},
}],
'params': {
'skip_download': True,
},
}, {
# non standard key
'url': 'http://vod.afreecatv.com/PLAYER/STATION/20515605',
'info_dict': {
'id': '20170411_BE689A0E_190960999_1_2_h',
'ext': 'mp4',
'title': '혼자사는여자집',
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': '♥이슬이',
'uploader_id': 'dasl8121',
'upload_date': '20170411',
'duration': 213,
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
@@ -94,7 +152,7 @@ class AfreecaTVIE(InfoExtractor):
m = re.match(r'^(?P<upload_date>\d{8})_\w+_(?P<part>\d+)$', key)
if m:
video_key['upload_date'] = m.group('upload_date')
video_key['part'] = m.group('part')
video_key['part'] = int(m.group('part'))
return video_key
def _real_extract(self, url):
@@ -109,23 +167,65 @@ class AfreecaTVIE(InfoExtractor):
raise ExtractorError('Specified AfreecaTV video does not exist',
expected=True)
video_url_raw = video_element.text
app, playpath = video_url_raw.split('mp4:')
video_url = video_element.text.strip()
title = xpath_text(video_xml, './track/title', 'title', fatal=True)
uploader = xpath_text(video_xml, './track/nickname', 'uploader')
uploader_id = xpath_text(video_xml, './track/bj_id', 'uploader id')
duration = int_or_none(xpath_text(video_xml, './track/duration',
'duration'))
duration = int_or_none(xpath_text(
video_xml, './track/duration', 'duration'))
thumbnail = xpath_text(video_xml, './track/titleImage', 'thumbnail')
return {
common_entry = {
'uploader': uploader,
'uploader_id': uploader_id,
'thumbnail': thumbnail,
}
info = common_entry.copy()
info.update({
'id': video_id,
'title': title,
'duration': duration,
})
if not video_url:
entries = []
file_elements = video_element.findall(compat_xpath('./file'))
one = len(file_elements) == 1
for file_num, file_element in enumerate(file_elements, start=1):
file_url = file_element.text
if not file_url:
continue
key = file_element.get('key', '')
upload_date = self._search_regex(
r'^(\d{8})_', key, 'upload date', default=None)
file_duration = int_or_none(file_element.get('duration'))
format_id = key if key else '%s_%s' % (video_id, file_num)
formats = self._extract_m3u8_formats(
file_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls',
note='Downloading part %d m3u8 information' % file_num)
title = title if one else '%s (part %d)' % (title, file_num)
file_info = common_entry.copy()
file_info.update({
'id': format_id,
'title': title,
'upload_date': upload_date,
'duration': file_duration,
'formats': formats,
})
entries.append(file_info)
entries_info = info.copy()
entries_info.update({
'_type': 'multi_video',
'entries': entries,
})
return entries_info
info = {
'id': video_id,
'url': app,
'ext': 'flv',
'play_path': 'mp4:' + playpath,
'rtmp_live': True, # downloading won't end without this
'title': title,
'uploader': uploader,
'uploader_id': uploader_id,
@@ -133,6 +233,21 @@ class AfreecaTVIE(InfoExtractor):
'thumbnail': thumbnail,
}
if determine_ext(video_url) == 'm3u8':
info['formats'] = self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls')
else:
app, playpath = video_url.split('mp4:')
info.update({
'url': app,
'ext': 'flv',
'play_path': 'mp4:' + playpath,
'rtmp_live': True, # downloading won't end without this
})
return info
class AfreecaTVGlobalIE(AfreecaTVIE):
IE_NAME = 'afreecatv:global'

View File

@@ -361,7 +361,7 @@ class BBCCoUkIE(InfoExtractor):
fmt.update({
'width': width,
'height': height,
'vbr': bitrate,
'tbr': bitrate,
'vcodec': encoding,
})
else:
@@ -370,7 +370,7 @@ class BBCCoUkIE(InfoExtractor):
'acodec': encoding,
'vcodec': 'none',
})
if protocol == 'http':
if protocol in ('http', 'https'):
# Direct link
fmt.update({
'url': href,
@@ -389,6 +389,8 @@ class BBCCoUkIE(InfoExtractor):
'rtmp_live': False,
'ext': 'flv',
})
else:
continue
formats.append(fmt)
elif kind == 'captions':
subtitles = self.extract_subtitles(media, programme_id)
@@ -407,7 +409,7 @@ class BBCCoUkIE(InfoExtractor):
description = smp_config['summary']
for item in smp_config['items']:
kind = item['kind']
if kind != 'programme' and kind != 'radioProgramme':
if kind not in ('programme', 'radioProgramme'):
continue
programme_id = item.get('vpid')
duration = int_or_none(item.get('duration'))
@@ -448,7 +450,7 @@ class BBCCoUkIE(InfoExtractor):
for item in self._extract_items(playlist):
kind = item.get('kind')
if kind != 'programme' and kind != 'radioProgramme':
if kind not in ('programme', 'radioProgramme'):
continue
title = playlist.find('./{%s}title' % self._EMP_PLAYLIST_NS).text
description_el = playlist.find('./{%s}summary' % self._EMP_PLAYLIST_NS)

View File

@@ -45,6 +45,9 @@ class CanalplusIE(InfoExtractor):
'itele': 'itele',
}
# Only works for direct mp4 URLs
_GEO_COUNTRIES = ['FR']
_TESTS = [{
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1192814',
'info_dict': {
@@ -56,6 +59,7 @@ class CanalplusIE(InfoExtractor):
'upload_date': '20160702',
},
}, {
# geo restricted, bypassed
'url': 'http://www.piwiplus.fr/videos-piwi/pid1405-le-labyrinthe-boing-super-ranger.html?vid=1108190',
'info_dict': {
'id': '1108190',
@@ -65,19 +69,20 @@ class CanalplusIE(InfoExtractor):
'description': 'md5:4cea7a37153be42c1ba2c1d3064376ff',
'upload_date': '20140724',
},
'skip': 'Only works from France',
'expected_warnings': ['HTTP Error 403: Forbidden'],
}, {
'url': 'http://www.c8.fr/c8-divertissement/ms-touche-pas-a-mon-poste/pid6318-videos-integrales.html',
'md5': '4b47b12b4ee43002626b97fad8fb1de5',
# geo restricted, bypassed
'url': 'http://www.c8.fr/c8-divertissement/ms-touche-pas-a-mon-poste/pid6318-videos-integrales.html?vid=1443684',
'md5': 'bb6f9f343296ab7ebd88c97b660ecf8d',
'info_dict': {
'id': '1420213',
'id': '1443684',
'display_id': 'pid6318-videos-integrales',
'ext': 'mp4',
'title': 'TPMP ! Même le matin - Les 35H de Baba - 14/10/2016',
'description': 'md5:f96736c1b0ffaa96fd5b9e60ad871799',
'upload_date': '20161014',
'title': 'Guess my iep ! - TPMP - 07/04/2017',
'description': 'md5:6f005933f6e06760a9236d9b3b5f17fa',
'upload_date': '20170407',
},
'skip': 'Only works from France',
'expected_warnings': ['HTTP Error 403: Forbidden'],
}, {
'url': 'http://www.itele.fr/chroniques/invite-michael-darmon/rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
'info_dict': {
@@ -134,15 +139,15 @@ class CanalplusIE(InfoExtractor):
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD'])
fmt_url = next(iter(media.get('VIDEOS')))
if '/geo' in fmt_url.lower():
response = self._request_webpage(
HEADRequest(fmt_url), video_id,
'Checking if the video is georestricted')
if '/blocage' in response.geturl():
raise ExtractorError(
'The video is not available in your country',
expected=True)
# _, fmt_url = next(iter(media['VIDEOS'].items()))
# if '/geo' in fmt_url.lower():
# response = self._request_webpage(
# HEADRequest(fmt_url), video_id,
# 'Checking if the video is georestricted')
# if '/blocage' in response.geturl():
# raise ExtractorError(
# 'The video is not available in your country',
# expected=True)
formats = []
for format_id, format_url in media['VIDEOS'].items():

View File

@@ -33,10 +33,17 @@ class ChaturbateIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
m3u8_formats = [(m.group('id').lower(), m.group('url')) for m in re.finditer(
r'hlsSource(?P<id>.+?)\s*=\s*(?P<q>["\'])(?P<url>http.+?)(?P=q)', webpage)]
m3u8_urls = []
if not m3u8_formats:
for m in re.finditer(
r'(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage):
m3u8_fast_url, m3u8_no_fast_url = m.group('url'), m.group(
'url').replace('_fast', '')
for m3u8_url in (m3u8_fast_url, m3u8_no_fast_url):
if m3u8_url not in m3u8_urls:
m3u8_urls.append(m3u8_url)
if not m3u8_urls:
error = self._search_regex(
[r'<span[^>]+class=(["\'])desc_span\1[^>]*>(?P<error>[^<]+)</span>',
r'<div[^>]+id=(["\'])defchat\1[^>]*>\s*<p><strong>(?P<error>[^<]+)<'],
@@ -50,7 +57,8 @@ class ChaturbateIE(InfoExtractor):
raise ExtractorError('Unable to find stream URL')
formats = []
for m3u8_id, m3u8_url in m3u8_formats:
for m3u8_url in m3u8_urls:
m3u8_id = 'fast' if '_fast' in m3u8_url else 'slow'
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, ext='mp4',
# ffmpeg skips segments for fast m3u8

View File

@@ -1779,7 +1779,7 @@ class InfoExtractor(object):
if content_type == 'text':
# TODO implement WebVTT downloading
pass
elif content_type == 'video' or content_type == 'audio':
elif content_type in ('video', 'audio'):
base_url = ''
for element in (representation, adaptation_set, period, mpd_doc):
base_url_e = element.find(_add_ns('BaseURL'))

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
@@ -49,6 +51,48 @@ class CuriosityStreamBaseIE(InfoExtractor):
limelight_media_id = media['limelight_media_id']
title = media['title']
formats = []
for encoding in media.get('encodings', []):
m3u8_url = encoding.get('master_playlist_url')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
encoding_url = encoding.get('url')
file_url = encoding.get('file_url')
if not encoding_url and not file_url:
continue
f = {
'width': int_or_none(encoding.get('width')),
'height': int_or_none(encoding.get('height')),
'vbr': int_or_none(encoding.get('video_bitrate')),
'abr': int_or_none(encoding.get('audio_bitrate')),
'filesize': int_or_none(encoding.get('size_in_bytes')),
'vcodec': encoding.get('video_codec'),
'acodec': encoding.get('audio_codec'),
'container': encoding.get('container_type'),
}
for f_url in (encoding_url, file_url):
if not f_url:
continue
fmt = f.copy()
rtmp = re.search(r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+))/(?P<playpath>mp[34]:.+)$', f_url)
if rtmp:
fmt.update({
'url': rtmp.group('url'),
'play_path': rtmp.group('playpath'),
'app': rtmp.group('app'),
'ext': 'flv',
'format_id': 'rtmp',
})
else:
fmt.update({
'url': f_url,
'format_id': 'http',
})
formats.append(fmt)
self._sort_formats(formats)
subtitles = {}
for closed_caption in media.get('closed_captions', []):
sub_url = closed_caption.get('file')
@@ -60,16 +104,14 @@ class CuriosityStreamBaseIE(InfoExtractor):
})
return {
'_type': 'url_transparent',
'id': video_id,
'url': 'limelight:media:' + limelight_media_id,
'formats': formats,
'title': title,
'description': media.get('description'),
'thumbnail': media.get('image_large') or media.get('image_medium') or media.get('image_small'),
'duration': int_or_none(media.get('duration')),
'tags': media.get('tags'),
'subtitles': subtitles,
'ie_key': 'LimelightMedia',
}
@@ -78,14 +120,12 @@ class CuriosityStreamIE(CuriosityStreamBaseIE):
_VALID_URL = r'https?://app\.curiositystream\.com/video/(?P<id>\d+)'
_TEST = {
'url': 'https://app.curiositystream.com/video/2',
'md5': 'a0074c190e6cddaf86900b28d3e9ee7a',
'md5': '262bb2f257ff301115f1973540de8983',
'info_dict': {
'id': '2',
'ext': 'mp4',
'title': 'How Did You Develop The Internet?',
'description': 'Vint Cerf, Google\'s Chief Internet Evangelist, describes how he and Bob Kahn created the internet.',
'timestamp': 1448388615,
'upload_date': '20151124',
}
}
@@ -105,7 +145,7 @@ class CuriosityStreamCollectionIE(CuriosityStreamBaseIE):
'title': 'Curious Minds: The Internet',
'description': 'How is the internet shaping our lives in the 21st Century?',
},
'playlist_mincount': 17,
'playlist_mincount': 12,
}
def _real_extract(self, url):

View File

@@ -384,6 +384,7 @@ from .globo import (
GloboArticleIE,
)
from .go import GoIE
from .go90 import Go90IE
from .godtube import GodTubeIE
from .godtv import GodTVIE
from .golem import GolemIE
@@ -1031,6 +1032,7 @@ from .tv2 import (
TV2IE,
TV2ArticleIE,
)
from .tv2hu import TV2HuIE
from .tv3 import TV3IE
from .tv4 import TV4IE
from .tv5mondeplus import TV5MondePlusIE

View File

@@ -730,6 +730,21 @@ class GenericIE(InfoExtractor):
'skip_download': True,
}
},
# YouTube <object> embed
{
'url': 'http://www.improbable.com/2017/04/03/untrained-modern-youths-and-ancient-masters-in-selfie-portraits/',
'md5': '516718101ec834f74318df76259fb3cc',
'info_dict': {
'id': 'msN87y-iEx0',
'ext': 'webm',
'title': 'Feynman: Mirrors FUN TO IMAGINE 6',
'upload_date': '20080526',
'description': 'md5:0ffc78ea3f01b2e2c247d5f8d1d3c18d',
'uploader': 'Christopher Sykes',
'uploader_id': 'ChristopherJSykes',
},
'add_ie': ['Youtube'],
},
# Camtasia studio
{
'url': 'http://www.ll.mit.edu/workshops/education/videocourses/antennas/lecture1/video/',
@@ -1938,6 +1953,7 @@ class GenericIE(InfoExtractor):
data-video-url=|
<embed[^>]+?src=|
embedSWF\(?:\s*|
<object[^>]+data=|
new\s+SWFObject\(
)
(["\'])
@@ -2595,7 +2611,7 @@ class GenericIE(InfoExtractor):
return True
vpath = compat_urlparse.urlparse(vurl).path
vext = determine_ext(vpath)
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml', 'js')
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml', 'js', 'xml')
def filter_video(urls):
return list(filter(check_video, urls))

View File

@@ -0,0 +1,92 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
)
class Go90IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?go90\.com/videos/(?P<id>[0-9a-zA-Z]+)'
_TEST = {
'url': 'https://www.go90.com/videos/84BUqjLpf9D',
'md5': 'efa7670dbbbf21a7b07b360652b24a32',
'info_dict': {
'id': '84BUqjLpf9D',
'ext': 'mp4',
'title': 'Inside The Utah Coalition Against Pornography Convention',
'description': 'VICE\'s Karley Sciortino meets with activists who discuss the state\'s strong anti-porn stance. Then, VICE Sports explains NFL contracts.',
'timestamp': 1491868800,
'upload_date': '20170411',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(
'https://www.go90.com/api/view/items/' + video_id,
video_id, headers={
'Content-Type': 'application/json; charset=utf-8',
}, data=b'{"client":"web","device_type":"pc"}')
title = video_data['title']
main_video_asset = video_data['main_video_asset']
thumbnails = []
formats = []
for asset in video_data.get('assets'):
if asset.get('id') == main_video_asset:
for source in asset.get('sources', []):
source_location = source.get('location')
if not source_location:
continue
source_type = source.get('type')
if source_type == 'hls':
m3u8_formats = self._extract_m3u8_formats(
source_location, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False)
for f in m3u8_formats:
mobj = re.search(r'/hls-(\d+)-(\d+)K', f['url'])
if mobj:
height, tbr = mobj.groups()
height = int_or_none(height)
f.update({
'height': f.get('height') or height,
'width': f.get('width') or int_or_none(height / 9.0 * 16.0 if height else None),
'tbr': f.get('tbr') or int_or_none(tbr),
})
formats.extend(m3u8_formats)
elif source_type == 'dash':
formats.extend(self._extract_mpd_formats(
source_location, video_id, mpd_id='dash', fatal=False))
else:
formats.append({
'format_id': source.get('name'),
'url': source_location,
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('bitrate')),
})
elif asset.get('type') == 'image':
asset_location = asset.get('location')
if not asset_location:
continue
thumbnails.append({
'url': asset_location,
'width': int_or_none(asset.get('width')),
'height': int_or_none(asset.get('height')),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': video_data.get('short_description'),
'like_count': int_or_none(video_data.get('like_count')),
'timestamp': parse_iso8601(video_data.get('released_at')),
}

View File

@@ -92,12 +92,14 @@ class HBOBaseIE(InfoExtractor):
video_url.replace('.tar', '/base_index_w8.m3u8'),
video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
elif source.tag == 'hls':
# #EXT-X-BYTERANGE is not supported by native hls downloader
# and ffmpeg (#10955)
# formats.extend(self._extract_m3u8_formats(
# video_url.replace('.tar', '/base_index.m3u8'),
# video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
continue
m3u8_formats = self._extract_m3u8_formats(
video_url.replace('.tar', '/base_index.m3u8'),
video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
for f in m3u8_formats:
if f.get('vcodec') == 'none' and not f.get('tbr'):
f['tbr'] = int_or_none(self._search_regex(
r'-(\d+)k/', f['url'], 'tbr', default=None))
formats.extend(m3u8_formats)
elif source.tag == 'dash':
formats.extend(self._extract_mpd_formats(
video_url.replace('.tar', '/manifest.mpd'),
@@ -110,7 +112,7 @@ class HBOBaseIE(InfoExtractor):
'width': format_info.get('width'),
'height': format_info.get('height'),
})
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
self._sort_formats(formats)
thumbnails = []
card_sizes = xpath_element(video_data, 'titleCardSizes')

View File

@@ -26,7 +26,7 @@ class RudoIE(InfoExtractor):
}
@classmethod
def _extract_url(self, webpage):
def _extract_url(cls, webpage):
mobj = re.search(
r'<iframe[^>]+src=(?P<q1>[\'"])(?P<url>(?:https?:)?//rudo\.video/vod/[0-9a-zA-Z]+)(?P=q1)',
webpage)

View File

@@ -0,0 +1,62 @@
# encoding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
class TV2HuIE(InfoExtractor):
IE_NAME = 'tv2.hu'
_VALID_URL = r'https?://(?:www\.)?tv2\.hu/(?:[^/]+/)+(?P<id>\d+)_[^/?#]+?\.html'
_TESTS = [{
'url': 'http://tv2.hu/ezek_megorultek/217679_ezek-megorultek---1.-adas-1.-resz.html',
'md5': '585e58e2e090f34603804bb2c48e98d8',
'info_dict': {
'id': '217679',
'ext': 'mp4',
'title': 'Ezek megőrültek! - 1. adás 1. rész',
'upload_date': '20160826',
'thumbnail': r're:^https?://.*\.jpg$'
}
}, {
'url': 'http://tv2.hu/ezek_megorultek/teljes_adasok/217677_ezek-megorultek---1.-adas-2.-resz.html',
'only_matching': True
}, {
'url': 'http://tv2.hu/musoraink/aktiv/aktiv_teljes_adas/217963_aktiv-teljes-adas---2016.08.30..html',
'only_matching': True
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
json_url = self._search_regex(
r'jsonUrl\s*=\s*"([^"]+)"', webpage, 'json url')
json_data = self._download_json(json_url, video_id)
formats = []
for b in ('bitrates', 'backupBitrates'):
bitrates = json_data.get(b, {})
m3u8_url = bitrates.get('hls')
if m3u8_url:
formats.extend(self._extract_wowza_formats(
m3u8_url, video_id, skip_protocols=['rtmp', 'rtsp']))
for mp4_url in bitrates.get('mp4', []):
height = int_or_none(self._search_regex(
r'\.(\d+)p\.mp4', mp4_url, 'height', default=None))
formats.append({
'format_id': 'http' + ('-%d' % height if height else ''),
'url': mp4_url,
'height': height,
'width': int_or_none(height / 9.0 * 16.0 if height else None),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(webpage).strip(),
'thumbnail': self._og_search_thumbnail(webpage),
'upload_date': self._search_regex(
r'/vod/(\d{8})/', json_url, 'upload_date', default=None),
'formats': formats,
}

View File

@@ -68,7 +68,7 @@ class ViewLiftEmbedIE(ViewLiftBaseIE):
type_ = source.get('type')
ext = determine_ext(file_)
format_id = source.get('label') or ext
if all(v == 'm3u8' or v == 'hls' for v in (type_, ext)):
if all(v in ('m3u8', 'hls') for v in (type_, ext)):
formats.extend(self._extract_m3u8_formats(
file_, video_id, 'mp4', m3u8_id='hls'))
else:

View File

@@ -70,9 +70,9 @@ class VLiveIE(InfoExtractor):
status, long_video_id, key = params[2], params[5], params[6]
status = remove_start(status, 'PRODUCT_')
if status == 'LIVE_ON_AIR' or status == 'BIG_EVENT_ON_AIR':
if status in ('LIVE_ON_AIR', 'BIG_EVENT_ON_AIR'):
return self._live(video_id, webpage)
elif status == 'VOD_ON_AIR' or status == 'BIG_EVENT_INTRO':
elif status in ('VOD_ON_AIR', 'BIG_EVENT_INTRO'):
if long_video_id and key:
return self._replay(video_id, webpage, long_video_id, key)
else:

View File

@@ -317,60 +317,60 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'151': {'ext': 'mp4', 'height': 72, 'format_note': 'HLS', 'acodec': 'aac', 'abr': 24, 'vcodec': 'h264', 'preference': -10},
# DASH mp4 video
'133': {'ext': 'mp4', 'height': 240, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'134': {'ext': 'mp4', 'height': 360, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'135': {'ext': 'mp4', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'136': {'ext': 'mp4', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'137': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'138': {'ext': 'mp4', 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40}, # Height can vary (https://github.com/rg3/youtube-dl/issues/4559)
'160': {'ext': 'mp4', 'height': 144, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'212': {'ext': 'mp4', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'264': {'ext': 'mp4', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'298': {'ext': 'mp4', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60, 'preference': -40},
'299': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60, 'preference': -40},
'266': {'ext': 'mp4', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'133': {'ext': 'mp4', 'height': 240, 'format_note': 'DASH video', 'vcodec': 'h264'},
'134': {'ext': 'mp4', 'height': 360, 'format_note': 'DASH video', 'vcodec': 'h264'},
'135': {'ext': 'mp4', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'h264'},
'136': {'ext': 'mp4', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'h264'},
'137': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264'},
'138': {'ext': 'mp4', 'format_note': 'DASH video', 'vcodec': 'h264'}, # Height can vary (https://github.com/rg3/youtube-dl/issues/4559)
'160': {'ext': 'mp4', 'height': 144, 'format_note': 'DASH video', 'vcodec': 'h264'},
'212': {'ext': 'mp4', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'h264'},
'264': {'ext': 'mp4', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'h264'},
'298': {'ext': 'mp4', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60},
'299': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60},
'266': {'ext': 'mp4', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'h264'},
# Dash mp4 audio
'139': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 48, 'preference': -50, 'container': 'm4a_dash'},
'140': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 128, 'preference': -50, 'container': 'm4a_dash'},
'141': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 256, 'preference': -50, 'container': 'm4a_dash'},
'256': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'preference': -50, 'container': 'm4a_dash'},
'258': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'preference': -50, 'container': 'm4a_dash'},
'325': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'dtse', 'preference': -50, 'container': 'm4a_dash'},
'328': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'ec-3', 'preference': -50, 'container': 'm4a_dash'},
'139': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 48, 'container': 'm4a_dash'},
'140': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 128, 'container': 'm4a_dash'},
'141': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 256, 'container': 'm4a_dash'},
'256': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'container': 'm4a_dash'},
'258': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'container': 'm4a_dash'},
'325': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'dtse', 'container': 'm4a_dash'},
'328': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'ec-3', 'container': 'm4a_dash'},
# Dash webm
'167': {'ext': 'webm', 'height': 360, 'width': 640, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'168': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'169': {'ext': 'webm', 'height': 720, 'width': 1280, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'170': {'ext': 'webm', 'height': 1080, 'width': 1920, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'218': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'219': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'278': {'ext': 'webm', 'height': 144, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp9', 'preference': -40},
'242': {'ext': 'webm', 'height': 240, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'243': {'ext': 'webm', 'height': 360, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'244': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'245': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'246': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'247': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'248': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'271': {'ext': 'webm', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'167': {'ext': 'webm', 'height': 360, 'width': 640, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'168': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'169': {'ext': 'webm', 'height': 720, 'width': 1280, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'170': {'ext': 'webm', 'height': 1080, 'width': 1920, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'218': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'219': {'ext': 'webm', 'height': 480, 'width': 854, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8'},
'278': {'ext': 'webm', 'height': 144, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp9'},
'242': {'ext': 'webm', 'height': 240, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'243': {'ext': 'webm', 'height': 360, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'244': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'245': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'246': {'ext': 'webm', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'247': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'248': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'271': {'ext': 'webm', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'vp9'},
# itag 272 videos are either 3840x2160 (e.g. RtoitU2A-3E) or 7680x4320 (sLprVF6d7Ug)
'272': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'302': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60, 'preference': -40},
'303': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60, 'preference': -40},
'308': {'ext': 'webm', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60, 'preference': -40},
'313': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9', 'preference': -40},
'315': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60, 'preference': -40},
'272': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'302': {'ext': 'webm', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60},
'303': {'ext': 'webm', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60},
'308': {'ext': 'webm', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60},
'313': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9'},
'315': {'ext': 'webm', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'vp9', 'fps': 60},
# Dash webm audio
'171': {'ext': 'webm', 'acodec': 'vorbis', 'format_note': 'DASH audio', 'abr': 128, 'preference': -50},
'172': {'ext': 'webm', 'acodec': 'vorbis', 'format_note': 'DASH audio', 'abr': 256, 'preference': -50},
'171': {'ext': 'webm', 'acodec': 'vorbis', 'format_note': 'DASH audio', 'abr': 128},
'172': {'ext': 'webm', 'acodec': 'vorbis', 'format_note': 'DASH audio', 'abr': 256},
# Dash webm audio with opus inside
'249': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 50, 'preference': -50},
'250': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 70, 'preference': -50},
'251': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 160, 'preference': -50},
'249': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 50},
'250': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 70},
'251': {'ext': 'webm', 'format_note': 'DASH audio', 'acodec': 'opus', 'abr': 160},
# RTMP (unnamed)
'_rtmp': {'protocol': 'rtmp'},
@@ -1253,21 +1253,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
video_id = mobj.group(2)
return video_id
def _extract_from_m3u8(self, manifest_url, video_id):
url_map = {}
def _get_urls(_manifest):
lines = _manifest.split('\n')
urls = filter(lambda l: l and not l.startswith('#'),
lines)
return urls
manifest = self._download_webpage(manifest_url, video_id, 'Downloading formats manifest')
formats_urls = _get_urls(manifest)
for format_url in formats_urls:
itag = self._search_regex(r'itag/(\d+?)/', format_url, 'itag')
url_map[itag] = format_url
return url_map
def _extract_annotations(self, video_id):
url = 'https://www.youtube.com/annotations_invideo?features=1&legacy=1&video_id=%s' % video_id
return self._download_webpage(url, video_id, note='Searching for annotations.', errnote='Unable to download video annotations.')
@@ -1573,19 +1558,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if self._downloader.params.get('writeannotations', False):
video_annotations = self._extract_annotations(video_id)
def _map_to_format_list(urlmap):
formats = []
for itag, video_real_url in urlmap.items():
dct = {
'format_id': itag,
'url': video_real_url,
'player_url': player_url,
}
if itag in self._formats:
dct.update(self._formats[itag])
formats.append(dct)
return formats
if 'conn' in video_info and video_info['conn'][0].startswith('rtmp'):
self.report_rtmp_download()
formats = [{
@@ -1718,11 +1690,22 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
formats.append(dct)
elif video_info.get('hlsvp'):
manifest_url = video_info['hlsvp'][0]
url_map = self._extract_from_m3u8(manifest_url, video_id)
formats = _map_to_format_list(url_map)
# Accept-Encoding header causes failures in live streams on Youtube and Youtube Gaming
for a_format in formats:
formats = []
m3u8_formats = self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', fatal=False)
for a_format in m3u8_formats:
itag = self._search_regex(
r'/itag/(\d+)/', a_format['url'], 'itag', default=None)
if itag:
a_format['format_id'] = itag
if itag in self._formats:
dct = self._formats[itag].copy()
dct.update(a_format)
a_format = dct
a_format['player_url'] = player_url
# Accept-Encoding header causes failures in live streams on Youtube and Youtube Gaming
a_format.setdefault('http_headers', {})['Youtubedl-no-compression'] = 'True'
formats.append(a_format)
else:
unavailable_message = self._html_search_regex(
r'(?s)<h1[^>]+id="unavailable-message"[^>]*>(.+?)</h1>',

View File

@@ -552,7 +552,7 @@ class FFmpegSubtitlesConvertorPP(FFmpegPostProcessor):
sub_filenames.append(old_file)
new_file = subtitles_filename(filename, lang, new_ext)
if ext == 'dfxp' or ext == 'ttml' or ext == 'tt':
if ext in ('dfxp', 'ttml', 'tt'):
self._downloader.report_warning(
'You have requested to convert dfxp (TTML) subtitles into another format, '
'which results in style information loss')

View File

@@ -26,7 +26,7 @@ class MetadataFromTitlePP(PostProcessor):
regex += r'(?P<' + match.group(1) + '>.+)'
lastpos = match.end()
if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:len(fmt)])
regex += re.escape(fmt[lastpos:])
return regex
def run(self, info):

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.04.09'
__version__ = '2017.04.14'