Compare commits

...

534 Commits

Author SHA1 Message Date
Sergey M․
16393d6535 release 2017.08.13 2017-08-13 08:58:30 +07:00
Sergey M․
4f049e4aa8 [ChangeLog] Actualize 2017-08-13 08:00:15 +07:00
Sergey M․
475bcb225f [pornhub:playlistbase] Skip videos from drop-down menu for all playlists (closes #12819, closes #13902) 2017-08-13 07:53:02 +07:00
Sergey M․
b3c6515365 [fourtube] Add support for other sites (closes #6022, closes #7859, closes #13901) 2017-08-13 07:23:29 +07:00
Sergey M․
eb02940cc7 [generic] Add test for #13895 2017-08-13 01:11:27 +07:00
Sergey M․
4ef9152428 [limelight] Improve embeds detection (closes #13895) 2017-08-13 00:58:39 +07:00
Sergey M․
0c43a481b9 [reddit] Add extractors (closes #13847) 2017-08-12 23:43:51 +07:00
Sergey M․
868f79db41 [extractor/common] Fix _media_formats 2017-08-12 19:24:26 +07:00
Sergey M․
70851a95c3 [aparat] Extract all formats (closes #13887) 2017-08-12 17:18:23 +07:00
Sergey M․
e74e3b63e3 [YoutubeDL] Make sure format id is not empty 2017-08-12 17:14:11 +07:00
Sergey M․
ac8491fcca [extractor/common] Make _family_friendly_search optional 2017-08-12 17:11:35 +07:00
Sergey M․
82889d4ae5 [extractor/common] Respect source's type attribute for HTML5 media (closes #13892) 2017-08-12 16:48:11 +07:00
Sergey M․
92a5c41532 [mixcloud] Fix play info decryption (closes #13885) 2017-08-12 16:30:50 +07:00
Sergey M․
1663bd6e1c [generic] Replace vzaar embed test 2017-08-11 22:02:00 +07:00
tetra-eder
41918eaa5c [generic] Add support for vzaar embeds 2017-08-11 22:00:39 +07:00
Sergey M․
6ed99754bb release 2017.08.09 2017-08-09 23:52:22 +07:00
Sergey M․
0e7dfa7d16 [ChangeLog] Actualize 2017-08-09 23:49:53 +07:00
Sergey M․
baba5f4d1d [xxxymovies] Fix title extraction (closes #13868) 2017-08-09 23:46:49 +07:00
Sergey M․
dee04d24a4 [nick] Add support for nick.com.pl (closes #13860) 2017-08-09 23:12:02 +07:00
Sergey M․
5b3ddadcc3 [mixcloud] Fix play info decryption (closes #13867) 2017-08-09 22:55:13 +07:00
Sergey M․
5b232f46dc [utils] Skip missing params in cli_bool_option (closes #13865) 2017-08-09 22:28:19 +07:00
Alex Seiler
4bf22f7a10 [20min] Fix embeds extraction 2017-08-08 05:41:38 +07:00
Sergey M․
15d1e8a23d [dplayit] Fix extraction (closes #13851) 2017-08-07 22:43:42 +07:00
Yen Chi Hsuan
ee6a611665 [niconico] Support videos with multiple formats (closes #13522) 2017-08-07 00:19:46 +08:00
Yen Chi Hsuan
463e7216c8 [niconico] Support HTML5-only videos (closes #13806) 2017-08-06 23:07:28 +08:00
Sergey M․
903a183b6a release 2017.08.06 2017-08-06 09:05:36 +07:00
Sergey M․
92740e4241 [ChangeLog] Actualize 2017-08-06 09:02:14 +07:00
Sergey M․
fac188c695 [pluralsight] Fix format selection 2017-08-06 08:44:28 +07:00
Sergey M․
16afce174e [mpora] Remove extractor (closes #13826) 2017-08-06 08:18:16 +07:00
Sergey M․
e2b4808fd8 [voot] Improve extraction (#10255, closes #11814) 2017-08-06 08:05:29 +07:00
Ashutosh Chaudhary
daaaf5f594 [voot] Add extractor 2017-08-06 08:05:24 +07:00
Sergey M․
f172c86dcd [vlive:channel] Limit number of videos per page to 100 (closes #13830) 2017-08-05 21:17:55 +07:00
Sergey M․
1d5472290f [podomatic] Extend _VALID_URL (closes #13827) 2017-08-05 08:28:12 +07:00
Sergey M․
c983cc3b71 [cinchcast] Extend _VALID_URL 2017-08-05 08:17:01 +07:00
Sergey M․
1141e9104b Use relative paths for DASH fragments (closes #12990)
10x reduced JSON size
refs #13810
2017-08-05 07:40:29 +07:00
Sergey M․
8519b88f67 [yandexdisk] Relax _VALID_URL (closes #13824) 2017-08-05 00:59:07 +07:00
Sergey M․
bbbe1cebfc [mlb] Update test (closes #13777) 2017-08-05 00:09:36 +07:00
Sergey M․
f31fd0693b [vidme] Extract DASH and HLS formats 2017-08-05 00:00:21 +07:00
Sergey M․
799802f368 [teamfour] Remove extractor (closes #13782)
Now covered with generic extractor
2017-08-04 23:54:28 +07:00
Sergey M․
b3b5870cba [pornhd] Fix extraction (closes #13783) 2017-08-04 23:51:03 +07:00
Sergey M․
57a38a38c3 [udemy] Fix subtitles extraction (closes #13812) 2017-08-04 23:45:13 +07:00
Matt Crupi
11a6793f80 [mlb] Extend _VALID_URL (closes #13740) 2017-08-04 22:46:54 +07:00
Justin Quan
1f03fef994 [README.md] Improve grammar 2017-08-04 22:43:44 +07:00
Sergey M․
183062a4ab [pbs] Add support for new URL schema (closes #13801) 2017-08-03 23:19:59 +07:00
Sergey M․
8cda78ef72 [test_YoutubeDL] Add a test for #10083 2017-08-02 23:12:34 +07:00
Sergey M․
9118c9f18a [nrktv] Update API host (closes #13796) 2017-08-01 05:21:00 +07:00
Sergey M․
5c9ea67bc0 release 2017.07.30.1 2017-07-30 20:47:31 +07:00
Sergey M․
f701827e31 [ChangeLog] Actualize 2017-07-30 19:43:09 +07:00
Sergey M․
8b9f50d7cb [watchbox] Add extractor (#13739) 2017-07-30 19:09:44 +07:00
Sergey M․
0ed4758023 [clipfish] Remove extractor 2017-07-30 19:08:44 +07:00
Sergey M․
a0a477b885 [youjizz] Fix extraction (closes #13744) 2017-07-30 15:48:22 +07:00
Grzegorz Ruciński
198d4cb40c [generic] Add support for another ooyala embed pattern (closes #13727) 2017-07-30 01:30:04 +07:00
Sergey M․
ca127ab2c1 [ard] Add support for lives (closes #13771) 2017-07-29 23:07:28 +07:00
Sergey M․
e445850e69 [soundcloud] Update client id 2017-07-29 18:45:57 +07:00
Sergey M․
836ef26486 [soundcloud:trackstation] Add extractor (closes #13733) 2017-07-29 18:41:42 +07:00
Sergey M․
c04017519d [svtplay] Use geo verification proxy for API request 2017-07-29 15:30:53 +07:00
Sergey M․
2a7a823211 [svtplay] Update API URL (closes #13767) 2017-07-29 15:25:32 +07:00
Sergey M․
95908ce453 [extractor/generic] PEP 8 2017-07-29 15:13:12 +07:00
Sergey M․
cbbe66635f [yandexdisk] Add extractor (closes #13755) 2017-07-29 15:10:19 +07:00
Sergey M․
c5a49ff084 [downloader/hls] Use redirect URL as manifest base (#13755) 2017-07-29 15:02:41 +07:00
Philipp Hagemeister
24e966e8da [megaphone] Add extractor 2017-07-28 12:13:19 +02:00
Sergey M․
9682666bda [amcnetworks] Make rating optional (closes #12453) 2017-07-27 02:04:51 +07:00
Sergey M․
f9c48d895b [cloudy] Fix extraction (closes #13737) 2017-07-26 23:12:43 +07:00
Sergey M․
c99d6890cb [nickru] Add extractor 2017-07-23 21:02:06 +07:00
Sergey M․
70bfab0e9a [mtv] Improve thumbnal extraction 2017-07-23 21:02:06 +07:00
nyuszika7h
f0e31e32c9 [nick] Automate geo-restriction bypass (#13711) 2017-07-23 20:40:04 +07:00
nyuszika7h
3150976669 [ISSUE_TEMPLATE_tmpl.md] Minor improvements 2017-07-23 20:33:18 +07:00
Yen Chi Hsuan
e3ce912c3d [niconico] improve error reporting (#13696) 2017-07-23 16:25:30 +08:00
Yen Chi Hsuan
73095e013f [options] Typo 2017-07-23 16:24:18 +08:00
Yen Chi Hsuan
905d18a7aa [options] Correctly hide login info from debug outputs (#13696)
Iterate over opts instead of PRIVATE_OPTS for both performance and
correctness
2017-07-23 16:22:14 +08:00
Sergey M․
0db492c02a release 2017.07.23 2017-07-23 01:09:09 +07:00
Sergey M․
425f41319a [ChangeLog] Actualize 2017-07-23 01:06:08 +07:00
Sergey M․
71dde5eecf [itv] Fix production id extraction (closes #13671) 2017-07-23 00:59:07 +07:00
Sergey M․
935d6c20c0 [vidio] Make duration non fatal and fix typo 2017-07-23 00:44:50 +07:00
Sergey M․
e0f1fb0a27 [mtv] Skip missing video parts (closes #13690) 2017-07-23 00:25:23 +07:00
Sergey M․
0017d9ad6d [YoutubeDL] Improve default format specification (closes #13704) 2017-07-23 00:12:01 +07:00
Sergey M․
327c8364f1 [sportbox:embed] Fix extraction 2017-07-22 21:35:14 +07:00
dubber0
359aa2fdd1 [npo] Add support for npo3.nl URLs 2017-07-22 19:15:55 +07:00
Sergey M․
f76c02c87b [dramafever] Fix tests 2017-07-22 11:41:40 +07:00
Sergey M․
7d9a1db111 [dramafever] Remove video id from title (closes #13699) 2017-07-22 11:40:46 +07:00
Sergey M․
0396806f67 [YoutubeDL] Do not override id, extractor and extractor_key in url_transparent
All these meta fields must be borrowed from final extractor that actually performs extraction.
This commit fixes extractor id in download archives for url_transparent downloads. Previously, 'transparent' extractor was erroneously
used for extractor archive id, e.g. 'eggheadlesson 4n8ugwwj5t' instead of 'wistia 4n8ugwwj5t'.
2017-07-21 00:13:32 +07:00
Sergey M․
dc6520aa3d [egghead:lesson] Add extractor (#6635) 2017-07-20 23:22:36 +07:00
Sergey M․
c653326a14 [funnyordie] Extract more metadata (closes #13677) 2017-07-20 22:50:56 +07:00
Yen Chi Hsuan
3fcf346ac1 [youku:show] Refine playlist extraction
Handle playlists that the initial page is not the first page
2017-07-20 23:20:46 +08:00
Yen Chi Hsuan
fa63cf6c23 [youku:show] Fix playlist extraction (closes #13248) 2017-07-20 22:57:51 +08:00
Yen Chi Hsuan
85f5a74b6c [tbs] Mark as broken and skip invalid tests 2017-07-20 21:19:09 +08:00
Yen Chi Hsuan
d20b1c6725 [dispeak] Recognize sevt subdomain (closes #13276) 2017-07-20 18:14:14 +08:00
Sergey M․
bb176df3bb [spiegel:article] Move test 2017-07-17 22:19:40 +07:00
Sergey M․
83d00044c1 [adn] Improve error reporting (#13663) 2017-07-16 20:50:32 +07:00
Sergey M․
7abed4e06c [crunchyroll] Relax series and season regex (closes #13659) 2017-07-16 12:40:45 +07:00
Sergey M․
13eb526f11 [nexx:embed] PEP 8 2017-07-16 05:23:19 +07:00
Sergey M․
00d06e3cfc [spiegel:article] Add support for nexx iframe embeds (closes #13029) 2017-07-16 04:38:20 +07:00
Sergey M․
749ca5eced [extractor/common] Fix playlist_from_matches 2017-07-16 04:33:14 +07:00
Sergey M․
3f59b0154a [nexx:embed] Add extractor for iframe embeds 2017-07-16 04:32:37 +07:00
Sergey M․
089b97cfee [nexx] Improve JS embed extraction 2017-07-16 04:30:48 +07:00
Sergey M․
decf86044d [pearvideo] Improve (closes #13031) 2017-07-16 03:06:04 +07:00
troywith77
94b817edeb [pearvideo] Add extractor 2017-07-16 03:02:31 +07:00
Sergey M․
cea931a9e5 release 2017.07.15 2017-07-15 07:36:05 +07:00
Sergey M․
ef78563e9c [ChangeLog] Actualize 2017-07-15 07:33:26 +07:00
Sergey M․
961ea474b6 [YoutubeDL] PEP 8 2017-07-15 07:02:57 +07:00
Sergey M․
ea3f20494f [youtube] PEP 8 2017-07-15 07:02:57 +07:00
Sergey M․
c7604d79e9 [spiegeltv] Delegate extraction to nexx (closes #13159) 2017-07-15 07:02:57 +07:00
Sergey M․
4e826cd9ae [nexx] Add extractor (closes #10807, closes #13465) 2017-07-15 07:02:57 +07:00
Robin Neatherway
2583c0b54e Fix bugs caused by typos 2017-07-14 23:08:32 +07:00
Sergey M․
7d02dcfaa2 [youtube] Don't capture YouTube Red ad for creator meta field (closes #13621) 2017-07-14 22:37:04 +07:00
satunnainen
00dbdfc1f7 [slideshare] Fix extraction 2017-07-14 22:11:07 +07:00
rrooij
f354d84807 [5tv] Add another video URL pattern (closes #13354) 2017-07-14 22:10:17 +07:00
Sergey M․
15da37c7dc [YoutubeDL] Don't expand env variables in meta fields (closes #13637) 2017-07-14 00:42:12 +07:00
Sergey M․
9a0942ad55 [drtv] Make HLS and HDS extraction non fatal 2017-07-11 22:59:56 +07:00
Sergey M․
f2bb33a986 [ted] Fix subtitles extraction (closes #13628, closes #13629) 2017-07-11 21:36:45 +07:00
Yen Chi Hsuan
3615bfe1b4 [twitter] Fix remaining tests 2017-07-11 16:46:37 +08:00
Yen Chi Hsuan
e8f20ffa03 [vine] Make sure the title won't be empty
And fix a relevant TwitterCard test case
2017-07-11 16:05:15 +08:00
Yen Chi Hsuan
9be31e771c [twitter] Support HLS streams in vmap URLs 2017-07-11 15:48:48 +08:00
Yen Chi Hsuan
7f176ac477 [periscope] Support pscp.tv URLs in embedded frames
And fix a relevant twitter test
2017-07-11 15:35:19 +08:00
Yen Chi Hsuan
2edfd745df [twitter] Extract mp4 urls via mobile API (closes #12726) 2017-07-11 15:19:36 +08:00
Yen Chi Hsuan
708f6f511e [niconico] Fix authentication error handling (closes #12486) 2017-07-11 15:04:45 +08:00
Yen Chi Hsuan
bb13949197 [niconico] Check login errors (#12486) 2017-07-11 15:03:11 +08:00
Yen Chi Hsuan
c3c94ca4a4 [giantbomb] Extract m3u8 formats (closes #13626) 2017-07-10 21:34:27 +08:00
Sergey M․
e3cd1fcdd1 [vlive:playlist] Relax and simplify 2017-07-10 04:32:24 +07:00
coreynicholson
b71c18b434 [vlive:playlist] Add extractor 2017-07-10 04:24:04 +07:00
Sergey M․
7bf539edcc [eagleplatform] Fix test 2017-07-10 00:14:41 +07:00
Sergey M․
65c416dda8 release 2017.07.09 2017-07-09 20:16:38 +07:00
Sergey M․
207acd8465 [ChangeLog] Actualize 2017-07-09 20:15:15 +07:00
Sergey M․
71a1db8919 [dailymail] Add support for embeds 2017-07-09 20:06:24 +07:00
Sergey M․
6e925598d6 [csjw] Add coding cookie 2017-07-09 19:18:12 +07:00
Sergey M․
73cf76a93f [joj] Rewrite and add support for generic embeds (closes #13268) 2017-07-09 19:17:54 +07:00
luboss
256a746d21 [joj] Add extractor 2017-07-09 19:17:38 +07:00
Sergey M․
58179eb7d9 [abc.net.au:iview] Extract more formats (closes #13492, closes #13489) 2017-07-09 17:55:40 +07:00
Sergey M․
485cb37576 [egghead:course] Improve (closes #13370) 2017-07-09 17:30:49 +07:00
Santiago Calcagno
ed84454d35 [egghead:course] Fix extraction 2017-07-09 17:30:25 +07:00
Sergey M․
a02682fd13 Keep in sync with ffmpeg's current malformed AAC bitstream wording (closes #13587) 2017-07-09 17:09:44 +07:00
Sergey M․
0d2f0b0357 [csjw] Make description optional 2017-07-09 17:05:11 +07:00
Sergey M․
c319d1c483 [csjw] Fix issues and improve extraction (closes #13525) 2017-07-09 17:01:05 +07:00
Christopher Smith
d2b9f362fa [cjsw] Add extractor 2017-07-09 17:01:00 +07:00
Sergey M․
4328ddf82b [extractor/common] Add support for AMP tags in _parse_html5_media_entries 2017-07-09 16:29:52 +07:00
Sergey M․
250b042c7e [generic] Add tests for #13557 2017-07-09 16:02:38 +07:00
Sergey M․
665e945246 [eagleplatform] Add support for referrer protected videos (closes #13557) 2017-07-09 15:57:58 +07:00
Sergey M․
5af2fd7fa0 [eagleplatform] Add support for another embed pattern (#13557) 2017-07-09 15:55:04 +07:00
mlindner
15237fcd51 [veoh] Extend _VALID_URL 2017-07-09 14:54:52 +07:00
rrooij
7a57730907 [npo:live] Fix live stream id extraction (closes #13568) 2017-07-09 14:21:40 +07:00
Sergey M․
8b347a389e [googledrive] Fix height extraction (closes #13603) 2017-07-09 00:26:13 +07:00
Sergey M․
a49804816c [dailymotion] Add support for new layout (close #13580) 2017-07-08 18:12:15 +07:00
Yen Chi Hsuan
eadd313321 [yam] Remove extractor
mymedia.yam.com is dead. An wikipedia user also pointed out that Yam's
blog service is no longer available. [1]

[1] https://zh.wikipedia.org/zh-tw/%E5%A4%A9%E7%A9%BA%E9%83%A8%E8%90%BD
2017-07-08 15:48:05 +08:00
Sergey M․
d852c6bc59 [xhamster] Extract all formats and fix duration extraction (#13593) 2017-07-07 22:49:11 +07:00
Sergey M․
00e5c36315 [xhamster] Add support for new URL schema (closes #13593) 2017-07-07 22:27:34 +07:00
Sergey M․
8a04ade86b Credit @parmjitv for #13322, #13503, #13541, #13549 2017-07-06 23:15:23 +07:00
Sergey M․
ab328411d5 Credit @orng for ruv (#13396) 2017-07-06 23:15:16 +07:00
Sergey M․
ddeff4be3f Credit @gfabiano for #13382, #13385, #13415 2017-07-06 23:15:09 +07:00
Parmjit Virk
60d4401c5e [espn] Extend _VALID_URL (fixes #13244) 2017-07-06 22:55:59 +07:00
Sergey M․
dee2ff1d81 [test_utils] Fix tests under Windows 2017-07-06 00:25:37 +07:00
Sergey M․
6554708252 [kaltura] Fix typo in subtitles extraction (closes #13569) 2017-07-05 23:20:50 +07:00
Sergey M․
0a2e1b2e30 [vier] Adapt extraction to redesign (#13575) 2017-07-05 22:52:47 +07:00
Yen Chi Hsuan
babbc04d45 [xuite] Move to the new HTML5 API and reduce # of requests 2017-07-05 23:27:12 +08:00
Yen Chi Hsuan
609ff8ca19 [utils] Support attributes with no values in get_elements_by_attribute() 2017-07-05 23:27:12 +08:00
Sergey M․
b6c9fe4162 release 2017.07.02 2017-07-02 20:17:10 +07:00
Sergey M․
4d9ba27bba [ChangeLog] Actualize 2017-07-02 20:12:40 +07:00
Sergey M․
50ae3f646e [thisoldhouse] Add more fallbacks for video id (closes #13541) 2017-07-02 20:06:15 +07:00
Parmjit Virk
99a7e76240 [thisoldhouse] Update test 2017-07-02 20:05:11 +07:00
Parmjit Virk
a3a6d01a96 [thisoldhouse] Fix video id extraction (closes #13540) 2017-07-02 20:04:51 +07:00
Sergey M․
02d61a65e2 [xfileshare] Extend format regex (closes #13536) 2017-07-02 08:00:22 +07:00
Sergey M․
9b35297be1 [extractors] Add import for tastytrade 2017-07-01 18:39:29 +07:00
Sergey M․
4917478803 [ted] Fix extraction (closes #13535)) 2017-07-01 18:39:01 +07:00
Sergey M․
54faac2235 [tastytrade] Add extractor (closes #13521) 2017-06-30 22:20:30 +07:00
Sergey M․
c69701c6ab [extractor/common] Improve _json_ld 2017-06-30 22:19:06 +07:00
Sergey M․
d4f8ce6e91 [dplayit] Relax video id regex (closes #13524) 2017-06-30 21:55:45 +07:00
Sergey M․
b311b0ead2 [generic] Extract more generic metadata (closes #13527) 2017-06-30 21:42:04 +07:00
Sergey M․
72d256c434 [bbccouk] Extend _VALID_URL 2017-06-29 22:29:28 +07:00
Sergey M․
b2ed954fc6 [bbccouk] Capture and output error message (closes #13518) 2017-06-29 22:27:53 +07:00
Sergey M․
a919ca0ad6 [cbsnews] Actualize test 2017-06-28 22:30:12 +07:00
Parmjit Virk
88d6b7c2bd [cbsnews] Relax video info regex (fixes #13284) 2017-06-28 22:21:35 +07:00
Sergey M․
fd1c5fba6b [facebook] Add test for plugin video embed (#13493) 2017-06-27 22:38:59 +07:00
Sergey M․
0646e34c7d [facebook] Add support for plugin video embeds and multiple embeds (closes #13493) 2017-06-27 22:38:54 +07:00
Sergey M․
bf2dc9cc6e [soundcloud] Fix tests 2017-06-27 21:26:46 +07:00
Viktor Szakats
f1c051009b [soundcloud] Switch to https for API requests 2017-06-27 21:20:18 +07:00
Sergey M․
33ffb645a6 [pandatv] Switch to https for API and download URLs 2017-06-26 22:11:09 +07:00
Xuan Hu (Sean)
35544690e4 [pandatv] Add support for https URLs 2017-06-26 22:00:31 +07:00
Yen Chi Hsuan
136503e302 [ChangeLog] Update after #13494 2017-06-26 19:56:07 +08:00
Luca Steeb
4a87de72df [niconico] fix sp subdomain links 2017-06-25 21:30:05 +02:00
Sergey M․
a7ce8f16c4 release 2017.06.25 2017-06-25 05:16:06 +07:00
Sergey M․
a5aea53fc8 [ChangeLog] Actualize 2017-06-25 05:13:12 +07:00
Sergey M․
0c7a631b61 [adobepass] Add support for ATTOTT MSO (DIRECTV NOW) (closes #13472) 2017-06-25 05:03:17 +07:00
Sergey M․
fd9ee4de8c [wsj] Add support for barrons.com (closes #13470) 2017-06-25 02:15:35 +07:00
Argn0
5744cf6c03 [ign] Add another video id pattern (closes #13328) 2017-06-25 01:59:15 +07:00
Sergey M․
9c48b5a193 [raiplay:live] Improve and add test (closes #13414) 2017-06-25 01:49:27 +07:00
james
449c665776 [raiplay:live] Add extractor 2017-06-25 01:48:54 +07:00
Sergey M․
23aec3d623 [redbulltv] Restore hls format prefix 2017-06-25 01:10:31 +07:00
Sergey M․
27449ad894 [redbulltv] Add support for lives and segments (closes #13486)) 2017-06-25 01:09:12 +07:00
Sergey M․
bd65f18153 [onetpl] Add support for videos embedded via pulsembed (closes #13482) 2017-06-24 18:33:31 +07:00
Sergey M․
73af5cc817 [YoutubeDL] Skip malformed formats for better extraction robustness 2017-06-23 21:18:33 +07:00
Sergey M․
b5f523ed62 [ooyala] Add test for missing stream['url']['data'] 2017-06-23 20:56:48 +07:00
Sergey M․
4f4dd8d797 [ooyala] Make more robust 2017-06-23 20:56:21 +07:00
Sergey M․
4cb18ab1b9 [ooyala] Skip empty format URLs (closes #13471, closes #13476) 2017-06-23 20:50:48 +07:00
Sergey M․
ac7409eec5 [hgtv.com:show] Fix typo 2017-06-23 02:54:12 +07:00
Sergey M․
170719414d release 2017.06.23 2017-06-23 02:13:21 +07:00
Sergey M․
38dad4737f [ChangeLog] Actualize 2017-06-23 02:10:54 +07:00
Sergey M․
ddbb4c5c3e [youtube] Adapt to new automatic captions rendition (closes #13467) 2017-06-23 02:00:19 +07:00
Sergey M․
fa3ea7223a [hgtv.com:show] Relax video config regex and update test (closes #13279, closes #13461) 2017-06-23 00:42:42 +07:00
Parmjit Virk
0f4a5a73e7 [drtuber] Fix formats extraction (fixes 12058) 2017-06-23 00:08:36 +07:00
Sergey M․
18166bb8e8 [youporn] Fix upload date extraction 2017-06-22 00:47:02 +07:00
Sergey M․
d4893e764b [youporn] Improve formats extraction 2017-06-22 00:40:15 +07:00
Sergey M․
97b6e30113 [youporn] Fix title extraction (closes #13456) 2017-06-22 00:20:45 +07:00
Sergey M․
9be9ec5980 [googledrive] Fix formats' sorting (closes #13443) 2017-06-20 22:58:33 +07:00
Giuseppe Fabiano
048b55804d [watchindianporn] Fix extraction (closes #13411) 2017-06-20 04:30:45 +07:00
Giuseppe Fabiano
6ce79d7ac0 [abcotvs] Fix test md5 2017-06-20 04:07:00 +07:00
Sergey M․
1641ca402d [vimeo] Add fallback mp4 extension for original format 2017-06-20 01:27:59 +07:00
Sergey M․
85cbcede5b [ruv] Improve, extract all formats and metadata (closes #13396) 2017-06-19 23:46:03 +07:00
Orn
a1de83e5f0 [ruv] Add extractor 2017-06-19 23:45:45 +07:00
Sergey M․
fee00b3884 [viu] Fix extraction on older python 2.6 2017-06-19 22:57:37 +07:00
Sergey M․
2d2132ac6e [adobepass] Fix extraction on older python 2.6 2017-06-19 22:54:53 +07:00
Yen Chi Hsuan
cc2ffe5afe [pandora.tv] Fix upload_date extraction (closes #12846) 2017-06-19 16:20:36 +08:00
Sergey M․
560050669b [asiancrush] Add extractor (closes #13420) 2017-06-18 20:18:51 +07:00
Sergey M․
eaa006d1bd release 2017.06.18 2017-06-18 00:16:49 +07:00
Sergey M․
a6f29820c6 [ChangeLog] Actualize 2017-06-18 00:15:43 +07:00
Sergey M․
1433734c35 [downloader/common] Use utils.shell_quote for debug command line 2017-06-17 23:50:21 +07:00
Sergey M․
aefce8e6dc [utils] Use compat_shlex_quote in shell_quote 2017-06-17 23:48:58 +07:00
Sergey M․
8b6ac49ecc [postprocessor/execafterdownload] Encode command line (closes #13407) 2017-06-17 23:16:53 +07:00
Sergey M․
b08e235f09 [compat] Fix compat_shlex_quote on Windows (closes #5889, closes #10254) 2017-06-17 23:14:24 +07:00
Sergey M․
be80986ed9 [postprocessor/metadatafromtitle] Fix missing optional meta fields (closes #13408) 2017-06-17 19:05:10 +07:00
Yen Chi Hsuan
473e87064b [devscripts/prepare_manpage] Fix deprecated escape sequence on py36 2017-06-17 17:37:25 +08:00
Yen Chi Hsuan
4f90d2aeac [Makefile] Excluding __pycache__ correctly (#13400) 2017-06-17 17:09:24 +08:00
Jakub Adam Wieczorek
b230fefc3c [polskieradio] Fix extraction 2017-06-16 04:57:56 +07:00
Sergey M․
96a2daa1ee [extractor/common] Improve jwplayer subtitles extraction 2017-06-15 23:40:39 +07:00
gfabiano
0ea6efbb7a [xfileshare] Add support for fastvideo.me 2017-06-15 21:41:49 +07:00
Yen Chi Hsuan
6a9cb29509 [extractor/common] Fix json dumping with --geo-bypass
The line "[debug] Using fake IP %s (%s) as X-Forwarded-For." was printed
to stdout even with -j/-J, which breaks the resultant JSON.
2017-06-15 13:04:36 +08:00
Yen Chi Hsuan
ca27037171 [bilibili] Fix extraction of videos with double quotes in titles
Closes #13387
2017-06-15 11:19:03 +08:00
gfabiano
0bf4b71b75 [4tube] Fix extraction (closes #13381) 2017-06-15 04:16:50 +07:00
Marcin Cieślak
5215f45327 [disney] Add support for disneychannel.de 2017-06-15 04:13:04 +07:00
Sergey M․
0a268c6e11 [extractor/common] Improve jwplayer formats extraction (closes #13379) 2017-06-14 22:02:15 +07:00
Sergey M․
7dd5415cd0 [npo] Improve _VALID_URL (closes #13376) 2017-06-14 21:33:40 +07:00
Sergey M․
b5dc33daa9 [corus] Add support for showcase.ca 2017-06-13 23:27:27 +07:00
Sergey M․
97fa1f8dc4 [corus] Add support for history.ca (closes #13359) 2017-06-13 23:16:21 +07:00
Sergey M․
b081f53b08 [compat] Add compat_HTMLParseError to __all__ 2017-06-12 02:36:43 +07:00
Sergey M․
cb1e6d8985 release 2017.06.12 2017-06-12 02:23:17 +07:00
Sergey M․
9932ac5c58 [ChangeLog] Actualize 2017-06-12 02:01:15 +07:00
Sergey M․
bf87c36c93 [xfileshare] PEP 8 2017-06-12 02:01:12 +07:00
Sergey M․
b4a3d461e4 [utils] Handle HTMLParseError in extract_attributes (closes #13349) 2017-06-12 01:52:24 +07:00
Sergey M․
72b409559c [compat] Introduce compat_HTMLParseError 2017-06-12 01:50:32 +07:00
Sergey M․
534863e057 [xfileshare] Add support for rapidvideo (closes #13348) 2017-06-12 00:16:47 +07:00
Sergey M․
16bc958287 [xfileshare] Modernize and pass referrer 2017-06-12 00:14:04 +07:00
Sergey M․
624bd0104c [rutv] Add support for testplayer.vgtrk.com (closes #13347) 2017-06-11 21:36:19 +07:00
Sergey M․
28a4d6cce8 [newgrounds] Extract more metadata (closes #13232) 2017-06-11 21:30:06 +07:00
Sergey M․
2ae2ffda5e [utils] Improve unified_timestamp 2017-06-11 21:27:22 +07:00
Sergey M․
70e7967202 [newgrounds:playlist] Add extractor (closes #10611) 2017-06-11 20:50:55 +07:00
Sergey M․
6e999fbc12 [newgrounds] Improve formats and uploader extraction (closes #13346) 2017-06-11 19:44:44 +07:00
Sergey M․
7409af9eb3 [msn] Fix formats extraction 2017-06-11 08:56:53 +07:00
Sergey M․
4e3637034c [extractor/generic] Ensure format id is unicode string 2017-06-10 23:56:20 +07:00
Sergey M․
1afd0b0da7 [extractor/common] Return unicode string from _match_id 2017-06-09 00:40:03 +07:00
Sergey M․
7515830422 [turbo] Ensure format id is string 2017-06-09 00:31:56 +07:00
Sergey M․
f5521ea209 [sexu] Ensure height is int 2017-06-09 00:30:23 +07:00
Sergey M․
34646967ba [jove] Ensure comment count is int 2017-06-09 00:29:20 +07:00
Sergey M․
e4d2e76d8e [golem] Ensure format id is string 2017-06-09 00:27:11 +07:00
Sergey M․
87f5646937 [gfycat] Ensure filesize is int 2017-06-09 00:24:23 +07:00
Sergey M․
cc69a3de1b [foxgay] Ensure height is int 2017-06-09 00:22:14 +07:00
Sergey M․
15aeeb1188 [flickr] Ensure format id is string 2017-06-09 00:20:07 +07:00
Sergey M․
1693bebe4d [sohu] Fix numeric fields 2017-06-09 00:16:42 +07:00
Sergey M․
4244a13a1d [safari] Improve authentication detection (closes #13319) 2017-06-08 23:20:48 +07:00
Sergey M․
931adf8cc1 [liveleak] Ensure height is int (closes #13313) 2017-06-08 22:54:30 +07:00
Sergey M․
c996943418 [YoutubeDL] Sanitize more fields (#13313) 2017-06-08 22:53:14 +07:00
Sergey M․
76e6378358 [README.md] Improve man page formatting 2017-06-08 22:02:42 +07:00
Sergey M․
a355b57f58 [README.md] Clarify output template references (closes #13316) 2017-06-08 21:52:19 +07:00
Sergey M․
1508da30c2 [streamango] Skip download for test (closes #13292) 2017-06-07 21:53:40 +07:00
Luca Steeb
eb703e5380 [streamango] Make title optional 2017-06-07 21:53:33 +07:00
Sergey M․
0a3924e746 [rtlnl] Improve _VALID_URL (closes #13295) 2017-06-06 21:21:44 +07:00
Sergey M․
e1db730d86 [tvplayer] Fix extraction (closes #13291) 2017-06-06 00:13:57 +07:00
Sergey M․
537191826f release 2017.06.05 2017-06-05 00:48:07 +07:00
Sergey M․
130880ba48 [ChangeLog] Actualize 2017-06-05 00:43:38 +07:00
Sergey M․
f8ba3fda4d Credit @jktjkt for dvtv formats (#13063) 2017-06-05 00:38:44 +07:00
Sergey M․
e1b90cc3db Credit @mikf for beam:vod (#13032) 2017-06-05 00:35:41 +07:00
Sergey M․
43e6579558 Credit @adamvoss for bandcamp:weekly (#12758) 2017-06-04 23:22:19 +07:00
Sergey M․
6d923aab35 [bandcamp:weekly] Improve and extract more metadata (closes #12758) 2017-06-04 23:21:30 +07:00
Adam Voss
62bafabc09 [bandcamp:weekly] Add extractor 2017-06-04 23:21:07 +07:00
Sergey M․
9edcdac90c [pornhub:uservideos] Add missing raise 2017-06-04 20:39:55 +07:00
Sergey M․
cd138d8bd4 [pornhub:playlist] Fix extraction (closes #13281) 2017-06-04 15:54:19 +07:00
Sergey M․
cd750b731c [godtv] Remove extractor (closes #13175) 2017-06-03 22:08:12 +07:00
CeruleanSky
4bede0d8f5 [YoutubeDL] Don't emit ANSI escape codes on Windows 2017-06-03 19:14:23 +07:00
Sergey M․
f129c3f349 [safari] Fix typo (closes #13252) 2017-06-02 01:03:51 +07:00
Sergey M․
39d4c1be4d [youtube] Improve chapters extraction (closes #13247) 2017-06-01 23:29:45 +07:00
Sergey M․
f7a747ce59 [1tv] Lower preference for http formats (closes #13246) 2017-06-01 22:19:52 +07:00
Sergey M․
4489d41816 [francetv] Relax _VALID_URL 2017-06-01 00:15:15 +07:00
Sergey M․
87b5184a0d [drbonanza] Fix extraction (closes #13231) 2017-05-31 23:56:32 +07:00
Remita Amine
c56ad5c975 [packtpub] Fix authentication(closes #13240) 2017-05-31 15:44:29 +01:00
Sergey M․
6b7ce85cdc [README.md] Mention http_dash_segments protocol 2017-05-30 23:50:48 +07:00
Yen Chi Hsuan
d10d0e3cf8 [README.md] Add an example for how to use .netrc on Windows
That's a Python bug: http://bugs.python.org/issue28334
Most likely it will be fixed in Python 3.7: https://github.com/python/cpython/pull/123
2017-05-29 14:58:07 +08:00
Sergey M․
941ea38ef5 release 2017.05.29 2017-05-29 00:42:18 +07:00
Sergey M․
99bea8d298 [ChangeLog] Actualize 2017-05-29 00:33:56 +07:00
Yen Chi Hsuan
a49eccdfa7 [youtube] Parse player_url if format URLs are encrypted or DASH MPDs are requested
Fixes #13211
2017-05-28 20:20:20 +08:00
Sergey M․
a846173d93 [xhamster] Simplify (closes #13216) 2017-05-28 07:55:56 +07:00
fiocfun
78e210dea5 [xhamster] Fix author and like/dislike count extraction 2017-05-28 07:55:07 +07:00
Sergey M․
8555204274 [xhamster] Extract categories (closes #11728) 2017-05-28 07:50:15 +07:00
Sergey M․
164fcbfeb7 [abcnews] Improve and remove duplicate test (closes #12851) 2017-05-28 07:06:56 +07:00
Tithen-Firion
bc22df29c4 [abcnews] Add support for embed URLs 2017-05-28 07:06:29 +07:00
Sergey M․
7e688d2f6a [gaskrank] Improve (closes #12493) 2017-05-28 06:47:38 +07:00
motophil
5a6d1da442 [gaskrank] Fix extraction 2017-05-28 06:47:30 +07:00
Sergey M․
703751add4 [medialaan] PEP 8 (closes #12774) 2017-05-28 06:27:57 +07:00
midas02
4050be78e5 [medialaan] Fix videos with missing videoUrl
A rough trick to get around the two different json styles medialaan seems to be using.
Fix for these example videos:
https://vtmkzoom.be/video?aid=45724
https://vtmkzoom.be/video?aid=45425
2017-05-28 06:27:52 +07:00
Sergey M․
4d9fc40100 [dvtv] Improve and fix playlists support (closes #13063) 2017-05-28 06:19:54 +07:00
Jan Kundrát
765522345f [dvtv] Parse adaptive formats as well
The old code hit an error when it attempted to parse the string
"adaptive" for video height. Actually parsing the returned playlists is
a good idea because it adds more output formats, including some
audio-only-ones.
2017-05-28 06:19:46 +07:00
Sergey M․
6bceb36b99 [beam] Improve and add support for mixer.com (closes #13032) 2017-05-28 05:43:04 +07:00
Mike Fährmann
1e0d65f0bd [beam:vod] Add extractor 2017-05-28 05:42:23 +07:00
Sergey M․
03327bc9a6 [cbsinteractive] Relax _VALID_URL (closes #13213) 2017-05-27 22:37:24 +07:00
Yen Chi Hsuan
b407d8533d [utils] Drop an compatibility wrapper for Python < 2.6
addinfourl.getcode is added since Python 2.6a1. As youtube-dl now
requires 2.6+, this is no longer necessary.

See 9b0d46db11
2017-05-27 23:05:02 +08:00
Remita Amine
20e2c9de04 [adn] fix formats extraction 2017-05-26 20:00:44 +01:00
Yen Chi Hsuan
d16c0121b9 [youku] Extract more metadata (closes #10433) 2017-05-27 00:08:37 +08:00
Sergey M․
7f4c3a7439 [cbsnews] Fix extraction (closes #13205) 2017-05-26 22:42:27 +07:00
Sergey M․
28dbde9cc3 release 2017.05.26 2017-05-26 22:29:16 +07:00
Sergey M․
cc304ce588 [ChangeLog] Actualize 2017-05-26 22:27:56 +07:00
Yen Chi Hsuan
98a0618941 [ChangeLog] Update after the fix for #11381 2017-05-26 23:22:54 +08:00
Yen Chi Hsuan
fd545fc6d1 Revert "[youtube] Don't use the DASH manifest from 'get_video_info' if 'use_cipher_signature' is True (#5118)"
This reverts commit 87dc451108.
2017-05-26 23:22:54 +08:00
Sergey M․
97067db2ae [bbc] Add support for authentication 2017-05-26 22:12:24 +07:00
Yen Chi Hsuan
c130f0a37b [tudou] Merge into youku extractor (fixes #12214)
Also, there are no tudou playlists anymore. All playlist URLs points to youku
playlists.
2017-05-26 23:04:42 +08:00
Yen Chi Hsuan
d3d4ba7f24 [youku:show] Fix extraction 2017-05-26 21:59:16 +08:00
Yen Chi Hsuan
5552c9eb0f [utils] Recognize more patterns in strip_jsonp()
Used in Youku Show pages
2017-05-26 21:58:18 +08:00
Yen Chi Hsuan
59ed87cbd9 [youku] Fix extraction (closes #13191) 2017-05-26 19:16:52 +08:00
Sergey M․
b7f8749304 [udemy] Fix extraction for outputs' format entries without URL (closes #13192) 2017-05-25 22:28:26 +07:00
Yen Chi Hsuan
5192ee17e7 [postprocessor/ffmpeg] Fix metadata filename handling on Python 2
Fixes #13182
2017-05-25 22:07:03 +08:00
Sergey M․
e834f04400 [vimeo] Fix formats' sorting (closes #13189) 2017-05-24 22:58:16 +07:00
Remita Amine
884d09f330 [cbsnews] fix extraction for 60 Minutes videos 2017-05-24 09:44:41 +01:00
remitamine
9e35298f97 Merge pull request #12861 from Tithen-Firion/cbsinteractive-fix
[cbsinteractive] update extractor and test cases
2017-05-24 10:21:33 +02:00
Sergey M․
0551f1b07b Credit @gritstub for vevo fix (#12879) 2017-05-23 21:47:40 +07:00
Sergey M․
de53511201 Credit @timendum for rai (#11790) and mediaset (#12964) 2017-05-23 00:41:53 +07:00
Sergey M․
2570e85167 release 2017.05.23 2017-05-23 00:17:48 +07:00
Sergey M․
9dc5ab041f [ChangeLog] Actualize 2017-05-23 00:15:44 +07:00
Sergey M․
01f3c8e290 Credit @fredbourni for noovo (#12792) 2017-05-23 00:08:14 +07:00
Sergey M․
06c1b3ce07 Credit @mphe for streamango (#12643) 2017-05-23 00:07:31 +07:00
Sergey M․
0b75e42dfb Credit @zurfyx for atresplayer improvements (#12548) 2017-05-23 00:00:49 +07:00
Sergey M․
a609e61a90 [downloader/external] Pass -loglevel to ffmpeg downloader (closes #13183) 2017-05-22 23:40:07 +07:00
Ondřej Caletka
afdb387cd8 [streamcz] Add support for subtitles 2017-05-21 15:41:52 +07:00
Sergey M․
dc4e4f90a2 [youtube] Modernize 2017-05-21 01:18:56 +07:00
Protuhj
fdc20f87a6 [youtube] Fix DASH manifest signature decryption (closes #8944) 2017-05-21 01:11:37 +07:00
Sergey M․
35a2d221a3 [toggle] Relax _VALID_URL (closes #13172) 2017-05-20 23:06:30 +07:00
Nii-90
daa4e9ff90 [adobepass] Add support for Brighthouse MSO 2017-05-20 20:50:46 +07:00
Sergey M․
2ca29f1aaf [toypics] Improve and modernize 2017-05-20 01:29:33 +07:00
vobe
77d682da9d [toypics] Fix extraction 2017-05-20 01:18:03 +07:00
Sergey M․
8fffac6927 [njpwworld] Fix extraction (closes #13162) 2017-05-19 23:11:02 +07:00
Sergey M․
5f6fbcea08 [hitbox] Add support for smashcast.tv (closes #13154) 2017-05-19 22:34:00 +07:00
Logan B
00cb0faca8 [mitele] Update app key regex 2017-05-19 21:54:57 +07:00
Sergey M․
bfdf6fcc66 release 2017.05.18.1 2017-05-18 23:00:03 +07:00
Sergey M․
bcaa1dd060 [ChangeLog] Actualize 2017-05-18 22:58:14 +07:00
Sergey M․
0e2d626ddd [jsinterp] Fix typo and cleanup regexes (closes #13134) 2017-05-18 22:57:38 +07:00
Sergey M․
9221d5d7a8 [ChangeLog] Fix typo 2017-05-18 22:37:32 +07:00
Sergey M․
9d63e57d1f release 2017.05.18 2017-05-18 22:30:37 +07:00
Sergey M․
3bc1eea0d8 [ChangeLog] Actualize 2017-05-18 22:29:25 +07:00
Sergey M․
7769f83701 [jsinterp] Add support for quoted names and indexers (closes #13123, closes #13130) 2017-05-18 22:18:33 +07:00
Sergey M․
650bd94716 [vier] Relax regexes and extract more metadata (closes #12539) 2017-05-17 23:39:01 +07:00
mrBliss
36b226d48f [vier] Extract more info
Extract the `episode_number` and `upload_date`. Also extract the real
`description`.
2017-05-17 23:38:50 +07:00
Sergey M․
f2e2f0c777 [extractor/common] Fix rtmp and rtsp formats' URLs in _extract_wowza_formats 2017-05-17 22:20:25 +07:00
Sergey M․
6f76679804 [extractor/common] Add support for schemeless URLs in _extract_wowza_formats (closes #13088, closes #13092) 2017-05-16 22:11:34 +07:00
Sergey M․
7073015a23 [vier] PEP 8 and cleanup 2017-05-15 22:00:53 +07:00
mrBliss
89fd03079b [vier] Improve extraction
+ Add support for authentication
* Bypass authentication when no credentials provded
* Improve extraction robustness
2017-05-15 21:46:55 +07:00
Sergey M․
1c45b7a8a9 [dailymail] Fix sources extraction (closes #13057) 2017-05-14 12:47:19 +07:00
Sergey M․
60f5c9fb19 [utils] Recognize more audio codecs (#13081) 2017-05-14 12:33:33 +07:00
Sergey M․
c360e641e9 [dailymotion] Extend _VALID_URL (closes #13079) 2017-05-14 09:55:40 +07:00
Sergey M․
6f3c632c24 release 2017.05.14 2017-05-14 07:38:40 +07:00
Sergey M․
09b866e171 [ChangeLog] Actualize 2017-05-14 07:37:31 +07:00
Sergey M․
166d12b00c [options] PEP 8 2017-05-14 07:33:24 +07:00
Sergey M․
2b8e6a68f8 [extractor/generic] Add test for mediaset embed 2017-05-14 06:40:19 +07:00
Sergey M․
d105a7edc6 [mediaset] Fix upload date 2017-05-14 06:39:47 +07:00
Sergey M․
5d29af3d15 [extractor/generic] Add support for mediaset embeds 2017-05-14 06:29:16 +07:00
Sergey M․
ca04de463d [mediaset] Add support for shortcut 2017-05-14 06:28:40 +07:00
Sergey M․
946826eec7 [extractor/generic] Remove duplicate limelight code 2017-05-14 06:17:34 +07:00
Sergey M․
76d5a36391 [extractor/common] Respect Width and Height attributes in ISM manifests 2017-05-14 06:11:45 +07:00
Sergey M․
56f9c77f0e [mediaset] Improve extraction (closes #12708, closes #12964) 2017-05-14 05:30:13 +07:00
Timendum
0de136341a [mediaset] Add extractor 2017-05-14 05:30:02 +07:00
Sergey M․
1339ecb2f8 [orf:radio] Cleanup _VALID_URLs (closes #11643) 2017-05-14 04:31:20 +07:00
phaer
efe9316703 [orf:radio] Fix extraction
Since oe1.orf.at has been updated, both ORF radios supported by youtube_dl
use the same API. This commit honors this fact by merging both extractors
into one.
2017-05-14 04:31:14 +07:00
Luca Steeb
851a01aed6 [aljazeera] Extend _VALID_URL 2017-05-14 00:57:02 +07:00
Sergey M․
b845766597 [imdb] Relax _VALID_URL (closes #13056) 2017-05-14 00:32:50 +07:00
Sergey M․
fa26734e07 [postprocessor/metadatafromtitle] Add support regex syntax for --metadata-from-title (closes #13065) 2017-05-14 00:03:15 +07:00
Sergey M․
12f01118b0 [francetv] Add support for mobile.france.tv (closes #13068) 2017-05-13 21:57:00 +07:00
Sergey M․
7fc60f4ee9 [upskill] Add extractor (closes #13043) 2017-05-13 21:52:59 +07:00
Sergey M․
58bb440283 [extractor/generic] Extract wistia embed code into separate method 2017-05-13 21:51:58 +07:00
Remita Amine
7ad4362357 [thescene] fix extraction(closes #13061) 2017-05-12 16:37:09 +01:00
Remita Amine
6c52477f59 [condenast] improve embed support 2017-05-12 16:37:09 +01:00
Yen Chi Hsuan
116283ff64 [liveleak] Fix extraction (#12053) 2017-05-12 19:15:33 +08:00
Yen Chi Hsuan
7274f3d0e9 [douyu] Support Douyu shows (closes #12228) 2017-05-12 18:44:24 +08:00
Sergey M․
3166b1f0ac [myspace] Improve _VALID_URL (closes #13040) 2017-05-10 22:35:46 +07:00
Remita Amine
39ee263819 use platform=desktop in assets url(closes #13041) 2017-05-10 08:50:30 +01:00
Sergey M․
a7ed6b341c release 2017.05.09 2017-05-09 04:20:13 +07:00
Sergey M․
cbd84b5817 [ChangeLog] Actualize 2017-05-09 23:17:22 +07:00
Sergey M․
6d1ded7502 [francetv] Adapt to site redesign (closes #13034) 2017-05-09 23:07:01 +07:00
Remita Amine
5d0968f0af [packtpub] add support for authentication(closes #12622) 2017-05-09 11:15:14 +01:00
Sergey M․
8d65880e24 [drtv] Improve extraction and update tests (closes #13013, closes #13016) 2017-05-09 15:37:09 +07:00
Rasmus Rendal
b972fb037b [drtv] Lower preference for SignLanguage formats (closes #13013) 2017-05-09 15:36:02 +07:00
Remita Amine
5996d21aea [cspan] add support for brightcove live embeds(closes #13028) 2017-05-09 00:51:12 +01:00
Remita Amine
afa0200bf0 [vrv] extract dash formats and subtitles 2017-05-08 20:04:40 +01:00
Sergey M․
e9137224b3 [YoutubeDL] Force restrict filenames when no locale is set for python 2 as well (#13027) 2017-05-09 01:14:02 +07:00
Remita Amine
804181dda9 [funimation] remove codes related to old login method and update test 2017-05-08 18:58:13 +01:00
Remita Amine
8fa17117df [funimation] fix authentication(closes #13021) 2017-05-08 18:13:58 +01:00
Remita Amine
3b859145c2 [adultswim] Fix Extraction(closes #8640)(closes #10950)(closes closes #11042)(closes #12121)
- add support for adobe pass authentication
- add support for live streams
- add support for show pages
2017-05-08 15:07:40 +01:00
Remita Amine
04c09f1961 [turner] extract thumbnail and is_live and strip description 2017-05-08 15:07:40 +01:00
Sergey M․
bf82b87323 [nonktube] Use econfig nuevo URL 2017-05-08 20:13:22 +07:00
Sergey M․
b6eb74e340 [nonktube] Add extractor (closes #8647, closes #13024) 2017-05-08 20:10:39 +07:00
Sergey M․
3d40084b83 [nuevo] Pass headers to _extract_nuevo 2017-05-08 20:03:38 +07:00
Remita Amine
52294cdda7 [nbc] remove unused imports and extract permalink from modified urls 2017-05-07 09:31:14 +01:00
Remita Amine
2eeb588efe [nbc] improve extraction(closes #12364) 2017-05-07 08:59:31 +01:00
Sergey M․
4ac0f573ef release 2017.05.07 2017-05-07 04:51:34 +07:00
Sergey M․
3892a9f4ab [ChangeLog] Actualize 2017-05-07 04:44:54 +07:00
Sergey M․
3995d37da5 [youtube] Fix TFA (#12927) 2017-05-07 04:19:11 +07:00
Sergey M․
e4a75d7932 [test_youtube_chapters] PEP 8 2017-05-07 00:00:11 +07:00
Sergey M․
e00eb564e9 [youtube] Fix authentication (closes #12927) 2017-05-06 23:58:47 +07:00
Yen Chi Hsuan
10c87c151b [utils] Rename try_multipart_encode to _multipart_encode_impl
To state that this is an internal function and people should be careful
when using it outside youtube-dl.
2017-05-06 19:06:18 +08:00
Yen Chi Hsuan
228cd9bb90 [bilibili] Fix video downloading (closes #13001) 2017-05-06 18:58:38 +08:00
Sergey M․
566fbbaefd [rmcdecouverte] Improve (closes #12937) 2017-05-06 17:56:10 +07:00
midas02
74c09c852a [rmcdecouverte] Fix extraction 2017-05-06 17:56:10 +07:00
Remita Amine
fd178b8748 [theplatform] extract chapters 2017-05-06 07:19:07 +01:00
Sergey M․
a57a8e9918 [test_youtube_chapters] Add coding cookie 2017-05-06 05:30:56 +07:00
Tithen-Firion
1f9fefe7f5 [crackle] Update test 2017-05-06 03:39:14 +07:00
Luca Steeb
8b4774dcac [bandcamp] Fix thumbnail extraction 2017-05-06 03:35:42 +07:00
Sergey M․
a99cc4ca16 [pornhub] Extend _VALID_URL (closes #12996) 2017-05-06 02:46:37 +07:00
Sergey M․
9cafc3fd8b [youtube] Extract chapters 2017-05-06 02:27:06 +07:00
Sergey M․
329e3dd5ad [nrk] Extract chapters 2017-05-05 22:59:15 +07:00
Remita Amine
1d9e0a4f40 [vice] update tests and add support for ooyala embeds in article pages 2017-05-05 16:13:12 +01:00
Sergey M․
7ad53cb7ff [laola1tv] PEP 8 2017-05-05 21:59:23 +07:00
Yen Chi Hsuan
b2ad479d17 [utils] Fix multipart_encode for Python < 3.5 2017-05-05 20:51:59 +08:00
Yen Chi Hsuan
4ac6dc3732 [vice] Support Vice articles (closes #12968) 2017-05-05 20:26:51 +08:00
Yen Chi Hsuan
cc7bda4fff [vice] Fix extraction for non en_us videos (closes #12967) 2017-05-05 20:01:02 +08:00
Yen Chi Hsuan
50ad078b7b [gdcvault] Fix extraction for videos with gdc-player.html
Closes #12733
2017-05-05 15:13:40 +08:00
Sergey M․
4947f13cd0 [pbs] Improve multipart video support (closes #12981) 2017-05-04 22:42:49 +07:00
Sergey M․
7f09e523e8 [laola1tv:embed] Fix tests 2017-05-04 22:41:47 +07:00
Remita Amine
4fe14732a2 [laola1tv] fix extraction(closes #12880) 2017-05-04 16:07:08 +01:00
Remita Amine
ff6f9a6704 [extractor/common] fix typo in _extract_akamai_formats 2017-05-04 16:07:08 +01:00
Yen Chi Hsuan
0c26548601 [cda] Implement birthday verification (closes #12789) 2017-05-04 16:26:17 +08:00
Yen Chi Hsuan
5401bea27f [leeco] Fix extraction (closes #12974)
Seems on mobile devices a similar API is used, but I always get an AD
with mimicking that API.
2017-05-04 03:18:56 +08:00
remitamine
7a6d33a9a5 [pbs] extract chapters information 2017-05-02 20:41:48 +01:00
remitamine
fa2a36d9bc [ffmpeg] add support for chapters field postprocessing 2017-05-02 20:41:48 +01:00
remitamine
55949fede6 [common] introduce chapters field 2017-05-02 20:41:48 +01:00
Remita Amine
7fc875195f [amp] imporove thumbnail and subtitle extraction 2017-05-02 00:06:58 +01:00
Tithen-Firion
c6fe5a7e12 [douyutv] Update test 2017-05-02 03:45:27 +08:00
Tithen-Firion
ae21d2fd94 [dotsub] Update test 2017-05-02 02:56:44 +08:00
Tithen-Firion
77481f1386 [democracynow] Update test 2017-05-02 01:38:31 +07:00
Tithen-Firion
d86d169dd5 [dailymotion] Add working test 2017-05-02 01:37:23 +07:00
Tithen-Firion
b9f9f361fa [crunchyroll] Update test 2017-05-02 00:56:51 +07:00
Remita Amine
ab39a25c75 [foxsports] fix extraction(closes #12945) 2017-05-01 09:02:41 +01:00
Tithen-Firion
a146fa1c68 [coub] Update test and remove comment count extraction 2017-05-01 05:54:44 +07:00
Sergey M․
e0c1e9a98c release 2017.05.01 2017-05-01 01:39:52 +07:00
Sergey M․
086041e2f8 [ChangeLog] Actualize 2017-05-01 01:34:51 +07:00
Sergey M․
74da856544 [infoq] Make audio format extraction non fatal (closes #12938) 2017-05-01 01:23:05 +07:00
Sergey M․
9edf47df7b [brightcove] Allow whitespace around attribute names in embedded code 2017-05-01 01:03:47 +07:00
Sergey M․
238cec17ae [extractor/anvato] PEP 8 2017-04-30 22:04:21 +07:00
Sergey M․
50534b7158 [downloader/fragment] PEP 8 2017-04-30 22:04:01 +07:00
Sergey M․
9cd4209724 [zaq1] Improve extraction (closes #12693) 2017-04-30 21:46:05 +07:00
Sergey M․
33a81c2c6f [extractor/common] Extract view count from JSON-LD 2017-04-30 21:45:59 +07:00
Sergey M․
deef31955b [utils] Improve unified_timestamp
Seen at http://zaq1.pl/video/xev0e
2017-04-30 21:45:53 +07:00
slocum
9dac2cec2d [zaq1] Add new extractor 2017-04-30 21:45:47 +07:00
Sergey M․
6ec371cd9e [xvideos] Extract og:duration (closes #12828) 2017-04-30 18:14:01 +07:00
Sander
13081db1f5 [xvideos] Add video duration 2017-04-30 18:10:49 +07:00
Sergey M․
b07ea5eaec [vevo] Modernize 2017-04-30 17:58:22 +07:00
gritstub
5599253009 [vevo] Fix extraction (config.token.key) 2017-04-30 17:56:10 +07:00
Remita Amine
98ce1a3fd3 [utils] add video/mp2t to mimetype2ext 2017-04-30 09:03:10 +01:00
Yen Chi Hsuan
ba5c3caf88 [washingtonpost] Fix invalid escape sequence on Python 3.6 2017-04-30 02:15:28 +08:00
Sergey M․
b5c39537be [noovo] Improve extraction (closes #12792) 2017-04-30 00:24:25 +07:00
Frederic Bournival
1c7c76e4fb [noovo] Add extractor 2017-04-30 00:24:19 +07:00
John Hawkinson
557194591a [washingtonpost] Add support for embeds (closes #12699) 2017-04-29 23:07:26 +07:00
Yen Chi Hsuan
27e70a8f6c Merge pull request #12869 from Tithen-Firion/cbc-update-tests
[cbc] update test cases
2017-04-29 21:34:18 +08:00
Sergey M․
a4c81e4968 [yandexmusic:playlist] Fix extraction for python 3 (closes #12888) 2017-04-29 20:23:26 +07:00
Sergey M․
7986c3abcd [anvato] Improve extraction (closes #12913)
* Promote to regular shortcut based extractor
* Add mcp to access key mapping table
* Add support for embeds extraction
* Add support for anvato embeds in generic extractor
2017-04-29 19:49:04 +07:00
Yen Chi Hsuan
a1ebfd4494 Merge pull request #12854 from Tithen-Firion/appletrailer-test-fix
[appletrailers] update test cases
2017-04-29 19:24:38 +08:00
Yen Chi Hsuan
d19093bd50 Merge pull request #12906 from Tithen-Firion/clean-html-fix
[utils] Fix inconsistent output of clean_html
2017-04-29 15:58:45 +08:00
Yen Chi Hsuan
24eb7c2578 [xtube] Fix extraction with non-standard JSON 'sources'
Closes #12734

Thanks @paulguy for the fix!
2017-04-29 15:55:08 +08:00
Sergey M․
e7db6759e4 [downloader/external] Properly handle live stream downloading cancellation (closes #8932) 2017-04-29 04:33:35 +07:00
Sergey M․
b364c87c42 [tvplayer] Fix extraction (closes #12908) 2017-04-29 03:46:08 +07:00
Tithen-Firion
9222d94510 [test_utils] Add one more clean_html test 2017-04-28 18:05:14 +02:00
Tithen-Firion
edd9221cd2 [utils] Fix inconsistent output of clean_html
`\s` in Python 2.x doesn't match unicode whitespace characters by
default
2017-04-28 17:34:27 +02:00
Sergey M․
bc8a2ea071 release 2017.04.28 2017-04-28 18:30:03 +07:00
Sergey M․
7527923371 [ChangeLog] Actualize 2017-04-28 18:27:29 +07:00
Remita Amine
20783b8b50 [aenetworks] fix extraction for shows with single season 2017-04-28 12:04:56 +01:00
Remita Amine
bf2a5555c0 [go] add support for Disney, DisneyJunior and DisneyXD show pages 2017-04-28 09:48:52 +01:00
Remita Amine
fb8e8b2d16 [adobepass] use geo verification headers for all requests 2017-04-28 09:48:52 +01:00
Yen Chi Hsuan
b62985a9a5 [youtube] Recognize another HTML5 player URL (#12885) 2017-04-28 16:25:04 +08:00
Yen Chi Hsuan
e31fed95b4 [youtube] Recognize new locale-based player URLs (fixes #12885) 2017-04-28 15:48:30 +08:00
Tithen-Firion
3fd0f70f6a [cbslocal] Update test 2017-04-28 04:26:59 +07:00
Tithen-Firion
33c62efc32 [collegerama] Update tests 2017-04-28 04:00:49 +07:00
Tithen-Firion
6b4ddd336c [afreecatv] Fix title extraction 2017-04-28 04:00:15 +07:00
Tithen-Firion
c12b4b80f8 [archiveorg] Update test 2017-04-28 03:48:32 +07:00
Tithen-Firion
064fafe932 [appleconnect] Update test 2017-04-28 03:47:25 +07:00
Tithen-Firion
ac1a5b9a12 [audioboom] Update test 2017-04-28 03:36:28 +07:00
Tithen-Firion
a15777491a [atresplayer] Update test 2017-04-28 03:32:25 +07:00
Tithen-Firion
d8571dd6bf [bleacherreport] Update tests 2017-04-28 03:28:26 +07:00
Sergey M․
c0fa4245ce [downloader/fragment] Remove assert for resume_len when no fragments downloaded
This may be incorrect due some header (e.g. flv header in f4m downloader)
2017-04-28 03:26:19 +07:00
Tithen-Firion
8814ae42bc [beeg] Update test 2017-04-28 03:14:11 +07:00
Tithen-Firion
0f63dc2402 [bandcamp] Update test 2017-04-28 03:13:12 +07:00
Tithen-Firion
dde97ea8da [canalc2] Update test 2017-04-28 03:07:42 +07:00
Sergey M․
30bb6ce1a4 [test_InfoExtractor] Fix test_parse_m3u8_formats 2017-04-28 03:01:43 +07:00
Sergey M․
c89b49f743 [extractor/common] Add manifest_url for explicit group rendition formats 2017-04-28 03:00:14 +07:00
Tithen-Firion
6f4a888416 [br] Update test 2017-04-28 02:53:11 +07:00
Tithen-Firion
f5edd7ae51 [clipfish] Update test 2017-04-28 02:51:30 +07:00
Tithen-Firion
96820c1c6b [cbsinteractive] extract formats with CBSIE 2017-04-27 20:23:52 +02:00
Tithen-Firion
c95e2b5911 [cbc] update test cases 2017-04-27 18:07:07 +02:00
Tithen-Firion
374560f018 [test_download] Fix order when testing file's md5 2017-04-27 22:27:34 +07:00
Sergey M․
ff99fe529e Don't list master m3u8 playlists in format list (closes #12832) 2017-04-27 21:53:17 +07:00
Tithen-Firion
e095109da1 [cbsinteractive] update test cases 2017-04-27 15:40:17 +02:00
Tithen-Firion
d68afc5bc9 [cbsinteractive] fix extractor 2017-04-27 15:27:01 +02:00
Tithen-Firion
76c1951036 [appletrailers] update test cases 2017-04-27 10:04:21 +02:00
Lucas M
e8bfe2a946 [streamable] Add support for new embedded URL schema 2017-04-26 23:39:53 +07:00
Sergey M․
3dc8b61b7f [arte:+7] Relax _VALID_URL (closes #12837) 2017-04-26 01:55:29 +07:00
Sergey M․
a82f41841d release 2017.04.26 2017-04-26 00:06:12 +07:00
Sergey M․
30a4ab191a [ChangeLog] Actualize 2017-04-26 00:03:13 +07:00
Sergey M․
ac9c69ace7 [extractor/common] Improve jwplayer regex 2017-04-25 23:46:05 +07:00
Sergey M․
85f6de25e4 [downloader/fragment] Clarify current_fragment's index and mark as experimental 2017-04-25 23:33:35 +07:00
Sergey M․
538eee7b6a Add missing test m3u8 file 2017-04-25 22:26:30 +07:00
Yen Chi Hsuan
9f54ae2873 Ignore and clean *.ytdl files 2017-04-25 22:42:55 +08:00
Yen Chi Hsuan
01cb57016f [iqiyi] Fix extraction of Yule videos 2017-04-25 22:23:57 +08:00
Sergey M․
290f64dbaa [downloader/fragment] Improve .ytdl format and start documenting 2017-04-24 23:50:20 +07:00
Sergey M․
adb4b03cd5 [downloader/fragment] Don't process ytdl file when it's not needed yet 2017-04-24 23:05:56 +07:00
Sergey M․
0eee52f34b Introduce --keep-fragments 2017-04-24 03:09:08 +07:00
Sergey M․
d3f0687cf7 [downloader/fragment] Use temp file for current fragment 2017-04-24 02:54:17 +07:00
Sergey M․
a4d6cf970c [YoutubeDL] Fix output template for missing timestamp (closes #12796) 2017-04-24 00:50:39 +07:00
Sergey M․
3019cb0c99 [extractor/common] Rephrase comment 2017-04-23 11:52:07 +07:00
Sergey M․
ddd258f922 [test_InfoExtractor] Add m3u8 parsing test for NAME attribute in EXT-X-STREAM-INF tag 2017-04-23 11:49:57 +07:00
Sergey M․
07ad0cf34f [vidio] Improve and sort formats 2017-04-23 11:48:51 +07:00
Sergey M․
9c99bef704 [extractor/common] Use float for scaled tbr 2017-04-23 11:33:49 +07:00
Remita Amine
ffbc8386b9 [brightcove] match only video elements with data-video-id attribute 2017-04-22 22:26:20 +01:00
Remita Amine
4abdba643c [downloader/fragment] remove unused code 2017-04-22 18:19:47 +01:00
Remita Amine
3e0304fe6e [downloader/fragment] use the documented names for fragment progress_hooks fields 2017-04-22 16:42:24 +01:00
Yen Chi Hsuan
fbf56be213 [iqiyi] Fix playlist detection (#12504) 2017-04-22 22:11:37 +08:00
Yen Chi Hsuan
54f54fcca7 [socks] Report errors elegantly when credentails are required but missing
In some non-standard implementations, the server may respond AUTH_USER_PASS
even if's not listed in available authentication methods. (it should
respond AUTH_NO_ACCEPTABLE per standards)
2017-04-22 21:48:41 +08:00
Yen Chi Hsuan
facfd79f9a [azubu] Remove extractor as the site is gone (closes #12813) 2017-04-22 21:20:25 +08:00
Yen Chi Hsuan
3110bb937d [porn91] Fix extraction (closes #12814) 2017-04-22 21:16:36 +08:00
Sergey M․
cb2520802d [extractor/common] Improve m3u8 extraction (closes #12211)
* Extract m3u8 parsing to separate method
* Improve rendition groups extraction
* Build stream name according stream GROUP-ID
* Ignore reference to AUDIO group without URI when stream has no CODECS
+ Add test coverage for parsing m3u8 from #11507, #11995, #12211 and twitch vod
2017-04-22 07:01:00 +07:00
Sergey M․
f779958250 [vidzi] Fix extraction (closes #12793) 2017-04-21 23:37:06 +07:00
Remita Amine
8abc7dca39 [amp] extract error message(closes #12795) 2017-04-20 05:16:41 +01:00
Remita Amine
ea0c2f219c [downloader/fragment] use a general file to store fragment download context 2017-04-19 18:53:15 +01:00
Sergey M․
481ef51e23 [brightcove] PEP 8 2017-04-19 21:47:03 +07:00
Remita Amine
5b995f713b [utils] add support for ttml styles 2017-04-19 14:38:40 +01:00
Remita Amine
75a2485407 [fragment,hls,f4m,dash,ism] improve fragment downloading
- resume immediately
- no need to concatenate segments and decrypt them on every resume
- no need to save temp files for segments

and for hls downloader:
- no need to download keys for segments that already downloaded
2017-04-19 11:46:07 +01:00
Remita Amine
58f6ab72ed [odnoklassniki] update tests 2017-04-19 00:16:55 +01:00
Sergey M․
2dc48df5bc [xfileshare] Add support for gorillavid.com and daclips.com (closes #12776) 2017-04-18 23:58:37 +07:00
Sergey M․
18848d226a [instagram] Fix extraction (closes #12777) 2017-04-18 22:40:26 +07:00
Sergey M․
a32a9a7ef5 [extractor/common] Add support multiple getters in try_get 2017-04-18 22:39:58 +07:00
Sergey M․
bae1404893 [extractor/common] Add support for video of WebPage context in _json_ld (closes #12778) 2017-04-18 22:21:38 +07:00
Yen Chi Hsuan
06d0ad9a4e [brightcove] Support URLs with bcpid instead of playerID
Fixes #12482
2017-04-18 23:04:22 +08:00
Sergey M․
f631b55791 [brightcove] Fix _extract_url (closes #12782) 2017-04-18 21:46:25 +07:00
Remita Amine
bf1b87cd91 [common] Relax JWPlayer regex and remove duplicate urls(#12768) 2017-04-17 08:48:24 +01:00
Remita Amine
1c35b3da44 [odnoklassniki] extract m3u8 formats 2017-04-16 21:27:08 +01:00
256 changed files with 9431 additions and 3908 deletions

View File

@@ -1,16 +1,16 @@
## Please follow the guide below ## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly - You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x]) - Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use *Preview* tab to see how your issue will actually look like - Use the *Preview* tab to see what your issue will actually look like
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.04.17*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.08.13*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.04.17** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.08.13**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones - [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*? ### What is the purpose of your *issue*?
@@ -28,14 +28,14 @@
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows: ### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```): Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
``` ```
$ youtube-dl -v <your command line>
[debug] System config: [] [debug] System config: []
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.04.17 [debug] youtube-dl version 2017.08.13
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@@ -1,16 +1,16 @@
## Please follow the guide below ## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly - You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x]) - Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use *Preview* tab to see how your issue will actually look like - Use the *Preview* tab to see what your issue will actually look like
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones - [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*? ### What is the purpose of your *issue*?
@@ -28,9 +28,9 @@
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows: ### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```): Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
``` ```
$ youtube-dl -v <your command line>
[debug] System config: [] [debug] System config: []
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']

2
.gitignore vendored
View File

@@ -35,8 +35,8 @@ updates_key.pem
*.mkv *.mkv
*.swf *.swf
*.part *.part
*.ytdl
*.swp *.swp
test/testdata
test/local_parameters.json test/local_parameters.json
.tox .tox
youtube-dl.zsh youtube-dl.zsh

11
AUTHORS
View File

@@ -212,3 +212,14 @@ Xiao Di Guan
Thomas Winant Thomas Winant
Daniel Twardowski Daniel Twardowski
Jeremie Jarosh Jeremie Jarosh
Gerard Rovira
Marvin Ewald
Frédéric Bournival
Timendum
gritstub
Adam Voss
Mike Fährmann
Jan Kundrát
Giuseppe Fabiano
Örn Guðjónsson
Parmjit Virk

536
ChangeLog
View File

@@ -1,3 +1,539 @@
version 2017.08.13
Core
* [YoutubeDL] Make sure format id is not empty
* [extractor/common] Make _family_friendly_search optional
* [extractor/common] Respect source's type attribute for HTML5 media (#13892)
Extractors
* [pornhub:playlistbase] Skip videos from drop-down menu (#12819, #13902)
+ [fourtube] Add support pornerbros.com (#6022)
+ [fourtube] Add support porntube.com (#7859, #13901)
+ [fourtube] Add support fux.com
* [limelight] Improve embeds detection (#13895)
+ [reddit] Add support for v.redd.it and reddit.com (#13847)
* [aparat] Extract all formats (#13887)
* [mixcloud] Fix play info decryption (#13885)
+ [generic] Add support for vzaar embeds (#13876)
version 2017.08.09
Core
* [utils] Skip missing params in cli_bool_option (#13865)
Extractors
* [xxxymovies] Fix title extraction (#13868)
+ [nick] Add support for nick.com.pl (#13860)
* [mixcloud] Fix play info decryption (#13867)
* [20min] Fix embeds extraction (#13852)
* [dplayit] Fix extraction (#13851)
+ [niconico] Support videos with multiple formats (#13522)
+ [niconico] Support HTML5-only videos (#13806)
version 2017.08.06
Core
* Use relative paths for DASH fragments (#12990)
Extractors
* [pluralsight] Fix format selection
- [mpora] Remove extractor (#13826)
+ [voot] Add support for voot.com (#10255, #11644, #11814, #12350, #13218)
* [vlive:channel] Limit number of videos per page to 100 (#13830)
* [podomatic] Extend URL regular expression (#13827)
* [cinchcast] Extend URL regular expression
* [yandexdisk] Relax URL regular expression (#13824)
* [vidme] Extract DASH and HLS formats
- [teamfour] Remove extractor (#13782)
* [pornhd] Fix extraction (#13783)
* [udemy] Fix subtitles extraction (#13812)
* [mlb] Extend URL regular expression (#13740, #13773)
+ [pbs] Add support for new URL schema (#13801)
* [nrktv] Update API host (#13796)
version 2017.07.30.1
Core
* [downloader/hls] Use redirect URL as manifest base (#13755)
* [options] Correctly hide login info from debug outputs (#13696)
Extractors
+ [watchbox] Add support for watchbox.de (#13739)
- [clipfish] Remove extractor
+ [youjizz] Fix extraction (#13744)
+ [generic] Add support for another ooyala embed pattern (#13727)
+ [ard] Add support for lives (#13771)
* [soundcloud] Update client id
+ [soundcloud:trackstation] Add support for track stations (#13733)
* [svtplay] Use geo verification proxy for API request
* [svtplay] Update API URL (#13767)
+ [yandexdisk] Add support for yadi.sk (#13755)
+ [megaphone] Add support for megaphone.fm
* [amcnetworks] Make rating optional (#12453)
* [cloudy] Fix extraction (#13737)
+ [nickru] Add support for nickelodeon.ru
* [mtv] Improve thumbnal extraction
* [nick] Automate geo-restriction bypass (#13711)
* [niconico] Improve error reporting (#13696)
version 2017.07.23
Core
* [YoutubeDL] Improve default format specification (#13704)
* [YoutubeDL] Do not override id, extractor and extractor_key for
url_transparent entities
* [extractor/common] Fix playlist_from_matches
Extractors
* [itv] Fix production id extraction (#13671, #13703)
* [vidio] Make duration non fatal and fix typo
* [mtv] Skip missing video parts (#13690)
* [sportbox:embed] Fix extraction
+ [npo] Add support for npo3.nl URLs (#13695)
* [dramafever] Remove video id from title (#13699)
+ [egghead:lesson] Add support for lessons (#6635)
* [funnyordie] Extract more metadata (#13677)
* [youku:show] Fix playlist extraction (#13248)
+ [dispeak] Recognize sevt subdomain (#13276)
* [adn] Improve error reporting (#13663)
* [crunchyroll] Relax series and season regex (#13659)
+ [spiegel:article] Add support for nexx iframe embeds (#13029)
+ [nexx:embed] Add support for iframe embeds
* [nexx] Improve JS embed extraction
+ [pearvideo] Add support for pearvideo.com (#13031)
version 2017.07.15
Core
* [YoutubeDL] Don't expand environment variables in meta fields (#13637)
Extractors
* [spiegeltv] Delegate extraction to nexx extractor (#13159)
+ [nexx] Add support for nexx.cloud (#10807, #13465)
* [generic] Fix rutube embeds extraction (#13641)
* [karrierevideos] Fix title extraction (#13641)
* [youtube] Don't capture YouTube Red ad for creator meta field (#13621)
* [slideshare] Fix extraction (#13617)
+ [5tv] Add another video URL pattern (#13354, #13606)
* [drtv] Make HLS and HDS extraction non fatal
* [ted] Fix subtitles extraction (#13628, #13629)
* [vine] Make sure the title won't be empty
+ [twitter] Support HLS streams in vmap URLs
+ [periscope] Support pscp.tv URLs in embedded frames
* [twitter] Extract mp4 urls via mobile API (#12726)
* [niconico] Fix authentication error handling (#12486)
* [giantbomb] Extract m3u8 formats (#13626)
+ [vlive:playlist] Add support for playlists (#13613)
version 2017.07.09
Core
+ [extractor/common] Add support for AMP tags in _parse_html5_media_entries
+ [utils] Support attributes with no values in get_elements_by_attribute
Extractors
+ [dailymail] Add support for embeds
+ [joj] Add support for joj.sk (#13268)
* [abc.net.au:iview] Extract more formats (#13492, #13489)
* [egghead:course] Fix extraction (#6635, #13370)
+ [cjsw] Add support for cjsw.com (#13525)
+ [eagleplatform] Add support for referrer protected videos (#13557)
+ [eagleplatform] Add support for another embed pattern (#13557)
* [veoh] Extend URL regular expression (#13601)
* [npo:live] Fix live stream id extraction (#13568, #13605)
* [googledrive] Fix height extraction (#13603)
+ [dailymotion] Add support for new layout (#13580)
- [yam] Remove extractor
* [xhamster] Extract all formats and fix duration extraction (#13593)
+ [xhamster] Add support for new URL schema (#13593)
* [espn] Extend URL regular expression (#13244, #13549)
* [kaltura] Fix typo in subtitles extraction (#13569)
* [vier] Adapt extraction to redesign (#13575)
version 2017.07.02
Core
* [extractor/common] Improve _json_ld
Extractors
+ [thisoldhouse] Add more fallbacks for video id
* [thisoldhouse] Fix video id extraction (#13540, #13541)
* [xfileshare] Extend format regular expression (#13536)
* [ted] Fix extraction (#13535)
+ [tastytrade] Add support for tastytrade.com (#13521)
* [dplayit] Relax video id regular expression (#13524)
+ [generic] Extract more generic metadata (#13527)
+ [bbccouk] Capture and output error message (#13501, #13518)
* [cbsnews] Relax video info regular expression (#13284, #13503)
+ [facebook] Add support for plugin video embeds and multiple embeds (#13493)
* [soundcloud] Switch to https for API requests (#13502)
* [pandatv] Switch to https for API and download URLs
+ [pandatv] Add support for https URLs (#13491)
+ [niconico] Support sp subdomain (#13494)
version 2017.06.25
Core
+ [adobepass] Add support for DIRECTV NOW (mso ATTOTT) (#13472)
* [YoutubeDL] Skip malformed formats for better extraction robustness
Extractors
+ [wsj] Add support for barrons.com (#13470)
+ [ign] Add another video id pattern (#13328)
+ [raiplay:live] Add support for live streams (#13414)
+ [redbulltv] Add support for live videos and segments (#13486)
+ [onetpl] Add support for videos embedded via pulsembed (#13482)
* [ooyala] Make more robust
* [ooyala] Skip empty format URLs (#13471, #13476)
* [hgtv.com:show] Fix typo
version 2017.06.23
Core
* [adobepass] Fix extraction on older python 2.6
Extractors
* [youtube] Adapt to new automatic captions rendition (#13467)
* [hgtv.com:show] Relax video config regular expression (#13279, #13461)
* [drtuber] Fix formats extraction (#12058)
* [youporn] Fix upload date extraction
* [youporn] Improve formats extraction
* [youporn] Fix title extraction (#13456)
* [googledrive] Fix formats sorting (#13443)
* [watchindianporn] Fix extraction (#13411, #13415)
+ [vimeo] Add fallback mp4 extension for original format
+ [ruv] Add support for ruv.is (#13396)
* [viu] Fix extraction on older python 2.6
* [pandora.tv] Fix upload_date extraction (#12846)
+ [asiancrush] Add support for asiancrush.com (#13420)
version 2017.06.18
Core
* [downloader/common] Use utils.shell_quote for debug command line
* [utils] Use compat_shlex_quote in shell_quote
* [postprocessor/execafterdownload] Encode command line (#13407)
* [compat] Fix compat_shlex_quote on Windows (#5889, #10254)
* [postprocessor/metadatafromtitle] Fix missing optional meta fields processing
in --metadata-from-title (#13408)
* [extractor/common] Fix json dumping with --geo-bypass
+ [extractor/common] Improve jwplayer subtitles extraction
+ [extractor/common] Improve jwplayer formats extraction (#13379)
Extractors
* [polskieradio] Fix extraction (#13392)
+ [xfileshare] Add support for fastvideo.me (#13385)
* [bilibili] Fix extraction of videos with double quotes in titles (#13387)
* [4tube] Fix extraction (#13381, #13382)
+ [disney] Add support for disneychannel.de (#13383)
* [npo] Improve URL regular expression (#13376)
+ [corus] Add support for showcase.ca
+ [corus] Add support for history.ca (#13359)
version 2017.06.12
Core
* [utils] Handle compat_HTMLParseError in extract_attributes (#13349)
+ [compat] Introduce compat_HTMLParseError
* [utils] Improve unified_timestamp
* [extractor/generic] Ensure format id is unicode string
* [extractor/common] Return unicode string from _match_id
+ [YoutubeDL] Sanitize more fields (#13313)
Extractors
+ [xfileshare] Add support for rapidvideo.tv (#13348)
* [xfileshare] Modernize and pass Referer
+ [rutv] Add support for testplayer.vgtrk.com (#13347)
+ [newgrounds] Extract more metadata (#13232)
+ [newgrounds:playlist] Add support for playlists (#10611)
* [newgrounds] Improve formats and uploader extraction (#13346)
* [msn] Fix formats extraction
* [turbo] Ensure format id is string
* [sexu] Ensure height is int
* [jove] Ensure comment count is int
* [golem] Ensure format id is string
* [gfycat] Ensure filesize is int
* [foxgay] Ensure height is int
* [flickr] Ensure format id is string
* [sohu] Fix numeric fields
* [safari] Improve authentication detection (#13319)
* [liveleak] Ensure height is int (#13313)
* [streamango] Make title optional (#13292)
* [rtlnl] Improve URL regular expression (#13295)
* [tvplayer] Fix extraction (#13291)
version 2017.06.05
Core
* [YoutubeDL] Don't emit ANSI escape codes on Windows (#13270)
Extractors
+ [bandcamp:weekly] Add support for bandcamp weekly (#12758)
* [pornhub:playlist] Fix extraction (#13281)
- [godtv] Remove extractor (#13175)
* [safari] Fix typo (#13252)
* [youtube] Improve chapters extraction (#13247)
* [1tv] Lower preference for HTTP formats (#13246)
* [francetv] Relax URL regular expression
* [drbonanza] Fix extraction (#13231)
* [packtpub] Fix authentication (#13240)
version 2017.05.29
Extractors
* [youtube] Fix DASH MPD extraction for videos with non-encrypted format URLs
(#13211)
* [xhamster] Fix uploader and like/dislike count extraction (#13216))
+ [xhamster] Extract categories (#11728)
+ [abcnews] Add support for embed URLs (#12851)
* [gaskrank] Fix extraction (#12493)
* [medialaan] Fix videos with missing videoUrl (#12774)
* [dvtv] Fix playlist support
+ [dvtv] Add support for DASH and HLS formats (#3063)
+ [beam:vod] Add support for beam.pro/mixer.com VODs (#13032))
* [cbsinteractive] Relax URL regular expression (#13213)
* [adn] Fix formats extraction
+ [youku] Extract more metadata (#10433)
* [cbsnews] Fix extraction (#13205)
version 2017.05.26
Core
+ [utils] strip_jsonp() can recognize more patterns
* [postprocessor/ffmpeg] Fix metadata filename handling on Python 2 (#13182)
Extractors
+ [youtube] DASH MPDs with cipher signatures are recognized now (#11381)
+ [bbc] Add support for authentication
* [tudou] Merge into youku extractor (#12214)
* [youku:show] Fix extraction
* [youku] Fix extraction (#13191)
* [udemy] Fix extraction for outputs' format entries without URL (#13192)
* [vimeo] Fix formats' sorting (#13189)
* [cbsnews] Fix extraction for 60 Minutes videos (#12861)
version 2017.05.23
Core
+ [downloader/external] Pass -loglevel to ffmpeg downloader (#13183)
+ [adobepass] Add support for Bright House Networks (#13149)
Extractors
+ [streamcz] Add support for subtitles (#13174)
* [youtube] Fix DASH manifest signature decryption (#8944, #13156)
* [toggle] Relax URL regular expression (#13172)
* [toypics] Fix extraction (#13077)
* [njpwworld] Fix extraction (#13162, #13169)
+ [hitbox] Add support for smashcast.tv (#13154)
* [mitele] Update app key regular expression (#13158)
version 2017.05.18.1
Core
* [jsinterp] Fix typo and cleanup regular expressions (#13134)
version 2017.05.18
Core
+ [jsinterp] Add support for quoted names and indexers (#13123, #13124, #13125,
#13126, #13128, #13129, #13130, #13131, #13132)
+ [extractor/common] Add support for schemeless URLs in _extract_wowza_formats
(#13088, #13092)
+ [utils] Recognize more audio codecs (#13081)
Extractors
+ [vier] Extract more metadata (#12539)
* [vier] Improve extraction (#12801)
+ Add support for authentication
* Bypass authentication when no credentials provided
* Improve extraction robustness
* [dailymail] Fix sources extraction (#13057)
* [dailymotion] Extend URL regular expression (#13079)
version 2017.05.14
Core
+ [extractor/common] Respect Width and Height attributes in ISM manifests
+ [postprocessor/metadatafromtitle] Add support regular expression syntax for
--metadata-from-title (#13065)
Extractors
+ [mediaset] Add support for video.mediaset.it (#12708, #12964)
* [orf:radio] Fix extraction (#11643, #12926)
* [aljazeera] Extend URL regular expression (#13053)
* [imdb] Relax URL regular expression (#13056)
+ [francetv] Add support for mobile.france.tv (#13068)
+ [upskill] Add support for upskillcourses.com (#13043)
* [thescene] Fix extraction (#13061)
* [condenast] Improve embed support
* [liveleak] Fix extraction (#12053)
+ [douyu] Support Douyu shows (#12228)
* [myspace] Improve URL regular expression (#13040)
* [adultswim] Use desktop platform in assets URL (#13041)
version 2017.05.09
Core
* [YoutubeDL] Force --restrict-filenames when no locale is set on all python
versions (#13027)
Extractors
* [francetv] Adapt to site redesign (#13034)
+ [packtpub] Add support for authentication (#12622)
* [drtv] Lower preference for SignLanguage formats (#13013, #13016)
+ [cspan] Add support for brightcove live embeds (#13028)
* [vrv] Extract DASH formats and subtitles
* [funimation] Fix authentication (#13021)
* [adultswim] Fix extraction (#8640, #10950, #11042, #12121)
+ Add support for Adobe Pass authentication
+ Add support for live streams
+ Add support for show pages
* [turner] Extract thumbnail, is_live and strip description
+ [nonktube] Add support for nonktube.com (#8647, #13024)
+ [nuevo] Pass headers to _extract_nuevo
* [nbc] Improve extraction (#12364)
version 2017.05.07
Common
* [extractor/common] Fix typo in _extract_akamai_formats
+ [postprocessor/ffmpeg] Embed chapters into media file with --add-metadata
+ [extractor/common] Introduce chapters meta field
Extractors
* [youtube] Fix authentication (#12820, #12927, #12973, #12992, #12993, #12995,
#13003)
* [bilibili] Fix video downloading (#13001)
* [rmcdecouverte] Fix extraction (#12937)
* [theplatform] Extract chapters
* [bandcamp] Fix thumbnail extraction (#12980)
* [pornhub] Extend URL regular expression (#12996)
+ [youtube] Extract chapters
+ [nrk] Extract chapters
+ [vice] Add support for ooyala embeds in article pages
+ [vice] Support vice articles (#12968)
* [vice] Fix extraction for non en_us videos (#12967)
* [gdcvault] Fix extraction for some videos (#12733)
* [pbs] Improve multipart video support (#12981)
* [laola1tv] Fix extraction (#12880)
+ [cda] Support birthday verification (#12789)
* [leeco] Fix extraction (#12974)
+ [pbs] Extract chapters
* [amp] Imporove thumbnail and subtitles extraction
* [foxsports] Fix extraction (#12945)
- [coub] Remove comment count extraction (#12941)
version 2017.05.01
Core
+ [extractor/common] Extract view count from JSON-LD
* [utils] Improve unified_timestamp
+ [utils] Add video/mp2t to mimetype2ext
* [downloader/external] Properly handle live stream downloading cancellation
(#8932)
+ [utils] Add support for unicode whitespace in clean_html on python 2 (#12906)
Extractors
* [infoq] Make audio format extraction non fatal (#12938)
* [brightcove] Allow whitespace around attribute names in embedded code
+ [zaq1] Add support for zaq1.pl (#12693)
+ [xvideos] Extract duration (#12828)
* [vevo] Fix extraction (#12879)
+ [noovo] Add support for noovo.ca (#12792)
+ [washingtonpost] Add support for embeds (#12699)
* [yandexmusic:playlist] Fix extraction for python 3 (#12888)
* [anvato] Improve extraction (#12913)
* Promote to regular shortcut based extractor
* Add mcp to access key mapping table
* Add support for embeds extraction
* Add support for anvato embeds in generic extractor
* [xtube] Fix extraction for older FLV videos (#12734)
* [tvplayer] Fix extraction (#12908)
version 2017.04.28
Core
+ [adobepass] Use geo verification headers for all requests
- [downloader/fragment] Remove assert for resume_len when no fragments
downloaded
+ [extractor/common] Add manifest_url for explicit group rendition formats
* [extractor/common] Fix manifest_url for m3u8 formats
- [extractor/common] Don't list master m3u8 playlists in format list (#12832)
Extractor
* [aenetworks] Fix extraction for shows with single season
+ [go] Add support for Disney, DisneyJunior and DisneyXD show pages
* [youtube] Recognize new locale-based player URLs (#12885)
+ [streamable] Add support for new embedded URL schema (#12844)
* [arte:+7] Relax URL regular expression (#12837)
version 2017.04.26
Core
* Introduce --keep-fragments for keeping fragments of fragmented download
on disk after download is finished
* [YoutubeDL] Fix output template for missing timestamp (#12796)
* [socks] Handle cases where credentials are required but missing
* [extractor/common] Improve HLS extraction (#12211)
* Extract m3u8 parsing to separate method
* Improve rendition groups extraction
* Build stream name according stream GROUP-ID
* Ignore reference to AUDIO group without URI when stream has no CODECS
* Use float for scaled tbr in _parse_m3u8_formats
* [utils] Add support for TTML styles in dfxp2srt
* [downloader/hls] No need to download keys for fragments that have been
already downloaded
* [downloader/fragment] Improve fragment downloading
* Resume immediately
* Don't concatenate fragments and decrypt them on every resume
* Optimize disk storage usage, don't store intermediate fragments on disk
* Store bookkeeping download state file
+ [extractor/common] Add support for multiple getters in try_get
+ [extractor/common] Add support for video of WebPage context in _json_ld
(#12778)
+ [extractor/common] Relax JWPlayer regular expression and remove
duplicate URLs (#12768)
Extractors
* [iqiyi] Fix extraction of Yule videos
* [vidio] Improve extraction and sort formats
+ [brightcove] Match only video elements with data-video-id attribute
* [iqiyi] Fix playlist detection (#12504)
- [azubu] Remove extractor (#12813)
* [porn91] Fix extraction (#12814)
* [vidzi] Fix extraction (#12793)
+ [amp] Extract error message (#12795)
+ [xfileshare] Add support for gorillavid.com and daclips.com (#12776)
* [instagram] Fix extraction (#12777)
+ [generic] Support Brightcove videos in <iframe> (#12482)
+ [brightcove] Support URLs with bcpid instead of playerID (#12482)
* [brightcove] Fix _extract_url (#12782)
+ [odnoklassniki] Extract HLS formats
version 2017.04.17 version 2017.04.17
Extractors Extractors

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete find . -name "*.class" -delete
@@ -101,7 +101,7 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
--exclude '*.pyc' \ --exclude '*.pyc' \
--exclude '*.pyo' \ --exclude '*.pyo' \
--exclude '*~' \ --exclude '*~' \
--exclude '__pycache' \ --exclude '__pycache__' \
--exclude '.git' \ --exclude '.git' \
--exclude 'testdata' \ --exclude 'testdata' \
--exclude 'docs/_build' \ --exclude 'docs/_build' \

View File

@@ -145,18 +145,18 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--max-views COUNT Do not download any videos with more than --max-views COUNT Do not download any videos with more than
COUNT views COUNT views
--match-filter FILTER Generic video filter. Specify any key (see --match-filter FILTER Generic video filter. Specify any key (see
help for -o for a list of available keys) the "OUTPUT TEMPLATE" for a list of
to match if the key is present, !key to available keys) to match if the key is
check if the key is not present, key > present, !key to check if the key is not
NUMBER (like "comment_count > 12", also present, key > NUMBER (like "comment_count
works with >=, <, <=, !=, =) to compare > 12", also works with >=, <, <=, !=, =) to
against a number, key = 'LITERAL' (like compare against a number, key = 'LITERAL'
"uploader = 'Mike Smith'", also works with (like "uploader = 'Mike Smith'", also works
!=) to match against a string literal and & with !=) to match against a string literal
to require multiple matches. Values which and & to require multiple matches. Values
are not known are excluded unless you put a which are not known are excluded unless you
question mark (?) after the operator. For put a question mark (?) after the operator.
example, to only match videos that have For example, to only match videos that have
been liked more than 100 times and disliked been liked more than 100 times and disliked
less than 50 times (or the dislike less than 50 times (or the dislike
functionality is not available at the given functionality is not available at the given
@@ -187,6 +187,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
and ISM) and ISM)
--abort-on-unavailable-fragment Abort downloading when some fragment is not --abort-on-unavailable-fragment Abort downloading when some fragment is not
available available
--keep-fragments Keep downloaded fragments on disk after
downloading is finished; fragments are
erased by default
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) --buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024) (default is 1024)
--no-resize-buffer Do not automatically adjust the buffer --no-resize-buffer Do not automatically adjust the buffer
@@ -274,8 +277,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--get-filename Simulate, quiet but print output filename --get-filename Simulate, quiet but print output filename
--get-format Simulate, quiet but print output format --get-format Simulate, quiet but print output format
-j, --dump-json Simulate, quiet but print JSON information. -j, --dump-json Simulate, quiet but print JSON information.
See --output for a description of available See the "OUTPUT TEMPLATE" for a description
keys. of available keys.
-J, --dump-single-json Simulate, quiet but print JSON information -J, --dump-single-json Simulate, quiet but print JSON information
for each command-line argument. If the URL for each command-line argument. If the URL
refers to a playlist, dump the whole refers to a playlist, dump the whole
@@ -397,12 +400,14 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--add-metadata Write metadata to the video file --add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title / --metadata-from-title FORMAT Parse additional metadata like song title /
artist from the video title. The format artist from the video title. The format
syntax is the same as --output, the parsed syntax is the same as --output. Regular
parameters replace existing values. expression with named capture groups may
Additional templates: %(album)s, also be used. The parsed parameters replace
%(artist)s. Example: --metadata-from-title existing values. Example: --metadata-from-
"%(artist)s - %(title)s" matches a title title "%(artist)s - %(title)s" matches a
like "Coldplay - Paradise" title like "Coldplay - Paradise". Example
(regex): --metadata-from-title
"(?P<artist>.+?) - (?P<title>.+)"
--xattrs Write metadata to the video file's xattrs --xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards) (using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the --fixup POLICY Automatically correct known faults of the
@@ -469,7 +474,10 @@ machine twitch login my_twitch_account_name password my_twitch_password
``` ```
To activate authentication with the `.netrc` file you should pass `--netrc` to youtube-dl or place it in the [configuration file](#configuration). To activate authentication with the `.netrc` file you should pass `--netrc` to youtube-dl or place it in the [configuration file](#configuration).
On Windows you may also need to setup the `%HOME%` environment variable manually. On Windows you may also need to setup the `%HOME%` environment variable manually. For example:
```
set HOME=%USERPROFILE%
```
# OUTPUT TEMPLATE # OUTPUT TEMPLATE
@@ -527,13 +535,14 @@ The basic usage is not to set any template arguments when downloading a single f
- `playlist_id` (string): Playlist identifier - `playlist_id` (string): Playlist identifier
- `playlist_title` (string): Playlist title - `playlist_title` (string): Playlist title
Available for the video that belongs to some logical chapter or section: Available for the video that belongs to some logical chapter or section:
- `chapter` (string): Name or title of the chapter the video belongs to - `chapter` (string): Name or title of the chapter the video belongs to
- `chapter_number` (numeric): Number of the chapter the video belongs to - `chapter_number` (numeric): Number of the chapter the video belongs to
- `chapter_id` (string): Id of the chapter the video belongs to - `chapter_id` (string): Id of the chapter the video belongs to
Available for the video that is an episode of some series or programme: Available for the video that is an episode of some series or programme:
- `series` (string): Title of the series or programme the video episode belongs to - `series` (string): Title of the series or programme the video episode belongs to
- `season` (string): Title of the season the video episode belongs to - `season` (string): Title of the season the video episode belongs to
- `season_number` (numeric): Number of the season the video episode belongs to - `season_number` (numeric): Number of the season the video episode belongs to
@@ -543,6 +552,7 @@ Available for the video that is an episode of some series or programme:
- `episode_id` (string): Id of the video episode - `episode_id` (string): Id of the video episode
Available for the media that is a track or a part of a music album: Available for the media that is a track or a part of a music album:
- `track` (string): Title of the track - `track` (string): Title of the track
- `track_number` (numeric): Number of the track within an album or a disc - `track_number` (numeric): Number of the track within an album or a disc
- `track_id` (string): Id of the track - `track_id` (string): Id of the track
@@ -574,7 +584,7 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples #### Output template examples
Note on Windows you may need to use double quotes instead of single. Note that on Windows you may need to use double quotes instead of single.
```bash ```bash
$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc $ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
@@ -644,7 +654,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
- `container`: Name of the container format - `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format - `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster. Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
@@ -661,7 +671,7 @@ If you want to preserve the old format selection behavior (prior to youtube-dl 2
#### Format selection examples #### Format selection examples
Note on Windows you may need to use double quotes instead of single. Note that on Windows you may need to use double quotes instead of single.
```bash ```bash
# Download best mp4 format available or any other best if no mp4 available # Download best mp4 format available or any other best if no mp4 available

View File

@@ -8,7 +8,7 @@ import re
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
README_FILE = os.path.join(ROOT_DIR, 'README.md') README_FILE = os.path.join(ROOT_DIR, 'README.md')
PREFIX = '''%YOUTUBE-DL(1) PREFIX = r'''%YOUTUBE-DL(1)
# NAME # NAME

View File

@@ -42,9 +42,10 @@
- **Allocine** - **Allocine**
- **AlphaPorno** - **AlphaPorno**
- **AMCNetworks** - **AMCNetworks**
- **anderetijden**: npo.nl and ntr.nl - **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeOnDemand** - **AnimeOnDemand**
- **anitube.se** - **anitube.se**
- **Anvato**
- **AnySex** - **AnySex**
- **Aparat** - **Aparat**
- **AppleConnect** - **AppleConnect**
@@ -66,6 +67,8 @@
- **arte.tv:info** - **arte.tv:info**
- **arte.tv:magazine** - **arte.tv:magazine**
- **arte.tv:playlist** - **arte.tv:playlist**
- **AsianCrush**
- **AsianCrushPlaylist**
- **AtresPlayer** - **AtresPlayer**
- **ATTTechChannel** - **ATTTechChannel**
- **ATVAt** - **ATVAt**
@@ -81,20 +84,18 @@
- **AZMedien**: AZ Medien videos - **AZMedien**: AZ Medien videos
- **AZMedienPlaylist**: AZ Medien playlists - **AZMedienPlaylist**: AZ Medien playlists
- **AZMedienShowPlaylist**: AZ Medien show playlists - **AZMedienShowPlaylist**: AZ Medien show playlists
- **Azubu**
- **AzubuLive**
- **BaiduVideo**: 百度视频 - **BaiduVideo**: 百度视频
- **bambuser** - **bambuser**
- **bambuser:channel** - **bambuser:channel**
- **Bandcamp** - **Bandcamp**
- **Bandcamp:album** - **Bandcamp:album**
- **Bandcamp:weekly**
- **bangumi.bilibili.com**: BiliBili番剧 - **bangumi.bilibili.com**: BiliBili番剧
- **bbc**: BBC - **bbc**: BBC
- **bbc.co.uk**: BBC iPlayer - **bbc.co.uk**: BBC iPlayer
- **bbc.co.uk:article**: BBC articles - **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist** - **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist** - **bbc.co.uk:playlist**
- **Beam:live**
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
@@ -153,7 +154,7 @@
- **chirbit** - **chirbit**
- **chirbit:profile** - **chirbit:profile**
- **Cinchcast** - **Cinchcast**
- **Clipfish** - **CJSW**
- **cliphunter** - **cliphunter**
- **ClipRs** - **ClipRs**
- **Clipsyndicate** - **Clipsyndicate**
@@ -217,6 +218,7 @@
- **DiscoveryVR** - **DiscoveryVR**
- **Disney** - **Disney**
- **Dotsub** - **Dotsub**
- **DouyuShow**
- **DouyuTV**: 斗鱼 - **DouyuTV**: 斗鱼
- **DPlay** - **DPlay**
- **DPlayIt** - **DPlayIt**
@@ -235,6 +237,7 @@
- **EbaumsWorld** - **EbaumsWorld**
- **EchoMsk** - **EchoMsk**
- **egghead:course**: egghead.io course - **egghead:course**: egghead.io course
- **egghead:lesson**: egghead.io lesson
- **eHow** - **eHow**
- **Einthusan** - **Einthusan**
- **eitb.tv** - **eitb.tv**
@@ -282,7 +285,8 @@
- **france2.fr:generation-quoi** - **france2.fr:generation-quoi**
- **FranceCulture** - **FranceCulture**
- **FranceInter** - **FranceInter**
- **francetv**: France 2, 3, 4, 5 and Ô - **FranceTV**
- **FranceTVEmbed**
- **francetvinfo.fr** - **francetvinfo.fr**
- **Freesound** - **Freesound**
- **freespeech.org** - **freespeech.org**
@@ -290,6 +294,7 @@
- **Funimation** - **Funimation**
- **FunnyOrDie** - **FunnyOrDie**
- **Fusion** - **Fusion**
- **Fux**
- **FXNetworks** - **FXNetworks**
- **GameInformer** - **GameInformer**
- **GameOne** - **GameOne**
@@ -310,7 +315,6 @@
- **Go** - **Go**
- **Go90** - **Go90**
- **GodTube** - **GodTube**
- **GodTV**
- **Golem** - **Golem**
- **GoogleDrive** - **GoogleDrive**
- **Goshgay** - **Goshgay**
@@ -367,6 +371,7 @@
- **Jamendo** - **Jamendo**
- **JamendoAlbum** - **JamendoAlbum**
- **JeuxVideo** - **JeuxVideo**
- **Joj**
- **Jove** - **Jove**
- **jpopsuki.tv** - **jpopsuki.tv**
- **JWPlatform** - **JWPlatform**
@@ -433,7 +438,9 @@
- **MDR**: MDR.DE and KiKA - **MDR**: MDR.DE and KiKA
- **media.ccc.de** - **media.ccc.de**
- **Medialaan** - **Medialaan**
- **Mediaset**
- **Medici** - **Medici**
- **megaphone.fm**: megaphone.fm embedded players
- **Meipai**: 美拍 - **Meipai**: 美拍
- **MelonVOD** - **MelonVOD**
- **META** - **META**
@@ -451,6 +458,8 @@
- **mixcloud:playlist** - **mixcloud:playlist**
- **mixcloud:stream** - **mixcloud:stream**
- **mixcloud:user** - **mixcloud:user**
- **Mixer:live**
- **Mixer:vod**
- **MLB** - **MLB**
- **Mnet** - **Mnet**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
@@ -464,7 +473,6 @@
- **MovieFap** - **MovieFap**
- **Moviezine** - **Moviezine**
- **MovingImage** - **MovingImage**
- **MPORA**
- **MSN** - **MSN**
- **mtg**: MTG services - **mtg**: MTG services
- **mtv** - **mtv**
@@ -509,10 +517,13 @@
- **netease:song**: 网易云音乐 - **netease:song**: 网易云音乐
- **Netzkino** - **Netzkino**
- **Newgrounds** - **Newgrounds**
- **NewgroundsPlaylist**
- **Newstube** - **Newstube**
- **NextMedia**: 蘋果日報 - **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞 - **NextMediaActionNews**: 蘋果日報 - 動新聞
- **NextTV**: 壹電視 - **NextTV**: 壹電視
- **Nexx**
- **NexxEmbed**
- **nfb**: National Film Board of Canada - **nfb**: National Film Board of Canada
- **nfl.com** - **nfl.com**
- **NhkVod** - **NhkVod**
@@ -522,6 +533,7 @@
- **nhl.com:videocenter:category**: NHL videocenter category - **nhl.com:videocenter:category**: NHL videocenter category
- **nick.com** - **nick.com**
- **nick.de** - **nick.de**
- **nickelodeonru**
- **nicknight** - **nicknight**
- **niconico**: ニコニコ動画 - **niconico**: ニコニコ動画
- **NiconicoPlaylist** - **NiconicoPlaylist**
@@ -531,6 +543,8 @@
- **NJPWWorld**: 新日本プロレスワールド - **NJPWWorld**: 新日本プロレスワールド
- **NobelPrize** - **NobelPrize**
- **Noco** - **Noco**
- **NonkTube**
- **Noovo**
- **Normalboots** - **Normalboots**
- **NosVideo** - **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz - **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
@@ -541,7 +555,7 @@
- **NowTVList** - **NowTVList**
- **nowvideo**: NowVideo - **nowvideo**: NowVideo
- **Noz** - **Noz**
- **npo**: npo.nl and ntr.nl - **npo**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **npo.nl:live** - **npo.nl:live**
- **npo.nl:radio** - **npo.nl:radio**
- **npo.nl:radio:fragment** - **npo.nl:radio:fragment**
@@ -585,6 +599,7 @@
- **Patreon** - **Patreon**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC) - **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **pcmag** - **pcmag**
- **PearVideo**
- **People** - **People**
- **periscope**: Periscope - **periscope**: Periscope
- **periscope:user**: Periscope user videos - **periscope:user**: Periscope user videos
@@ -602,12 +617,12 @@
- **pluralsight** - **pluralsight**
- **pluralsight:course** - **pluralsight:course**
- **plus.google**: Google Plus - **plus.google**: Google Plus
- **pluzz.francetv.fr**
- **podomatic** - **podomatic**
- **Pokemon** - **Pokemon**
- **PolskieRadio** - **PolskieRadio**
- **PolskieRadioCategory** - **PolskieRadioCategory**
- **PornCom** - **PornCom**
- **PornerBros**
- **PornFlip** - **PornFlip**
- **PornHd** - **PornHd**
- **PornHub**: PornHub and Thumbzilla - **PornHub**: PornHub and Thumbzilla
@@ -616,6 +631,7 @@
- **Pornotube** - **Pornotube**
- **PornoVoisines** - **PornoVoisines**
- **PornoXO** - **PornoXO**
- **PornTube**
- **PressTV** - **PressTV**
- **PrimeShareTV** - **PrimeShareTV**
- **PromptFile** - **PromptFile**
@@ -637,9 +653,12 @@
- **RadioJavan** - **RadioJavan**
- **Rai** - **Rai**
- **RaiPlay** - **RaiPlay**
- **RaiPlayLive**
- **RBMARadio** - **RBMARadio**
- **RDS**: RDS.ca - **RDS**: RDS.ca
- **RedBullTV** - **RedBullTV**
- **Reddit**
- **RedditR**
- **RedTube** - **RedTube**
- **RegioTV** - **RegioTV**
- **RENTV** - **RENTV**
@@ -681,6 +700,7 @@
- **rutube:person**: Rutube person videos - **rutube:person**: Rutube person videos
- **RUTV**: RUTV.RU - **RUTV**: RUTV.RU
- **Ruutu** - **Ruutu**
- **Ruv**
- **safari**: safaribooksonline.com online video - **safari**: safaribooksonline.com online video
- **safari:api** - **safari:api**
- **safari:course**: safaribooksonline.com online courses - **safari:course**: safaribooksonline.com online courses
@@ -719,6 +739,7 @@
- **soundcloud:playlist** - **soundcloud:playlist**
- **soundcloud:search**: Soundcloud search - **soundcloud:search**: Soundcloud search
- **soundcloud:set** - **soundcloud:set**
- **soundcloud:trackstation**
- **soundcloud:user** - **soundcloud:user**
- **soundgasm** - **soundgasm**
- **soundgasm:profile** - **soundgasm:profile**
@@ -759,13 +780,13 @@
- **Tagesschau** - **Tagesschau**
- **tagesschau:player** - **tagesschau:player**
- **Tass** - **Tass**
- **TBS** - **TastyTrade**
- **TBS** (Currently broken)
- **TDSLifeway** - **TDSLifeway**
- **teachertube**: teachertube.com videos - **teachertube**: teachertube.com videos
- **teachertube:user:collection**: teachertube.com user and collection videos - **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel** - **TeachingChannel**
- **Teamcoco** - **Teamcoco**
- **TeamFourStar**
- **TechTalks** - **TechTalks**
- **techtv.mit.edu** - **techtv.mit.edu**
- **ted** - **ted**
@@ -800,16 +821,13 @@
- **ToonGoggles** - **ToonGoggles**
- **Tosh**: Tosh.0 - **Tosh**: Tosh.0
- **tou.tv** - **tou.tv**
- **Toypics**: Toypics user profile - **Toypics**: Toypics video
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken) - **TrailerAddict** (Currently broken)
- **Trilulilu** - **Trilulilu**
- **TruTV** - **TruTV**
- **Tube8** - **Tube8**
- **TubiTv** - **TubiTv**
- **tudou**
- **tudou:album**
- **tudou:playlist**
- **Tumblr** - **Tumblr**
- **tunein:clip** - **tunein:clip**
- **tunein:program** - **tunein:program**
@@ -860,6 +878,8 @@
- **uol.com.br** - **uol.com.br**
- **uplynk** - **uplynk**
- **uplynk:preplay** - **uplynk:preplay**
- **Upskill**
- **UpskillCourse**
- **Urort**: NRK P3 Urørt - **Urort**: NRK P3 Urørt
- **URPlay** - **URPlay**
- **USANetwork** - **USANetwork**
@@ -879,9 +899,10 @@
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet - **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com** - **vh1.com**
- **Viafree** - **Viafree**
- **Vice** - **vice**
- **vice:article**
- **vice:show**
- **Viceland** - **Viceland**
- **ViceShow**
- **Vidbit** - **Vidbit**
- **Viddler** - **Viddler**
- **Videa** - **Videa**
@@ -930,13 +951,15 @@
- **vk:wallpost** - **vk:wallpost**
- **vlive** - **vlive**
- **vlive:channel** - **vlive:channel**
- **vlive:playlist**
- **Vodlocker** - **Vodlocker**
- **VODPl** - **VODPl**
- **VODPlatform** - **VODPlatform**
- **VoiceRepublic** - **VoiceRepublic**
- **Voot**
- **VoxMedia** - **VoxMedia**
- **Vporn** - **Vporn**
- **vpro**: npo.nl and ntr.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **Vrak** - **Vrak**
- **VRT**: deredactie.be, sporza.be, cobra.be and cobra.canvas.be - **VRT**: deredactie.be, sporza.be, cobra.be and cobra.canvas.be
- **vrv** - **vrv**
@@ -951,6 +974,7 @@
- **washingtonpost** - **washingtonpost**
- **washingtonpost:article** - **washingtonpost:article**
- **wat.tv** - **wat.tv**
- **WatchBox**
- **WatchIndianPorn**: Watch Indian Porn - **WatchIndianPorn**: Watch Indian Porn
- **WDR** - **WDR**
- **wdr:mobile** - **wdr:mobile**
@@ -962,7 +986,7 @@
- **wholecloud**: WholeCloud - **wholecloud**: WholeCloud
- **Wimp** - **Wimp**
- **Wistia** - **Wistia**
- **wnl**: npo.nl and ntr.nl - **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop** - **WorldStarHipHop**
- **wrzuta.pl** - **wrzuta.pl**
- **wrzuta.pl:playlist** - **wrzuta.pl:playlist**
@@ -970,7 +994,7 @@
- **WSJArticle** - **WSJArticle**
- **XBef** - **XBef**
- **XboxClips** - **XboxClips**
- **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV, FastVideo.me
- **XHamster** - **XHamster**
- **XHamsterEmbed** - **XHamsterEmbed**
- **xiami:album**: 虾米音乐 - 专辑 - **xiami:album**: 虾米音乐 - 专辑
@@ -986,7 +1010,7 @@
- **XVideos** - **XVideos**
- **XXXYMovies** - **XXXYMovies**
- **Yahoo**: Yahoo screen and movies - **Yahoo**: Yahoo screen and movies
- **Yam**: 蕃薯藤yam天空部落 - **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом - **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист - **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек - **yandexmusic:track**: Яндекс.Музыка - Трек
@@ -1015,6 +1039,7 @@
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword) - **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication) - **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **Zapiks** - **Zapiks**
- **Zaq1**
- **ZDF** - **ZDF**
- **ZDFChannel** - **ZDFChannel**
- **zingmp3**: mp3.zing.vn - **zingmp3**: mp3.zing.vn

View File

@@ -3,12 +3,13 @@
from __future__ import unicode_literals from __future__ import unicode_literals
# Allow direct execution # Allow direct execution
import io
import os import os
import sys import sys
import unittest import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL, expect_dict from test.helper import FakeYDL, expect_dict, expect_value
from youtube_dl.extractor.common import InfoExtractor from youtube_dl.extractor.common import InfoExtractor
from youtube_dl.extractor import YoutubeIE, get_info_extractor from youtube_dl.extractor import YoutubeIE, get_info_extractor
from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
@@ -175,6 +176,318 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
}] }]
}) })
def test_parse_m3u8_formats(self):
_TEST_CASES = [
(
# https://github.com/rg3/youtube-dl/issues/11507
# http://pluzz.francetv.fr/videos/le_ministere.html
'pluzz_francetv_11507',
'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
[{
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_0_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '180',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.66.30',
'tbr': 180,
'width': 256,
'height': 144,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_1_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '303',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.66.30',
'tbr': 303,
'width': 320,
'height': 180,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_2_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '575',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.66.30',
'tbr': 575,
'width': 512,
'height': 288,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_3_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '831',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.77.30',
'tbr': 831,
'width': 704,
'height': 396,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_4_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'protocol': 'm3u8',
'format_id': '1467',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.77.30',
'tbr': 1467,
'width': 1024,
'height': 576,
}]
),
(
# https://github.com/rg3/youtube-dl/issues/11995
# http://teamcoco.com/video/clueless-gamer-super-bowl-for-honor
'teamcoco_11995',
'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
[{
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-160k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': 'audio-0-Default',
'protocol': 'm3u8',
'vcodec': 'none',
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': 'audio-1-Default',
'protocol': 'm3u8',
'vcodec': 'none',
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '71',
'protocol': 'm3u8',
'acodec': 'mp4a.40.5',
'vcodec': 'none',
'tbr': 71,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '413',
'protocol': 'm3u8',
'acodec': 'none',
'vcodec': 'avc1.42001e',
'tbr': 413,
'width': 400,
'height': 224,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '522',
'protocol': 'm3u8',
'acodec': 'none',
'vcodec': 'avc1.42001e',
'tbr': 522,
'width': 400,
'height': 224,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-1m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '1205',
'protocol': 'm3u8',
'acodec': 'none',
'vcodec': 'avc1.4d001e',
'tbr': 1205,
'width': 640,
'height': 360,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-2m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '2374',
'protocol': 'm3u8',
'acodec': 'none',
'vcodec': 'avc1.4d001f',
'tbr': 2374,
'width': 1024,
'height': 576,
}]
),
(
# https://github.com/rg3/youtube-dl/issues/12211
# http://video.toggle.sg/en/series/whoopie-s-world/ep3/478601
'toggle_mobile_12211',
'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
[{
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_sa2ntrdg/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': 'audio-English',
'protocol': 'm3u8',
'language': 'eng',
'vcodec': 'none',
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_r7y0nitg/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': 'audio-Undefined',
'protocol': 'm3u8',
'language': 'und',
'vcodec': 'none',
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_qlk9hlzr/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '155',
'protocol': 'm3u8',
'tbr': 155.648,
'width': 320,
'height': 180,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_oefackmi/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '502',
'protocol': 'm3u8',
'tbr': 502.784,
'width': 480,
'height': 270,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_vyg9pj7k/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '827',
'protocol': 'm3u8',
'tbr': 827.392,
'width': 640,
'height': 360,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_50n4psvx/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '1396',
'protocol': 'm3u8',
'tbr': 1396.736,
'width': 854,
'height': 480,
}]
),
(
# http://www.twitch.tv/riotgames/v/6528877
'twitch_vod',
'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
[{
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/audio_only/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Audio Only',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'none',
'tbr': 182.725,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/mobile/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Mobile',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.42C00D',
'tbr': 280.474,
'width': 400,
'height': 226,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/low/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Low',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.42C01E',
'tbr': 628.347,
'width': 640,
'height': 360,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/medium/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Medium',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.42C01E',
'tbr': 893.387,
'width': 852,
'height': 480,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/high/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'High',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.42C01F',
'tbr': 1603.789,
'width': 1280,
'height': 720,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/chunked/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Source',
'protocol': 'm3u8',
'acodec': 'mp4a.40.2',
'vcodec': 'avc1.100.31',
'tbr': 3214.134,
'width': 1280,
'height': 720,
}]
),
(
# http://www.vidio.com/watch/165683-dj_ambred-booyah-live-2015
# EXT-X-STREAM-INF tag with NAME attribute that is not defined
# in HLS specification
'vidio',
'https://www.vidio.com/videos/165683/playlist.m3u8',
[{
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b300.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '270p 3G',
'protocol': 'm3u8',
'tbr': 300,
'width': 480,
'height': 270,
}, {
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b600.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '360p SD',
'protocol': 'm3u8',
'tbr': 600,
'width': 640,
'height': 360,
}, {
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b1200.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '720p HD',
'protocol': 'm3u8',
'tbr': 1200,
'width': 1280,
'height': 720,
}]
)
]
for m3u8_file, m3u8_url, expected_formats in _TEST_CASES:
with io.open('./test/testdata/m3u8/%s.m3u8' % m3u8_file,
mode='r', encoding='utf-8') as f:
formats = self.ie._parse_m3u8_formats(
f.read(), m3u8_url, ext='mp4')
self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -41,6 +41,7 @@ def _make_result(formats, **kwargs):
'id': 'testid', 'id': 'testid',
'title': 'testttitle', 'title': 'testttitle',
'extractor': 'testex', 'extractor': 'testex',
'extractor_key': 'TestEx',
} }
res.update(**kwargs) res.update(**kwargs)
return res return res
@@ -370,6 +371,19 @@ class TestFormatSelection(unittest.TestCase):
ydl = YDL({'format': 'best[height>360]'}) ydl = YDL({'format': 'best[height>360]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy()) self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_format_selection_issue_10083(self):
# See https://github.com/rg3/youtube-dl/issues/10083
formats = [
{'format_id': 'regular', 'height': 360, 'url': TEST_URL},
{'format_id': 'video', 'height': 720, 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'audio', 'vcodec': 'none', 'url': TEST_URL},
]
info_dict = _make_result(formats)
ydl = YDL({'format': 'best[height>360]/bestvideo[height>360]+bestaudio'})
ydl.process_ie_result(info_dict.copy())
self.assertEqual(ydl.downloaded_info_dicts[0]['format_id'], 'video+audio')
def test_invalid_format_specs(self): def test_invalid_format_specs(self):
def assert_syntax_error(format_spec): def assert_syntax_error(format_spec):
ydl = YDL({'format': format_spec}) ydl = YDL({'format': format_spec})
@@ -448,6 +462,17 @@ class TestFormatSelection(unittest.TestCase):
pass pass
self.assertEqual(ydl.downloaded_info_dicts, []) self.assertEqual(ydl.downloaded_info_dicts, [])
def test_default_format_spec(self):
ydl = YDL({'simulate': True})
self.assertEqual(ydl._default_format_spec({}), 'bestvideo+bestaudio/best')
ydl = YDL({'outtmpl': '-'})
self.assertEqual(ydl._default_format_spec({}), 'best')
ydl = YDL({})
self.assertEqual(ydl._default_format_spec({}, download=False), 'bestvideo+bestaudio/best')
self.assertEqual(ydl._default_format_spec({'is_live': True}), 'best')
class TestYoutubeDL(unittest.TestCase): class TestYoutubeDL(unittest.TestCase):
def test_subtitles(self): def test_subtitles(self):
@@ -527,6 +552,8 @@ class TestYoutubeDL(unittest.TestCase):
'ext': 'mp4', 'ext': 'mp4',
'width': None, 'width': None,
'height': 1080, 'height': 1080,
'title1': '$PATH',
'title2': '%PATH%',
} }
def fname(templ): def fname(templ):
@@ -545,10 +572,14 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4') self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4') self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height) 0 6d.%(ext)s'), ' 01080.mp4') self.assertEqual(fname('%(height) 0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%%'), '%')
self.assertEqual(fname('%%%%'), '%%')
self.assertEqual(fname('%%(height)06d.%(ext)s'), '%(height)06d.mp4') self.assertEqual(fname('%%(height)06d.%(ext)s'), '%(height)06d.mp4')
self.assertEqual(fname('%(width)06d.%(ext)s'), 'NA.mp4') self.assertEqual(fname('%(width)06d.%(ext)s'), 'NA.mp4')
self.assertEqual(fname('%(width)06d.%%(ext)s'), 'NA.%(ext)s') self.assertEqual(fname('%(width)06d.%%(ext)s'), 'NA.%(ext)s')
self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4') self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4')
self.assertEqual(fname('Hello %(title1)s'), 'Hello $PATH')
self.assertEqual(fname('Hello %(title2)s'), 'Hello %PATH%')
def test_format_note(self): def test_format_note(self):
ydl = YoutubeDL() ydl = YoutubeDL()
@@ -755,7 +786,8 @@ class TestYoutubeDL(unittest.TestCase):
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'foo2:', 'url': 'foo2:',
'ie_key': 'Foo2', 'ie_key': 'Foo2',
'title': 'foo1 title' 'title': 'foo1 title',
'id': 'foo1_id',
} }
class Foo2IE(InfoExtractor): class Foo2IE(InfoExtractor):
@@ -781,6 +813,9 @@ class TestYoutubeDL(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0] downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['url'], TEST_URL) self.assertEqual(downloaded['url'], TEST_URL)
self.assertEqual(downloaded['title'], 'foo1 title') self.assertEqual(downloaded['title'], 'foo1 title')
self.assertEqual(downloaded['id'], 'testid')
self.assertEqual(downloaded['extractor'], 'testex')
self.assertEqual(downloaded['extractor_key'], 'TestEx')
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -225,7 +225,7 @@ def generator(test_case, tname):
format_bytes(got_fsize))) format_bytes(got_fsize)))
if 'md5' in tc: if 'md5' in tc:
md5_for_file = _file_md5(tc_filename) md5_for_file = _file_md5(tc_filename)
self.assertEqual(md5_for_file, tc['md5']) self.assertEqual(tc['md5'], md5_for_file)
# Finally, check test cases' data again but this time against # Finally, check test cases' data again but this time against
# extracted data from info JSON file written during processing # extracted data from info JSON file written during processing
info_json_fn = os.path.splitext(tc_filename)[0] + '.info.json' info_json_fn = os.path.splitext(tc_filename)[0] + '.info.json'

26
test/test_options.py Normal file
View File

@@ -0,0 +1,26 @@
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.options import _hide_login_info
class TestOptions(unittest.TestCase):
def test_hide_login_info(self):
self.assertEqual(_hide_login_info(['-u', 'foo', '-p', 'bar']),
['-u', 'PRIVATE', '-p', 'PRIVATE'])
self.assertEqual(_hide_login_info(['-u']), ['-u'])
self.assertEqual(_hide_login_info(['-u', 'foo', '-u', 'bar']),
['-u', 'PRIVATE', '-u', 'PRIVATE'])
self.assertEqual(_hide_login_info(['--username=foo']),
['--username=PRIVATE'])
if __name__ == '__main__':
unittest.main()

View File

@@ -44,6 +44,7 @@ from youtube_dl.utils import (
limit_length, limit_length,
mimetype2ext, mimetype2ext,
month_by_name, month_by_name,
multipart_encode,
ohdave_rsa_encrypt, ohdave_rsa_encrypt,
OnDemandPagedList, OnDemandPagedList,
orderedSet, orderedSet,
@@ -97,6 +98,7 @@ from youtube_dl.compat import (
compat_chr, compat_chr,
compat_etree_fromstring, compat_etree_fromstring,
compat_getenv, compat_getenv,
compat_os_name,
compat_setenv, compat_setenv,
compat_urlparse, compat_urlparse,
compat_parse_qs, compat_parse_qs,
@@ -338,6 +340,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('UNKNOWN DATE FORMAT'), None) self.assertEqual(unified_timestamp('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500) self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100) self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100)
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
def test_determine_ext(self): def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4') self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@@ -445,7 +449,9 @@ class TestUtil(unittest.TestCase):
def test_shell_quote(self): def test_shell_quote(self):
args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')] args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')]
self.assertEqual(shell_quote(args), """ffmpeg -i 'ñ€ß'"'"'.mp4'""") self.assertEqual(
shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_str_to_int(self): def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456) self.assertEqual(str_to_int('123,456'), 123456)
@@ -619,6 +625,16 @@ class TestUtil(unittest.TestCase):
'http://example.com/path', {'test': '第二行тест'})), 'http://example.com/path', {'test': '第二行тест'})),
query_dict('http://example.com/path?test=%E7%AC%AC%E4%BA%8C%E8%A1%8C%D1%82%D0%B5%D1%81%D1%82')) query_dict('http://example.com/path?test=%E7%AC%AC%E4%BA%8C%E8%A1%8C%D1%82%D0%B5%D1%81%D1%82'))
def test_multipart_encode(self):
self.assertEqual(
multipart_encode({b'field': b'value'}, boundary='AAAAAA')[0],
b'--AAAAAA\r\nContent-Disposition: form-data; name="field"\r\n\r\nvalue\r\n--AAAAAA--\r\n')
self.assertEqual(
multipart_encode({'欄位'.encode('utf-8'): ''.encode('utf-8')}, boundary='AAAAAA')[0],
b'--AAAAAA\r\nContent-Disposition: form-data; name="\xe6\xac\x84\xe4\xbd\x8d"\r\n\r\n\xe5\x80\xbc\r\n--AAAAAA--\r\n')
self.assertRaises(
ValueError, multipart_encode, {b'field': b'value'}, boundary='value')
def test_dict_get(self): def test_dict_get(self):
FALSE_VALUES = { FALSE_VALUES = {
'none': None, 'none': None,
@@ -666,6 +682,14 @@ class TestUtil(unittest.TestCase):
d = json.loads(stripped) d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'}) self.assertEqual(d, {'status': 'success'})
stripped = strip_jsonp('window.cb && window.cb({"status": "success"});')
d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'})
stripped = strip_jsonp('window.cb && cb({"status": "success"});')
d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'})
def test_uppercase_escape(self): def test_uppercase_escape(self):
self.assertEqual(uppercase_escape(''), '') self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐') self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
@@ -895,10 +919,13 @@ class TestUtil(unittest.TestCase):
supports_outside_bmp = False supports_outside_bmp = False
if supports_outside_bmp: if supports_outside_bmp:
self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'}) self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'})
# Malformed HTML should not break attributes extraction on older Python
self.assertEqual(extract_attributes('<mal"formed/>'), {})
def test_clean_html(self): def test_clean_html(self):
self.assertEqual(clean_html('a:\nb'), 'a: b') self.assertEqual(clean_html('a:\nb'), 'a: b')
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"') self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
self.assertEqual(clean_html('a<br>\xa0b'), 'a\nb')
def test_intlist_to_bytes(self): def test_intlist_to_bytes(self):
self.assertEqual( self.assertEqual(
@@ -908,7 +935,7 @@ class TestUtil(unittest.TestCase):
def test_args_to_str(self): def test_args_to_str(self):
self.assertEqual( self.assertEqual(
args_to_str(['foo', 'ba/r', '-baz', '2 be', '']), args_to_str(['foo', 'ba/r', '-baz', '2 be', '']),
'foo ba/r -baz \'2 be\' \'\'' 'foo ba/r -baz \'2 be\' \'\'' if compat_os_name != 'nt' else 'foo ba/r -baz "2 be" ""'
) )
def test_parse_filesize(self): def test_parse_filesize(self):
@@ -1069,6 +1096,47 @@ The first line
''' '''
self.assertEqual(dfxp2srt(dfxp_data_no_default_namespace), srt_data) self.assertEqual(dfxp2srt(dfxp_data_no_default_namespace), srt_data)
dfxp_data_with_style = '''<?xml version="1.0" encoding="utf-8"?>
<tt xmlns="http://www.w3.org/2006/10/ttaf1" xmlns:ttp="http://www.w3.org/2006/10/ttaf1#parameter" ttp:timeBase="media" xmlns:tts="http://www.w3.org/2006/10/ttaf1#style" xml:lang="en" xmlns:ttm="http://www.w3.org/2006/10/ttaf1#metadata">
<head>
<styling>
<style id="s2" style="s0" tts:color="cyan" tts:fontWeight="bold" />
<style id="s1" style="s0" tts:color="yellow" tts:fontStyle="italic" />
<style id="s3" style="s0" tts:color="lime" tts:textDecoration="underline" />
<style id="s0" tts:backgroundColor="black" tts:fontStyle="normal" tts:fontSize="16" tts:fontFamily="sansSerif" tts:color="white" />
</styling>
</head>
<body tts:textAlign="center" style="s0">
<div>
<p begin="00:00:02.08" id="p0" end="00:00:05.84">default style<span tts:color="red">custom style</span></p>
<p style="s2" begin="00:00:02.08" id="p0" end="00:00:05.84"><span tts:color="lime">part 1<br /></span><span tts:color="cyan">part 2</span></p>
<p style="s3" begin="00:00:05.84" id="p1" end="00:00:09.56">line 3<br />part 3</p>
<p style="s1" tts:textDecoration="underline" begin="00:00:09.56" id="p2" end="00:00:12.36"><span style="s2" tts:color="lime">inner<br /> </span>style</p>
</div>
</body>
</tt>'''
srt_data = '''1
00:00:02,080 --> 00:00:05,839
<font color="white" face="sansSerif" size="16">default style<font color="red">custom style</font></font>
2
00:00:02,080 --> 00:00:05,839
<b><font color="cyan" face="sansSerif" size="16"><font color="lime">part 1
</font>part 2</font></b>
3
00:00:05,839 --> 00:00:09,560
<u><font color="lime">line 3
part 3</font></u>
4
00:00:09,560 --> 00:00:12,359
<i><u><font color="yellow"><font color="lime">inner
</font>style</font></u></i>
'''
self.assertEqual(dfxp2srt(dfxp_data_with_style), srt_data)
def test_cli_option(self): def test_cli_option(self):
self.assertEqual(cli_option({'proxy': '127.0.0.1:3128'}, '--proxy', 'proxy'), ['--proxy', '127.0.0.1:3128']) self.assertEqual(cli_option({'proxy': '127.0.0.1:3128'}, '--proxy', 'proxy'), ['--proxy', '127.0.0.1:3128'])
self.assertEqual(cli_option({'proxy': None}, '--proxy', 'proxy'), []) self.assertEqual(cli_option({'proxy': None}, '--proxy', 'proxy'), [])
@@ -1114,6 +1182,10 @@ The first line
cli_bool_option( cli_bool_option(
{'nocheckcertificate': False}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='), {'nocheckcertificate': False}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='),
['--check-certificate=true']) ['--check-certificate=true'])
self.assertEqual(
cli_bool_option(
{}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='),
[])
def test_ohdave_rsa_encrypt(self): def test_ohdave_rsa_encrypt(self):
N = 0xab86b6371b5318aaa1d3c9e612a9f1264f372323c8c0f19875b5fc3b3fd3afcc1e5bec527aa94bfa85bffc157e4245aebda05389a5357b75115ac94f074aefcd N = 0xab86b6371b5318aaa1d3c9e612a9f1264f372323c8c0f19875b5fc3b3fd3afcc1e5bec527aa94bfa85bffc157e4245aebda05389a5357b75115ac94f074aefcd
@@ -1163,6 +1235,12 @@ The first line
self.assertEqual(get_element_by_attribute('class', 'foo', html), None) self.assertEqual(get_element_by_attribute('class', 'foo', html), None)
self.assertEqual(get_element_by_attribute('class', 'no-such-foo', html), None) self.assertEqual(get_element_by_attribute('class', 'no-such-foo', html), None)
html = '''
<div itemprop="author" itemscope>foo</div>
'''
self.assertEqual(get_element_by_attribute('itemprop', 'author', html), 'foo')
def test_get_elements_by_class(self): def test_get_elements_by_class(self):
html = ''' html = '''
<span class="foo bar">nice</span><span class="foo bar">also nice</span> <span class="foo bar">nice</span><span class="foo bar">also nice</span>

View File

@@ -0,0 +1,275 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import expect_value
from youtube_dl.extractor import YoutubeIE
class TestYoutubeChapters(unittest.TestCase):
_TEST_CASES = [
(
# https://www.youtube.com/watch?v=A22oy8dFjqc
# pattern: 00:00 - <title>
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
1477,
[{
'start_time': 36,
'end_time': 162,
'title': 'Bohemian Rhapsody',
}, {
'start_time': 162,
'end_time': 413,
'title': 'Radio Ga Ga',
}, {
'start_time': 413,
'end_time': 454,
'title': 'Ay Oh!',
}, {
'start_time': 454,
'end_time': 728,
'title': 'Hammer To Fall',
}, {
'start_time': 728,
'end_time': 963,
'title': 'Crazy Little Thing Called Love',
}, {
'start_time': 963,
'end_time': 1038,
'title': 'We Will Rock You',
}, {
'start_time': 1038,
'end_time': 1272,
'title': 'We Are The Champions',
}, {
'start_time': 1272,
'end_time': 1477,
'title': 'Is This The World We Created...?',
}]
),
(
# https://www.youtube.com/watch?v=ekYlRhALiRQ
# pattern: <num>. <title> 0:00
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
4009,
[{
'start_time': 0,
'end_time': 707,
'title': '1. Those Beaten Paths of Confusion',
}, {
'start_time': 707,
'end_time': 1590,
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
}, {
'start_time': 1590,
'end_time': 2157,
'title': '3. Poison Yourself...With Thought',
}, {
'start_time': 2157,
'end_time': 2672,
'title': '4. The Agents of Transformation',
}, {
'start_time': 2672,
'end_time': 3187,
'title': '5. Drowning in the Pain of Consciousness',
}, {
'start_time': 3187,
'end_time': 4009,
'title': '6. Deny the Disease of Life',
}]
),
(
# https://www.youtube.com/watch?v=WjL4pSzog9w
# pattern: 00:00 <title>
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
2705,
[{
'start_time': 0,
'end_time': 428,
'title': 'Christening Unborn Deformities',
}, {
'start_time': 428,
'end_time': 976,
'title': 'Taste of Purity',
}, {
'start_time': 976,
'end_time': 1485,
'title': 'Sculpting Sins of a Universal Tongue',
}, {
'start_time': 1485,
'end_time': 1884,
'title': 'Birth',
}, {
'start_time': 1884,
'end_time': 2275,
'title': 'Neves',
}, {
'start_time': 2275,
'end_time': 2705,
'title': 'Libations in Limbo',
}]
),
(
# https://www.youtube.com/watch?v=o3r1sn-t3is
# pattern: <title> 00:00 <note>
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
5640,
[{
'start_time': 45,
'end_time': 266,
'title': 'I-E-A-I-A-I-O',
}, {
'start_time': 266,
'end_time': 331,
'title': 'Suite-Pee (Incomplete)',
}, {
'start_time': 331,
'end_time': 522,
'title': 'Attack (First live performance since 2011)',
}, {
'start_time': 522,
'end_time': 752,
'title': 'Prison Song',
}, {
'start_time': 752,
'end_time': 932,
'title': 'Know (First live performance since 2011)',
}, {
'start_time': 932,
'end_time': 1153,
'title': 'Aerials',
}, {
'start_time': 1153,
'end_time': 1209,
'title': 'Soldier Side - Intro',
}, {
'start_time': 1209,
'end_time': 1472,
'title': 'B.Y.O.B.',
}, {
'start_time': 1472,
'end_time': 1668,
'title': 'Soil',
}, {
'start_time': 1668,
'end_time': 1838,
'title': 'Darts',
}, {
'start_time': 1838,
'end_time': 2105,
'title': 'Radio/Video',
}, {
'start_time': 2105,
'end_time': 2288,
'title': 'Hypnotize',
}, {
'start_time': 2288,
'end_time': 2460,
'title': 'Temper (First live performance since 1999)',
}, {
'start_time': 2460,
'end_time': 2577,
'title': 'CUBErt',
}, {
'start_time': 2577,
'end_time': 2787,
'title': 'Needles',
}, {
'start_time': 2787,
'end_time': 2978,
'title': 'Deer Dance',
}, {
'start_time': 2978,
'end_time': 3085,
'title': 'Bounce',
}, {
'start_time': 3085,
'end_time': 3232,
'title': 'Suggestions',
}, {
'start_time': 3232,
'end_time': 3493,
'title': 'Psycho',
}, {
'start_time': 3493,
'end_time': 3675,
'title': 'Chop Suey!',
}, {
'start_time': 3675,
'end_time': 3854,
'title': 'Lonely Day',
}, {
'start_time': 3854,
'end_time': 4090,
'title': 'Question!',
}, {
'start_time': 4090,
'end_time': 4420,
'title': 'Lost in Hollywood',
}, {
'start_time': 4420,
'end_time': 4577,
'title': 'Vicinity of Obscenity (First live performance since 2012)',
}, {
'start_time': 4577,
'end_time': 4802,
'title': 'Forest',
}, {
'start_time': 4802,
'end_time': 5037,
'title': 'Cigaro',
}, {
'start_time': 5037,
'end_time': 5273,
'title': 'Toxicity (with Chino Moreno)',
}, {
'start_time': 5273,
'end_time': 5640,
'title': 'Sugar',
}]
),
(
# https://www.youtube.com/watch?v=PkYLQbsqCE8
# pattern: <num> - <title> [<latinized title>] 0:00:00
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
1138,
[{
'start_time': 0,
'end_time': 490,
'title': '1 - Во прах [Vo prakh]',
}, {
'start_time': 490,
'end_time': 870,
'title': '2 - Искупление [Iskupleniye]',
}, {
'start_time': 870,
'end_time': 1138,
'title': '3 - Из серпов луны...[Iz serpov luny]',
}]
),
(
# https://www.youtube.com/watch?v=xZW70zEasOk
# time point more than duration
'''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
283,
[]
),
]
def test_youtube_chapters(self):
for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE()
expect_value(
self, ie._extract_chapters(description, duration),
expected_chapters, None)
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,14 @@
#EXTM3U
#EXT-X-VERSION:5
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Francais",DEFAULT=NO,FORCED=NO,URI="http://replayftv-pmd.francetv.fr/subtitles/2017/16/156589847-1492488987.m3u8",LANGUAGE="fra"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac",LANGUAGE="fra",NAME="Francais",DEFAULT=YES, AUTOSELECT=YES
#EXT-X-STREAM-INF:SUBTITLES="subs",AUDIO="aac",PROGRAM-ID=1,BANDWIDTH=180000,RESOLUTION=256x144,CODECS="avc1.66.30, mp4a.40.2"
http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_0_av.m3u8?null=0
#EXT-X-STREAM-INF:SUBTITLES="subs",AUDIO="aac",PROGRAM-ID=1,BANDWIDTH=303000,RESOLUTION=320x180,CODECS="avc1.66.30, mp4a.40.2"
http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_1_av.m3u8?null=0
#EXT-X-STREAM-INF:SUBTITLES="subs",AUDIO="aac",PROGRAM-ID=1,BANDWIDTH=575000,RESOLUTION=512x288,CODECS="avc1.66.30, mp4a.40.2"
http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_2_av.m3u8?null=0
#EXT-X-STREAM-INF:SUBTITLES="subs",AUDIO="aac",PROGRAM-ID=1,BANDWIDTH=831000,RESOLUTION=704x396,CODECS="avc1.77.30, mp4a.40.2"

16
test/testdata/m3u8/teamcoco_11995.m3u8 vendored Normal file
View File

@@ -0,0 +1,16 @@
#EXTM3U
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio-0",NAME="Default",AUTOSELECT=YES,DEFAULT=YES,URI="hls/CONAN_020217_Highlight_show-audio-160k_v4.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio-1",NAME="Default",AUTOSELECT=YES,DEFAULT=YES,URI="hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8"
#EXT-X-I-FRAME-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=37862000,CODECS="avc1.4d001f",URI="hls/CONAN_020217_Highlight_show-2m_iframe.m3u8"
#EXT-X-I-FRAME-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=18750000,CODECS="avc1.4d001e",URI="hls/CONAN_020217_Highlight_show-1m_iframe.m3u8"
#EXT-X-I-FRAME-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=6535000,CODECS="avc1.42001e",URI="hls/CONAN_020217_Highlight_show-400k_iframe.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2374000,RESOLUTION=1024x576,CODECS="avc1.4d001f,mp4a.40.2",AUDIO="audio-0"
hls/CONAN_020217_Highlight_show-2m_v4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1205000,RESOLUTION=640x360,CODECS="avc1.4d001e,mp4a.40.2",AUDIO="audio-0"
hls/CONAN_020217_Highlight_show-1m_v4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=522000,RESOLUTION=400x224,CODECS="avc1.42001e,mp4a.40.2",AUDIO="audio-0"
hls/CONAN_020217_Highlight_show-400k_v4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=413000,RESOLUTION=400x224,CODECS="avc1.42001e,mp4a.40.5",AUDIO="audio-1"
hls/CONAN_020217_Highlight_show-400k_v4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=71000,CODECS="mp4a.40.5",AUDIO="audio-1"
hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8

View File

@@ -0,0 +1,13 @@
#EXTM3U
#EXT-X-VERSION:4
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="eng",NAME="English",URI="http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_sa2ntrdg/name/a.mp4/index.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="und",NAME="Undefined",URI="http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_r7y0nitg/name/a.mp4/index.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=155648,RESOLUTION=320x180,AUDIO="audio"
http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_qlk9hlzr/name/a.mp4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=502784,RESOLUTION=480x270,AUDIO="audio"
http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_oefackmi/name/a.mp4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=827392,RESOLUTION=640x360,AUDIO="audio"
http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_vyg9pj7k/name/a.mp4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1396736,RESOLUTION=854x480,AUDIO="audio"
http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_50n4psvx/name/a.mp4/index.m3u8

20
test/testdata/m3u8/twitch_vod.m3u8 vendored Normal file
View File

@@ -0,0 +1,20 @@
#EXTM3U
#EXT-X-TWITCH-INFO:ORIGIN="s3",CLUSTER="edgecast_vod",REGION="EU",MANIFEST-CLUSTER="edgecast_vod",USER-IP="109.171.17.81"
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="chunked",NAME="Source",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3214134,CODECS="avc1.100.31,mp4a.40.2",RESOLUTION="1280x720",VIDEO="chunked"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/chunked/index-muted-HM49I092CC.m3u8
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="high",NAME="High",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1603789,CODECS="avc1.42C01F,mp4a.40.2",RESOLUTION="1280x720",VIDEO="high"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/high/index-muted-HM49I092CC.m3u8
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="medium",NAME="Medium",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=893387,CODECS="avc1.42C01E,mp4a.40.2",RESOLUTION="852x480",VIDEO="medium"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/medium/index-muted-HM49I092CC.m3u8
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="low",NAME="Low",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=628347,CODECS="avc1.42C01E,mp4a.40.2",RESOLUTION="640x360",VIDEO="low"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/low/index-muted-HM49I092CC.m3u8
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="mobile",NAME="Mobile",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=280474,CODECS="avc1.42C00D,mp4a.40.2",RESOLUTION="400x226",VIDEO="mobile"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/mobile/index-muted-HM49I092CC.m3u8
#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="audio_only",NAME="Audio Only",AUTOSELECT=NO,DEFAULT=NO
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=182725,CODECS="mp4a.40.2",VIDEO="audio_only"
https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/audio_only/index-muted-HM49I092CC.m3u8

10
test/testdata/m3u8/vidio.m3u8 vendored Normal file
View File

@@ -0,0 +1,10 @@
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=300000,RESOLUTION=480x270,NAME="270p 3G"
https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b300.mp4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=600000,RESOLUTION=640x360,NAME="360p SD"
https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b600.mp4.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1200000,RESOLUTION=1280x720,NAME="720p HD"
https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b1200.mp4.m3u8

View File

@@ -26,6 +26,8 @@ import tokenize
import traceback import traceback
import random import random
from string import ascii_letters
from .compat import ( from .compat import (
compat_basestring, compat_basestring,
compat_cookiejar, compat_cookiejar,
@@ -58,6 +60,7 @@ from .utils import (
format_bytes, format_bytes,
formatSeconds, formatSeconds,
GeoRestrictedError, GeoRestrictedError,
int_or_none,
ISO3166Utils, ISO3166Utils,
locked_file, locked_file,
make_HTTPS_handler, make_HTTPS_handler,
@@ -302,6 +305,17 @@ class YoutubeDL(object):
postprocessor. postprocessor.
""" """
_NUMERIC_FIELDS = set((
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
'timestamp', 'upload_year', 'upload_month', 'upload_day',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
'chapter_number', 'season_number', 'episode_number',
'track_number', 'disc_number', 'release_year',
'playlist_index',
))
params = None params = None
_ies = [] _ies = []
_pps = [] _pps = []
@@ -370,10 +384,10 @@ class YoutubeDL(object):
else: else:
raise raise
if (sys.version_info >= (3,) and sys.platform != 'win32' and if (sys.platform != 'win32' and
sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and
not params.get('restrictfilenames', False)): not params.get('restrictfilenames', False)):
# On Python 3, the Unicode filesystem API will throw errors (#1474) # Unicode filesystem API will throw errors (#1474, #13027)
self.report_warning( self.report_warning(
'Assuming --restrict-filenames since file system encoding ' 'Assuming --restrict-filenames since file system encoding '
'cannot encode all characters. ' 'cannot encode all characters. '
@@ -498,24 +512,25 @@ class YoutubeDL(object):
def to_console_title(self, message): def to_console_title(self, message):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if compat_os_name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow(): if compat_os_name == 'nt':
# c_wchar_p() might not be necessary if `message` is if ctypes.windll.kernel32.GetConsoleWindow():
# already of type unicode() # c_wchar_p() might not be necessary if `message` is
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message)) # already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
elif 'TERM' in os.environ: elif 'TERM' in os.environ:
self._write_string('\033]0;%s\007' % message, self._screen_file) self._write_string('\033]0;%s\007' % message, self._screen_file)
def save_console_title(self): def save_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Save the title on stack # Save the title on stack
self._write_string('\033[22;0t', self._screen_file) self._write_string('\033[22;0t', self._screen_file)
def restore_console_title(self): def restore_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Restore the title from stack # Restore the title from stack
self._write_string('\033[23;0t', self._screen_file) self._write_string('\033[23;0t', self._screen_file)
@@ -638,22 +653,11 @@ class YoutubeDL(object):
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')], r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl) outtmpl)
NUMERIC_FIELDS = set((
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
'upload_year', 'upload_month', 'upload_day',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
'chapter_number', 'season_number', 'episode_number',
'track_number', 'disc_number', 'release_year',
'playlist_index',
))
# Missing numeric fields used together with integer presentation types # Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since # in format specification will break the argument substitution since
# string 'NA' is returned for missing fields. We will patch output # string 'NA' is returned for missing fields. We will patch output
# template for missing fields to meet string presentation type. # template for missing fields to meet string presentation type.
for numeric_field in NUMERIC_FIELDS: for numeric_field in self._NUMERIC_FIELDS:
if numeric_field not in template_dict: if numeric_field not in template_dict:
# As of [1] format syntax is: # As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type # %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
@@ -672,7 +676,19 @@ class YoutubeDL(object):
FORMAT_RE.format(numeric_field), FORMAT_RE.format(numeric_field),
r'%({0})s'.format(numeric_field), outtmpl) r'%({0})s'.format(numeric_field), outtmpl)
filename = expand_path(outtmpl % template_dict) # expand_path translates '%%' into '%' and '$$' into '$'
# correspondingly that is not what we want since we need to keep
# '%%' intact for template dict substitution step. Working around
# with boundary-alike separator hack.
sep = ''.join([random.choice(ascii_letters) for _ in range(32)])
outtmpl = outtmpl.replace('%%', '%{0}%'.format(sep)).replace('$$', '${0}$'.format(sep))
# outtmpl should be expand_path'ed before template dict substitution
# because meta fields may contain env variables we don't want to
# be expanded. For example, for outtmpl "%(title)s.%(ext)s" and
# title "Hello $PATH", we don't want `$PATH` to be expanded.
filename = expand_path(outtmpl).replace(sep, '') % template_dict
# Temporary fix for #4787 # Temporary fix for #4787
# 'Treat' all problem characters by passing filename through preferredencoding # 'Treat' all problem characters by passing filename through preferredencoding
# to workaround encoding issues with subprocess on python2 @ Windows # to workaround encoding issues with subprocess on python2 @ Windows
@@ -844,7 +860,7 @@ class YoutubeDL(object):
force_properties = dict( force_properties = dict(
(k, v) for k, v in ie_result.items() if v is not None) (k, v) for k, v in ie_result.items() if v is not None)
for f in ('_type', 'url', 'ie_key'): for f in ('_type', 'url', 'id', 'extractor', 'extractor_key', 'ie_key'):
if f in force_properties: if f in force_properties:
del force_properties[f] del force_properties[f]
new_result = info.copy() new_result = info.copy()
@@ -1048,6 +1064,25 @@ class YoutubeDL(object):
return op(actual_value, comparison_value) return op(actual_value, comparison_value)
return _filter return _filter
def _default_format_spec(self, info_dict, download=True):
req_format_list = []
def can_have_partial_formats():
if self.params.get('simulate', False):
return True
if not download:
return True
if self.params.get('outtmpl', DEFAULT_OUTTMPL) == '-':
return False
if info_dict.get('is_live'):
return False
merger = FFmpegMergerPP(self)
return merger.available and merger.can_merge()
if can_have_partial_formats():
req_format_list.append('bestvideo+bestaudio')
req_format_list.append('best')
return '/'.join(req_format_list)
def build_format_selector(self, format_spec): def build_format_selector(self, format_spec):
def syntax_error(note, start): def syntax_error(note, start):
message = ( message = (
@@ -1344,9 +1379,28 @@ class YoutubeDL(object):
if 'title' not in info_dict: if 'title' not in info_dict:
raise ExtractorError('Missing "title" field in extractor result') raise ExtractorError('Missing "title" field in extractor result')
if not isinstance(info_dict['id'], compat_str): def report_force_conversion(field, field_not, conversion):
self.report_warning('"id" field is not a string - forcing string conversion') self.report_warning(
info_dict['id'] = compat_str(info_dict['id']) '"%s" field is not %s - forcing %s conversion, there is an error in extractor'
% (field, field_not, conversion))
def sanitize_string_field(info, string_field):
field = info.get(string_field)
if field is None or isinstance(field, compat_str):
return
report_force_conversion(string_field, 'a string', 'string')
info[string_field] = compat_str(field)
def sanitize_numeric_fields(info):
for numeric_field in self._NUMERIC_FIELDS:
field = info.get(numeric_field)
if field is None or isinstance(field, compat_numeric_types):
continue
report_force_conversion(numeric_field, 'numeric', 'int')
info[numeric_field] = int_or_none(field)
sanitize_string_field(info_dict, 'id')
sanitize_numeric_fields(info_dict)
if 'playlist' not in info_dict: if 'playlist' not in info_dict:
# It isn't part of a playlist # It isn't part of a playlist
@@ -1427,16 +1481,26 @@ class YoutubeDL(object):
if not formats: if not formats:
raise ExtractorError('No video formats found!') raise ExtractorError('No video formats found!')
def is_wellformed(f):
url = f.get('url')
valid_url = url and isinstance(url, compat_str)
if not valid_url:
self.report_warning(
'"url" field is missing or empty - skipping format, '
'there is an error in extractor')
return valid_url
# Filter out malformed formats for better extraction robustness
formats = list(filter(is_wellformed, formats))
formats_dict = {} formats_dict = {}
# We check that all the formats have the format and format_id fields # We check that all the formats have the format and format_id fields
for i, format in enumerate(formats): for i, format in enumerate(formats):
if 'url' not in format: sanitize_string_field(format, 'format_id')
raise ExtractorError('Missing "url" key in result (index %d)' % i) sanitize_numeric_fields(format)
format['url'] = sanitize_url(format['url']) format['url'] = sanitize_url(format['url'])
if not format.get('format_id'):
if format.get('format_id') is None:
format['format_id'] = compat_str(i) format['format_id'] = compat_str(i)
else: else:
# Sanitize format_id from characters used in format selector expression # Sanitize format_id from characters used in format selector expression
@@ -1489,14 +1553,10 @@ class YoutubeDL(object):
req_format = self.params.get('format') req_format = self.params.get('format')
if req_format is None: if req_format is None:
req_format_list = [] req_format = self._default_format_spec(info_dict, download=download)
if (self.params.get('outtmpl', DEFAULT_OUTTMPL) != '-' and if self.params.get('verbose'):
not info_dict.get('is_live')): self.to_stdout('[debug] Default format spec: %s' % req_format)
merger = FFmpegMergerPP(self)
if merger.available and merger.can_merge():
req_format_list.append('bestvideo+bestaudio')
req_format_list.append('best')
req_format = '/'.join(req_format_list)
format_selector = self.build_format_selector(req_format) format_selector = self.build_format_selector(req_format)
# While in format selection we may need to have an access to the original # While in format selection we may need to have an access to the original
@@ -1859,7 +1919,7 @@ class YoutubeDL(object):
info_dict.get('protocol') == 'm3u8' and info_dict.get('protocol') == 'm3u8' and
self.params.get('hls_prefer_native')): self.params.get('hls_prefer_native')):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning('%s: malformated aac bitstream.' % ( self.report_warning('%s: malformed AAC bitstream detected.' % (
info_dict['id'])) info_dict['id']))
elif fixup_policy == 'detect_or_warn': elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM3u8PP(self) fixup_pp = FFmpegFixupM3u8PP(self)
@@ -1868,7 +1928,7 @@ class YoutubeDL(object):
info_dict['__postprocessors'].append(fixup_pp) info_dict['__postprocessors'].append(fixup_pp)
else: else:
self.report_warning( self.report_warning(
'%s: malformated aac bitstream. %s' '%s: malformed AAC bitstream detected. %s'
% (info_dict['id'], INSTALL_FFMPEG_MESSAGE)) % (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')

View File

@@ -343,6 +343,7 @@ def _real_main(argv=None):
'retries': opts.retries, 'retries': opts.retries,
'fragment_retries': opts.fragment_retries, 'fragment_retries': opts.fragment_retries,
'skip_unavailable_fragments': opts.skip_unavailable_fragments, 'skip_unavailable_fragments': opts.skip_unavailable_fragments,
'keep_fragments': opts.keep_fragments,
'buffersize': opts.buffersize, 'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer, 'noresizebuffer': opts.noresizebuffer,
'continuedl': opts.continue_dl, 'continuedl': opts.continue_dl,

View File

@@ -2322,6 +2322,19 @@ try:
except ImportError: # Python 2 except ImportError: # Python 2
from HTMLParser import HTMLParser as compat_HTMLParser from HTMLParser import HTMLParser as compat_HTMLParser
try: # Python 2
from HTMLParser import HTMLParseError as compat_HTMLParseError
except ImportError: # Python <3.4
try:
from html.parser import HTMLParseError as compat_HTMLParseError
except ImportError: # Python >3.4
# HTMLParseError has been deprecated in Python 3.3 and removed in
# Python 3.5. Introducing dummy exception for Python >3.5 for compatible
# and uniform cross-version exceptiong handling
class compat_HTMLParseError(Exception):
pass
try: try:
from subprocess import DEVNULL from subprocess import DEVNULL
compat_subprocess_get_DEVNULL = lambda: DEVNULL compat_subprocess_get_DEVNULL = lambda: DEVNULL
@@ -2604,14 +2617,22 @@ except ImportError: # Python 2
parsed_result[name] = [value] parsed_result[name] = [value]
return parsed_result return parsed_result
try:
from shlex import quote as compat_shlex_quote compat_os_name = os._name if os.name == 'java' else os.name
except ImportError: # Python < 3.3
if compat_os_name == 'nt':
def compat_shlex_quote(s): def compat_shlex_quote(s):
if re.match(r'^[-_\w./]+$', s): return s if re.match(r'^[-_\w./]+$', s) else '"%s"' % s.replace('"', '\\"')
return s else:
else: try:
return "'" + s.replace("'", "'\"'\"'") + "'" from shlex import quote as compat_shlex_quote
except ImportError: # Python < 3.3
def compat_shlex_quote(s):
if re.match(r'^[-_\w./]+$', s):
return s
else:
return "'" + s.replace("'", "'\"'\"'") + "'"
try: try:
@@ -2636,9 +2657,6 @@ def compat_ord(c):
return ord(c) return ord(c)
compat_os_name = os._name if os.name == 'java' else os.name
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
compat_getenv = os.getenv compat_getenv = os.getenv
compat_expanduser = os.path.expanduser compat_expanduser = os.path.expanduser
@@ -2882,6 +2900,7 @@ else:
__all__ = [ __all__ = [
'compat_HTMLParseError',
'compat_HTMLParser', 'compat_HTMLParser',
'compat_HTTPError', 'compat_HTTPError',
'compat_basestring', 'compat_basestring',

View File

@@ -8,10 +8,11 @@ import random
from ..compat import compat_os_name from ..compat import compat_os_name
from ..utils import ( from ..utils import (
decodeArgument,
encodeFilename, encodeFilename,
error_to_compat_str, error_to_compat_str,
decodeArgument,
format_bytes, format_bytes,
shell_quote,
timeconvert, timeconvert,
) )
@@ -187,6 +188,9 @@ class FileDownloader(object):
return filename[:-len('.part')] return filename[:-len('.part')]
return filename return filename
def ytdl_filename(self, filename):
return filename + '.ytdl'
def try_rename(self, old_filename, new_filename): def try_rename(self, old_filename, new_filename):
try: try:
if old_filename == new_filename: if old_filename == new_filename:
@@ -327,21 +331,22 @@ class FileDownloader(object):
os.path.exists(encodeFilename(filename)) os.path.exists(encodeFilename(filename))
) )
continuedl_and_exists = ( if not hasattr(filename, 'write'):
self.params.get('continuedl', True) and continuedl_and_exists = (
os.path.isfile(encodeFilename(filename)) and self.params.get('continuedl', True) and
not self.params.get('nopart', False) os.path.isfile(encodeFilename(filename)) and
) not self.params.get('nopart', False)
)
# Check file already present # Check file already present
if filename != '-' and (nooverwrites_and_exists or continuedl_and_exists): if filename != '-' and (nooverwrites_and_exists or continuedl_and_exists):
self.report_file_already_downloaded(filename) self.report_file_already_downloaded(filename)
self._hook_progress({ self._hook_progress({
'filename': filename, 'filename': filename,
'status': 'finished', 'status': 'finished',
'total_bytes': os.path.getsize(encodeFilename(filename)), 'total_bytes': os.path.getsize(encodeFilename(filename)),
}) })
return True return True
min_sleep_interval = self.params.get('sleep_interval') min_sleep_interval = self.params.get('sleep_interval')
if min_sleep_interval: if min_sleep_interval:
@@ -377,10 +382,5 @@ class FileDownloader(object):
if exe is None: if exe is None:
exe = os.path.basename(str_args[0]) exe = os.path.basename(str_args[0])
try:
import pipes
shell_quote = lambda args: ' '.join(map(pipes.quote, str_args))
except ImportError:
shell_quote = repr
self.to_screen('[debug] %s command line: %s' % ( self.to_screen('[debug] %s command line: %s' % (
exe, shell_quote(str_args))) exe, shell_quote(str_args)))

View File

@@ -1,13 +1,8 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import compat_urllib_error from ..compat import compat_urllib_error
from ..utils import ( from ..utils import urljoin
sanitize_open,
encodeFilename,
)
class DashSegmentsFD(FragmentFD): class DashSegmentsFD(FragmentFD):
@@ -18,41 +13,39 @@ class DashSegmentsFD(FragmentFD):
FD_NAME = 'dashsegments' FD_NAME = 'dashsegments'
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
segments = info_dict['fragments'][:1] if self.params.get( fragment_base_url = info_dict.get('fragment_base_url')
fragments = info_dict['fragments'][:1] if self.params.get(
'test', False) else info_dict['fragments'] 'test', False) else info_dict['fragments']
ctx = { ctx = {
'filename': filename, 'filename': filename,
'total_frags': len(segments), 'total_frags': len(fragments),
} }
self._prepare_and_start_frag_download(ctx) self._prepare_and_start_frag_download(ctx)
segments_filenames = []
fragment_retries = self.params.get('fragment_retries', 0) fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True) skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
def process_segment(segment, tmp_filename, num): frag_index = 0
segment_url = segment['url'] for i, fragment in enumerate(fragments):
segment_name = 'Frag%d' % num frag_index += 1
target_filename = '%s-%s' % (tmp_filename, segment_name) if frag_index <= ctx['fragment_index']:
continue
# In DASH, the first segment contains necessary headers to # In DASH, the first segment contains necessary headers to
# generate a valid MP4 file, so always abort for the first segment # generate a valid MP4 file, so always abort for the first segment
fatal = num == 0 or not skip_unavailable_fragments fatal = i == 0 or not skip_unavailable_fragments
count = 0 count = 0
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success = ctx['dl'].download(target_filename, { fragment_url = fragment.get('url')
'url': segment_url, if not fragment_url:
'http_headers': info_dict.get('http_headers'), assert fragment_base_url
}) fragment_url = urljoin(fragment_base_url, fragment['path'])
success, frag_content = self._download_fragment(ctx, fragment_url, info_dict)
if not success: if not success:
return False return False
down, target_sanitized = sanitize_open(target_filename, 'rb') self._append_fragment(ctx, frag_content)
ctx['dest_stream'].write(down.read())
down.close()
segments_filenames.append(target_sanitized)
break break
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# YouTube may often return 404 HTTP error for a fragment causing the # YouTube may often return 404 HTTP error for a fragment causing the
@@ -63,22 +56,14 @@ class DashSegmentsFD(FragmentFD):
# HTTP error. # HTTP error.
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, segment_name, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries: if count > fragment_retries:
if not fatal: if not fatal:
self.report_skip_fragment(segment_name) self.report_skip_fragment(frag_index)
return True continue
self.report_error('giving up after %s fragment retries' % fragment_retries) self.report_error('giving up after %s fragment retries' % fragment_retries)
return False return False
return True
for i, segment in enumerate(segments):
if not process_segment(segment, ctx['tmpfilename'], i):
return False
self._finish_frag_download(ctx) self._finish_frag_download(ctx)
for segment_file in segments_filenames:
os.remove(encodeFilename(segment_file))
return True return True

View File

@@ -29,7 +29,17 @@ class ExternalFD(FileDownloader):
self.report_destination(filename) self.report_destination(filename)
tmpfilename = self.temp_name(filename) tmpfilename = self.temp_name(filename)
retval = self._call_downloader(tmpfilename, info_dict) try:
retval = self._call_downloader(tmpfilename, info_dict)
except KeyboardInterrupt:
if not info_dict.get('is_live'):
raise
# Live stream downloading cancellation should be considered as
# correct and expected termination thus all postprocessing
# should take place
retval = 0
self.to_screen('[%s] Interrupted by user' % self.get_basename())
if retval == 0: if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize)) self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
@@ -202,6 +212,11 @@ class FFmpegFD(ExternalFD):
args = [ffpp.executable, '-y'] args = [ffpp.executable, '-y']
for log_level in ('quiet', 'verbose'):
if self.params.get(log_level, False):
args += ['-loglevel', log_level]
break
seekable = info_dict.get('_seekable') seekable = info_dict.get('_seekable')
if seekable is not None: if seekable is not None:
# setting -seekable prevents ffmpeg from guessing if the server # setting -seekable prevents ffmpeg from guessing if the server

View File

@@ -3,7 +3,6 @@ from __future__ import division, unicode_literals
import base64 import base64
import io import io
import itertools import itertools
import os
import time import time
from .fragment import FragmentFD from .fragment import FragmentFD
@@ -16,9 +15,7 @@ from ..compat import (
compat_struct_unpack, compat_struct_unpack,
) )
from ..utils import ( from ..utils import (
encodeFilename,
fix_xml_ampersands, fix_xml_ampersands,
sanitize_open,
xpath_text, xpath_text,
) )
@@ -366,17 +363,21 @@ class F4mFD(FragmentFD):
dest_stream = ctx['dest_stream'] dest_stream = ctx['dest_stream']
write_flv_header(dest_stream) if ctx['complete_frags_downloaded_bytes'] == 0:
if not live: write_flv_header(dest_stream)
write_metadata_tag(dest_stream, metadata) if not live:
write_metadata_tag(dest_stream, metadata)
base_url_parsed = compat_urllib_parse_urlparse(base_url) base_url_parsed = compat_urllib_parse_urlparse(base_url)
self._start_frag_download(ctx) self._start_frag_download(ctx)
frags_filenames = [] frag_index = 0
while fragments_list: while fragments_list:
seg_i, frag_i = fragments_list.pop(0) seg_i, frag_i = fragments_list.pop(0)
frag_index += 1
if frag_index <= ctx['fragment_index']:
continue
name = 'Seg%d-Frag%d' % (seg_i, frag_i) name = 'Seg%d-Frag%d' % (seg_i, frag_i)
query = [] query = []
if base_url_parsed.query: if base_url_parsed.query:
@@ -386,17 +387,10 @@ class F4mFD(FragmentFD):
if info_dict.get('extra_param_to_segment_url'): if info_dict.get('extra_param_to_segment_url'):
query.append(info_dict['extra_param_to_segment_url']) query.append(info_dict['extra_param_to_segment_url'])
url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query)) url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query))
frag_filename = '%s-%s' % (ctx['tmpfilename'], name)
try: try:
success = ctx['dl'].download(frag_filename, { success, down_data = self._download_fragment(ctx, url_parsed.geturl(), info_dict)
'url': url_parsed.geturl(),
'http_headers': info_dict.get('http_headers'),
})
if not success: if not success:
return False return False
(down, frag_sanitized) = sanitize_open(frag_filename, 'rb')
down_data = down.read()
down.close()
reader = FlvReader(down_data) reader = FlvReader(down_data)
while True: while True:
try: try:
@@ -411,12 +405,8 @@ class F4mFD(FragmentFD):
break break
raise raise
if box_type == b'mdat': if box_type == b'mdat':
dest_stream.write(box_data) self._append_fragment(ctx, box_data)
break break
if live:
os.remove(encodeFilename(frag_sanitized))
else:
frags_filenames.append(frag_sanitized)
except (compat_urllib_error.HTTPError, ) as err: except (compat_urllib_error.HTTPError, ) as err:
if live and (err.code == 404 or err.code == 410): if live and (err.code == 404 or err.code == 410):
# We didn't keep up with the live window. Continue # We didn't keep up with the live window. Continue
@@ -436,7 +426,4 @@ class F4mFD(FragmentFD):
self._finish_frag_download(ctx) self._finish_frag_download(ctx)
for frag_file in frags_filenames:
os.remove(encodeFilename(frag_file))
return True return True

View File

@@ -2,6 +2,7 @@ from __future__ import division, unicode_literals
import os import os
import time import time
import json
from .common import FileDownloader from .common import FileDownloader
from .http import HttpFD from .http import HttpFD
@@ -28,15 +29,37 @@ class FragmentFD(FileDownloader):
and hlsnative only) and hlsnative only)
skip_unavailable_fragments: skip_unavailable_fragments:
Skip unavailable fragments (DASH and hlsnative only) Skip unavailable fragments (DASH and hlsnative only)
keep_fragments: Keep downloaded fragments on disk after downloading is
finished
For each incomplete fragment download youtube-dl keeps on disk a special
bookkeeping file with download state and metadata (in future such files will
be used for any incomplete download handled by youtube-dl). This file is
used to properly handle resuming, check download file consistency and detect
potential errors. The file has a .ytdl extension and represents a standard
JSON file of the following format:
extractor:
Dictionary of extractor related data. TBD.
downloader:
Dictionary of downloader related data. May contain following data:
current_fragment:
Dictionary with current (being downloaded) fragment data:
index: 0-based index of current fragment among all fragments
fragment_count:
Total count of fragments
This feature is experimental and file format may change in future.
""" """
def report_retry_fragment(self, err, fragment_name, count, retries): def report_retry_fragment(self, err, frag_index, count, retries):
self.to_screen( self.to_screen(
'[download] Got server HTTP error: %s. Retrying fragment %s (attempt %d of %s)...' '[download] Got server HTTP error: %s. Retrying fragment %d (attempt %d of %s)...'
% (error_to_compat_str(err), fragment_name, count, self.format_retries(retries))) % (error_to_compat_str(err), frag_index, count, self.format_retries(retries)))
def report_skip_fragment(self, fragment_name): def report_skip_fragment(self, frag_index):
self.to_screen('[download] Skipping fragment %s...' % fragment_name) self.to_screen('[download] Skipping fragment %d...' % frag_index)
def _prepare_url(self, info_dict, url): def _prepare_url(self, info_dict, url):
headers = info_dict.get('http_headers') headers = info_dict.get('http_headers')
@@ -46,6 +69,51 @@ class FragmentFD(FileDownloader):
self._prepare_frag_download(ctx) self._prepare_frag_download(ctx)
self._start_frag_download(ctx) self._start_frag_download(ctx)
@staticmethod
def __do_ytdl_file(ctx):
return not ctx['live'] and not ctx['tmpfilename'] == '-'
def _read_ytdl_file(self, ctx):
stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r')
ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index']
stream.close()
def _write_ytdl_file(self, ctx):
frag_index_stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'w')
downloader = {
'current_fragment': {
'index': ctx['fragment_index'],
},
}
if ctx.get('fragment_count') is not None:
downloader['fragment_count'] = ctx['fragment_count']
frag_index_stream.write(json.dumps({'downloader': downloader}))
frag_index_stream.close()
def _download_fragment(self, ctx, frag_url, info_dict, headers=None):
fragment_filename = '%s-Frag%d' % (ctx['tmpfilename'], ctx['fragment_index'])
success = ctx['dl'].download(fragment_filename, {
'url': frag_url,
'http_headers': headers or info_dict.get('http_headers'),
})
if not success:
return False, None
down, frag_sanitized = sanitize_open(fragment_filename, 'rb')
ctx['fragment_filename_sanitized'] = frag_sanitized
frag_content = down.read()
down.close()
return True, frag_content
def _append_fragment(self, ctx, frag_content):
try:
ctx['dest_stream'].write(frag_content)
finally:
if self.__do_ytdl_file(ctx):
self._write_ytdl_file(ctx)
if not self.params.get('keep_fragments', False):
os.remove(ctx['fragment_filename_sanitized'])
del ctx['fragment_filename_sanitized']
def _prepare_frag_download(self, ctx): def _prepare_frag_download(self, ctx):
if 'live' not in ctx: if 'live' not in ctx:
ctx['live'] = False ctx['live'] = False
@@ -66,11 +134,36 @@ class FragmentFD(FileDownloader):
} }
) )
tmpfilename = self.temp_name(ctx['filename']) tmpfilename = self.temp_name(ctx['filename'])
dest_stream, tmpfilename = sanitize_open(tmpfilename, 'wb') open_mode = 'wb'
resume_len = 0
# Establish possible resume length
if os.path.isfile(encodeFilename(tmpfilename)):
open_mode = 'ab'
resume_len = os.path.getsize(encodeFilename(tmpfilename))
# Should be initialized before ytdl file check
ctx.update({
'tmpfilename': tmpfilename,
'fragment_index': 0,
})
if self.__do_ytdl_file(ctx):
if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))):
self._read_ytdl_file(ctx)
else:
self._write_ytdl_file(ctx)
if ctx['fragment_index'] > 0:
assert resume_len > 0
dest_stream, tmpfilename = sanitize_open(tmpfilename, open_mode)
ctx.update({ ctx.update({
'dl': dl, 'dl': dl,
'dest_stream': dest_stream, 'dest_stream': dest_stream,
'tmpfilename': tmpfilename, 'tmpfilename': tmpfilename,
# Total complete fragments downloaded so far in bytes
'complete_frags_downloaded_bytes': resume_len,
}) })
def _start_frag_download(self, ctx): def _start_frag_download(self, ctx):
@@ -79,9 +172,9 @@ class FragmentFD(FileDownloader):
# hook # hook
state = { state = {
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': 0, 'downloaded_bytes': ctx['complete_frags_downloaded_bytes'],
'frag_index': 0, 'fragment_index': ctx['fragment_index'],
'frag_count': total_frags, 'fragment_count': total_frags,
'filename': ctx['filename'], 'filename': ctx['filename'],
'tmpfilename': ctx['tmpfilename'], 'tmpfilename': ctx['tmpfilename'],
} }
@@ -89,8 +182,6 @@ class FragmentFD(FileDownloader):
start = time.time() start = time.time()
ctx.update({ ctx.update({
'started': start, 'started': start,
# Total complete fragments downloaded so far in bytes
'complete_frags_downloaded_bytes': 0,
# Amount of fragment's bytes downloaded by the time of the previous # Amount of fragment's bytes downloaded by the time of the previous
# frag progress hook invocation # frag progress hook invocation
'prev_frag_downloaded_bytes': 0, 'prev_frag_downloaded_bytes': 0,
@@ -106,11 +197,12 @@ class FragmentFD(FileDownloader):
if not ctx['live']: if not ctx['live']:
estimated_size = ( estimated_size = (
(ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) / (ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) /
(state['frag_index'] + 1) * total_frags) (state['fragment_index'] + 1) * total_frags)
state['total_bytes_estimate'] = estimated_size state['total_bytes_estimate'] = estimated_size
if s['status'] == 'finished': if s['status'] == 'finished':
state['frag_index'] += 1 state['fragment_index'] += 1
ctx['fragment_index'] = state['fragment_index']
state['downloaded_bytes'] += frag_total_bytes - ctx['prev_frag_downloaded_bytes'] state['downloaded_bytes'] += frag_total_bytes - ctx['prev_frag_downloaded_bytes']
ctx['complete_frags_downloaded_bytes'] = state['downloaded_bytes'] ctx['complete_frags_downloaded_bytes'] = state['downloaded_bytes']
ctx['prev_frag_downloaded_bytes'] = 0 ctx['prev_frag_downloaded_bytes'] = 0
@@ -132,6 +224,10 @@ class FragmentFD(FileDownloader):
def _finish_frag_download(self, ctx): def _finish_frag_download(self, ctx):
ctx['dest_stream'].close() ctx['dest_stream'].close()
if self.__do_ytdl_file(ctx):
ytdl_filename = encodeFilename(self.ytdl_filename(ctx['filename']))
if os.path.isfile(ytdl_filename):
os.remove(ytdl_filename)
elapsed = time.time() - ctx['started'] elapsed = time.time() - ctx['started']
self.try_rename(ctx['tmpfilename'], ctx['filename']) self.try_rename(ctx['tmpfilename'], ctx['filename'])
fsize = os.path.getsize(encodeFilename(ctx['filename'])) fsize = os.path.getsize(encodeFilename(ctx['filename']))

View File

@@ -1,6 +1,5 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os.path
import re import re
import binascii import binascii
try: try:
@@ -18,8 +17,6 @@ from ..compat import (
compat_struct_pack, compat_struct_pack,
) )
from ..utils import ( from ..utils import (
encodeFilename,
sanitize_open,
parse_m3u8_attributes, parse_m3u8_attributes,
update_url_query, update_url_query,
) )
@@ -62,9 +59,9 @@ class HlsFD(FragmentFD):
man_url = info_dict['url'] man_url = info_dict['url']
self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME) self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME)
manifest = self.ydl.urlopen(self._prepare_url(info_dict, man_url)).read() urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl()
s = manifest.decode('utf-8', 'ignore') s = urlh.read().decode('utf-8', 'ignore')
if not self.can_download(s, info_dict): if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url'): if info_dict.get('extra_param_to_segment_url'):
@@ -103,17 +100,18 @@ class HlsFD(FragmentFD):
media_sequence = 0 media_sequence = 0
decrypt_info = {'METHOD': 'NONE'} decrypt_info = {'METHOD': 'NONE'}
byte_range = {} byte_range = {}
frags_filenames = [] frag_index = 0
for line in s.splitlines(): for line in s.splitlines():
line = line.strip() line = line.strip()
if line: if line:
if not line.startswith('#'): if not line.startswith('#'):
frag_index += 1
if frag_index <= ctx['fragment_index']:
continue
frag_url = ( frag_url = (
line line
if re.match(r'^https?://', line) if re.match(r'^https?://', line)
else compat_urlparse.urljoin(man_url, line)) else compat_urlparse.urljoin(man_url, line))
frag_name = 'Frag%d' % i
frag_filename = '%s-%s' % (ctx['tmpfilename'], frag_name)
if extra_query: if extra_query:
frag_url = update_url_query(frag_url, extra_query) frag_url = update_url_query(frag_url, extra_query)
count = 0 count = 0
@@ -122,15 +120,10 @@ class HlsFD(FragmentFD):
headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end']) headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'])
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success = ctx['dl'].download(frag_filename, { success, frag_content = self._download_fragment(
'url': frag_url, ctx, frag_url, info_dict, headers)
'http_headers': headers,
})
if not success: if not success:
return False return False
down, frag_sanitized = sanitize_open(frag_filename, 'rb')
frag_content = down.read()
down.close()
break break
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# Unavailable (possibly temporary) fragments may be served. # Unavailable (possibly temporary) fragments may be served.
@@ -139,28 +132,29 @@ class HlsFD(FragmentFD):
# https://github.com/rg3/youtube-dl/issues/10448). # https://github.com/rg3/youtube-dl/issues/10448).
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, frag_name, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries: if count > fragment_retries:
if skip_unavailable_fragments: if skip_unavailable_fragments:
i += 1 i += 1
media_sequence += 1 media_sequence += 1
self.report_skip_fragment(frag_name) self.report_skip_fragment(frag_index)
continue continue
self.report_error( self.report_error(
'giving up after %s fragment retries' % fragment_retries) 'giving up after %s fragment retries' % fragment_retries)
return False return False
if decrypt_info['METHOD'] == 'AES-128': if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence) iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(decrypt_info['URI']).read()
frag_content = AES.new( frag_content = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content) decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
ctx['dest_stream'].write(frag_content) self._append_fragment(ctx, frag_content)
frags_filenames.append(frag_sanitized)
# We only download the first fragment during the test # We only download the first fragment during the test
if test: if test:
break break
i += 1 i += 1
media_sequence += 1 media_sequence += 1
elif line.startswith('#EXT-X-KEY'): elif line.startswith('#EXT-X-KEY'):
decrypt_url = decrypt_info.get('URI')
decrypt_info = parse_m3u8_attributes(line[11:]) decrypt_info = parse_m3u8_attributes(line[11:])
if decrypt_info['METHOD'] == 'AES-128': if decrypt_info['METHOD'] == 'AES-128':
if 'IV' in decrypt_info: if 'IV' in decrypt_info:
@@ -170,7 +164,8 @@ class HlsFD(FragmentFD):
man_url, decrypt_info['URI']) man_url, decrypt_info['URI'])
if extra_query: if extra_query:
decrypt_info['URI'] = update_url_query(decrypt_info['URI'], extra_query) decrypt_info['URI'] = update_url_query(decrypt_info['URI'], extra_query)
decrypt_info['KEY'] = self.ydl.urlopen(decrypt_info['URI']).read() if decrypt_url != decrypt_info['URI']:
decrypt_info['KEY'] = None
elif line.startswith('#EXT-X-MEDIA-SEQUENCE'): elif line.startswith('#EXT-X-MEDIA-SEQUENCE'):
media_sequence = int(line[22:]) media_sequence = int(line[22:])
elif line.startswith('#EXT-X-BYTERANGE'): elif line.startswith('#EXT-X-BYTERANGE'):
@@ -183,7 +178,4 @@ class HlsFD(FragmentFD):
self._finish_frag_download(ctx) self._finish_frag_download(ctx)
for frag_file in frags_filenames:
os.remove(encodeFilename(frag_file))
return True return True

View File

@@ -1,6 +1,5 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os
import time import time
import struct import struct
import binascii import binascii
@@ -8,10 +7,6 @@ import io
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import compat_urllib_error from ..compat import compat_urllib_error
from ..utils import (
sanitize_open,
encodeFilename,
)
u8 = struct.Struct(b'>B') u8 = struct.Struct(b'>B')
@@ -103,7 +98,7 @@ def write_piff_header(stream, params):
if is_audio: if is_audio:
smhd_payload = s88.pack(0) # balance smhd_payload = s88.pack(0) # balance
smhd_payload = u16.pack(0) # reserved smhd_payload += u16.pack(0) # reserved
media_header_box = full_box(b'smhd', 0, 0, smhd_payload) # Sound Media Header media_header_box = full_box(b'smhd', 0, 0, smhd_payload) # Sound Media Header
else: else:
vmhd_payload = u16.pack(0) # graphics mode vmhd_payload = u16.pack(0) # graphics mode
@@ -131,7 +126,6 @@ def write_piff_header(stream, params):
if fourcc == 'AACL': if fourcc == 'AACL':
sample_entry_box = box(b'mp4a', sample_entry_payload) sample_entry_box = box(b'mp4a', sample_entry_payload)
else: else:
sample_entry_payload = sample_entry_payload
sample_entry_payload += u16.pack(0) # pre defined sample_entry_payload += u16.pack(0) # pre defined
sample_entry_payload += u16.pack(0) # reserved sample_entry_payload += u16.pack(0) # reserved
sample_entry_payload += u32.pack(0) * 3 # pre defined sample_entry_payload += u32.pack(0) * 3 # pre defined
@@ -225,50 +219,39 @@ class IsmFD(FragmentFD):
self._prepare_and_start_frag_download(ctx) self._prepare_and_start_frag_download(ctx)
segments_filenames = []
fragment_retries = self.params.get('fragment_retries', 0) fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True) skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
track_written = False track_written = False
frag_index = 0
for i, segment in enumerate(segments): for i, segment in enumerate(segments):
segment_url = segment['url'] frag_index += 1
segment_name = 'Frag%d' % i if frag_index <= ctx['fragment_index']:
target_filename = '%s-%s' % (ctx['tmpfilename'], segment_name) continue
count = 0 count = 0
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success = ctx['dl'].download(target_filename, { success, frag_content = self._download_fragment(ctx, segment['url'], info_dict)
'url': segment_url,
'http_headers': info_dict.get('http_headers'),
})
if not success: if not success:
return False return False
down, target_sanitized = sanitize_open(target_filename, 'rb')
down_data = down.read()
if not track_written: if not track_written:
tfhd_data = extract_box_data(down_data, [b'moof', b'traf', b'tfhd']) tfhd_data = extract_box_data(frag_content, [b'moof', b'traf', b'tfhd'])
info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0] info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0]
write_piff_header(ctx['dest_stream'], info_dict['_download_params']) write_piff_header(ctx['dest_stream'], info_dict['_download_params'])
track_written = True track_written = True
ctx['dest_stream'].write(down_data) self._append_fragment(ctx, frag_content)
down.close()
segments_filenames.append(target_sanitized)
break break
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, segment_name, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries: if count > fragment_retries:
if skip_unavailable_fragments: if skip_unavailable_fragments:
self.report_skip_fragment(segment_name) self.report_skip_fragment(frag_index)
continue continue
self.report_error('giving up after %s fragment retries' % fragment_retries) self.report_error('giving up after %s fragment retries' % fragment_retries)
return False return False
self._finish_frag_download(ctx) self._finish_frag_download(ctx)
for segment_file in segments_filenames:
os.remove(encodeFilename(segment_file))
return True return True

View File

@@ -3,11 +3,13 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
js_to_json, js_to_json,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get,
) )
@@ -124,7 +126,20 @@ class ABCIViewIE(InfoExtractor):
title = video_params.get('title') or video_params['seriesTitle'] title = video_params.get('title') or video_params['seriesTitle']
stream = next(s for s in video_params['playlist'] if s.get('type') == 'program') stream = next(s for s in video_params['playlist'] if s.get('type') == 'program')
formats = self._extract_akamai_formats(stream['hds-unmetered'], video_id) format_urls = [
try_get(stream, lambda x: x['hds-unmetered'], compat_str)]
# May have higher quality video
sd_url = try_get(
stream, lambda x: x['streams']['hds']['sd'], compat_str)
if sd_url:
format_urls.append(sd_url.replace('metered', 'um'))
formats = []
for format_url in format_urls:
if format_url:
formats.extend(
self._extract_akamai_formats(format_url, video_id))
self._sort_formats(formats) self._sort_formats(formats)
subtitles = {} subtitles = {}

View File

@@ -12,7 +12,15 @@ from ..compat import compat_urlparse
class AbcNewsVideoIE(AMPIE): class AbcNewsVideoIE(AMPIE):
IE_NAME = 'abcnews:video' IE_NAME = 'abcnews:video'
_VALID_URL = r'https?://abcnews\.go\.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)' _VALID_URL = r'''(?x)
https?://
abcnews\.go\.com/
(?:
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid=
)
(?P<id>\d+)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932', 'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932',
@@ -29,6 +37,9 @@ class AbcNewsVideoIE(AMPIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'http://abcnews.go.com/video/embed?id=46979033',
'only_matching': True,
}, { }, {
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478', 'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
'only_matching': True, 'only_matching': True,

View File

@@ -22,7 +22,7 @@ class ABCOTVSIE(InfoExtractor):
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers', 'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
'ext': 'mp4', 'ext': 'mp4',
'title': 'East Bay museum celebrates vintage synthesizers', 'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10', 'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421123075, 'timestamp': 1421123075,
'upload_date': '20150113', 'upload_date': '20150113',

View File

@@ -15,6 +15,7 @@ from ..utils import (
intlist_to_bytes, intlist_to_bytes,
srt_subtitles_timecode, srt_subtitles_timecode,
strip_or_none, strip_or_none,
urljoin,
) )
@@ -31,25 +32,28 @@ class ADNIE(InfoExtractor):
'description': 'md5:2f7b5aa76edbc1a7a92cedcda8a528d5', 'description': 'md5:2f7b5aa76edbc1a7a92cedcda8a528d5',
} }
} }
_BASE_URL = 'http://animedigitalnetwork.fr'
def _get_subtitles(self, sub_path, video_id): def _get_subtitles(self, sub_path, video_id):
if not sub_path: if not sub_path:
return None return None
enc_subtitles = self._download_webpage( enc_subtitles = self._download_webpage(
'http://animedigitalnetwork.fr/' + sub_path, urljoin(self._BASE_URL, sub_path),
video_id, fatal=False) video_id, fatal=False, headers={
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0',
})
if not enc_subtitles: if not enc_subtitles:
return None return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])), bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
bytes_to_intlist(b'\nd\xaf\xd2J\xd0\xfc\xe1\xfc\xdf\xb61\xe8\xe1\xf0\xcc'), bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
bytes_to_intlist(base64.b64decode(enc_subtitles[:24])) bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
)) ))
subtitles_json = self._parse_json( subtitles_json = self._parse_json(
dec_subtitles[:-compat_ord(dec_subtitles[-1])], dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
None, fatal=False) None, fatal=False)
if not subtitles_json: if not subtitles_json:
return None return None
@@ -103,9 +107,18 @@ class ADNIE(InfoExtractor):
metas = options.get('metas') or {} metas = options.get('metas') or {}
title = metas.get('title') or video_info['title'] title = metas.get('title') or video_info['title']
links = player_config.get('links') or {} links = player_config.get('links') or {}
error = None
if not links:
links_url = player_config['linksurl']
links_data = self._download_json(urljoin(
self._BASE_URL, links_url), video_id)
links = links_data.get('links') or {}
error = links_data.get('error')
formats = [] formats = []
for format_id, qualities in links.items(): for format_id, qualities in links.items():
if not isinstance(qualities, dict):
continue
for load_balancer_url in qualities.values(): for load_balancer_url in qualities.values():
load_balancer_data = self._download_json( load_balancer_data = self._download_json(
load_balancer_url, video_id, fatal=False) or {} load_balancer_url, video_id, fatal=False) or {}
@@ -119,7 +132,8 @@ class ADNIE(InfoExtractor):
for f in m3u8_formats: for f in m3u8_formats:
f['language'] = 'fr' f['language'] = 'fr'
formats.extend(m3u8_formats) formats.extend(m3u8_formats)
error = options.get('error') if not error:
error = options.get('error')
if not formats and error: if not formats and error:
raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True) raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True)
self._sort_formats(formats) self._sort_formats(formats)

View File

@@ -6,12 +6,16 @@ import time
import xml.etree.ElementTree as etree import xml.etree.ElementTree as etree
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import (
compat_kwargs,
compat_urlparse,
)
from ..utils import ( from ..utils import (
unescapeHTML, unescapeHTML,
urlencode_postdata, urlencode_postdata,
unified_timestamp, unified_timestamp,
ExtractorError, ExtractorError,
NO_DEFAULT,
) )
@@ -21,6 +25,11 @@ MSO_INFO = {
'username_field': 'username', 'username_field': 'username',
'password_field': 'password', 'password_field': 'password',
}, },
'ATTOTT': {
'name': 'DIRECTV NOW',
'username_field': 'email',
'password_field': 'loginpassword',
},
'Rogers': { 'Rogers': {
'name': 'Rogers', 'name': 'Rogers',
'username_field': 'UserName', 'username_field': 'UserName',
@@ -36,6 +45,11 @@ MSO_INFO = {
'username_field': 'Ecom_User_ID', 'username_field': 'Ecom_User_ID',
'password_field': 'Ecom_Password', 'password_field': 'Ecom_Password',
}, },
'Brighthouse': {
'name': 'Bright House Networks | Spectrum',
'username_field': 'j_username',
'password_field': 'j_password',
},
'Charter_Direct': { 'Charter_Direct': {
'name': 'Charter Spectrum', 'name': 'Charter Spectrum',
'username_field': 'IDToken1', 'username_field': 'IDToken1',
@@ -1308,6 +1322,15 @@ class AdobePassIE(InfoExtractor):
_USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0' _USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
_MVPD_CACHE = 'ap-mvpd' _MVPD_CACHE = 'ap-mvpd'
_DOWNLOADING_LOGIN_PAGE = 'Downloading Provider Login Page'
def _download_webpage_handle(self, *args, **kwargs):
headers = kwargs.get('headers', {})
headers.update(self.geo_verification_headers())
kwargs['headers'] = headers
return super(AdobePassIE, self)._download_webpage_handle(
*args, **compat_kwargs(kwargs))
@staticmethod @staticmethod
def _get_mvpd_resource(provider_id, title, guid, rating): def _get_mvpd_resource(provider_id, title, guid, rating):
channel = etree.Element('channel') channel = etree.Element('channel')
@@ -1350,6 +1373,21 @@ class AdobePassIE(InfoExtractor):
'Use --ap-mso to specify Adobe Pass Multiple-system operator Identifier ' 'Use --ap-mso to specify Adobe Pass Multiple-system operator Identifier '
'and --ap-username and --ap-password or --netrc to provide account credentials.', expected=True) 'and --ap-username and --ap-password or --netrc to provide account credentials.', expected=True)
def extract_redirect_url(html, url=None, fatal=False):
# TODO: eliminate code duplication with generic extractor and move
# redirection code into _download_webpage_handle
REDIRECT_REGEX = r'[0-9]{,2};\s*(?:URL|url)=\'?([^\'"]+)'
redirect_url = self._search_regex(
r'(?i)<meta\s+(?=(?:[a-z-]+="[^"]+"\s+)*http-equiv="refresh")'
r'(?:[a-z-]+="[^"]+"\s+)*?content="%s' % REDIRECT_REGEX,
html, 'meta refresh redirect',
default=NO_DEFAULT if fatal else None, fatal=fatal)
if not redirect_url:
return None
if url:
redirect_url = compat_urlparse.urljoin(url, unescapeHTML(redirect_url))
return redirect_url
mvpd_headers = { mvpd_headers = {
'ap_42': 'anonymous', 'ap_42': 'anonymous',
'ap_11': 'Linux i686', 'ap_11': 'Linux i686',
@@ -1399,16 +1437,15 @@ class AdobePassIE(InfoExtractor):
if '<form name="signin"' in provider_redirect_page: if '<form name="signin"' in provider_redirect_page:
provider_login_page_res = provider_redirect_page_res provider_login_page_res = provider_redirect_page_res
elif 'http-equiv="refresh"' in provider_redirect_page: elif 'http-equiv="refresh"' in provider_redirect_page:
oauth_redirect_url = self._html_search_regex( oauth_redirect_url = extract_redirect_url(
r'content="0;\s*url=([^\'"]+)', provider_redirect_page, fatal=True)
provider_redirect_page, 'meta refresh redirect')
provider_login_page_res = self._download_webpage_handle( provider_login_page_res = self._download_webpage_handle(
oauth_redirect_url, video_id, oauth_redirect_url, video_id,
'Downloading Provider Login Page') self._DOWNLOADING_LOGIN_PAGE)
else: else:
provider_login_page_res = post_form( provider_login_page_res = post_form(
provider_redirect_page_res, provider_redirect_page_res,
'Downloading Provider Login Page') self._DOWNLOADING_LOGIN_PAGE)
mvpd_confirm_page_res = post_form( mvpd_confirm_page_res = post_form(
provider_login_page_res, 'Logging in', { provider_login_page_res, 'Logging in', {
@@ -1455,8 +1492,17 @@ class AdobePassIE(InfoExtractor):
'Content-Type': 'application/x-www-form-urlencoded' 'Content-Type': 'application/x-www-form-urlencoded'
}) })
else: else:
# Some providers (e.g. DIRECTV NOW) have another meta refresh
# based redirect that should be followed.
provider_redirect_page, urlh = provider_redirect_page_res
provider_refresh_redirect_url = extract_redirect_url(
provider_redirect_page, url=urlh.geturl())
if provider_refresh_redirect_url:
provider_redirect_page_res = self._download_webpage_handle(
provider_refresh_redirect_url, video_id,
'Downloading Provider Redirect Page (meta refresh)')
provider_login_page_res = post_form( provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page') provider_redirect_page_res, self._DOWNLOADING_LOGIN_PAGE)
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', { mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
mso_info.get('username_field', 'username'): username, mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password, mso_info.get('password_field', 'password'): password,

View File

@@ -5,91 +5,52 @@ import re
from .turner import TurnerBaseIE from .turner import TurnerBaseIE
from ..utils import ( from ..utils import (
ExtractorError,
int_or_none, int_or_none,
strip_or_none,
) )
class AdultSwimIE(TurnerBaseIE): class AdultSwimIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<is_playlist>playlists/)?(?P<show_path>[^/]+)/(?P<episode_path>[^/?#]+)/?' _VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<show_path>[^/?#]+)(?:/(?P<episode_path>[^/?#]+))?'
_TESTS = [{ _TESTS = [{
'url': 'http://adultswim.com/videos/rick-and-morty/pilot', 'url': 'http://adultswim.com/videos/rick-and-morty/pilot',
'playlist': [
{
'md5': '247572debc75c7652f253c8daa51a14d',
'info_dict': {
'id': 'rQxZvXQ4ROaSOqq-or2Mow-0',
'ext': 'flv',
'title': 'Rick and Morty - Pilot Part 1',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
},
},
{
'md5': '77b0e037a4b20ec6b98671c4c379f48d',
'info_dict': {
'id': 'rQxZvXQ4ROaSOqq-or2Mow-3',
'ext': 'flv',
'title': 'Rick and Morty - Pilot Part 4',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
},
},
],
'info_dict': { 'info_dict': {
'id': 'rQxZvXQ4ROaSOqq-or2Mow', 'id': 'rQxZvXQ4ROaSOqq-or2Mow',
'ext': 'mp4',
'title': 'Rick and Morty - Pilot', 'title': 'Rick and Morty - Pilot',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. " 'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.',
}, 'timestamp': 1493267400,
'skip': 'This video is only available for registered users', 'upload_date': '20170427',
}, {
'url': 'http://www.adultswim.com/videos/playlists/american-parenting/putting-francine-out-of-business/',
'playlist': [
{
'md5': '2eb5c06d0f9a1539da3718d897f13ec5',
'info_dict': {
'id': '-t8CamQlQ2aYZ49ItZCFog-0',
'ext': 'flv',
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},
}
],
'info_dict': {
'id': '-t8CamQlQ2aYZ49ItZCFog',
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},
}, {
'url': 'http://www.adultswim.com/videos/tim-and-eric-awesome-show-great-job/dr-steve-brule-for-your-wine/',
'playlist': [
{
'md5': '3e346a2ab0087d687a05e1e7f3b3e529',
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ-0',
'ext': 'mp4',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
},
}
],
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} },
'expected_warnings': ['Unable to download f4m manifest'],
}, {
'url': 'http://www.adultswim.com/videos/tim-and-eric-awesome-show-great-job/dr-steve-brule-for-your-wine/',
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ',
'ext': 'mp4',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.',
'upload_date': '20080124',
'timestamp': 1201150800,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, { }, {
# heroMetadata.trailer
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/', 'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': { 'info_dict': {
'id': 'I0LQFQkaSUaFp8PnAWHhoQ', 'id': 'I0LQFQkaSUaFp8PnAWHhoQ',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Decker - Inside Decker: A New Hero', 'title': 'Decker - Inside Decker: A New Hero',
'description': 'md5:c916df071d425d62d70c86d4399d3ee0', 'description': 'The guys recap the conclusion of the season. They announce a new hero, take a peek into the Victorville Film Archive and welcome back the talented James Dean.',
'duration': 249.008, 'timestamp': 1469480460,
'upload_date': '20160725',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@@ -97,136 +58,102 @@ class AdultSwimIE(TurnerBaseIE):
}, },
'expected_warnings': ['Unable to download f4m manifest'], 'expected_warnings': ['Unable to download f4m manifest'],
}, { }, {
'url': 'http://www.adultswim.com/videos/toonami/friday-october-14th-2016/', 'url': 'http://www.adultswim.com/videos/attack-on-titan',
'info_dict': { 'info_dict': {
'id': 'eYiLsKVgQ6qTC6agD67Sig', 'id': 'b7A69dzfRzuaXIECdxW8XQ',
'title': 'Toonami - Friday, October 14th, 2016', 'title': 'Attack on Titan',
'description': 'md5:99892c96ffc85e159a428de85c30acde', 'description': 'md5:6c8e003ea0777b47013e894767f5e114',
},
'playlist_mincount': 12,
}, {
'url': 'http://www.adultswim.com/videos/streams/williams-stream',
'info_dict': {
'id': 'd8DEBj7QRfetLsRgFnGEyg',
'ext': 'mp4',
'title': r're:^Williams Stream \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'description': 'original programming',
}, },
'playlist': [{
'md5': '',
'info_dict': {
'id': 'eYiLsKVgQ6qTC6agD67Sig',
'ext': 'mp4',
'title': 'Toonami - Friday, October 14th, 2016',
'description': 'md5:99892c96ffc85e159a428de85c30acde',
},
}],
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['Unable to download f4m manifest'],
}] }]
@staticmethod
def find_video_info(collection, slug):
for video in collection.get('videos'):
if video.get('slug') == slug:
return video
@staticmethod
def find_collection_by_linkURL(collections, linkURL):
for collection in collections:
if collection.get('linkURL') == linkURL:
return collection
@staticmethod
def find_collection_containing_video(collections, slug):
for collection in collections:
for video in collection.get('videos'):
if video.get('slug') == slug:
return collection, video
return None, None
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) show_path, episode_path = re.match(self._VALID_URL, url).groups()
show_path = mobj.group('show_path') display_id = episode_path or show_path
episode_path = mobj.group('episode_path') webpage = self._download_webpage(url, display_id)
is_playlist = True if mobj.group('is_playlist') else False initial_data = self._parse_json(self._search_regex(
r'AS_INITIAL_DATA(?:__)?\s*=\s*({.+?});',
webpage, 'initial data'), display_id)
webpage = self._download_webpage(url, episode_path) is_stream = show_path == 'streams'
if is_stream:
if not episode_path:
episode_path = 'live-stream'
# Extract the value of `bootstrappedData` from the Javascript in the page. video_data = next(stream for stream_path, stream in initial_data['streams'].items() if stream_path == episode_path)
bootstrapped_data = self._parse_json(self._search_regex( video_id = video_data.get('stream')
r'var bootstrappedData = ({.*});', webpage, 'bootstraped data'), episode_path)
# Downloading videos from a /videos/playlist/ URL needs to be handled differently. if not video_id:
# NOTE: We are only downloading one video (the current one) not the playlist entries = []
if is_playlist: for episode in video_data.get('archiveEpisodes', []):
collections = bootstrapped_data['playlists']['collections'] episode_url = episode.get('url')
collection = self.find_collection_by_linkURL(collections, show_path) if not episode_url:
video_info = self.find_video_info(collection, episode_path) continue
entries.append(self.url_result(
show_title = video_info['showTitle'] episode_url, 'AdultSwim', episode.get('id')))
segment_ids = [video_info['videoPlaybackID']] return self.playlist_result(
entries, video_data.get('id'), video_data.get('title'),
strip_or_none(video_data.get('description')))
else: else:
collections = bootstrapped_data['show']['collections'] show_data = initial_data['show']
collection, video_info = self.find_collection_containing_video(collections, episode_path)
# Video wasn't found in the collections, let's try `slugged_video`.
if video_info is None:
if bootstrapped_data.get('slugged_video', {}).get('slug') == episode_path:
video_info = bootstrapped_data['slugged_video']
if not video_info:
video_info = bootstrapped_data.get(
'heroMetadata', {}).get('trailer', {}).get('video')
if not video_info:
video_info = bootstrapped_data.get('onlineOriginals', [None])[0]
if not video_info:
raise ExtractorError('Unable to find video info')
show = bootstrapped_data['show'] if not episode_path:
show_title = show['title'] entries = []
stream = video_info.get('stream') for video in show_data.get('videos', []):
if stream and stream.get('videoPlaybackID'): slug = video.get('slug')
segment_ids = [stream['videoPlaybackID']] if not slug:
elif video_info.get('clips'): continue
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']] entries.append(self.url_result(
elif video_info.get('videoPlaybackID'): 'http://adultswim.com/videos/%s/%s' % (show_path, slug),
segment_ids = [video_info['videoPlaybackID']] 'AdultSwim', video.get('id')))
elif video_info.get('id'): return self.playlist_result(
segment_ids = [video_info['id']] entries, show_data.get('id'), show_data.get('title'),
else: strip_or_none(show_data.get('metadata', {}).get('description')))
if video_info.get('auth') is True:
raise ExtractorError(
'This video is only available via cable service provider subscription that'
' is not currently supported. You may want to use --cookies.', expected=True)
else:
raise ExtractorError('Unable to find stream or clips')
episode_id = video_info['id'] video_data = show_data['sluggedVideo']
episode_title = video_info['title'] video_id = video_data['id']
episode_description = video_info.get('description')
episode_duration = int_or_none(video_info.get('duration'))
view_count = int_or_none(video_info.get('views'))
entries = [] info = self._extract_cvp_info(
for part_num, segment_id in enumerate(segment_ids): 'http://www.adultswim.com/videos/api/v0/assets?platform=desktop&id=' + video_id,
segement_info = self._extract_cvp_info( video_id, {
'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=desktop' % segment_id, 'secure': {
segment_id, { 'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'secure': { 'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big', },
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do', }, {
}, 'url': url,
}) 'site_name': 'AdultSwim',
segment_title = '%s - %s' % (show_title, episode_title) 'auth_required': video_data.get('auth'),
if len(segment_ids) > 1:
segment_title += ' Part %d' % (part_num + 1)
segement_info.update({
'id': segment_id,
'title': segment_title,
'description': episode_description,
}) })
entries.append(segement_info)
return { info.update({
'_type': 'playlist', 'id': video_id,
'id': episode_id, 'display_id': display_id,
'display_id': episode_path, 'description': info.get('description') or strip_or_none(video_data.get('description')),
'entries': entries, })
'title': '%s - %s' % (show_title, episode_title), if not is_stream:
'description': episode_description, info.update({
'duration': episode_duration, 'duration': info.get('duration') or int_or_none(video_data.get('duration')),
'view_count': view_count, 'timestamp': info.get('timestamp') or int_or_none(video_data.get('launch_date')),
} 'season_number': info.get('season_number') or int_or_none(video_data.get('season_number')),
'episode': info['title'],
'episode_number': info.get('episode_number') or int_or_none(video_data.get('episode_number')),
})
info['series'] = video_data.get('collection_title') or info.get('series')
if info['series'] and info['series'] != info['title']:
info['title'] = '%s - %s' % (info['series'], info['title'])
return info

View File

@@ -101,10 +101,14 @@ class AENetworksIE(AENetworksBaseIE):
for season_url_path in re.findall(r'(?s)<li[^>]+data-href="(/shows/%s/season-\d+)"' % url_parts[0], webpage): for season_url_path in re.findall(r'(?s)<li[^>]+data-href="(/shows/%s/season-\d+)"' % url_parts[0], webpage):
entries.append(self.url_result( entries.append(self.url_result(
compat_urlparse.urljoin(url, season_url_path), 'AENetworks')) compat_urlparse.urljoin(url, season_url_path), 'AENetworks'))
return self.playlist_result( if entries:
entries, self._html_search_meta('aetn:SeriesId', webpage), return self.playlist_result(
self._html_search_meta('aetn:SeriesTitle', webpage)) entries, self._html_search_meta('aetn:SeriesId', webpage),
elif url_parts_len == 2: self._html_search_meta('aetn:SeriesTitle', webpage))
else:
# single season
url_parts_len = 2
if url_parts_len == 2:
entries = [] entries = []
for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage): for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage):
episode_attributes = extract_attributes(episode_item) episode_attributes = extract_attributes(episode_item)
@@ -112,7 +116,7 @@ class AENetworksIE(AENetworksBaseIE):
url, episode_attributes['data-canonical']) url, episode_attributes['data-canonical'])
entries.append(self.url_result( entries.append(self.url_result(
episode_url, 'AENetworks', episode_url, 'AENetworks',
episode_attributes['data-videoid'])) episode_attributes.get('data-videoid') or episode_attributes.get('data-video-id')))
return self.playlist_result( return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage)) entries, self._html_search_meta('aetn:SeasonId', webpage))

View File

@@ -207,11 +207,10 @@ class AfreecaTVIE(InfoExtractor):
file_url, video_id, 'mp4', entry_protocol='m3u8_native', file_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', m3u8_id='hls',
note='Downloading part %d m3u8 information' % file_num) note='Downloading part %d m3u8 information' % file_num)
title = title if one else '%s (part %d)' % (title, file_num)
file_info = common_entry.copy() file_info = common_entry.copy()
file_info.update({ file_info.update({
'id': format_id, 'id': format_id,
'title': title, 'title': title if one else '%s (part %d)' % (title, file_num),
'upload_date': upload_date, 'upload_date': upload_date,
'duration': file_duration, 'duration': file_duration,
'formats': formats, 'formats': formats,

View File

@@ -4,9 +4,9 @@ from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor): class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html' _VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?:programmes|video)/.*?/(?P<id>[^/]+)\.html'
_TEST = { _TESTS = [{
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html', 'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
'info_dict': { 'info_dict': {
'id': '3792260579001', 'id': '3792260579001',
@@ -19,7 +19,10 @@ class AlJazeeraIE(InfoExtractor):
}, },
'add_ie': ['BrightcoveNew'], 'add_ie': ['BrightcoveNew'],
'skip': 'Not accessible from Travis CI server', 'skip': 'Not accessible from Travis CI server',
} }, {
'url': 'http://www.aljazeera.com/video/news/2017/05/sierra-leone-709-carat-diamond-auctioned-170511100111930.html',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s'
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -3,9 +3,10 @@ from __future__ import unicode_literals
from .theplatform import ThePlatformIE from .theplatform import ThePlatformIE
from ..utils import ( from ..utils import (
update_url_query,
parse_age_limit,
int_or_none, int_or_none,
parse_age_limit,
try_get,
update_url_query,
) )
@@ -68,7 +69,8 @@ class AMCNetworksIE(ThePlatformIE):
info = self._parse_theplatform_metadata(theplatform_metadata) info = self._parse_theplatform_metadata(theplatform_metadata)
video_id = theplatform_metadata['pid'] video_id = theplatform_metadata['pid']
title = theplatform_metadata['title'] title = theplatform_metadata['title']
rating = theplatform_metadata['ratings'][0]['rating'] rating = try_get(
theplatform_metadata, lambda x: x['ratings'][0]['rating'])
auth_required = self._search_regex( auth_required = self._search_regex(
r'window\.authRequired\s*=\s*(true|false);', r'window\.authRequired\s*=\s*(true|false);',
webpage, 'auth required') webpage, 'auth required')

View File

@@ -7,15 +7,19 @@ from ..utils import (
parse_iso8601, parse_iso8601,
mimetype2ext, mimetype2ext,
determine_ext, determine_ext,
ExtractorError,
) )
class AMPIE(InfoExtractor): class AMPIE(InfoExtractor):
# parse Akamai Adaptive Media Player feed # parse Akamai Adaptive Media Player feed
def _extract_feed_info(self, url): def _extract_feed_info(self, url):
item = self._download_json( feed = self._download_json(
url, None, 'Downloading Akamai AMP feed', url, None, 'Downloading Akamai AMP feed',
'Unable to download Akamai AMP feed')['channel']['item'] 'Unable to download Akamai AMP feed')
item = feed.get('channel', {}).get('item')
if not item:
raise ExtractorError('%s said: %s' % (self.IE_NAME, feed['error']))
video_id = item['guid'] video_id = item['guid']
@@ -30,9 +34,12 @@ class AMPIE(InfoExtractor):
if isinstance(media_thumbnail, dict): if isinstance(media_thumbnail, dict):
media_thumbnail = [media_thumbnail] media_thumbnail = [media_thumbnail]
for thumbnail_data in media_thumbnail: for thumbnail_data in media_thumbnail:
thumbnail = thumbnail_data['@attributes'] thumbnail = thumbnail_data.get('@attributes', {})
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({ thumbnails.append({
'url': self._proto_relative_url(thumbnail['url'], 'http:'), 'url': self._proto_relative_url(thumbnail_url, 'http:'),
'width': int_or_none(thumbnail.get('width')), 'width': int_or_none(thumbnail.get('width')),
'height': int_or_none(thumbnail.get('height')), 'height': int_or_none(thumbnail.get('height')),
}) })
@@ -43,9 +50,14 @@ class AMPIE(InfoExtractor):
if isinstance(media_subtitle, dict): if isinstance(media_subtitle, dict):
media_subtitle = [media_subtitle] media_subtitle = [media_subtitle]
for subtitle_data in media_subtitle: for subtitle_data in media_subtitle:
subtitle = subtitle_data['@attributes'] subtitle = subtitle_data.get('@attributes', {})
lang = subtitle.get('lang') or 'en' subtitle_href = subtitle.get('href')
subtitles[lang] = [{'url': subtitle['href']}] if not subtitle_href:
continue
subtitles.setdefault(subtitle.get('lang') or 'en', []).append({
'url': subtitle_href,
'ext': mimetype2ext(subtitle.get('type')) or determine_ext(subtitle_href),
})
formats = [] formats = []
media_content = get_media_node('content') media_content = get_media_node('content')

View File

@@ -5,6 +5,7 @@ import base64
import hashlib import hashlib
import json import json
import random import random
import re
import time import time
from .common import InfoExtractor from .common import InfoExtractor
@@ -16,6 +17,7 @@ from ..utils import (
intlist_to_bytes, intlist_to_bytes,
int_or_none, int_or_none,
strip_jsonp, strip_jsonp,
unescapeHTML,
) )
@@ -26,6 +28,8 @@ def md5_text(s):
class AnvatoIE(InfoExtractor): class AnvatoIE(InfoExtractor):
_VALID_URL = r'anvato:(?P<access_key_or_mcp>[^:]+):(?P<id>\d+)'
# Copied from anvplayer.min.js # Copied from anvplayer.min.js
_ANVACK_TABLE = { _ANVACK_TABLE = {
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ', 'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ',
@@ -114,6 +118,22 @@ class AnvatoIE(InfoExtractor):
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ' 'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ'
} }
_MCP_TO_ACCESS_KEY_TABLE = {
'qa': 'anvato_mcpqa_demo_web_stage_18b55e00db5a13faa8d03ae6e41f6f5bcb15b922',
'lin': 'anvato_mcp_lin_web_prod_4c36fbfd4d8d8ecae6488656e21ac6d1ac972749',
'univison': 'anvato_mcp_univision_web_prod_37fe34850c99a3b5cdb71dab10a417dd5cdecafa',
'uni': 'anvato_mcp_univision_web_prod_37fe34850c99a3b5cdb71dab10a417dd5cdecafa',
'dev': 'anvato_mcp_fs2go_web_prod_c7b90a93e171469cdca00a931211a2f556370d0a',
'sps': 'anvato_mcp_sps_web_prod_54bdc90dd6ba21710e9f7074338365bba28da336',
'spsstg': 'anvato_mcp_sps_web_prod_54bdc90dd6ba21710e9f7074338365bba28da336',
'anv': 'anvato_mcp_anv_web_prod_791407490f4c1ef2a4bcb21103e0cb1bcb3352b3',
'gray': 'anvato_mcp_gray_web_prod_4c10f067c393ed8fc453d3930f8ab2b159973900',
'hearst': 'anvato_mcp_hearst_web_prod_5356c3de0fc7c90a3727b4863ca7fec3a4524a99',
'cbs': 'anvato_mcp_cbs_web_prod_02f26581ff80e5bda7aad28226a8d369037f2cbe',
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
}
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce' _AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce'
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
@@ -178,12 +198,7 @@ class AnvatoIE(InfoExtractor):
} }
if ext == 'm3u8' or media_format in ('m3u8', 'm3u8-variant'): if ext == 'm3u8' or media_format in ('m3u8', 'm3u8-variant'):
# Not using _extract_m3u8_formats here as individual media if tbr is not None:
# playlists are also included in published_urls.
if tbr is None:
formats.append(self._m3u8_meta_format(video_url, ext='mp4', m3u8_id='hls'))
continue
else:
a_format.update({ a_format.update({
'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])), 'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])),
'ext': 'mp4', 'ext': 'mp4',
@@ -222,9 +237,42 @@ class AnvatoIE(InfoExtractor):
'subtitles': subtitles, 'subtitles': subtitles,
} }
@staticmethod
def _extract_urls(ie, webpage, video_id):
entries = []
for mobj in re.finditer(AnvatoIE._ANVP_RE, webpage):
anvplayer_data = ie._parse_json(
mobj.group('anvp'), video_id, transform_source=unescapeHTML,
fatal=False)
if not anvplayer_data:
continue
video = anvplayer_data.get('video')
if not isinstance(video, compat_str) or not video.isdigit():
continue
access_key = anvplayer_data.get('accessKey')
if not access_key:
mcp = anvplayer_data.get('mcp')
if mcp:
access_key = AnvatoIE._MCP_TO_ACCESS_KEY_TABLE.get(
mcp.lower())
if not access_key:
continue
entries.append(ie.url_result(
'anvato:%s:%s' % (access_key, video), ie=AnvatoIE.ie_key(),
video_id=video))
return entries
def _extract_anvato_videos(self, webpage, video_id): def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json(self._html_search_regex( anvplayer_data = self._parse_json(
r'<script[^>]+data-anvp=\'([^\']+)\'', webpage, self._html_search_regex(
'Anvato player data'), video_id) self._ANVP_RE, webpage, 'Anvato player data', group='anvp'),
video_id)
return self._get_anvato_videos( return self._get_anvato_videos(
anvplayer_data['accessKey'], anvplayer_data['video']) anvplayer_data['accessKey'], anvplayer_data['video'])
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
access_key, video_id = mobj.group('access_key_or_mcp', 'id')
if access_key not in self._ANVACK_TABLE:
access_key = self._MCP_TO_ACCESS_KEY_TABLE[access_key]
return self._get_anvato_videos(access_key, video_id)

View File

@@ -3,13 +3,13 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, int_or_none,
HEADRequest, mimetype2ext,
) )
class AparatIE(InfoExtractor): class AparatIE(InfoExtractor):
_VALID_URL = r'^https?://(?:www\.)?aparat\.com/(?:v/|video/video/embed/videohash/)(?P<id>[a-zA-Z0-9]+)' _VALID_URL = r'https?://(?:www\.)?aparat\.com/(?:v/|video/video/embed/videohash/)(?P<id>[a-zA-Z0-9]+)'
_TEST = { _TEST = {
'url': 'http://www.aparat.com/v/wP8On', 'url': 'http://www.aparat.com/v/wP8On',
@@ -29,30 +29,41 @@ class AparatIE(InfoExtractor):
# Note: There is an easier-to-parse configuration at # Note: There is an easier-to-parse configuration at
# http://www.aparat.com/video/video/config/videohash/%video_id # http://www.aparat.com/video/video/config/videohash/%video_id
# but the URL in there does not work # but the URL in there does not work
embed_url = 'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id webpage = self._download_webpage(
webpage = self._download_webpage(embed_url, video_id) 'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
video_id)
file_list = self._parse_json(self._search_regex(
r'fileList\s*=\s*JSON\.parse\(\'([^\']+)\'\)', webpage, 'file list'), video_id)
for i, item in enumerate(file_list[0]):
video_url = item['file']
req = HEADRequest(video_url)
res = self._request_webpage(
req, video_id, note='Testing video URL %d' % i, errnote=False)
if res:
break
else:
raise ExtractorError('No working video URLs found')
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title') title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
file_list = self._parse_json(
self._search_regex(
r'fileList\s*=\s*JSON\.parse\(\'([^\']+)\'\)', webpage,
'file list'),
video_id)
formats = []
for item in file_list[0]:
file_url = item.get('file')
if not file_url:
continue
ext = mimetype2ext(item.get('type'))
label = item.get('label')
formats.append({
'url': file_url,
'ext': ext,
'format_id': label or ext,
'height': int_or_none(self._search_regex(
r'(\d+)[pP]', label or '', 'height', default=None)),
})
self._sort_formats(formats)
thumbnail = self._search_regex( thumbnail = self._search_regex(
r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False) r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'url': video_url,
'ext': 'mp4',
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'age_limit': self._family_friendly_search(webpage), 'age_limit': self._family_friendly_search(webpage),
'formats': formats,
} }

View File

@@ -12,13 +12,13 @@ class AppleConnectIE(InfoExtractor):
_VALID_URL = r'https?://itunes\.apple\.com/\w{0,2}/?post/idsa\.(?P<id>[\w-]+)' _VALID_URL = r'https?://itunes\.apple\.com/\w{0,2}/?post/idsa\.(?P<id>[\w-]+)'
_TEST = { _TEST = {
'url': 'https://itunes.apple.com/us/post/idsa.4ab17a39-2720-11e5-96c5-a5b38f6c42d3', 'url': 'https://itunes.apple.com/us/post/idsa.4ab17a39-2720-11e5-96c5-a5b38f6c42d3',
'md5': '10d0f2799111df4cb1c924520ca78f98', 'md5': 'e7c38568a01ea45402570e6029206723',
'info_dict': { 'info_dict': {
'id': '4ab17a39-2720-11e5-96c5-a5b38f6c42d3', 'id': '4ab17a39-2720-11e5-96c5-a5b38f6c42d3',
'ext': 'm4v', 'ext': 'm4v',
'title': 'Energy', 'title': 'Energy',
'uploader': 'Drake', 'uploader': 'Drake',
'thumbnail': 'http://is5.mzstatic.com/image/thumb/Video5/v4/78/61/c5/7861c5fa-ad6d-294b-1464-cf7605b911d6/source/1920x1080sr.jpg', 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150710', 'upload_date': '20150710',
'timestamp': 1436545535, 'timestamp': 1436545535,
}, },

View File

@@ -70,7 +70,8 @@ class AppleTrailersIE(InfoExtractor):
}, { }, {
'url': 'http://trailers.apple.com/trailers/magnolia/blackthorn/', 'url': 'http://trailers.apple.com/trailers/magnolia/blackthorn/',
'info_dict': { 'info_dict': {
'id': 'blackthorn', 'id': '4489',
'title': 'Blackthorn',
}, },
'playlist_mincount': 2, 'playlist_mincount': 2,
'expected_warnings': ['Unable to download JSON metadata'], 'expected_warnings': ['Unable to download JSON metadata'],
@@ -261,7 +262,7 @@ class AppleTrailersSectionIE(InfoExtractor):
'title': 'Most Popular', 'title': 'Most Popular',
'id': 'mostpopular', 'id': 'mostpopular',
}, },
'playlist_mincount': 80, 'playlist_mincount': 30,
}, { }, {
'url': 'http://trailers.apple.com/#section=moviestudios', 'url': 'http://trailers.apple.com/#section=moviestudios',
'info_dict': { 'info_dict': {

View File

@@ -24,12 +24,12 @@ class ArchiveOrgIE(InfoExtractor):
} }
}, { }, {
'url': 'https://archive.org/details/Cops1922', 'url': 'https://archive.org/details/Cops1922',
'md5': 'bc73c8ab3838b5a8fc6c6651fa7b58ba', 'md5': '0869000b4ce265e8ca62738b336b268a',
'info_dict': { 'info_dict': {
'id': 'Cops1922', 'id': 'Cops1922',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Buster Keaton\'s "Cops" (1922)', 'title': 'Buster Keaton\'s "Cops" (1922)',
'description': 'md5:b4544662605877edd99df22f9620d858', 'description': 'md5:89e7c77bf5d965dd5c0372cfb49470f6',
} }
}, { }, {
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect', 'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',

View File

@@ -93,6 +93,7 @@ class ARDMediathekIE(InfoExtractor):
duration = int_or_none(media_info.get('_duration')) duration = int_or_none(media_info.get('_duration'))
thumbnail = media_info.get('_previewImage') thumbnail = media_info.get('_previewImage')
is_live = media_info.get('_isLive') is True
subtitles = {} subtitles = {}
subtitle_url = media_info.get('_subtitleUrl') subtitle_url = media_info.get('_subtitleUrl')
@@ -106,6 +107,7 @@ class ARDMediathekIE(InfoExtractor):
'id': video_id, 'id': video_id,
'duration': duration, 'duration': duration,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'is_live': is_live,
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
} }
@@ -166,9 +168,11 @@ class ARDMediathekIE(InfoExtractor):
# determine video id from url # determine video id from url
m = re.match(self._VALID_URL, url) m = re.match(self._VALID_URL, url)
document_id = None
numid = re.search(r'documentId=([0-9]+)', url) numid = re.search(r'documentId=([0-9]+)', url)
if numid: if numid:
video_id = numid.group(1) document_id = video_id = numid.group(1)
else: else:
video_id = m.group('video_id') video_id = m.group('video_id')
@@ -228,12 +232,16 @@ class ARDMediathekIE(InfoExtractor):
'formats': formats, 'formats': formats,
} }
else: # request JSON file else: # request JSON file
if not document_id:
video_id = self._search_regex(
r'/play/(?:config|media)/(\d+)', webpage, 'media id')
info = self._extract_media_info( info = self._extract_media_info(
'http://www.ardmediathek.de/play/media/%s' % video_id, webpage, video_id) 'http://www.ardmediathek.de/play/media/%s' % video_id,
webpage, video_id)
info.update({ info.update({
'id': video_id, 'id': video_id,
'title': title, 'title': self._live_title(title) if info.get('is_live') else title,
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
}) })

View File

@@ -180,7 +180,7 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE): class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7' IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/[^/]+/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/(?:[^/]+/)?(?P<lang>fr|de|en|es)/(?:videos/)?(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D', 'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
@@ -188,6 +188,9 @@ class ArteTVPlus7IE(ArteTVBaseIE):
}, { }, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22', 'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.arte.tv/de/videos/048696-000-A/der-kluge-bauch-unser-zweites-gehirn',
'only_matching': True,
}] }]
@classmethod @classmethod

View File

@@ -0,0 +1,93 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .kaltura import KalturaIE
from ..utils import (
extract_attributes,
remove_end,
urlencode_postdata,
)
class AsianCrushIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'
_TESTS = [{
'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
'md5': 'c3b740e48d0ba002a42c0b72857beae6',
'info_dict': {
'id': '1_y4tmjm5r',
'ext': 'mp4',
'title': 'Women Who Flirt',
'description': 'md5:3db14e9186197857e7063522cb89a805',
'timestamp': 1496936429,
'upload_date': '20170608',
'uploader_id': 'craig@crifkin.com',
},
}, {
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'https://www.asiancrush.com/wp-admin/admin-ajax.php', video_id,
data=urlencode_postdata({
'postid': video_id,
'action': 'get_channel_kaltura_vars',
}))
entry_id = data['entry_id']
return self.url_result(
'kaltura:%s:%s' % (data['partner_id'], entry_id),
ie=KalturaIE.ie_key(), video_id=entry_id,
video_title=data.get('vid_label'))
class AsianCrushPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?asiancrush\.com/series/0+(?P<id>\d+)s\b'
_TEST = {
'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
'info_dict': {
'id': '12481',
'title': 'Scholar Who Walks the Night',
'description': 'md5:7addd7c5132a09fd4741152d96cce886',
},
'playlist_count': 20,
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = []
for mobj in re.finditer(
r'<a[^>]+href=(["\'])(?P<url>%s.*?)\1[^>]*>' % AsianCrushIE._VALID_URL,
webpage):
attrs = extract_attributes(mobj.group(0))
if attrs.get('class') == 'clearfix':
entries.append(self.url_result(
mobj.group('url'), ie=AsianCrushIE.ie_key()))
title = remove_end(
self._html_search_regex(
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
'title', default=None) or self._og_search_title(
webpage, default=None) or self._html_search_meta(
'twitter:title', webpage, 'title',
default=None) or self._search_regex(
r'<title>([^<]+)</title>', webpage, 'title', fatal=False),
' | AsianCrush')
description = self._og_search_description(
webpage, default=None) or self._html_search_meta(
'twitter:description', webpage, 'description', fatal=False)
return self.playlist_result(entries, playlist_id, title, description)

View File

@@ -36,7 +36,7 @@ class AtresPlayerIE(InfoExtractor):
}, },
{ {
'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html', 'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
'md5': '0d0e918533bbd4b263f2de4d197d4aac', 'md5': '6e52cbb513c405e403dbacb7aacf8747',
'info_dict': { 'info_dict': {
'id': 'capitulo-112-david-bustamante', 'id': 'capitulo-112-david-bustamante',
'ext': 'flv', 'ext': 'flv',

View File

@@ -16,7 +16,7 @@ class AudioBoomIE(InfoExtractor):
'title': '3/09/2016 Czaban Hour 3', 'title': '3/09/2016 Czaban Hour 3',
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans', 'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72, 'duration': 2245.72,
'uploader': 'Steve Czaban', 'uploader': 'SB Nation A.M.',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio', 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
} }
}, { }, {
@@ -43,7 +43,7 @@ class AudioBoomIE(InfoExtractor):
def from_clip(field): def from_clip(field):
if clip: if clip:
clip.get(field) return clip.get(field)
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property( audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url') 'audio', webpage, 'audio url')

View File

@@ -1,140 +0,0 @@
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
sanitized_Request,
)
class AzubuIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/[^/]+#!/play/(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
'md5': 'a88b42fcf844f29ad6035054bd9ecaf4',
'info_dict': {
'id': '15575',
'ext': 'mp4',
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1417523507.334,
'upload_date': '20141202',
'duration': 9988.7,
'uploader': 'GSL',
'uploader_id': 414310,
'view_count': int,
},
},
{
'url': 'http://www.azubu.tv/FnaticTV#!/play/9344/-fnatic-at-worlds-2014:-toyz---%22i-love-rekkles,-he-has-amazing-mechanics%22-',
'md5': 'b72a871fe1d9f70bd7673769cdb3b925',
'info_dict': {
'id': '9344',
'ext': 'mp4',
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1410530893.320,
'upload_date': '20140912',
'duration': 172.385,
'uploader': 'FnaticTV',
'uploader_id': 272749,
'view_count': int,
},
'skip': 'Channel offline',
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
title = data['title'].strip()
description = data.get('description')
thumbnail = data.get('thumbnail')
view_count = data.get('view_count')
user = data.get('user', {})
uploader = user.get('username')
uploader_id = user.get('id')
stream_params = json.loads(data['stream_params'])
timestamp = float_or_none(stream_params.get('creationDate'), 1000)
duration = float_or_none(stream_params.get('length'), 1000)
renditions = stream_params.get('renditions') or []
video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
if video:
renditions.append(video)
if not renditions and not user.get('channel', {}).get('is_live', True):
raise ExtractorError('%s said: channel is offline.' % self.IE_NAME, expected=True)
formats = [{
'url': fmt['url'],
'width': fmt['frameWidth'],
'height': fmt['frameHeight'],
'vbr': float_or_none(fmt['encodingRate'], 1000),
'filesize': fmt['size'],
'vcodec': fmt['videoCodec'],
'container': fmt['videoContainer'],
} for fmt in renditions if fmt['url']]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'duration': duration,
'uploader': uploader,
'uploader_id': uploader_id,
'view_count': view_count,
'formats': formats,
}
class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/(?P<id>[^/]+)$'
_TESTS = [{
'url': 'http://www.azubu.tv/MarsTVMDLen',
'only_matching': True,
}, {
'url': 'http://azubu.uol.com.br/adolfz',
'only_matching': True,
}]
def _real_extract(self, url):
user = self._match_id(url)
info = self._download_json(
'http://api.azubu.tv/public/modules/last-video/{0}/info'.format(user),
user)['data']
if info['type'] != 'STREAM':
raise ExtractorError('{0} is not streaming live'.format(user), expected=True)
req = sanitized_Request(
'https://edge-elb.api.brightcove.com/playback/v1/accounts/3361910549001/videos/ref:' + info['reference_id'])
req.add_header('Accept', 'application/json;pk=BCpkADawqM1gvI0oGWg8dxQHlgT8HkdE2LnAlWAZkOlznO39bSZX726u4JqnDsK3MDXcO01JxXK2tZtJbgQChxgaFzEVdHRjaDoxaOu8hHOO8NYhwdxw9BzvgkvLUlpbDNUuDoc4E4wxDToV')
bc_info = self._download_json(req, user)
m3u8_url = next(source['src'] for source in bc_info['sources'] if source['container'] == 'M2TS')
formats = self._extract_m3u8_formats(m3u8_url, user, ext='mp4')
self._sort_formats(formats)
return {
'id': info['id'],
'title': self._live_title(info['title']),
'uploader_id': user,
'formats': formats,
'is_live': True,
'thumbnail': bc_info['poster'],
}

View File

@@ -14,14 +14,16 @@ from ..utils import (
ExtractorError, ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
KNOWN_EXTENSIONS,
parse_filesize, parse_filesize,
unescapeHTML, unescapeHTML,
update_url_query, update_url_query,
unified_strdate,
) )
class BandcampIE(InfoExtractor): class BandcampIE(InfoExtractor):
_VALID_URL = r'https?://.*?\.bandcamp\.com/track/(?P<title>.*)' _VALID_URL = r'https?://.*?\.bandcamp\.com/track/(?P<title>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song', 'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
'md5': 'c557841d5e50261777a6585648adf439', 'md5': 'c557841d5e50261777a6585648adf439',
@@ -34,12 +36,12 @@ class BandcampIE(InfoExtractor):
'_skip': 'There is a limit of 200 free downloads / month for the test song' '_skip': 'There is a limit of 200 free downloads / month for the test song'
}, { }, {
'url': 'http://benprunty.bandcamp.com/track/lanius-battle', 'url': 'http://benprunty.bandcamp.com/track/lanius-battle',
'md5': '73d0b3171568232574e45652f8720b5c', 'md5': '0369ace6b939f0927e62c67a1a8d9fa7',
'info_dict': { 'info_dict': {
'id': '2650410135', 'id': '2650410135',
'ext': 'mp3', 'ext': 'aiff',
'title': 'Lanius (Battle)', 'title': 'Ben Prunty - Lanius (Battle)',
'uploader': 'Ben Prunty Music', 'uploader': 'Ben Prunty',
}, },
}] }]
@@ -47,6 +49,7 @@ class BandcampIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
title = mobj.group('title') title = mobj.group('title')
webpage = self._download_webpage(url, title) webpage = self._download_webpage(url, title)
thumbnail = self._html_search_meta('og:image', webpage, default=None)
m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage) m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage)
if not m_download: if not m_download:
m_trackinfo = re.search(r'trackinfo: (.+),\s*?\n', webpage) m_trackinfo = re.search(r'trackinfo: (.+),\s*?\n', webpage)
@@ -75,6 +78,7 @@ class BandcampIE(InfoExtractor):
return { return {
'id': track_id, 'id': track_id,
'title': data['title'], 'title': data['title'],
'thumbnail': thumbnail,
'formats': formats, 'formats': formats,
'duration': float_or_none(data.get('duration')), 'duration': float_or_none(data.get('duration')),
} }
@@ -143,7 +147,7 @@ class BandcampIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'thumbnail': info.get('thumb_url'), 'thumbnail': info.get('thumb_url') or thumbnail,
'uploader': info.get('artist'), 'uploader': info.get('artist'),
'artist': artist, 'artist': artist,
'track': track, 'track': track,
@@ -153,7 +157,7 @@ class BandcampIE(InfoExtractor):
class BandcampAlbumIE(InfoExtractor): class BandcampAlbumIE(InfoExtractor):
IE_NAME = 'Bandcamp:album' IE_NAME = 'Bandcamp:album'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^?#]+)|/?(?:$|[?#]))' _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^/?#&]+))?'
_TESTS = [{ _TESTS = [{
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1', 'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -220,6 +224,12 @@ class BandcampAlbumIE(InfoExtractor):
'playlist_count': 2, 'playlist_count': 2,
}] }]
@classmethod
def suitable(cls, url):
return (False
if BandcampWeeklyIE.suitable(url) or BandcampIE.suitable(url)
else super(BandcampAlbumIE, cls).suitable(url))
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
uploader_id = mobj.group('subdomain') uploader_id = mobj.group('subdomain')
@@ -248,3 +258,92 @@ class BandcampAlbumIE(InfoExtractor):
'title': title, 'title': title,
'entries': entries, 'entries': entries,
} }
class BandcampWeeklyIE(InfoExtractor):
IE_NAME = 'Bandcamp:weekly'
_VALID_URL = r'https?://(?:www\.)?bandcamp\.com/?\?(?:.*?&)?show=(?P<id>\d+)'
_TESTS = [{
'url': 'https://bandcamp.com/?show=224',
'md5': 'b00df799c733cf7e0c567ed187dea0fd',
'info_dict': {
'id': '224',
'ext': 'opus',
'title': 'BC Weekly April 4th 2017 - Magic Moments',
'description': 'md5:5d48150916e8e02d030623a48512c874',
'duration': 5829.77,
'release_date': '20170404',
'series': 'Bandcamp Weekly',
'episode': 'Magic Moments',
'episode_number': 208,
'episode_id': '224',
}
}, {
'url': 'https://bandcamp.com/?blah/blah@&show=228',
'only_matching': True
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
blob = self._parse_json(
self._search_regex(
r'data-blob=(["\'])(?P<blob>{.+?})\1', webpage,
'blob', group='blob'),
video_id, transform_source=unescapeHTML)
show = blob['bcw_show']
# This is desired because any invalid show id redirects to `bandcamp.com`
# which happens to expose the latest Bandcamp Weekly episode.
show_id = int_or_none(show.get('show_id')) or int_or_none(video_id)
formats = []
for format_id, format_url in show['audio_stream'].items():
if not isinstance(format_url, compat_str):
continue
for known_ext in KNOWN_EXTENSIONS:
if known_ext in format_id:
ext = known_ext
break
else:
ext = None
formats.append({
'format_id': format_id,
'url': format_url,
'ext': ext,
'vcodec': 'none',
})
self._sort_formats(formats)
title = show.get('audio_title') or 'Bandcamp Weekly'
subtitle = show.get('subtitle')
if subtitle:
title += ' - %s' % subtitle
episode_number = None
seq = blob.get('bcw_seq')
if seq and isinstance(seq, list):
try:
episode_number = next(
int_or_none(e.get('episode_number'))
for e in seq
if isinstance(e, dict) and int_or_none(e.get('id')) == show_id)
except StopIteration:
pass
return {
'id': video_id,
'title': title,
'description': show.get('desc') or show.get('short_desc'),
'duration': float_or_none(show.get('audio_duration')),
'is_live': False,
'release_date': unified_strdate(show.get('published_date')),
'series': 'Bandcamp Weekly',
'episode': show.get('subtitle'),
'episode_number': episode_number,
'episode_id': compat_str(video_id),
'formats': formats
}

View File

@@ -6,14 +6,18 @@ import itertools
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html,
dict_get, dict_get,
ExtractorError, ExtractorError,
float_or_none, float_or_none,
get_element_by_class,
int_or_none, int_or_none,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
try_get, try_get,
unescapeHTML, unescapeHTML,
urlencode_postdata,
urljoin,
) )
from ..compat import ( from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
@@ -32,12 +36,15 @@ class BBCCoUkIE(InfoExtractor):
(?: (?:
programmes/(?!articles/)| programmes/(?!articles/)|
iplayer(?:/[^/]+)?/(?:episode/|playlist/)| iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
music/clips[/#]| music/(?:clips|audiovideo/popular)[/#]|
radio/player/ radio/player/
) )
(?P<id>%s)(?!/(?:episodes|broadcasts|clips)) (?P<id>%s)(?!/(?:episodes|broadcasts|clips))
''' % _ID_REGEX ''' % _ID_REGEX
_LOGIN_URL = 'https://account.bbc.com/signin'
_NETRC_MACHINE = 'bbc'
_MEDIASELECTOR_URLS = [ _MEDIASELECTOR_URLS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails # Provides HQ HLS streams with even better quality that pc mediaset but fails
# with geolocation in some cases when it's even not geo restricted at all (e.g. # with geolocation in some cases when it's even not geo restricted at all (e.g.
@@ -222,11 +229,46 @@ class BBCCoUkIE(InfoExtractor):
}, { }, {
'url': 'http://www.bbc.co.uk/radio/player/p03cchwf', 'url': 'http://www.bbc.co.uk/radio/player/p03cchwf',
'only_matching': True, 'only_matching': True,
} }, {
] 'url': 'https://www.bbc.co.uk/music/audiovideo/popular#p055bc55',
'only_matching': True,
}]
_USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8' _USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading signin page')
login_form = self._hidden_inputs(login_page)
login_form.update({
'username': username,
'password': password,
})
post_url = urljoin(self._LOGIN_URL, self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page,
'post url', default=self._LOGIN_URL, group='url'))
response, urlh = self._download_webpage_handle(
post_url, None, 'Logging in', data=urlencode_postdata(login_form),
headers={'Referer': self._LOGIN_URL})
if self._LOGIN_URL in urlh.geturl():
error = clean_html(get_element_by_class('form-message', response))
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
class MediaSelectionError(Exception): class MediaSelectionError(Exception):
def __init__(self, id): def __init__(self, id):
self.id = id self.id = id
@@ -483,6 +525,12 @@ class BBCCoUkIE(InfoExtractor):
webpage = self._download_webpage(url, group_id, 'Downloading video page') webpage = self._download_webpage(url, group_id, 'Downloading video page')
error = self._search_regex(
r'<div\b[^>]+\bclass=["\']smp__message delta["\'][^>]*>([^<]+)<',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
programme_id = None programme_id = None
duration = None duration = None

View File

@@ -6,18 +6,33 @@ from ..utils import (
ExtractorError, ExtractorError,
clean_html, clean_html,
compat_str, compat_str,
float_or_none,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get, try_get,
urljoin,
) )
class BeamProLiveIE(InfoExtractor): class BeamProBaseIE(InfoExtractor):
IE_NAME = 'Beam:live' _API_BASE = 'https://mixer.com/api/v1'
_VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
_RATINGS = {'family': 0, 'teen': 13, '18+': 18} _RATINGS = {'family': 0, 'teen': 13, '18+': 18}
def _extract_channel_info(self, chan):
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
return {
'uploader': chan.get('token') or try_get(
chan, lambda x: x['user']['username'], compat_str),
'uploader_id': compat_str(user_id) if user_id else None,
'age_limit': self._RATINGS.get(chan.get('audience')),
}
class BeamProLiveIE(BeamProBaseIE):
IE_NAME = 'Mixer:live'
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/(?P<id>[^/?#&]+)'
_TEST = { _TEST = {
'url': 'http://www.beam.pro/niterhayven', 'url': 'http://mixer.com/niterhayven',
'info_dict': { 'info_dict': {
'id': '261562', 'id': '261562',
'ext': 'mp4', 'ext': 'mp4',
@@ -38,11 +53,17 @@ class BeamProLiveIE(InfoExtractor):
}, },
} }
_MANIFEST_URL_TEMPLATE = '%s/channels/%%s/manifest.%%s' % BeamProBaseIE._API_BASE
@classmethod
def suitable(cls, url):
return False if BeamProVodIE.suitable(url) else super(BeamProLiveIE, cls).suitable(url)
def _real_extract(self, url): def _real_extract(self, url):
channel_name = self._match_id(url) channel_name = self._match_id(url)
chan = self._download_json( chan = self._download_json(
'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name) '%s/channels/%s' % (self._API_BASE, channel_name), channel_name)
if chan.get('online') is False: if chan.get('online') is False:
raise ExtractorError( raise ExtractorError(
@@ -50,24 +71,118 @@ class BeamProLiveIE(InfoExtractor):
channel_id = chan['id'] channel_id = chan['id']
def manifest_url(kind):
return self._MANIFEST_URL_TEMPLATE % (channel_id, kind)
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id, manifest_url('m3u8'), channel_name, ext='mp4', m3u8_id='hls',
channel_name, ext='mp4', m3u8_id='hls', fatal=False) fatal=False)
formats.extend(self._extract_smil_formats(
manifest_url('smil'), channel_name, fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id']) info = {
return {
'id': compat_str(chan.get('id') or channel_name), 'id': compat_str(chan.get('id') or channel_name),
'title': self._live_title(chan.get('name') or channel_name), 'title': self._live_title(chan.get('name') or channel_name),
'description': clean_html(chan.get('description')), 'description': clean_html(chan.get('description')),
'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str), 'thumbnail': try_get(
chan, lambda x: x['thumbnail']['url'], compat_str),
'timestamp': parse_iso8601(chan.get('updatedAt')), 'timestamp': parse_iso8601(chan.get('updatedAt')),
'uploader': chan.get('token') or try_get(
chan, lambda x: x['user']['username'], compat_str),
'uploader_id': compat_str(user_id) if user_id else None,
'age_limit': self._RATINGS.get(chan.get('audience')),
'is_live': True, 'is_live': True,
'view_count': int_or_none(chan.get('viewersTotal')), 'view_count': int_or_none(chan.get('viewersTotal')),
'formats': formats, 'formats': formats,
} }
info.update(self._extract_channel_info(chan))
return info
class BeamProVodIE(BeamProBaseIE):
IE_NAME = 'Mixer:vod'
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>\d+)'
_TEST = {
'url': 'https://mixer.com/willow8714?vod=2259830',
'md5': 'b2431e6e8347dc92ebafb565d368b76b',
'info_dict': {
'id': '2259830',
'ext': 'mp4',
'title': 'willow8714\'s Channel',
'duration': 6828.15,
'thumbnail': r're:https://.*source\.png$',
'timestamp': 1494046474,
'upload_date': '20170506',
'uploader': 'willow8714',
'uploader_id': '6085379',
'age_limit': 13,
'view_count': int,
},
'params': {
'skip_download': True,
},
}
@staticmethod
def _extract_format(vod, vod_type):
if not vod.get('baseUrl'):
return []
if vod_type == 'hls':
filename, protocol = 'manifest.m3u8', 'm3u8_native'
elif vod_type == 'raw':
filename, protocol = 'source.mp4', 'https'
else:
assert False
data = vod.get('data') if isinstance(vod.get('data'), dict) else {}
format_id = [vod_type]
if isinstance(data.get('Height'), compat_str):
format_id.append('%sp' % data['Height'])
return [{
'url': urljoin(vod['baseUrl'], filename),
'format_id': '-'.join(format_id),
'ext': 'mp4',
'protocol': protocol,
'width': int_or_none(data.get('Width')),
'height': int_or_none(data.get('Height')),
'fps': int_or_none(data.get('Fps')),
'tbr': int_or_none(data.get('Bitrate'), 1000),
}]
def _real_extract(self, url):
vod_id = self._match_id(url)
vod_info = self._download_json(
'%s/recordings/%s' % (self._API_BASE, vod_id), vod_id)
state = vod_info.get('state')
if state != 'AVAILABLE':
raise ExtractorError(
'VOD %s is not available (state: %s)' % (vod_id, state),
expected=True)
formats = []
thumbnail_url = None
for vod in vod_info['vods']:
vod_type = vod.get('format')
if vod_type in ('hls', 'raw'):
formats.extend(self._extract_format(vod, vod_type))
elif vod_type == 'thumbnail':
thumbnail_url = urljoin(vod.get('baseUrl'), 'source.png')
self._sort_formats(formats)
info = {
'id': vod_id,
'title': vod_info.get('name') or vod_id,
'duration': float_or_none(vod_info.get('duration')),
'thumbnail': thumbnail_url,
'timestamp': parse_iso8601(vod_info.get('createdAt')),
'view_count': int_or_none(vod_info.get('viewsTotal')),
'formats': formats,
}
info.update(self._extract_channel_info(vod_info.get('channel') or {}))
return info

View File

@@ -16,7 +16,7 @@ class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)'
_TEST = { _TEST = {
'url': 'http://beeg.com/5416503', 'url': 'http://beeg.com/5416503',
'md5': '46c384def73b33dbc581262e5ee67cef', 'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'info_dict': { 'info_dict': {
'id': '5416503', 'id': '5416503',
'ext': 'mp4', 'ext': 'mp4',

View File

@@ -54,6 +54,22 @@ class BiliBiliIE(InfoExtractor):
'description': '如果你是神明并且能够让妄想成为现实。那你会进行怎么样的妄想是淫靡的世界独裁社会毁灭性的制裁还是……2015年涩谷。从6年前发生的大灾害“涩谷地震”之后复兴了的这个街区里新设立的私立高中...', 'description': '如果你是神明并且能够让妄想成为现实。那你会进行怎么样的妄想是淫靡的世界独裁社会毁灭性的制裁还是……2015年涩谷。从6年前发生的大灾害“涩谷地震”之后复兴了的这个街区里新设立的私立高中...',
}, },
'skip': 'Geo-restricted to China', 'skip': 'Geo-restricted to China',
}, {
# Title with double quotes
'url': 'http://www.bilibili.com/video/av8903802/',
'info_dict': {
'id': '8903802',
'ext': 'mp4',
'title': '阿滴英文|英文歌分享#6 "Closer',
'description': '滴妹今天唱Closer給你聽! 有史以来,被推最多次也是最久的歌曲,其实歌词跟我原本想像差蛮多的,不过还是好听! 微博@阿滴英文',
'uploader': '阿滴英文',
'uploader_id': '65880958',
'timestamp': 1488382620,
'upload_date': '20170301',
},
'params': {
'skip_download': True, # Test metadata only
},
}] }]
_APP_KEY = '84956560bc028eb7' _APP_KEY = '84956560bc028eb7'
@@ -122,6 +138,11 @@ class BiliBiliIE(InfoExtractor):
'preference': -2 if 'hd.mp4' in backup_url else -3, 'preference': -2 if 'hd.mp4' in backup_url else -3,
}) })
for a_format in formats:
a_format.setdefault('http_headers', {}).update({
'Referer': url,
})
self._sort_formats(formats) self._sort_formats(formats)
entries.append({ entries.append({
@@ -130,7 +151,7 @@ class BiliBiliIE(InfoExtractor):
'formats': formats, 'formats': formats,
}) })
title = self._html_search_regex('<h1[^>]+title="([^"]+)">', webpage, 'title') title = self._html_search_regex('<h1[^>]*>([^<]+)</h1>', webpage, 'title')
description = self._html_search_meta('description', webpage) description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex( timestamp = unified_timestamp(self._html_search_regex(
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None)) r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))

View File

@@ -35,7 +35,7 @@ class BleacherReportIE(InfoExtractor):
'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo', 'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo',
'timestamp': 1446839961, 'timestamp': 1446839961,
'uploader': 'Sean Fay', 'uploader': 'Sean Fay',
'description': 'md5:825e94e0f3521df52fa83b2ed198fa20', 'description': 'md5:b1601e2314c4d8eec23b6eafe086a757',
'uploader_id': 6466954, 'uploader_id': 6466954,
'upload_date': '20151011', 'upload_date': '20151011',
}, },
@@ -90,17 +90,13 @@ class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})' _VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})'
_TESTS = [{ _TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'md5': '8c2c12e3af7805152675446c905d159b', 'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'info_dict': { 'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'mp4', 'ext': 'flv',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division', 'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e', 'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -77,7 +77,7 @@ class BRIE(InfoExtractor):
'description': 'md5:bb659990e9e59905c3d41e369db1fbe3', 'description': 'md5:bb659990e9e59905c3d41e369db1fbe3',
'duration': 893, 'duration': 893,
'uploader': 'Eva Maria Steimle', 'uploader': 'Eva Maria Steimle',
'upload_date': '20140117', 'upload_date': '20170208',
} }
}, },
] ]

View File

@@ -5,6 +5,7 @@ import re
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from .adobepass import AdobePassIE
from ..compat import ( from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_parse_qs, compat_parse_qs,
@@ -131,6 +132,12 @@ class BrightcoveLegacyIE(InfoExtractor):
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
}, },
{
# playerID inferred from bcpid
# from http://www.un.org/chinese/News/story.asp?NewsID=27724
'url': 'https://link.brightcove.com/services/player/bcpid1722935254001/?bctid=5360463607001&autoStart=false&secureConnections=true&width=650&height=350',
'only_matching': True, # Tested in GenericIE
}
] ]
FLV_VCODECS = { FLV_VCODECS = {
1: 'SORENSON', 1: 'SORENSON',
@@ -266,9 +273,13 @@ class BrightcoveLegacyIE(InfoExtractor):
if matches: if matches:
return list(filter(None, [cls._build_brighcove_url(m) for m in matches])) return list(filter(None, [cls._build_brighcove_url(m) for m in matches]))
return list(filter(None, [ matches = re.findall(r'(customBC\.createVideo\(.+?\);)', webpage)
cls._build_brighcove_url_from_js(custom_bc) if matches:
for custom_bc in re.findall(r'(customBC\.createVideo\(.+?\);)', webpage)])) return list(filter(None, [
cls._build_brighcove_url_from_js(custom_bc)
for custom_bc in matches]))
return [src for _, src in re.findall(
r'<iframe[^>]+src=([\'"])((?:https?:)?//link\.brightcove\.com/services/player/(?!\1).+)\1', webpage)]
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
@@ -285,6 +296,10 @@ class BrightcoveLegacyIE(InfoExtractor):
if videoPlayer: if videoPlayer:
# We set the original url as the default 'Referer' header # We set the original url as the default 'Referer' header
referer = smuggled_data.get('Referer', url) referer = smuggled_data.get('Referer', url)
if 'playerID' not in query:
mobj = re.search(r'/bcpid(\d+)', url)
if mobj is not None:
query['playerID'] = [mobj.group(1)]
return self._get_video_info( return self._get_video_info(
videoPlayer[0], query, referer=referer) videoPlayer[0], query, referer=referer)
elif 'playerKey' in query: elif 'playerKey' in query:
@@ -434,7 +449,7 @@ class BrightcoveLegacyIE(InfoExtractor):
return info return info
class BrightcoveNewIE(InfoExtractor): class BrightcoveNewIE(AdobePassIE):
IE_NAME = 'brightcove:new' IE_NAME = 'brightcove:new'
_VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)' _VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)'
_TESTS = [{ _TESTS = [{
@@ -484,8 +499,8 @@ class BrightcoveNewIE(InfoExtractor):
}] }]
@staticmethod @staticmethod
def _extract_url(webpage): def _extract_url(ie, webpage):
urls = BrightcoveNewIE._extract_urls(webpage) urls = BrightcoveNewIE._extract_urls(ie, webpage)
return urls[0] if urls else None return urls[0] if urls else None
@staticmethod @staticmethod
@@ -508,7 +523,7 @@ class BrightcoveNewIE(InfoExtractor):
# [2] looks like: # [2] looks like:
for video, script_tag, account_id, player_id, embed in re.findall( for video, script_tag, account_id, player_id, embed in re.findall(
r'''(?isx) r'''(?isx)
(<video\s+[^>]+>) (<video\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>)
(?:.*? (?:.*?
(<script[^>]+ (<script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/ src=["\'](?:https?:)?//players\.brightcove\.net/
@@ -588,6 +603,20 @@ class BrightcoveNewIE(InfoExtractor):
raise ExtractorError(message, expected=True) raise ExtractorError(message, expected=True)
raise raise
errors = json_data.get('errors')
if errors and errors[0].get('error_subcode') == 'TVE_AUTH':
custom_fields = json_data['custom_fields']
tve_token = self._extract_mvpd_auth(
smuggled_data['source_url'], video_id,
custom_fields['bcadobepassrequestorid'],
custom_fields['bcadobepassresourceid'])
json_data = self._download_json(
api_url, video_id, headers={
'Accept': 'application/json;pk=%s' % policy_key
}, query={
'tveToken': tve_token,
})
title = json_data['name'].strip() title = json_data['name'].strip()
formats = [] formats = []
@@ -653,7 +682,6 @@ class BrightcoveNewIE(InfoExtractor):
}) })
formats.append(f) formats.append(f)
errors = json_data.get('errors')
if not formats and errors: if not formats and errors:
error = errors[0] error = errors[0]
raise ExtractorError( raise ExtractorError(
@@ -670,7 +698,7 @@ class BrightcoveNewIE(InfoExtractor):
is_live = False is_live = False
duration = float_or_none(json_data.get('duration'), 1000) duration = float_or_none(json_data.get('duration'), 1000)
if duration and duration < 0: if duration is not None and duration <= 0:
is_live = True is_live = True
return { return {

View File

@@ -84,9 +84,10 @@ class BuzzFeedIE(InfoExtractor):
continue continue
entries.append(self.url_result(video['url'])) entries.append(self.url_result(video['url']))
facebook_url = FacebookIE._extract_url(webpage) facebook_urls = FacebookIE._extract_urls(webpage)
if facebook_url: entries.extend([
entries.append(self.url_result(facebook_url)) self.url_result(facebook_url)
for facebook_url in facebook_urls])
return { return {
'_type': 'playlist', '_type': 'playlist',

View File

@@ -16,13 +16,10 @@ class Canalc2IE(InfoExtractor):
'md5': '060158428b650f896c542dfbb3d6487f', 'md5': '060158428b650f896c542dfbb3d6487f',
'info_dict': { 'info_dict': {
'id': '12163', 'id': '12163',
'ext': 'flv', 'ext': 'mp4',
'title': 'Terrasses du Numérique', 'title': 'Terrasses du Numérique',
'duration': 122, 'duration': 122,
}, },
'params': {
'skip_download': True, # Requires rtmpdump
}
}, { }, {
'url': 'http://archives-canalc2.u-strasbg.fr/video.asp?idVideo=11427&voir=oui', 'url': 'http://archives-canalc2.u-strasbg.fr/video.asp?idVideo=11427&voir=oui',
'only_matching': True, 'only_matching': True,

View File

@@ -96,6 +96,7 @@ class CBCIE(InfoExtractor):
'info_dict': { 'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks', 'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238', 'id': 'dog-indoor-exercise-winter-1.3928238',
'description': 'md5:c18552e41726ee95bd75210d1ca9194c',
}, },
'playlist_mincount': 6, 'playlist_mincount': 6,
}] }]
@@ -165,12 +166,11 @@ class CBCPlayerIE(InfoExtractor):
'uploader': 'CBCC-NEW', 'uploader': 'CBCC-NEW',
}, },
}, { }, {
# available only when we add `formats=MPEG4,FLV,MP3` to theplatform url
'url': 'http://www.cbc.ca/player/play/2164402062', 'url': 'http://www.cbc.ca/player/play/2164402062',
'md5': '17a61eb813539abea40618d6323a7f82', 'md5': '33fcd8f6719b9dd60a5e73adcb83b9f6',
'info_dict': { 'info_dict': {
'id': '2164402062', 'id': '2164402062',
'ext': 'flv', 'ext': 'mp4',
'title': 'Cancer survivor four times over', 'title': 'Cancer survivor four times over',
'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.', 'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.',
'timestamp': 1320410746, 'timestamp': 1320410746,

View File

@@ -49,13 +49,13 @@ class CBSIE(CBSBaseIE):
'only_matching': True, 'only_matching': True,
}] }]
def _extract_video_info(self, content_id): def _extract_video_info(self, content_id, site='cbs', mpx_acc=2198311517):
items_data = self._download_xml( items_data = self._download_xml(
'http://can.cbs.com/thunder/player/videoPlayerService.php', 'http://can.cbs.com/thunder/player/videoPlayerService.php',
content_id, query={'partner': 'cbs', 'contentId': content_id}) content_id, query={'partner': site, 'contentId': content_id})
video_data = xpath_element(items_data, './/item') video_data = xpath_element(items_data, './/item')
title = xpath_text(video_data, 'videoTitle', 'title', True) title = xpath_text(video_data, 'videoTitle', 'title', True)
tp_path = 'dJ5BDC/media/guid/2198311517/%s' % content_id tp_path = 'dJ5BDC/media/guid/%d/%s' % (mpx_acc, content_id)
tp_release_url = 'http://link.theplatform.com/s/' + tp_path tp_release_url = 'http://link.theplatform.com/s/' + tp_path
asset_types = [] asset_types = []

View File

@@ -3,17 +3,18 @@ from __future__ import unicode_literals
import re import re
from .theplatform import ThePlatformIE from .cbs import CBSIE
from ..utils import int_or_none from ..utils import int_or_none
class CBSInteractiveIE(ThePlatformIE): class CBSInteractiveIE(CBSIE):
_VALID_URL = r'https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video/share)/(?P<id>[^/?]+)' _VALID_URL = r'https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video(?:/share)?)/(?P<id>[^/?]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/', 'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
'info_dict': { 'info_dict': {
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60', 'id': 'R49SYt__yAfmlXR85z4f7gNmCBDcN_00',
'ext': 'flv', 'display_id': 'hands-on-with-microsofts-windows-8-1-update',
'ext': 'mp4',
'title': 'Hands-on with Microsoft Windows 8.1 Update', 'title': 'Hands-on with Microsoft Windows 8.1 Update',
'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.', 'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861', 'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861',
@@ -22,13 +23,19 @@ class CBSInteractiveIE(ThePlatformIE):
'timestamp': 1396479627, 'timestamp': 1396479627,
'upload_date': '20140402', 'upload_date': '20140402',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}, { }, {
'url': 'http://www.cnet.com/videos/whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187/', 'url': 'http://www.cnet.com/videos/whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187/',
'md5': 'f11d27b2fa18597fbf92444d2a9ed386',
'info_dict': { 'info_dict': {
'id': '56527b93-d25d-44e3-b738-f989ce2e49ba', 'id': 'kjOJd_OoVJqbg_ZD8MZCOk8Wekb9QccK',
'ext': 'flv', 'display_id': 'whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187',
'ext': 'mp4',
'title': 'Whiny potholes tweet at local government when hit by cars (Tomorrow Daily 187)', 'title': 'Whiny potholes tweet at local government when hit by cars (Tomorrow Daily 187)',
'description': 'Khail and Ashley wonder what other civic woes can be solved by self-tweeting objects, investigate a new kind of VR camera and watch an origami robot self-assemble, walk, climb, dig and dissolve. #TDPothole', 'description': 'md5:d2b9a95a5ffe978ae6fbd4cf944d618f',
'uploader_id': 'b163284d-6b73-44fc-b3e6-3da66c392d40', 'uploader_id': 'b163284d-6b73-44fc-b3e6-3da66c392d40',
'uploader': 'Ashley Esqueda', 'uploader': 'Ashley Esqueda',
'duration': 1482, 'duration': 1482,
@@ -38,23 +45,28 @@ class CBSInteractiveIE(ThePlatformIE):
}, { }, {
'url': 'http://www.zdnet.com/video/share/video-keeping-android-smartphones-and-tablets-secure/', 'url': 'http://www.zdnet.com/video/share/video-keeping-android-smartphones-and-tablets-secure/',
'info_dict': { 'info_dict': {
'id': 'bc1af9f0-a2b5-4e54-880d-0d95525781c0', 'id': 'k0r4T_ehht4xW_hAOqiVQPuBDPZ8SRjt',
'display_id': 'video-keeping-android-smartphones-and-tablets-secure',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Video: Keeping Android smartphones and tablets secure', 'title': 'Video: Keeping Android smartphones and tablets secure',
'description': 'Here\'s the best way to keep Android devices secure, and what you do when they\'ve come to the end of their lives.', 'description': 'Here\'s the best way to keep Android devices secure, and what you do when they\'ve come to the end of their lives.',
'uploader_id': 'f2d97ea2-8175-11e2-9d12-0018fe8a00b0', 'uploader_id': 'f2d97ea2-8175-11e2-9d12-0018fe8a00b0',
'uploader': 'Adrian Kingsley-Hughes', 'uploader': 'Adrian Kingsley-Hughes',
'timestamp': 1448961720, 'duration': 731,
'upload_date': '20151201', 'timestamp': 1449129925,
'upload_date': '20151203',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} },
}, {
'url': 'http://www.zdnet.com/video/huawei-matebook-x-video/',
'only_matching': True,
}] }]
TP_RELEASE_URL_TEMPLATE = 'http://link.theplatform.com/s/kYEXFC/%s?mbr=true'
MPX_ACCOUNTS = { MPX_ACCOUNTS = {
'cnet': 2288573011, 'cnet': 2198311517,
'zdnet': 2387448114, 'zdnet': 2387448114,
} }
@@ -68,7 +80,8 @@ class CBSInteractiveIE(ThePlatformIE):
data = self._parse_json(data_json, display_id) data = self._parse_json(data_json, display_id)
vdata = data.get('video') or data['videos'][0] vdata = data.get('video') or data['videos'][0]
video_id = vdata['id'] video_id = vdata['mpxRefId']
title = vdata['title'] title = vdata['title']
author = vdata.get('author') author = vdata.get('author')
if author: if author:
@@ -78,20 +91,7 @@ class CBSInteractiveIE(ThePlatformIE):
uploader = None uploader = None
uploader_id = None uploader_id = None
media_guid_path = 'media/guid/%d/%s' % (self.MPX_ACCOUNTS[site], vdata['mpxRefId']) info = self._extract_video_info(video_id, site, self.MPX_ACCOUNTS[site])
formats, subtitles = [], {}
for (fkey, vid) in vdata['files'].items():
if fkey == 'hls_phone' and 'hls_tablet' in vdata['files']:
continue
release_url = self.TP_RELEASE_URL_TEMPLATE % vid
if fkey == 'hds':
release_url += '&manifest=f4m'
tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % fkey)
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
self._sort_formats(formats)
info = self._extract_theplatform_metadata('kYEXFC/%s' % media_guid_path, video_id)
info.update({ info.update({
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
@@ -99,7 +99,5 @@ class CBSInteractiveIE(ThePlatformIE):
'duration': int_or_none(vdata.get('duration')), 'duration': int_or_none(vdata.get('duration')),
'uploader': uploader, 'uploader': uploader,
'uploader_id': uploader_id, 'uploader_id': uploader_id,
'subtitles': subtitles,
'formats': formats,
}) })
return info return info

View File

@@ -60,8 +60,8 @@ class CBSLocalIE(AnvatoIE):
'title': 'A Very Blue Anniversary', 'title': 'A Very Blue Anniversary',
'description': 'CBS2s Cindy Hsu has more.', 'description': 'CBS2s Cindy Hsu has more.',
'thumbnail': 're:^https?://.*', 'thumbnail': 're:^https?://.*',
'timestamp': 1479962220, 'timestamp': int,
'upload_date': '20161124', 'upload_date': r're:^\d{8}$',
'uploader': 'CBS', 'uploader': 'CBS',
'subtitles': { 'subtitles': {
'en': 'mincount:5', 'en': 'mincount:5',

View File

@@ -15,19 +15,23 @@ class CBSNewsIE(CBSIE):
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.cbsnews.com/news/tesla-and-spacex-elon-musks-industrial-empire/', # 60 minutes
'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/',
'info_dict': { 'info_dict': {
'id': 'tesla-and-spacex-elon-musks-industrial-empire', 'id': '_B6Ga3VJrI4iQNKsir_cdFo9Re_YJHE_',
'ext': 'flv', 'ext': 'mp4',
'title': 'Tesla and SpaceX: Elon Musk\'s industrial empire', 'title': 'Artificial Intelligence',
'thumbnail': 'http://beta.img.cbsnews.com/i/2014/03/30/60147937-2f53-4565-ad64-1bdd6eb64679/60-0330-pelley-640x360.jpg', 'description': 'md5:8818145f9974431e0fb58a1b8d69613c',
'duration': 791, 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1606,
'uploader': 'CBSI-NEW',
'timestamp': 1498431900,
'upload_date': '20170625',
}, },
'params': { 'params': {
# rtmp download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Subscribers only',
}, },
{ {
'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/', 'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
@@ -52,6 +56,22 @@ class CBSNewsIE(CBSIE):
'skip_download': True, 'skip_download': True,
}, },
}, },
{
# 48 hours
'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/',
'info_dict': {
'id': 'QpM5BJjBVEAUFi7ydR9LusS69DPLqPJ1',
'ext': 'mp4',
'title': 'Cold as Ice',
'description': 'Can a childhood memory of a friend\'s murder solve a 1957 cold case? "48 Hours" correspondent Erin Moriarty has the latest.',
'upload_date': '20170604',
'timestamp': 1496538000,
'uploader': 'CBSI-NEW',
},
'params': {
'skip_download': True,
},
},
] ]
def _real_extract(self, url): def _real_extract(self, url):
@@ -60,12 +80,18 @@ class CBSNewsIE(CBSIE):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
video_info = self._parse_json(self._html_search_regex( video_info = self._parse_json(self._html_search_regex(
r'(?:<ul class="media-list items" id="media-related-items"><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'', r'(?:<ul class="media-list items" id="media-related-items"[^>]*><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
webpage, 'video JSON info'), video_id) webpage, 'video JSON info', default='{}'), video_id, fatal=False)
item = video_info['item'] if 'item' in video_info else video_info if video_info:
guid = item['mpxRefId'] item = video_info['item'] if 'item' in video_info else video_info
return self._extract_video_info(guid) else:
state = self._parse_json(self._search_regex(
r'data-cbsvideoui-options=(["\'])(?P<json>{.+?})\1', webpage,
'playlist JSON info', group='json'), video_id)['state']
item = state['playlist'][state['pid']]
return self._extract_video_info(item['mpxRefId'], 'cbsnews')
class CBSNewsLiveVideoIE(InfoExtractor): class CBSNewsLiveVideoIE(InfoExtractor):

View File

@@ -9,7 +9,10 @@ from ..utils import (
ExtractorError, ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
multipart_encode,
parse_duration, parse_duration,
random_birthday,
urljoin,
) )
@@ -27,7 +30,8 @@ class CDAIE(InfoExtractor):
'description': 'md5:269ccd135d550da90d1662651fcb9772', 'description': 'md5:269ccd135d550da90d1662651fcb9772',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'average_rating': float, 'average_rating': float,
'duration': 39 'duration': 39,
'age_limit': 0,
} }
}, { }, {
'url': 'http://www.cda.pl/video/57413289', 'url': 'http://www.cda.pl/video/57413289',
@@ -41,13 +45,41 @@ class CDAIE(InfoExtractor):
'uploader': 'crash404', 'uploader': 'crash404',
'view_count': int, 'view_count': int,
'average_rating': float, 'average_rating': float,
'duration': 137 'duration': 137,
'age_limit': 0,
} }
}, {
# Age-restricted
'url': 'http://www.cda.pl/video/1273454c4',
'info_dict': {
'id': '1273454c4',
'ext': 'mp4',
'title': 'Bronson (2008) napisy HD 1080p',
'description': 'md5:1b6cb18508daf2dc4e0fa4db77fec24c',
'height': 1080,
'uploader': 'boniek61',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 5554,
'age_limit': 18,
'view_count': int,
'average_rating': float,
},
}, { }, {
'url': 'http://ebd.cda.pl/0x0/5749950c', 'url': 'http://ebd.cda.pl/0x0/5749950c',
'only_matching': True, 'only_matching': True,
}] }]
def _download_age_confirm_page(self, url, video_id, *args, **kwargs):
form_data = random_birthday('rok', 'miesiac', 'dzien')
form_data.update({'return': url, 'module': 'video', 'module_id': video_id})
data, content_type = multipart_encode(form_data)
return self._download_webpage(
urljoin(url, '/a/validatebirth'), video_id, *args,
data=data, headers={
'Referer': url,
'Content-Type': content_type,
}, **kwargs)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
self._set_cookie('cda.pl', 'cda.player', 'html5') self._set_cookie('cda.pl', 'cda.player', 'html5')
@@ -57,6 +89,13 @@ class CDAIE(InfoExtractor):
if 'Ten film jest dostępny dla użytkowników premium' in webpage: if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True) raise ExtractorError('This video is only available for premium users.', expected=True)
need_confirm_age = False
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")',
webpage, 'birthday validate form', default=None):
webpage = self._download_age_confirm_page(
url, video_id, note='Confirming age')
need_confirm_age = True
formats = [] formats = []
uploader = self._search_regex(r'''(?x) uploader = self._search_regex(r'''(?x)
@@ -81,6 +120,7 @@ class CDAIE(InfoExtractor):
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats, 'formats': formats,
'duration': None, 'duration': None,
'age_limit': 18 if need_confirm_age else 0,
} }
def extract_format(page, version): def extract_format(page, version):
@@ -121,7 +161,12 @@ class CDAIE(InfoExtractor):
for href, resolution in re.findall( for href, resolution in re.findall(
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)', r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage): webpage):
webpage = self._download_webpage( if need_confirm_age:
handler = self._download_age_confirm_page
else:
handler = self._download_webpage
webpage = handler(
self._BASE_URL + href, video_id, self._BASE_URL + href, video_id,
'Downloading %s version information' % resolution, fatal=False) 'Downloading %s version information' % resolution, fatal=False)
if not webpage: if not webpage:
@@ -129,6 +174,7 @@ class CDAIE(InfoExtractor):
# invalid version is requested. # invalid version is requested.
self.report_warning('Unable to download %s version information' % resolution) self.report_warning('Unable to download %s version information' % resolution)
continue continue
extract_format(webpage, resolution) extract_format(webpage, resolution)
self._sort_formats(formats) self._sort_formats(formats)

View File

@@ -9,12 +9,20 @@ from ..utils import (
class CinchcastIE(InfoExtractor): class CinchcastIE(InfoExtractor):
_VALID_URL = r'https?://player\.cinchcast\.com/.*?assetId=(?P<id>[0-9]+)' _VALID_URL = r'https?://player\.cinchcast\.com/.*?(?:assetId|show_id)=(?P<id>[0-9]+)'
_TEST = { _TESTS = [{
'url': 'http://player.cinchcast.com/?show_id=5258197&platformId=1&assetType=single',
'info_dict': {
'id': '5258197',
'ext': 'mp3',
'title': 'Train Your Brain to Up Your Game with Coach Mandy',
'upload_date': '20130816',
},
}, {
# Actual test is run in generic, look for undergroundwellness # Actual test is run in generic, look for undergroundwellness
'url': 'http://player.cinchcast.com/?platformId=1&#038;assetType=single&#038;assetId=7141703', 'url': 'http://player.cinchcast.com/?platformId=1&#038;assetType=single&#038;assetId=7141703',
'only_matching': True, 'only_matching': True,
} }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)

View File

@@ -0,0 +1,72 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
unescapeHTML,
)
class CJSWIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cjsw\.com/program/(?P<program>[^/]+)/episode/(?P<id>\d+)'
_TESTS = [{
'url': 'http://cjsw.com/program/freshly-squeezed/episode/20170620',
'md5': 'cee14d40f1e9433632c56e3d14977120',
'info_dict': {
'id': '91d9f016-a2e7-46c5-8dcb-7cbcd7437c41',
'ext': 'mp3',
'title': 'Freshly Squeezed Episode June 20, 2017',
'description': 'md5:c967d63366c3898a80d0c7b0ff337202',
'series': 'Freshly Squeezed',
'episode_id': '20170620',
},
}, {
# no description
'url': 'http://cjsw.com/program/road-pops/episode/20170707/',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
program, episode_id = mobj.group('program', 'id')
audio_id = '%s/%s' % (program, episode_id)
webpage = self._download_webpage(url, episode_id)
title = unescapeHTML(self._search_regex(
(r'<h1[^>]+class=["\']episode-header__title["\'][^>]*>(?P<title>[^<]+)',
r'data-audio-title=(["\'])(?P<title>(?:(?!\1).)+)\1'),
webpage, 'title', group='title'))
audio_url = self._search_regex(
r'<button[^>]+data-audio-src=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'audio url', group='url')
audio_id = self._search_regex(
r'/([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})\.mp3',
audio_url, 'audio id', default=audio_id)
formats = [{
'url': audio_url,
'ext': determine_ext(audio_url, 'mp3'),
'vcodec': 'none',
}]
description = self._html_search_regex(
r'<p>(?P<description>.+?)</p>', webpage, 'description',
default=None)
series = self._search_regex(
r'data-showname=(["\'])(?P<name>(?:(?!\1).)+)\1', webpage,
'series', default=program, group='name')
return {
'id': audio_id,
'title': title,
'description': description,
'formats': formats,
'series': series,
'episode_id': episode_id,
}

View File

@@ -1,67 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
)
class ClipfishIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?clipfish\.de/(?:[^/]+/)+video/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.clipfish.de/special/ugly-americans/video/4343170/s01-e01-ugly-americans-date-in-der-hoelle/',
'md5': '720563e467b86374c194bdead08d207d',
'info_dict': {
'id': '4343170',
'ext': 'mp4',
'title': 'S01 E01 - Ugly Americans - Date in der Hölle',
'description': 'Mark Lilly arbeitet im Sozialdienst der Stadt New York und soll Immigranten bei ihrer Einbürgerung in die USA zur Seite stehen.',
'upload_date': '20161005',
'duration': 1291,
'view_count': int,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_info = self._download_json(
'http://www.clipfish.de/devapi/id/%s?format=json&apikey=hbbtv' % video_id,
video_id)['items'][0]
formats = []
m3u8_url = video_info.get('media_videourl_hls')
if m3u8_url:
formats.append({
'url': m3u8_url.replace('de.hls.fra.clipfish.de', 'hls.fra.clipfish.de'),
'ext': 'mp4',
'format_id': 'hls',
})
mp4_url = video_info.get('media_videourl')
if mp4_url:
formats.append({
'url': mp4_url,
'format_id': 'mp4',
'width': int_or_none(video_info.get('width')),
'height': int_or_none(video_info.get('height')),
'tbr': int_or_none(video_info.get('bitrate')),
})
descr = video_info.get('descr')
if descr:
descr = descr.strip()
return {
'id': video_id,
'title': video_info['title'],
'description': descr,
'formats': formats,
'thumbnail': video_info.get('media_content_thumbnail_large') or video_info.get('media_thumbnail'),
'duration': int_or_none(video_info.get('media_length')),
'upload_date': unified_strdate(video_info.get('pubDate')),
'view_count': int_or_none(video_info.get('media_views'))
}

View File

@@ -30,7 +30,11 @@ class CloudyIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
'http://www.cloudy.ec/embed.php?id=%s' % video_id, video_id) 'https://www.cloudy.ec/embed.php', video_id, query={
'id': video_id,
'playerPage': 1,
'autoplay': 1,
})
info = self._parse_html5_media_entries(url, webpage, video_id)[0] info = self._parse_html5_media_entries(url, webpage, video_id)[0]

View File

@@ -21,7 +21,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.', 'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
'description': '', 'description': '',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg(?:\?.*?)?$',
'duration': 7713.088, 'duration': 7713.088,
'timestamp': 1413309600, 'timestamp': 1413309600,
'upload_date': '20141014', 'upload_date': '20141014',
@@ -35,6 +35,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'wmv', 'ext': 'wmv',
'title': '64ste Vakantiecursus: Afvalwater', 'title': '64ste Vakantiecursus: Afvalwater',
'description': 'md5:7fd774865cc69d972f542b157c328305', 'description': 'md5:7fd774865cc69d972f542b157c328305',
'thumbnail': r're:^https?://.*\.jpg(?:\?.*?)?$',
'duration': 10853, 'duration': 10853,
'timestamp': 1326446400, 'timestamp': 1326446400,
'upload_date': '20120113', 'upload_date': '20120113',

View File

@@ -245,6 +245,10 @@ class InfoExtractor(object):
specified in the URL. specified in the URL.
end_time: Time in seconds where the reproduction should end, as end_time: Time in seconds where the reproduction should end, as
specified in the URL. specified in the URL.
chapters: A list of dictionaries, with the following entries:
* "start_time" - The start time of the chapter in seconds
* "end_time" - The end time of the chapter in seconds
* "title" (optional, string)
The following fields should only be used when the video belongs to some logical The following fields should only be used when the video belongs to some logical
chapter or section: chapter or section:
@@ -372,7 +376,7 @@ class InfoExtractor(object):
cls._VALID_URL_RE = re.compile(cls._VALID_URL) cls._VALID_URL_RE = re.compile(cls._VALID_URL)
m = cls._VALID_URL_RE.match(url) m = cls._VALID_URL_RE.match(url)
assert m assert m
return m.group('id') return compat_str(m.group('id'))
@classmethod @classmethod
def working(cls): def working(cls):
@@ -416,7 +420,7 @@ class InfoExtractor(object):
if country_code: if country_code:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code) self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if self._downloader.params.get('verbose', False): if self._downloader.params.get('verbose', False):
self._downloader.to_stdout( self._downloader.to_screen(
'[debug] Using fake IP %s (%s) as X-Forwarded-For.' '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country_code.upper())) % (self._x_forwarded_for_ip, country_code.upper()))
@@ -726,12 +730,12 @@ class InfoExtractor(object):
video_info['title'] = video_title video_info['title'] = video_title
return video_info return video_info
def playlist_from_matches(self, matches, video_id, video_title, getter=None, ie=None): def playlist_from_matches(self, matches, playlist_id=None, playlist_title=None, getter=None, ie=None):
urlrs = orderedSet( urls = orderedSet(
self.url_result(self._proto_relative_url(getter(m) if getter else m), ie) self.url_result(self._proto_relative_url(getter(m) if getter else m), ie)
for m in matches) for m in matches)
return self.playlist_result( return self.playlist_result(
urlrs, playlist_id=video_id, playlist_title=video_title) urls, playlist_id=playlist_id, playlist_title=playlist_title)
@staticmethod @staticmethod
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None): def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
@@ -936,7 +940,8 @@ class InfoExtractor(object):
def _family_friendly_search(self, html): def _family_friendly_search(self, html):
# See http://schema.org/VideoObject # See http://schema.org/VideoObject
family_friendly = self._html_search_meta('isFamilyFriendly', html) family_friendly = self._html_search_meta(
'isFamilyFriendly', html, default=None)
if not family_friendly: if not family_friendly:
return None return None
@@ -976,22 +981,39 @@ class InfoExtractor(object):
return info return info
if isinstance(json_ld, dict): if isinstance(json_ld, dict):
json_ld = [json_ld] json_ld = [json_ld]
def extract_video_object(e):
assert e['@type'] == 'VideoObject'
info.update({
'url': e.get('contentUrl'),
'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')),
'thumbnail': e.get('thumbnailUrl') or e.get('thumbnailURL'),
'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('uploadDate')),
'filesize': float_or_none(e.get('contentSize')),
'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')),
'height': int_or_none(e.get('height')),
'view_count': int_or_none(e.get('interactionCount')),
})
for e in json_ld: for e in json_ld:
if e.get('@context') == 'http://schema.org': if e.get('@context') == 'http://schema.org':
item_type = e.get('@type') item_type = e.get('@type')
if expected_type is not None and expected_type != item_type: if expected_type is not None and expected_type != item_type:
return info return info
if item_type == 'TVEpisode': if item_type in ('TVEpisode', 'Episode'):
info.update({ info.update({
'episode': unescapeHTML(e.get('name')), 'episode': unescapeHTML(e.get('name')),
'episode_number': int_or_none(e.get('episodeNumber')), 'episode_number': int_or_none(e.get('episodeNumber')),
'description': unescapeHTML(e.get('description')), 'description': unescapeHTML(e.get('description')),
}) })
part_of_season = e.get('partOfSeason') part_of_season = e.get('partOfSeason')
if isinstance(part_of_season, dict) and part_of_season.get('@type') == 'TVSeason': if isinstance(part_of_season, dict) and part_of_season.get('@type') in ('TVSeason', 'Season', 'CreativeWorkSeason'):
info['season_number'] = int_or_none(part_of_season.get('seasonNumber')) info['season_number'] = int_or_none(part_of_season.get('seasonNumber'))
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries') part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
if isinstance(part_of_series, dict) and part_of_series.get('@type') == 'TVSeries': if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
info['series'] = unescapeHTML(part_of_series.get('name')) info['series'] = unescapeHTML(part_of_series.get('name'))
elif item_type == 'Article': elif item_type == 'Article':
info.update({ info.update({
@@ -1000,18 +1022,11 @@ class InfoExtractor(object):
'description': unescapeHTML(e.get('articleBody')), 'description': unescapeHTML(e.get('articleBody')),
}) })
elif item_type == 'VideoObject': elif item_type == 'VideoObject':
info.update({ extract_video_object(e)
'url': e.get('contentUrl'), continue
'title': unescapeHTML(e.get('name')), video = e.get('video')
'description': unescapeHTML(e.get('description')), if isinstance(video, dict) and video.get('@type') == 'VideoObject':
'thumbnail': e.get('thumbnailUrl') or e.get('thumbnailURL'), extract_video_object(video)
'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('uploadDate')),
'filesize': float_or_none(e.get('contentSize')),
'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')),
'height': int_or_none(e.get('height')),
})
break break
return dict((k, v) for k, v in info.items() if v is not None) return dict((k, v) for k, v in info.items() if v is not None)
@@ -1303,40 +1318,50 @@ class InfoExtractor(object):
entry_protocol='m3u8', preference=None, entry_protocol='m3u8', preference=None,
m3u8_id=None, note=None, errnote=None, m3u8_id=None, note=None, errnote=None,
fatal=True, live=False): fatal=True, live=False):
res = self._download_webpage_handle( res = self._download_webpage_handle(
m3u8_url, video_id, m3u8_url, video_id,
note=note or 'Downloading m3u8 information', note=note or 'Downloading m3u8 information',
errnote=errnote or 'Failed to download m3u8 information', errnote=errnote or 'Failed to download m3u8 information',
fatal=fatal) fatal=fatal)
if res is False: if res is False:
return [] return []
m3u8_doc, urlh = res m3u8_doc, urlh = res
m3u8_url = urlh.geturl() m3u8_url = urlh.geturl()
return self._parse_m3u8_formats(
m3u8_doc, m3u8_url, ext=ext, entry_protocol=entry_protocol,
preference=preference, m3u8_id=m3u8_id, live=live)
def _parse_m3u8_formats(self, m3u8_doc, m3u8_url, ext=None,
entry_protocol='m3u8', preference=None,
m3u8_id=None, live=False):
if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access
return [] return []
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)] formats = []
format_url = lambda u: ( format_url = lambda u: (
u u
if re.match(r'^https?://', u) if re.match(r'^https?://', u)
else compat_urlparse.urljoin(m3u8_url, u)) else compat_urlparse.urljoin(m3u8_url, u))
# We should try extracting formats only from master playlists [1], i.e. # References:
# playlists that describe available qualities. On the other hand media # 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-21
# playlists [2] should be returned as is since they contain just the media # 2. https://github.com/rg3/youtube-dl/issues/12211
# without qualities renditions.
# We should try extracting formats only from master playlists [1, 4.3.4],
# i.e. playlists that describe available qualities. On the other hand
# media playlists [1, 4.3.3] should be returned as is since they contain
# just the media without qualities renditions.
# Fortunately, master playlist can be easily distinguished from media # Fortunately, master playlist can be easily distinguished from media
# playlist based on particular tags availability. As of [1, 2] master # playlist based on particular tags availability. As of [1, 4.3.3, 4.3.4]
# playlist tags MUST NOT appear in a media playist and vice versa. # master playlist tags MUST NOT appear in a media playist and vice versa.
# As of [3] #EXT-X-TARGETDURATION tag is REQUIRED for every media playlist # As of [1, 4.3.3.1] #EXT-X-TARGETDURATION tag is REQUIRED for every
# and MUST NOT appear in master playlist thus we can clearly detect media # media playlist and MUST NOT appear in master playlist thus we can
# playlist with this criterion. # clearly detect media playlist with this criterion.
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.4
# 2. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3
# 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.1
if '#EXT-X-TARGETDURATION' in m3u8_doc: # media playlist, return as is if '#EXT-X-TARGETDURATION' in m3u8_doc: # media playlist, return as is
return [{ return [{
'url': m3u8_url, 'url': m3u8_url,
@@ -1345,52 +1370,72 @@ class InfoExtractor(object):
'protocol': entry_protocol, 'protocol': entry_protocol,
'preference': preference, 'preference': preference,
}] }]
audio_in_video_stream = {}
last_info = {} groups = {}
last_media = {} last_stream_inf = {}
def extract_media(x_media_line):
media = parse_m3u8_attributes(x_media_line)
# As per [1, 4.3.4.1] TYPE, GROUP-ID and NAME are REQUIRED
media_type, group_id, name = media.get('TYPE'), media.get('GROUP-ID'), media.get('NAME')
if not (media_type and group_id and name):
return
groups.setdefault(group_id, []).append(media)
if media_type not in ('VIDEO', 'AUDIO'):
return
media_url = media.get('URI')
if media_url:
format_id = []
for v in (group_id, name):
if v:
format_id.append(v)
f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'manifest_url': m3u8_url,
'language': media.get('LANGUAGE'),
'ext': ext,
'protocol': entry_protocol,
'preference': preference,
}
if media_type == 'AUDIO':
f['vcodec'] = 'none'
formats.append(f)
def build_stream_name():
# Despite specification does not mention NAME attribute for
# EXT-X-STREAM-INF tag it still sometimes may be present (see [1]
# or vidio test in TestInfoExtractor.test_parse_m3u8_formats)
# 1. http://www.vidio.com/watch/165683-dj_ambred-booyah-live-2015
stream_name = last_stream_inf.get('NAME')
if stream_name:
return stream_name
# If there is no NAME in EXT-X-STREAM-INF it will be obtained
# from corresponding rendition group
stream_group_id = last_stream_inf.get('VIDEO')
if not stream_group_id:
return
stream_group = groups.get(stream_group_id)
if not stream_group:
return stream_group_id
rendition = stream_group[0]
return rendition.get('NAME') or stream_group_id
for line in m3u8_doc.splitlines(): for line in m3u8_doc.splitlines():
if line.startswith('#EXT-X-STREAM-INF:'): if line.startswith('#EXT-X-STREAM-INF:'):
last_info = parse_m3u8_attributes(line) last_stream_inf = parse_m3u8_attributes(line)
elif line.startswith('#EXT-X-MEDIA:'): elif line.startswith('#EXT-X-MEDIA:'):
media = parse_m3u8_attributes(line) extract_media(line)
media_type = media.get('TYPE')
if media_type in ('VIDEO', 'AUDIO'):
group_id = media.get('GROUP-ID')
media_url = media.get('URI')
if media_url:
format_id = []
for v in (group_id, media.get('NAME')):
if v:
format_id.append(v)
f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'language': media.get('LANGUAGE'),
'ext': ext,
'protocol': entry_protocol,
'preference': preference,
}
if media_type == 'AUDIO':
f['vcodec'] = 'none'
if group_id and not audio_in_video_stream.get(group_id):
audio_in_video_stream[group_id] = False
formats.append(f)
else:
# When there is no URI in EXT-X-MEDIA let this tag's
# data be used by regular URI lines below
last_media = media
if media_type == 'AUDIO' and group_id:
audio_in_video_stream[group_id] = True
elif line.startswith('#') or not line.strip(): elif line.startswith('#') or not line.strip():
continue continue
else: else:
tbr = int_or_none(last_info.get('AVERAGE-BANDWIDTH') or last_info.get('BANDWIDTH'), scale=1000) tbr = float_or_none(
last_stream_inf.get('AVERAGE-BANDWIDTH') or
last_stream_inf.get('BANDWIDTH'), scale=1000)
format_id = [] format_id = []
if m3u8_id: if m3u8_id:
format_id.append(m3u8_id) format_id.append(m3u8_id)
# Despite specification does not mention NAME attribute for stream_name = build_stream_name()
# EXT-X-STREAM-INF it still sometimes may be present
stream_name = last_info.get('NAME') or last_media.get('NAME')
# Bandwidth of live streams may differ over time thus making # Bandwidth of live streams may differ over time thus making
# format_id unpredictable. So it's better to keep provided # format_id unpredictable. So it's better to keep provided
# format_id intact. # format_id intact.
@@ -1400,14 +1445,14 @@ class InfoExtractor(object):
f = { f = {
'format_id': '-'.join(format_id), 'format_id': '-'.join(format_id),
'url': manifest_url, 'url': manifest_url,
'manifest_url': manifest_url, 'manifest_url': m3u8_url,
'tbr': tbr, 'tbr': tbr,
'ext': ext, 'ext': ext,
'fps': float_or_none(last_info.get('FRAME-RATE')), 'fps': float_or_none(last_stream_inf.get('FRAME-RATE')),
'protocol': entry_protocol, 'protocol': entry_protocol,
'preference': preference, 'preference': preference,
} }
resolution = last_info.get('RESOLUTION') resolution = last_stream_inf.get('RESOLUTION')
if resolution: if resolution:
mobj = re.search(r'(?P<width>\d+)[xX](?P<height>\d+)', resolution) mobj = re.search(r'(?P<width>\d+)[xX](?P<height>\d+)', resolution)
if mobj: if mobj:
@@ -1423,13 +1468,26 @@ class InfoExtractor(object):
'vbr': vbr, 'vbr': vbr,
'abr': abr, 'abr': abr,
}) })
f.update(parse_codecs(last_info.get('CODECS'))) codecs = parse_codecs(last_stream_inf.get('CODECS'))
if audio_in_video_stream.get(last_info.get('AUDIO')) is False and f['vcodec'] != 'none': f.update(codecs)
# TODO: update acodec for audio only formats with the same GROUP-ID audio_group_id = last_stream_inf.get('AUDIO')
f['acodec'] = 'none' # As per [1, 4.3.4.1.1] any EXT-X-STREAM-INF tag which
# references a rendition group MUST have a CODECS attribute.
# However, this is not always respected, for example, [2]
# contains EXT-X-STREAM-INF tag which references AUDIO
# rendition group but does not have CODECS and despite
# referencing audio group an audio group, it represents
# a complete (with audio and video) format. So, for such cases
# we will ignore references to rendition groups and treat them
# as complete formats.
if audio_group_id and codecs and f.get('vcodec') != 'none':
audio_group = groups.get(audio_group_id)
if audio_group and audio_group[0].get('URI'):
# TODO: update acodec for audio only formats with
# the same GROUP-ID
f['acodec'] = 'none'
formats.append(f) formats.append(f)
last_info = {} last_stream_inf = {}
last_media = {}
return formats return formats
@staticmethod @staticmethod
@@ -1803,7 +1861,7 @@ class InfoExtractor(object):
'ext': mimetype2ext(mime_type), 'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')), 'width': int_or_none(representation_attrib.get('width')),
'height': int_or_none(representation_attrib.get('height')), 'height': int_or_none(representation_attrib.get('height')),
'tbr': int_or_none(bandwidth, 1000), 'tbr': float_or_none(bandwidth, 1000),
'asr': int_or_none(representation_attrib.get('audioSamplingRate')), 'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
'fps': int_or_none(representation_attrib.get('frameRate')), 'fps': int_or_none(representation_attrib.get('frameRate')),
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None, 'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
@@ -1835,9 +1893,13 @@ class InfoExtractor(object):
'Bandwidth': bandwidth, 'Bandwidth': bandwidth,
} }
def location_key(location):
return 'url' if re.match(r'^https?://', location) else 'path'
if 'segment_urls' not in representation_ms_info and 'media' in representation_ms_info: if 'segment_urls' not in representation_ms_info and 'media' in representation_ms_info:
media_template = prepare_template('media', ('Number', 'Bandwidth', 'Time')) media_template = prepare_template('media', ('Number', 'Bandwidth', 'Time'))
media_location_key = location_key(media_template)
# As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$ # As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$
# can't be used at the same time # can't be used at the same time
@@ -1847,7 +1909,7 @@ class InfoExtractor(object):
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale']) segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration)) representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration))
representation_ms_info['fragments'] = [{ representation_ms_info['fragments'] = [{
'url': media_template % { media_location_key: media_template % {
'Number': segment_number, 'Number': segment_number,
'Bandwidth': bandwidth, 'Bandwidth': bandwidth,
}, },
@@ -1871,7 +1933,7 @@ class InfoExtractor(object):
'Number': segment_number, 'Number': segment_number,
} }
representation_ms_info['fragments'].append({ representation_ms_info['fragments'].append({
'url': segment_url, media_location_key: segment_url,
'duration': float_or_none(segment_d, representation_ms_info['timescale']), 'duration': float_or_none(segment_d, representation_ms_info['timescale']),
}) })
@@ -1895,8 +1957,9 @@ class InfoExtractor(object):
for s in representation_ms_info['s']: for s in representation_ms_info['s']:
duration = float_or_none(s['d'], timescale) duration = float_or_none(s['d'], timescale)
for r in range(s.get('r', 0) + 1): for r in range(s.get('r', 0) + 1):
segment_uri = representation_ms_info['segment_urls'][segment_index]
fragments.append({ fragments.append({
'url': representation_ms_info['segment_urls'][segment_index], location_key(segment_uri): segment_uri,
'duration': duration, 'duration': duration,
}) })
segment_index += 1 segment_index += 1
@@ -1905,6 +1968,7 @@ class InfoExtractor(object):
# No fragments key is present in this case. # No fragments key is present in this case.
if 'fragments' in representation_ms_info: if 'fragments' in representation_ms_info:
f.update({ f.update({
'fragment_base_url': base_url,
'fragments': [], 'fragments': [],
'protocol': 'http_dash_segments', 'protocol': 'http_dash_segments',
}) })
@@ -1912,10 +1976,8 @@ class InfoExtractor(object):
initialization_url = representation_ms_info['initialization_url'] initialization_url = representation_ms_info['initialization_url']
if not f.get('url'): if not f.get('url'):
f['url'] = initialization_url f['url'] = initialization_url
f['fragments'].append({'url': initialization_url}) f['fragments'].append({location_key(initialization_url): initialization_url})
f['fragments'].extend(representation_ms_info['fragments']) f['fragments'].extend(representation_ms_info['fragments'])
for fragment in f['fragments']:
fragment['url'] = urljoin(base_url, fragment['url'])
try: try:
existing_format = next( existing_format = next(
fo for fo in formats fo for fo in formats
@@ -1944,6 +2006,12 @@ class InfoExtractor(object):
compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id) compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id)
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None): def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
"""
Parse formats from ISM manifest.
References:
1. [MS-SSTR]: Smooth Streaming Protocol,
https://msdn.microsoft.com/en-us/library/ff469518.aspx
"""
if ism_doc.get('IsLive') == 'TRUE' or ism_doc.find('Protection') is not None: if ism_doc.get('IsLive') == 'TRUE' or ism_doc.find('Protection') is not None:
return [] return []
@@ -1965,8 +2033,11 @@ class InfoExtractor(object):
self.report_warning('%s is not a supported codec' % fourcc) self.report_warning('%s is not a supported codec' % fourcc)
continue continue
tbr = int(track.attrib['Bitrate']) // 1000 tbr = int(track.attrib['Bitrate']) // 1000
width = int_or_none(track.get('MaxWidth')) # [1] does not mention Width and Height attributes. However,
height = int_or_none(track.get('MaxHeight')) # they're often present while MaxWidth and MaxHeight are
# missing, so should be used as fallbacks
width = int_or_none(track.get('MaxWidth') or track.get('Width'))
height = int_or_none(track.get('MaxHeight') or track.get('Height'))
sampling_rate = int_or_none(track.get('SamplingRate')) sampling_rate = int_or_none(track.get('SamplingRate'))
track_url_pattern = re.sub(r'{[Bb]itrate}', track.attrib['Bitrate'], url_pattern) track_url_pattern = re.sub(r'{[Bb]itrate}', track.attrib['Bitrate'], url_pattern)
@@ -2044,9 +2115,9 @@ class InfoExtractor(object):
return f return f
return {} return {}
def _media_formats(src, cur_media_type): def _media_formats(src, cur_media_type, type_info={}):
full_url = absolute_url(src) full_url = absolute_url(src)
ext = determine_ext(full_url) ext = type_info.get('ext') or determine_ext(full_url)
if ext == 'm3u8': if ext == 'm3u8':
is_plain_url = False is_plain_url = False
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
@@ -2066,15 +2137,18 @@ class InfoExtractor(object):
return is_plain_url, formats return is_plain_url, formats
entries = [] entries = []
# amp-video and amp-audio are very similar to their HTML5 counterparts
# so we wll include them right here (see
# https://www.ampproject.org/docs/reference/components/amp-video)
media_tags = [(media_tag, media_type, '') media_tags = [(media_tag, media_type, '')
for media_tag, media_type for media_tag, media_type
in re.findall(r'(?s)(<(video|audio)[^>]*/>)', webpage)] in re.findall(r'(?s)(<(?:amp-)?(video|audio)[^>]*/>)', webpage)]
media_tags.extend(re.findall( media_tags.extend(re.findall(
# We only allow video|audio followed by a whitespace or '>'. # We only allow video|audio followed by a whitespace or '>'.
# Allowing more characters may end up in significant slow down (see # Allowing more characters may end up in significant slow down (see
# https://github.com/rg3/youtube-dl/issues/11979, example URL: # https://github.com/rg3/youtube-dl/issues/11979, example URL:
# http://www.porntrex.com/maps/videositemap.xml). # http://www.porntrex.com/maps/videositemap.xml).
r'(?s)(<(?P<tag>video|audio)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage)) r'(?s)(<(?P<tag>(?:amp-)?(?:video|audio))(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
for media_tag, media_type, media_content in media_tags: for media_tag, media_type, media_content in media_tags:
media_info = { media_info = {
'formats': [], 'formats': [],
@@ -2092,9 +2166,9 @@ class InfoExtractor(object):
src = source_attributes.get('src') src = source_attributes.get('src')
if not src: if not src:
continue continue
is_plain_url, formats = _media_formats(src, media_type) f = parse_content_type(source_attributes.get('type'))
is_plain_url, formats = _media_formats(src, media_type, f)
if is_plain_url: if is_plain_url:
f = parse_content_type(source_attributes.get('type'))
f.update(formats[0]) f.update(formats[0])
media_info['formats'].append(f) media_info['formats'].append(f)
else: else:
@@ -2117,7 +2191,7 @@ class InfoExtractor(object):
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}): def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
formats = [] formats = []
hdcore_sign = 'hdcore=3.7.0' hdcore_sign = 'hdcore=3.7.0'
f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m') f4m_url = re.sub(r'(https?://[^/]+)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
hds_host = hosts.get('hds') hds_host = hosts.get('hds')
if hds_host: if hds_host:
f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url) f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)
@@ -2139,8 +2213,9 @@ class InfoExtractor(object):
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]): def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url) url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
url_base = self._search_regex(r'(?:https?|rtmp|rtsp)(://[^?]+)', url, 'format url') url_base = self._search_regex(
http_base_url = 'http' + url_base r'(?:(?:https?|rtmp|rtsp):)?(//[^?]+)', url, 'format url')
http_base_url = '%s:%s' % ('http', url_base)
formats = [] formats = []
if 'm3u8' not in skip_protocols: if 'm3u8' not in skip_protocols:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
@@ -2174,7 +2249,7 @@ class InfoExtractor(object):
for protocol in ('rtmp', 'rtsp'): for protocol in ('rtmp', 'rtsp'):
if protocol not in skip_protocols: if protocol not in skip_protocols:
formats.append({ formats.append({
'url': protocol + url_base, 'url': '%s:%s' % (protocol, url_base),
'format_id': protocol, 'format_id': protocol,
'protocol': protocol, 'protocol': protocol,
}) })
@@ -2182,7 +2257,7 @@ class InfoExtractor(object):
def _find_jwplayer_data(self, webpage, video_id=None, transform_source=js_to_json): def _find_jwplayer_data(self, webpage, video_id=None, transform_source=js_to_json):
mobj = re.search( mobj = re.search(
r'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\s*\((?P<options>[^)]+)\)', r'(?s)jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)(?!</script>).*?\.setup\s*\((?P<options>[^)]+)\)',
webpage) webpage)
if mobj: if mobj:
try: try:
@@ -2232,6 +2307,8 @@ class InfoExtractor(object):
tracks = video_data.get('tracks') tracks = video_data.get('tracks')
if tracks and isinstance(tracks, list): if tracks and isinstance(tracks, list):
for track in tracks: for track in tracks:
if not isinstance(track, dict):
continue
if track.get('kind') != 'captions': if track.get('kind') != 'captions':
continue continue
track_url = urljoin(base_url, track.get('file')) track_url = urljoin(base_url, track.get('file'))
@@ -2258,11 +2335,19 @@ class InfoExtractor(object):
def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None, def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None,
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None): m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
urls = []
formats = [] formats = []
for source in jwplayer_sources_data: for source in jwplayer_sources_data:
source_url = self._proto_relative_url(source['file']) if not isinstance(source, dict):
continue
source_url = self._proto_relative_url(source.get('file'))
if not source_url:
continue
if base_url: if base_url:
source_url = compat_urlparse.urljoin(base_url, source_url) source_url = compat_urlparse.urljoin(base_url, source_url)
if source_url in urls:
continue
urls.append(source_url)
source_type = source.get('type') or '' source_type = source.get('type') or ''
ext = mimetype2ext(source_type) or determine_ext(source_url) ext = mimetype2ext(source_type) or determine_ext(source_url)
if source_type == 'hls' or ext == 'm3u8': if source_type == 'hls' or ext == 'm3u8':

View File

@@ -16,7 +16,6 @@ from ..utils import (
mimetype2ext, mimetype2ext,
orderedSet, orderedSet,
parse_iso8601, parse_iso8601,
remove_end,
) )
@@ -50,10 +49,17 @@ class CondeNastIE(InfoExtractor):
'wmagazine': 'W Magazine', 'wmagazine': 'W Magazine',
} }
_VALID_URL = r'https?://(?:video|www|player)\.(?P<site>%s)\.com/(?P<type>watch|series|video|embed(?:js)?)/(?P<id>[^/?#]+)' % '|'.join(_SITES.keys()) _VALID_URL = r'''(?x)https?://(?:video|www|player(?:-backend)?)\.(?:%s)\.com/
(?:
(?:
embed(?:js)?|
(?:script|inline)/video
)/(?P<id>[0-9a-f]{24})(?:/(?P<player_id>[0-9a-f]{24}))?(?:.+?\btarget=(?P<target>[^&]+))?|
(?P<type>watch|series|video)/(?P<display_id>[^/?#]+)
)''' % '|'.join(_SITES.keys())
IE_DESC = 'Condé Nast media group: %s' % ', '.join(sorted(_SITES.values())) IE_DESC = 'Condé Nast media group: %s' % ', '.join(sorted(_SITES.values()))
EMBED_URL = r'(?:https?:)?//player\.(?P<site>%s)\.com/(?P<type>embed(?:js)?)/.+?' % '|'.join(_SITES.keys()) EMBED_URL = r'(?:https?:)?//player(?:-backend)?\.(?:%s)\.com/(?:embed(?:js)?|(?:script|inline)/video)/.+?' % '|'.join(_SITES.keys())
_TESTS = [{ _TESTS = [{
'url': 'http://video.wired.com/watch/3d-printed-speakers-lit-with-led', 'url': 'http://video.wired.com/watch/3d-printed-speakers-lit-with-led',
@@ -89,6 +95,12 @@ class CondeNastIE(InfoExtractor):
'upload_date': '20150916', 'upload_date': '20150916',
'timestamp': 1442434955, 'timestamp': 1442434955,
} }
}, {
'url': 'https://player.cnevids.com/inline/video/59138decb57ac36b83000005.js?target=js-cne-player',
'only_matching': True,
}, {
'url': 'http://player-backend.cnevids.com/script/video/59138decb57ac36b83000005.js',
'only_matching': True,
}] }]
def _extract_series(self, url, webpage): def _extract_series(self, url, webpage):
@@ -104,7 +116,7 @@ class CondeNastIE(InfoExtractor):
entries = [self.url_result(build_url(path), 'CondeNast') for path in paths] entries = [self.url_result(build_url(path), 'CondeNast') for path in paths]
return self.playlist_result(entries, playlist_title=title) return self.playlist_result(entries, playlist_title=title)
def _extract_video(self, webpage, url_type): def _extract_video_params(self, webpage):
query = {} query = {}
params = self._search_regex( params = self._search_regex(
r'(?s)var params = {(.+?)}[;,]', webpage, 'player params', default=None) r'(?s)var params = {(.+?)}[;,]', webpage, 'player params', default=None)
@@ -123,17 +135,30 @@ class CondeNastIE(InfoExtractor):
'playerId': params['data-player'], 'playerId': params['data-player'],
'target': params['id'], 'target': params['id'],
}) })
video_id = query['videoId'] return query
def _extract_video(self, params):
video_id = params['videoId']
video_info = None video_info = None
info_page = self._download_json( if params.get('playerId'):
'http://player.cnevids.com/player/video.js', info_page = self._download_json(
video_id, 'Downloading video info', fatal=False, query=query) 'http://player.cnevids.com/player/video.js',
if info_page: video_id, 'Downloading video info', fatal=False, query=params)
video_info = info_page.get('video') if info_page:
if not video_info: video_info = info_page.get('video')
if not video_info:
info_page = self._download_webpage(
'http://player.cnevids.com/player/loader.js',
video_id, 'Downloading loader info', query=params)
else:
info_page = self._download_webpage( info_page = self._download_webpage(
'http://player.cnevids.com/player/loader.js', 'https://player.cnevids.com/inline/video/%s.js' % video_id,
video_id, 'Downloading loader info', query=query) video_id, 'Downloading inline info', query={
'target': params.get('target', 'embedplayer')
})
if not video_info:
video_info = self._parse_json( video_info = self._parse_json(
self._search_regex( self._search_regex(
r'(?s)var\s+config\s*=\s*({.+?});', info_page, 'config'), r'(?s)var\s+config\s*=\s*({.+?});', info_page, 'config'),
@@ -161,9 +186,7 @@ class CondeNastIE(InfoExtractor):
}) })
self._sort_formats(formats) self._sort_formats(formats)
info = self._search_json_ld( return {
webpage, video_id, fatal=False) if url_type != 'embed' else {}
info.update({
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': title, 'title': title,
@@ -174,22 +197,26 @@ class CondeNastIE(InfoExtractor):
'series': video_info.get('series_title'), 'series': video_info.get('series_title'),
'season': video_info.get('season_title'), 'season': video_info.get('season_title'),
'timestamp': parse_iso8601(video_info.get('premiere_date')), 'timestamp': parse_iso8601(video_info.get('premiere_date')),
}) 'categories': video_info.get('categories'),
return info }
def _real_extract(self, url): def _real_extract(self, url):
site, url_type, item_id = re.match(self._VALID_URL, url).groups() video_id, player_id, target, url_type, display_id = re.match(self._VALID_URL, url).groups()
# Convert JS embed to regular embed if video_id:
if url_type == 'embedjs': return self._extract_video({
parsed_url = compat_urlparse.urlparse(url) 'videoId': video_id,
url = compat_urlparse.urlunparse(parsed_url._replace( 'playerId': player_id,
path=remove_end(parsed_url.path, '.js').replace('/embedjs/', '/embed/'))) 'target': target,
url_type = 'embed' })
webpage = self._download_webpage(url, item_id) webpage = self._download_webpage(url, display_id)
if url_type == 'series': if url_type == 'series':
return self._extract_series(url, webpage) return self._extract_series(url, webpage)
else: else:
return self._extract_video(webpage, url_type) params = self._extract_video_params(webpage)
info = self._search_json_ld(
webpage, display_id, fatal=False)
info.update(self._extract_video(params))
return info

View File

@@ -8,7 +8,16 @@ from ..utils import int_or_none
class CorusIE(ThePlatformFeedIE): class CorusIE(ThePlatformFeedIE):
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:globaltv|etcanada)\.com|(?:hgtv|foodnetwork|slice)\.ca)/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))(?P<id>\d+)' _VALID_URL = r'''(?x)
https?://
(?:www\.)?
(?P<domain>
(?:globaltv|etcanada)\.com|
(?:hgtv|foodnetwork|slice|history|showcase)\.ca
)
/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))
(?P<id>\d+)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/', 'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/',
'md5': '05dcbca777bf1e58c2acbb57168ad3a6', 'md5': '05dcbca777bf1e58c2acbb57168ad3a6',
@@ -27,6 +36,12 @@ class CorusIE(ThePlatformFeedIE):
}, { }, {
'url': 'http://etcanada.com/video/873675331955/meet-the-survivor-game-changers-castaways-part-2/', 'url': 'http://etcanada.com/video/873675331955/meet-the-survivor-game-changers-castaways-part-2/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.history.ca/the-world-without-canada/video/full-episodes/natural-resources/video.html?v=955054659646#video',
'only_matching': True,
}, {
'url': 'http://www.showcase.ca/eyewitness/video/eyewitness++106/video.html?v=955070531919&p=1&s=da#video',
'only_matching': True,
}] }]
_TP_FEEDS = { _TP_FEEDS = {
@@ -50,6 +65,14 @@ class CorusIE(ThePlatformFeedIE):
'feed_id': '5tUJLgV2YNJ5', 'feed_id': '5tUJLgV2YNJ5',
'account_id': 2414427935, 'account_id': 2414427935,
}, },
'history': {
'feed_id': 'tQFx_TyyEq4J',
'account_id': 2369613659,
},
'showcase': {
'feed_id': '9H6qyshBZU3E',
'account_id': 2414426607,
},
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -24,12 +24,11 @@ class CoubIE(InfoExtractor):
'duration': 4.6, 'duration': 4.6,
'timestamp': 1428527772, 'timestamp': 1428527772,
'upload_date': '20150408', 'upload_date': '20150408',
'uploader': 'Артём Лоскутников', 'uploader': 'Artyom Loskutnikov',
'uploader_id': 'artyom.loskutnikov', 'uploader_id': 'artyom.loskutnikov',
'view_count': int, 'view_count': int,
'like_count': int, 'like_count': int,
'repost_count': int, 'repost_count': int,
'comment_count': int,
'age_limit': 0, 'age_limit': 0,
}, },
}, { }, {
@@ -118,7 +117,6 @@ class CoubIE(InfoExtractor):
view_count = int_or_none(coub.get('views_count') or coub.get('views_increase_count')) view_count = int_or_none(coub.get('views_count') or coub.get('views_increase_count'))
like_count = int_or_none(coub.get('likes_count')) like_count = int_or_none(coub.get('likes_count'))
repost_count = int_or_none(coub.get('recoubs_count')) repost_count = int_or_none(coub.get('recoubs_count'))
comment_count = int_or_none(coub.get('comments_count'))
age_restricted = coub.get('age_restricted', coub.get('age_restricted_by_admin')) age_restricted = coub.get('age_restricted', coub.get('age_restricted_by_admin'))
if age_restricted is not None: if age_restricted is not None:
@@ -137,7 +135,6 @@ class CoubIE(InfoExtractor):
'view_count': view_count, 'view_count': view_count,
'like_count': like_count, 'like_count': like_count,
'repost_count': repost_count, 'repost_count': repost_count,
'comment_count': comment_count,
'age_limit': age_limit, 'age_limit': age_limit,
'formats': formats, 'formats': formats,
} }

View File

@@ -21,9 +21,10 @@ class CrackleIE(InfoExtractor):
'season_number': 8, 'season_number': 8,
'episode_number': 4, 'episode_number': 4,
'subtitles': { 'subtitles': {
'en-US': [{ 'en-US': [
'ext': 'ttml', {'ext': 'vtt'},
}] {'ext': 'tt'},
]
}, },
}, },
'params': { 'params': {

View File

@@ -171,7 +171,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'info_dict': { 'info_dict': {
'id': '727589', 'id': '727589',
'ext': 'mp4', 'ext': 'mp4',
'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 Give Me Deliverance from this Judicial Injustice!", 'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 Give Me Deliverance From This Judicial Injustice!",
'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d', 'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Kadokawa Pictures Inc.', 'uploader': 'Kadokawa Pictures Inc.',
@@ -179,7 +179,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'series': "KONOSUBA -God's blessing on this wonderful world!", 'series': "KONOSUBA -God's blessing on this wonderful world!",
'season': "KONOSUBA -God's blessing on this wonderful world! 2", 'season': "KONOSUBA -God's blessing on this wonderful world! 2",
'season_number': 2, 'season_number': 2,
'episode': 'Give Me Deliverance from this Judicial Injustice!', 'episode': 'Give Me Deliverance From This Judicial Injustice!',
'episode_number': 1, 'episode_number': 1,
}, },
'params': { 'params': {
@@ -510,7 +510,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
# webpage provide more accurate data than series_title from XML # webpage provide more accurate data than series_title from XML
series = self._html_search_regex( series = self._html_search_regex(
r'id=["\']showmedia_about_episode_num[^>]+>\s*<a[^>]+>([^<]+)', r'(?s)<h\d[^>]+\bid=["\']showmedia_about_episode_num[^>]+>(.+?)</h\d',
webpage, 'series', fatal=False) webpage, 'series', fatal=False)
season = xpath_text(metadata, 'series_title') season = xpath_text(metadata, 'series_title')
@@ -518,7 +518,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
episode_number = int_or_none(xpath_text(metadata, 'episode_number')) episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
season_number = int_or_none(self._search_regex( season_number = int_or_none(self._search_regex(
r'(?s)<h4[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h4>\s*<h4>\s*Season (\d+)', r'(?s)<h\d[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h\d>\s*<h4>\s*Season (\d+)',
webpage, 'season number', default=None)) webpage, 'season number', default=None))
return { return {

View File

@@ -10,6 +10,7 @@ from ..utils import (
smuggle_url, smuggle_url,
determine_ext, determine_ext,
ExtractorError, ExtractorError,
extract_attributes,
) )
from .senateisvp import SenateISVPIE from .senateisvp import SenateISVPIE
from .ustream import UstreamIE from .ustream import UstreamIE
@@ -68,6 +69,7 @@ class CSpanIE(InfoExtractor):
'uploader_id': '12987475', 'uploader_id': '12987475',
}, },
}] }]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@@ -78,6 +80,19 @@ class CSpanIE(InfoExtractor):
if ustream_url: if ustream_url:
return self.url_result(ustream_url, UstreamIE.ie_key()) return self.url_result(ustream_url, UstreamIE.ie_key())
if '&vod' not in url:
bc = self._search_regex(
r"(<[^>]+id='brightcove-player-embed'[^>]+>)",
webpage, 'brightcove embed', default=None)
if bc:
bc_attr = extract_attributes(bc)
bc_url = self.BRIGHTCOVE_URL_TEMPLATE % (
bc_attr.get('data-bcaccountid', '3162030207001'),
bc_attr.get('data-noprebcplayerid', 'SyGGpuJy3g'),
bc_attr.get('data-newbcplayerid', 'default'),
bc_attr['data-bcid'])
return self.url_result(smuggle_url(bc_url, {'source_url': url}))
# We first look for clipid, because clipprog always appears before # We first look for clipid, because clipprog always appears before
patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')] patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
results = list(filter(None, (re.search(p, webpage) for p in patterns))) results = list(filter(None, (re.search(p, webpage) for p in patterns)))

View File

@@ -1,17 +1,21 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
determine_protocol, determine_protocol,
try_get,
unescapeHTML, unescapeHTML,
) )
class DailyMailIE(InfoExtractor): class DailyMailIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dailymail\.co\.uk/video/[^/]+/video-(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?dailymail\.co\.uk/(?:video/[^/]+/video-|embed/video/)(?P<id>[0-9]+)'
_TEST = { _TESTS = [{
'url': 'http://www.dailymail.co.uk/video/tvshowbiz/video-1295863/The-Mountain-appears-sparkling-water-ad-Heavy-Bubbles.html', 'url': 'http://www.dailymail.co.uk/video/tvshowbiz/video-1295863/The-Mountain-appears-sparkling-water-ad-Heavy-Bubbles.html',
'md5': 'f6129624562251f628296c3a9ffde124', 'md5': 'f6129624562251f628296c3a9ffde124',
'info_dict': { 'info_dict': {
@@ -20,7 +24,16 @@ class DailyMailIE(InfoExtractor):
'title': 'The Mountain appears in sparkling water ad for \'Heavy Bubbles\'', 'title': 'The Mountain appears in sparkling water ad for \'Heavy Bubbles\'',
'description': 'md5:a93d74b6da172dd5dc4d973e0b766a84', 'description': 'md5:a93d74b6da172dd5dc4d973e0b766a84',
} }
} }, {
'url': 'http://www.dailymail.co.uk/embed/video/1295863.html',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return re.findall(
r'<iframe\b[^>]+\bsrc=["\'](?P<url>(?:https?:)?//(?:www\.)?dailymail\.co\.uk/embed/video/\d+\.html)',
webpage)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@@ -28,8 +41,14 @@ class DailyMailIE(InfoExtractor):
video_data = self._parse_json(self._search_regex( video_data = self._parse_json(self._search_regex(
r"data-opts='({.+?})'", webpage, 'video data'), video_id) r"data-opts='({.+?})'", webpage, 'video data'), video_id)
title = unescapeHTML(video_data['title']) title = unescapeHTML(video_data['title'])
video_sources = self._download_json(video_data.get(
'sources', {}).get('url') or 'http://www.dailymail.co.uk/api/player/%s/video-sources.json' % video_id, video_id) sources_url = (try_get(
video_data,
(lambda x: x['plugins']['sources']['url'],
lambda x: x['sources']['url']), compat_str) or
'http://www.dailymail.co.uk/api/player/%s/video-sources.json' % video_id)
video_sources = self._download_json(sources_url, video_id)
formats = [] formats = []
for rendition in video_sources['renditions']: for rendition in video_sources['renditions']:

View File

@@ -38,7 +38,7 @@ class DailymotionBaseInfoExtractor(InfoExtractor):
class DailymotionIE(DailymotionBaseInfoExtractor): class DailymotionIE(DailymotionBaseInfoExtractor):
_VALID_URL = r'(?i)(?:https?://)?(?:(www|touch)\.)?dailymotion\.[a-z]{2,3}/(?:(?:embed|swf|#)/)?video/(?P<id>[^/?_]+)' _VALID_URL = r'(?i)https?://(?:(www|touch)\.)?dailymotion\.[a-z]{2,3}/(?:(?:(?:embed|swf|#)/)?video|swf)/(?P<id>[^/?_]+)'
IE_NAME = 'dailymotion' IE_NAME = 'dailymotion'
_FORMATS = [ _FORMATS = [
@@ -49,68 +49,82 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
('stream_h264_hd1080_url', 'hd180'), ('stream_h264_hd1080_url', 'hd180'),
] ]
_TESTS = [ _TESTS = [{
{ 'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames', 'md5': '074b95bdee76b9e3654137aee9c79dfe',
'md5': '2137c41a8e78554bb09225b8eb322406', 'info_dict': {
'info_dict': { 'id': 'x5kesuj',
'id': 'x2iuewm', 'ext': 'mp4',
'ext': 'mp4', 'title': 'Office Christmas Party Review Jason Bateman, Olivia Munn, T.J. Miller',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News', 'description': 'Office Christmas Party Review - Jason Bateman, Olivia Munn, T.J. Miller',
'description': 'Several come bundled with the Steam Controller.', 'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$', 'duration': 187,
'duration': 74, 'timestamp': 1493651285,
'timestamp': 1425657362, 'upload_date': '20170501',
'upload_date': '20150306', 'uploader': 'Deadline',
'uploader': 'IGN', 'uploader_id': 'x1xm8ri',
'uploader_id': 'xijv66', 'age_limit': 0,
'age_limit': 0, 'view_count': int,
'view_count': int,
}
}, },
}, {
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames',
'md5': '2137c41a8e78554bb09225b8eb322406',
'info_dict': {
'id': 'x2iuewm',
'ext': 'mp4',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
'description': 'Several come bundled with the Steam Controller.',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 74,
'timestamp': 1425657362,
'upload_date': '20150306',
'uploader': 'IGN',
'uploader_id': 'xijv66',
'age_limit': 0,
'view_count': int,
},
'skip': 'video gone',
}, {
# Vevo video # Vevo video
{ 'url': 'http://www.dailymotion.com/video/x149uew_katy-perry-roar-official_musi',
'url': 'http://www.dailymotion.com/video/x149uew_katy-perry-roar-official_musi', 'info_dict': {
'info_dict': { 'title': 'Roar (Official)',
'title': 'Roar (Official)', 'id': 'USUV71301934',
'id': 'USUV71301934', 'ext': 'mp4',
'ext': 'mp4', 'uploader': 'Katy Perry',
'uploader': 'Katy Perry', 'upload_date': '20130905',
'upload_date': '20130905',
},
'params': {
'skip_download': True,
},
'skip': 'VEVO is only available in some countries',
}, },
'params': {
'skip_download': True,
},
'skip': 'VEVO is only available in some countries',
}, {
# age-restricted video # age-restricted video
{ 'url': 'http://www.dailymotion.com/video/xyh2zz_leanna-decker-cyber-girl-of-the-year-desires-nude-playboy-plus_redband',
'url': 'http://www.dailymotion.com/video/xyh2zz_leanna-decker-cyber-girl-of-the-year-desires-nude-playboy-plus_redband', 'md5': '0d667a7b9cebecc3c89ee93099c4159d',
'md5': '0d667a7b9cebecc3c89ee93099c4159d', 'info_dict': {
'info_dict': { 'id': 'xyh2zz',
'id': 'xyh2zz', 'ext': 'mp4',
'ext': 'mp4', 'title': 'Leanna Decker - Cyber Girl Of The Year Desires Nude [Playboy Plus]',
'title': 'Leanna Decker - Cyber Girl Of The Year Desires Nude [Playboy Plus]', 'uploader': 'HotWaves1012',
'uploader': 'HotWaves1012', 'age_limit': 18,
'age_limit': 18,
},
'skip': 'video gone',
}, },
'skip': 'video gone',
}, {
# geo-restricted, player v5 # geo-restricted, player v5
{ 'url': 'http://www.dailymotion.com/video/xhza0o',
'url': 'http://www.dailymotion.com/video/xhza0o', 'only_matching': True,
'only_matching': True, }, {
},
# with subtitles # with subtitles
{ 'url': 'http://www.dailymotion.com/video/x20su5f_the-power-of-nightmares-1-the-rise-of-the-politics-of-fear-bbc-2004_news',
'url': 'http://www.dailymotion.com/video/x20su5f_the-power-of-nightmares-1-the-rise-of-the-politics-of-fear-bbc-2004_news', 'only_matching': True,
'only_matching': True, }, {
}, 'url': 'http://www.dailymotion.com/swf/video/x3n92nf',
{ 'only_matching': True,
'url': 'http://www.dailymotion.com/swf/video/x3n92nf', }, {
'only_matching': True, 'url': 'http://www.dailymotion.com/swf/x3ss1m_funny-magic-trick-barry-and-stuart_fun',
} 'only_matching': True,
] }]
@staticmethod @staticmethod
def _extract_urls(webpage): def _extract_urls(webpage):
@@ -133,7 +147,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
view_count_str = self._search_regex( view_count_str = self._search_regex(
(r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([\s\d,.]+)"', (r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([\s\d,.]+)"',
r'video_views_count[^>]+>\s+([\s\d\,.]+)'), r'video_views_count[^>]+>\s+([\s\d\,.]+)'),
webpage, 'view count', fatal=False) webpage, 'view count', default=None)
if view_count_str: if view_count_str:
view_count_str = re.sub(r'\s', '', view_count_str) view_count_str = re.sub(r'\s', '', view_count_str)
view_count = str_to_int(view_count_str) view_count = str_to_int(view_count_str)
@@ -145,7 +159,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
[r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826 [r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826
r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);', r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);',
r'buildPlayer\(({.+?})\);', r'buildPlayer\(({.+?})\);',
r'var\s+config\s*=\s*({.+?});'], r'var\s+config\s*=\s*({.+?});',
# New layout regex (see https://github.com/rg3/youtube-dl/issues/13580)
r'__PLAYER_CONFIG__\s*=\s*({.+?});'],
webpage, 'player v5', default=None) webpage, 'player v5', default=None)
if player_v5: if player_v5:
player = self._parse_json(player_v5, video_id) player = self._parse_json(player_v5, video_id)

View File

@@ -21,7 +21,8 @@ class DemocracynowIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '2015-0703-001', 'id': '2015-0703-001',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Daily Show', 'title': 'Daily Show for July 03, 2015',
'description': 'md5:80eb927244d6749900de6072c7cc2c86',
}, },
}, { }, {
'url': 'http://www.democracynow.org/2015/7/3/this_flag_comes_down_today_bree', 'url': 'http://www.democracynow.org/2015/7/3/this_flag_comes_down_today_bree',

View File

@@ -15,7 +15,7 @@ from ..utils import (
class DisneyIE(InfoExtractor): class DisneyIE(InfoExtractor):
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr)|(?:starwars|marvelkids)\.com))/(?:(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})|(?:[^/]+/)?(?P<display_id>[^/?#]+))''' https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr|channel\.de)|(?:starwars|marvelkids)\.com))/(?:(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})|(?:[^/]+/)?(?P<display_id>[^/?#]+))'''
_TESTS = [{ _TESTS = [{
# Disney.EmbedVideo # Disney.EmbedVideo
'url': 'http://video.disney.com/watch/moana-trailer-545ed1857afee5a0ec239977', 'url': 'http://video.disney.com/watch/moana-trailer-545ed1857afee5a0ec239977',
@@ -68,6 +68,9 @@ class DisneyIE(InfoExtractor):
}, { }, {
'url': 'http://disneyjunior.en.disneyme.com/dj/watch-my-friends-tigger-and-pooh-promo', 'url': 'http://disneyjunior.en.disneyme.com/dj/watch-my-friends-tigger-and-pooh-promo',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://disneychannel.de/sehen/soy-luna-folge-118-5518518987ba27f3cc729268',
'only_matching': True,
}, { }, {
'url': 'http://disneyjunior.disney.com/galactech-the-galactech-grab-galactech-an-admiral-rescue', 'url': 'http://disneyjunior.disney.com/galactech-the-galactech-grab-galactech-an-admiral-rescue',
'only_matching': True, 'only_matching': True,

View File

@@ -13,7 +13,7 @@ from ..utils import (
class DigitallySpeakingIE(InfoExtractor): class DigitallySpeakingIE(InfoExtractor):
_VALID_URL = r'https?://(?:evt\.dispeak|events\.digitallyspeaking)\.com/(?:[^/]+/)+xml/(?P<id>[^.]+)\.xml' _VALID_URL = r'https?://(?:s?evt\.dispeak|events\.digitallyspeaking)\.com/(?:[^/]+/)+xml/(?P<id>[^.]+)\.xml'
_TESTS = [{ _TESTS = [{
# From http://gdcvault.com/play/1023460/Tenacious-Design-and-The-Interface # From http://gdcvault.com/play/1023460/Tenacious-Design-and-The-Interface
@@ -28,6 +28,10 @@ class DigitallySpeakingIE(InfoExtractor):
# From http://www.gdcvault.com/play/1014631/Classic-Game-Postmortem-PAC # From http://www.gdcvault.com/play/1014631/Classic-Game-Postmortem-PAC
'url': 'http://events.digitallyspeaking.com/gdc/sf11/xml/12396_1299111843500GMPX.xml', 'url': 'http://events.digitallyspeaking.com/gdc/sf11/xml/12396_1299111843500GMPX.xml',
'only_matching': True, 'only_matching': True,
}, {
# From http://www.gdcvault.com/play/1013700/Advanced-Material
'url': 'http://sevt.dispeak.com/ubm/gdc/eur10/xml/11256_1282118587281VNIT.xml',
'only_matching': True,
}] }]
def _parse_mp4(self, metadata): def _parse_mp4(self, metadata):

View File

@@ -35,7 +35,7 @@ class DotsubIE(InfoExtractor):
'thumbnail': 're:^https?://dotsub.com/media/747bcf58-bd59-45b7-8c8c-ac312d084ee6/p', 'thumbnail': 're:^https?://dotsub.com/media/747bcf58-bd59-45b7-8c8c-ac312d084ee6/p',
'duration': 290, 'duration': 290,
'timestamp': 1476767794.2809999, 'timestamp': 1476767794.2809999,
'upload_date': '20160525', 'upload_date': '20161018',
'uploader': 'parthivi001', 'uploader': 'parthivi001',
'uploader_id': 'user52596202', 'uploader_id': 'user52596202',
'view_count': int, 'view_count': int,

View File

@@ -3,11 +3,14 @@ from __future__ import unicode_literals
import time import time
import hashlib import hashlib
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
unescapeHTML, unescapeHTML,
unified_strdate,
urljoin,
) )
@@ -20,7 +23,7 @@ class DouyuTVIE(InfoExtractor):
'id': '17732', 'id': '17732',
'display_id': 'iseven', 'display_id': 'iseven',
'ext': 'flv', 'ext': 'flv',
'title': 're:^清晨醒脑!T-ARA根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*', 'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅', 'uploader': '7师傅',
@@ -51,7 +54,7 @@ class DouyuTVIE(InfoExtractor):
'id': '17732', 'id': '17732',
'display_id': '17732', 'display_id': '17732',
'ext': 'flv', 'ext': 'flv',
'title': 're:^清晨醒脑!T-ARA根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*', 'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅', 'uploader': '7师傅',
@@ -117,3 +120,82 @@ class DouyuTVIE(InfoExtractor):
'uploader': uploader, 'uploader': uploader,
'is_live': True, 'is_live': True,
} }
class DouyuShowIE(InfoExtractor):
_VALID_URL = r'https?://v(?:mobile)?\.douyu\.com/show/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'https://v.douyu.com/show/rjNBdvnVXNzvE2yw',
'md5': '0c2cfd068ee2afe657801269b2d86214',
'info_dict': {
'id': 'rjNBdvnVXNzvE2yw',
'ext': 'mp4',
'title': '陈一发儿:砒霜 我有个室友系列04-01 22点场',
'duration': 7150.08,
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '陈一发儿',
'uploader_id': 'XrZwYelr5wbK',
'uploader_url': 'https://v.douyu.com/author/XrZwYelr5wbK',
'upload_date': '20170402',
},
}, {
'url': 'https://vmobile.douyu.com/show/rjNBdvnVXNzvE2yw',
'only_matching': True,
}]
def _real_extract(self, url):
url = url.replace('vmobile.', 'v.')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
room_info = self._parse_json(self._search_regex(
r'var\s+\$ROOM\s*=\s*({.+});', webpage, 'room info'), video_id)
video_info = None
for trial in range(5):
# Sometimes Douyu rejects our request. Let's try it more times
try:
video_info = self._download_json(
'https://vmobile.douyu.com/video/getInfo', video_id,
query={'vid': video_id},
headers={
'Referer': url,
'x-requested-with': 'XMLHttpRequest',
})
break
except ExtractorError:
self._sleep(1, video_id)
if not video_info:
raise ExtractorError('Can\'t fetch video info')
formats = self._extract_m3u8_formats(
video_info['data']['video_url'], video_id,
entry_protocol='m3u8_native', ext='mp4')
upload_date = unified_strdate(self._html_search_regex(
r'<em>上传时间:</em><span>([^<]+)</span>', webpage,
'upload date', fatal=False))
uploader = uploader_id = uploader_url = None
mobj = re.search(
r'(?m)<a[^>]+href="/author/([0-9a-zA-Z]+)".+?<strong[^>]+title="([^"]+)"',
webpage)
if mobj:
uploader_id, uploader = mobj.groups()
uploader_url = urljoin(url, '/author/' + uploader_id)
return {
'id': video_id,
'title': room_info['name'],
'formats': formats,
'duration': room_info.get('duration'),
'thumbnail': room_info.get('pic'),
'upload_date': upload_date,
'uploader': uploader,
'uploader_id': uploader_id,
'uploader_url': uploader_url,
}

View File

@@ -7,16 +7,18 @@ import time
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_urlparse,
compat_HTTPError, compat_HTTPError,
compat_str,
compat_urlparse,
) )
from ..utils import ( from ..utils import (
USER_AGENTS,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
unified_strdate,
remove_end, remove_end,
try_get,
unified_strdate,
update_url_query, update_url_query,
USER_AGENTS,
) )
@@ -183,28 +185,44 @@ class DPlayItIE(InfoExtractor):
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
info_url = self._search_regex(
r'url\s*:\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
webpage, 'video id')
title = remove_end(self._og_search_title(webpage), ' | Dplay') title = remove_end(self._og_search_title(webpage), ' | Dplay')
try: video_id = None
info = self._download_json(
info_url, display_id, headers={ info = self._search_regex(
'Authorization': 'Bearer %s' % self._get_cookies(url).get( r'playback_json\s*:\s*JSON\.parse\s*\(\s*("(?:\\.|[^"\\])+?")',
'dplayit_token').value, webpage, 'playback JSON', default=None)
'Referer': url, if info:
}) for _ in range(2):
except ExtractorError as e: info = self._parse_json(info, display_id, fatal=False)
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 403): if not info:
info = self._parse_json(e.cause.read().decode('utf-8'), display_id) break
error = info['errors'][0] else:
if error.get('code') == 'access.denied.geoblocked': video_id = try_get(info, lambda x: x['data']['id'])
self.raise_geo_restricted(
msg=error.get('detail'), countries=self._GEO_COUNTRIES) if not info:
raise ExtractorError(info['errors'][0]['detail'], expected=True) info_url = self._search_regex(
raise r'url\s*[:=]\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
webpage, 'info url')
video_id = info_url.rpartition('/')[-1]
try:
info = self._download_json(
info_url, display_id, headers={
'Authorization': 'Bearer %s' % self._get_cookies(url).get(
'dplayit_token').value,
'Referer': url,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 403):
info = self._parse_json(e.cause.read().decode('utf-8'), display_id)
error = info['errors'][0]
if error.get('code') == 'access.denied.geoblocked':
self.raise_geo_restricted(
msg=error.get('detail'), countries=self._GEO_COUNTRIES)
raise ExtractorError(info['errors'][0]['detail'], expected=True)
raise
hls_url = info['data']['attributes']['streaming']['hls']['url'] hls_url = info['data']['attributes']['streaming']['hls']['url']
@@ -230,7 +248,7 @@ class DPlayItIE(InfoExtractor):
season_number = episode_number = upload_date = None season_number = episode_number = upload_date = None
return { return {
'id': info_url.rpartition('/')[-1], 'id': compat_str(video_id or display_id),
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': self._og_search_description(webpage), 'description': self._og_search_description(webpage),

View File

@@ -12,6 +12,7 @@ from ..utils import (
ExtractorError, ExtractorError,
clean_html, clean_html,
int_or_none, int_or_none,
remove_end,
sanitized_Request, sanitized_Request,
urlencode_postdata urlencode_postdata
) )
@@ -72,15 +73,15 @@ class DramaFeverIE(DramaFeverBaseIE):
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/', 'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': { 'info_dict': {
'id': '4512.1', 'id': '4512.1',
'ext': 'mp4', 'ext': 'flv',
'title': 'Cooking with Shin 4512.1', 'title': 'Cooking with Shin',
'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0', 'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'episode': 'Episode 1', 'episode': 'Episode 1',
'episode_number': 1, 'episode_number': 1,
'thumbnail': r're:^https?://.*\.jpg', 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1404336058, 'timestamp': 1404336058,
'upload_date': '20140702', 'upload_date': '20140702',
'duration': 343, 'duration': 344,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@@ -90,15 +91,15 @@ class DramaFeverIE(DramaFeverBaseIE):
'url': 'http://www.dramafever.com/drama/4826/4/Mnet_Asian_Music_Awards_2015/?ap=1', 'url': 'http://www.dramafever.com/drama/4826/4/Mnet_Asian_Music_Awards_2015/?ap=1',
'info_dict': { 'info_dict': {
'id': '4826.4', 'id': '4826.4',
'ext': 'mp4', 'ext': 'flv',
'title': 'Mnet Asian Music Awards 2015 4826.4', 'title': 'Mnet Asian Music Awards 2015',
'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91', 'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91',
'episode': 'Mnet Asian Music Awards 2015 - Part 3', 'episode': 'Mnet Asian Music Awards 2015 - Part 3',
'episode_number': 4, 'episode_number': 4,
'thumbnail': r're:^https?://.*\.jpg', 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1450213200, 'timestamp': 1450213200,
'upload_date': '20151215', 'upload_date': '20151215',
'duration': 5602, 'duration': 5359,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@@ -122,6 +123,10 @@ class DramaFeverIE(DramaFeverBaseIE):
countries=self._GEO_COUNTRIES) countries=self._GEO_COUNTRIES)
raise raise
# title is postfixed with video id for some reason, removing
if info.get('title'):
info['title'] = remove_end(info['title'], video_id).strip()
series_id, episode_number = video_id.split('.') series_id, episode_number = video_id.split('.')
episode_info = self._download_json( episode_info = self._download_json(
# We only need a single episode info, so restricting page size to one episode # We only need a single episode info, so restricting page size to one episode

View File

@@ -1,135 +1,59 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, js_to_json,
parse_iso8601, parse_duration,
unescapeHTML,
) )
class DRBonanzaIE(InfoExtractor): class DRBonanzaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dr\.dk/bonanza/(?:[^/]+/)+(?:[^/])+?(?:assetId=(?P<id>\d+))?(?:[#&]|$)' _VALID_URL = r'https?://(?:www\.)?dr\.dk/bonanza/[^/]+/\d+/[^/]+/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
_TEST = {
_TESTS = [{ 'url': 'http://www.dr.dk/bonanza/serie/154/matador/40312/matador---0824-komme-fremmede-',
'url': 'http://www.dr.dk/bonanza/serie/portraetter/Talkshowet.htm?assetId=65517',
'info_dict': { 'info_dict': {
'id': '65517', 'id': '40312',
'display_id': 'matador---0824-komme-fremmede-',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Talkshowet - Leonard Cohen', 'title': 'MATADOR - 08:24. "Komme fremmede".',
'description': 'md5:8f34194fb30cd8c8a30ad8b27b70c0ca', 'description': 'md5:77b4c1ac4d4c1b9d610ab4395212ff84',
'thumbnail': r're:^https?://.*\.(?:gif|jpg)$', 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1295537932, 'duration': 4613,
'upload_date': '20110120',
'duration': 3664,
}, },
'params': { }
'skip_download': True, # requires rtmp
},
}, {
'url': 'http://www.dr.dk/bonanza/radio/serie/sport/fodbold.htm?assetId=59410',
'md5': '6dfe039417e76795fb783c52da3de11d',
'info_dict': {
'id': '59410',
'ext': 'mp3',
'title': 'EM fodbold 1992 Danmark - Tyskland finale Transmission',
'description': 'md5:501e5a195749480552e214fbbed16c4e',
'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1223274900,
'upload_date': '20081006',
'duration': 7369,
},
}]
def _real_extract(self, url): def _real_extract(self, url):
url_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
webpage = self._download_webpage(url, url_id) video_id, display_id = mobj.group('id', 'display_id')
if url_id: webpage = self._download_webpage(url, display_id)
info = json.loads(self._html_search_regex(r'({.*?%s.*})' % url_id, webpage, 'json'))
else:
# Just fetch the first video on that page
info = json.loads(self._html_search_regex(r'bonanzaFunctions.newPlaylist\(({.*})\)', webpage, 'json'))
asset_id = str(info['AssetId']) info = self._parse_html5_media_entries(
title = info['Title'].rstrip(' \'\"-,.:;!?') url, webpage, display_id, m3u8_id='hls',
duration = int_or_none(info.get('Duration'), scale=1000) m3u8_entry_protocol='m3u8_native')[0]
# First published online. "FirstPublished" contains the date for original airing. self._sort_formats(info['formats'])
timestamp = parse_iso8601(
re.sub(r'\.\d+$', '', info['Created']))
def parse_filename_info(url): asset = self._parse_json(
match = re.search(r'/\d+_(?P<width>\d+)x(?P<height>\d+)x(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url) self._search_regex(
if match: r'(?s)currentAsset\s*=\s*({.+?})\s*</script', webpage, 'asset'),
return { display_id, transform_source=js_to_json)
'width': int(match.group('width')),
'height': int(match.group('height')),
'vbr': int(match.group('bitrate')),
'ext': match.group('ext')
}
match = re.search(r'/\d+_(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url)
if match:
return {
'vbr': int(match.group('bitrate')),
'ext': match.group(2)
}
return {}
video_types = ['VideoHigh', 'VideoMid', 'VideoLow'] title = unescapeHTML(asset['AssetTitle']).strip()
preferencemap = {
'VideoHigh': -1,
'VideoMid': -2,
'VideoLow': -3,
'Audio': -4,
}
formats = [] def extract(field):
for file in info['Files']: return self._search_regex(
if info['Type'] == 'Video': r'<div[^>]+>\s*<p>%s:<p>\s*</div>\s*<div[^>]+>\s*<p>([^<]+)</p>' % field,
if file['Type'] in video_types: webpage, field, default=None)
format = parse_filename_info(file['Location'])
format.update({
'url': file['Location'],
'format_id': file['Type'].replace('Video', ''),
'preference': preferencemap.get(file['Type'], -10),
})
if format['url'].startswith('rtmp'):
rtmp_url = format['url']
format['rtmp_live'] = True # --resume does not work
if '/bonanza/' in rtmp_url:
format['play_path'] = rtmp_url.split('/bonanza/')[1]
formats.append(format)
elif file['Type'] == 'Thumb':
thumbnail = file['Location']
elif info['Type'] == 'Audio':
if file['Type'] == 'Audio':
format = parse_filename_info(file['Location'])
format.update({
'url': file['Location'],
'format_id': file['Type'],
'vcodec': 'none',
})
formats.append(format)
elif file['Type'] == 'Thumb':
thumbnail = file['Location']
description = '%s\n%s\n%s\n' % ( info.update({
info['Description'], info['Actors'], info['Colophon']) 'id': asset.get('AssetId') or video_id,
self._sort_formats(formats)
display_id = re.sub(r'[^\w\d-]', '', re.sub(r' ', '-', title.lower())) + '-' + asset_id
display_id = re.sub(r'-+', '-', display_id)
return {
'id': asset_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'formats': formats, 'description': extract('Programinfo'),
'description': description, 'duration': parse_duration(extract('Tid')),
'thumbnail': thumbnail, 'thumbnail': asset.get('AssetImageUrl'),
'timestamp': timestamp, })
'duration': duration, return info
}

View File

@@ -44,8 +44,23 @@ class DrTuberIE(InfoExtractor):
webpage = self._download_webpage( webpage = self._download_webpage(
'http://www.drtuber.com/video/%s' % video_id, display_id) 'http://www.drtuber.com/video/%s' % video_id, display_id)
video_url = self._html_search_regex( video_data = self._download_json(
r'<source src="([^"]+)"', webpage, 'video URL') 'http://www.drtuber.com/player_config_json/', video_id, query={
'vid': video_id,
'embed': 0,
'aid': 0,
'domain_id': 0,
})
formats = []
for format_id, video_url in video_data['files'].items():
if video_url:
formats.append({
'format_id': format_id,
'quality': 2 if format_id == 'hq' else 1,
'url': video_url
})
self._sort_formats(formats)
title = self._html_search_regex( title = self._html_search_regex(
(r'class="title_watch"[^>]*><(?:p|h\d+)[^>]*>([^<]+)<', (r'class="title_watch"[^>]*><(?:p|h\d+)[^>]*>([^<]+)<',
@@ -75,7 +90,7 @@ class DrTuberIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'url': video_url, 'formats': formats,
'title': title, 'title': title,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'like_count': like_count, 'like_count': like_count,

View File

@@ -20,7 +20,7 @@ class DRTVIE(InfoExtractor):
IE_NAME = 'drtv' IE_NAME = 'drtv'
_TESTS = [{ _TESTS = [{
'url': 'https://www.dr.dk/tv/se/boern/ultra/klassen-ultra/klassen-darlig-taber-10', 'url': 'https://www.dr.dk/tv/se/boern/ultra/klassen-ultra/klassen-darlig-taber-10',
'md5': '25e659cccc9a2ed956110a299fdf5983', 'md5': '7ae17b4e18eb5d29212f424a7511c184',
'info_dict': { 'info_dict': {
'id': 'klassen-darlig-taber-10', 'id': 'klassen-darlig-taber-10',
'ext': 'mp4', 'ext': 'mp4',
@@ -30,21 +30,37 @@ class DRTVIE(InfoExtractor):
'upload_date': '20160823', 'upload_date': '20160823',
'duration': 606.84, 'duration': 606.84,
}, },
'params': {
'skip_download': True,
},
}, { }, {
# embed
'url': 'https://www.dr.dk/nyheder/indland/live-christianias-rydning-af-pusher-street-er-i-gang', 'url': 'https://www.dr.dk/nyheder/indland/live-christianias-rydning-af-pusher-street-er-i-gang',
'md5': '2c37175c718155930f939ef59952474a',
'info_dict': { 'info_dict': {
'id': 'christiania-pusher-street-ryddes-drdkrjpo', 'id': 'christiania-pusher-street-ryddes-drdkrjpo',
'ext': 'mp4', 'ext': 'mp4',
'title': 'LIVE Christianias rydning af Pusher Street er i gang', 'title': 'LIVE Christianias rydning af Pusher Street er i gang',
'description': '- Det er det fedeste, der er sket i 20 år, fortæller christianit til DR Nyheder.', 'description': 'md5:2a71898b15057e9b97334f61d04e6eb5',
'timestamp': 1472800279, 'timestamp': 1472800279,
'upload_date': '20160902', 'upload_date': '20160902',
'duration': 131.4, 'duration': 131.4,
}, },
'params': {
'skip_download': True,
},
}, {
# with SignLanguage formats
'url': 'https://www.dr.dk/tv/se/historien-om-danmark/-/historien-om-danmark-stenalder',
'info_dict': {
'id': 'historien-om-danmark-stenalder',
'ext': 'mp4',
'title': 'Historien om Danmark: Stenalder (1)',
'description': 'md5:8c66dcbc1669bbc6f873879880f37f2a',
'timestamp': 1490401996,
'upload_date': '20170325',
'duration': 3502.04,
'formats': 'mincount:20',
},
'params': {
'skip_download': True,
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -88,7 +104,7 @@ class DRTVIE(InfoExtractor):
elif kind in ('VideoResource', 'AudioResource'): elif kind in ('VideoResource', 'AudioResource'):
duration = float_or_none(asset.get('DurationInMilliseconds'), 1000) duration = float_or_none(asset.get('DurationInMilliseconds'), 1000)
restricted_to_denmark = asset.get('RestrictedToDenmark') restricted_to_denmark = asset.get('RestrictedToDenmark')
spoken_subtitles = asset.get('Target') == 'SpokenSubtitles' asset_target = asset.get('Target')
for link in asset.get('Links', []): for link in asset.get('Links', []):
uri = link.get('Uri') uri = link.get('Uri')
if not uri: if not uri:
@@ -96,13 +112,13 @@ class DRTVIE(InfoExtractor):
target = link.get('Target') target = link.get('Target')
format_id = target or '' format_id = target or ''
preference = None preference = None
if spoken_subtitles: if asset_target in ('SpokenSubtitles', 'SignLanguage'):
preference = -1 preference = -1
format_id += '-spoken-subtitles' format_id += '-%s' % asset_target
if target == 'HDS': if target == 'HDS':
f4m_formats = self._extract_f4m_formats( f4m_formats = self._extract_f4m_formats(
uri + '?hdcore=3.3.0&plugin=aasp-3.3.0.99.43', uri + '?hdcore=3.3.0&plugin=aasp-3.3.0.99.43',
video_id, preference, f4m_id=format_id) video_id, preference, f4m_id=format_id, fatal=False)
if kind == 'AudioResource': if kind == 'AudioResource':
for f in f4m_formats: for f in f4m_formats:
f['vcodec'] = 'none' f['vcodec'] = 'none'
@@ -110,7 +126,8 @@ class DRTVIE(InfoExtractor):
elif target == 'HLS': elif target == 'HLS':
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
uri, video_id, 'mp4', entry_protocol='m3u8_native', uri, video_id, 'mp4', entry_protocol='m3u8_native',
preference=preference, m3u8_id=format_id)) preference=preference, m3u8_id=format_id,
fatal=False))
else: else:
bitrate = link.get('Bitrate') bitrate = link.get('Bitrate')
if bitrate: if bitrate:

View File

@@ -5,9 +5,12 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
js_to_json, determine_ext,
unescapeHTML,
ExtractorError, ExtractorError,
int_or_none,
js_to_json,
mimetype2ext,
unescapeHTML,
) )
@@ -24,14 +27,7 @@ class DVTVIE(InfoExtractor):
'id': 'dc0768de855511e49e4b0025900fea04', 'id': 'dc0768de855511e49e4b0025900fea04',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Vondra o Českém století: Při pohledu na Havla mi bylo trapně', 'title': 'Vondra o Českém století: Při pohledu na Havla mi bylo trapně',
} 'duration': 1484,
}, {
'url': 'http://video.aktualne.cz/dvtv/stropnicky-policie-vrbetice-preventivne-nekontrolovala/r~82ed4322849211e4a10c0025900fea04/',
'md5': '6388f1941b48537dbd28791f712af8bf',
'info_dict': {
'id': '72c02230849211e49f60002590604f2e',
'ext': 'mp4',
'title': 'Stropnický: Policie Vrbětice preventivně nekontrolovala',
} }
}, { }, {
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/', 'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
@@ -44,55 +40,100 @@ class DVTVIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'b0b40906854d11e4bdad0025900fea04', 'id': 'b0b40906854d11e4bdad0025900fea04',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Drtinová Veselovský TV 16. 12. 2014: Témata dne' 'title': 'Drtinová Veselovský TV 16. 12. 2014: Témata dne',
'description': 'md5:0916925dea8e30fe84222582280b47a0',
'timestamp': 1418760010,
'upload_date': '20141216',
} }
}, { }, {
'md5': '5f7652a08b05009c1292317b449ffea2', 'md5': '5f7652a08b05009c1292317b449ffea2',
'info_dict': { 'info_dict': {
'id': '420ad9ec854a11e4bdad0025900fea04', 'id': '420ad9ec854a11e4bdad0025900fea04',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Školní masakr možná změní boj s Talibanem, říká novinářka' 'title': 'Školní masakr možná změní boj s Talibanem, říká novinářka',
'description': 'md5:ff2f9f6de73c73d7cef4f756c1c1af42',
'timestamp': 1418760010,
'upload_date': '20141216',
} }
}, { }, {
'md5': '498eb9dfa97169f409126c617e2a3d64', 'md5': '498eb9dfa97169f409126c617e2a3d64',
'info_dict': { 'info_dict': {
'id': '95d35580846a11e4b6d20025900fea04', 'id': '95d35580846a11e4b6d20025900fea04',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Boj o kliniku: Veřejný zájem, nebo právo na majetek?' 'title': 'Boj o kliniku: Veřejný zájem, nebo právo na majetek?',
'description': 'md5:889fe610a70fee5511dc3326a089188e',
'timestamp': 1418760010,
'upload_date': '20141216',
} }
}, { }, {
'md5': 'b8dc6b744844032dab6ba3781a7274b9', 'md5': 'b8dc6b744844032dab6ba3781a7274b9',
'info_dict': { 'info_dict': {
'id': '6fe14d66853511e4833a0025900fea04', 'id': '6fe14d66853511e4833a0025900fea04',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Pánek: Odmítání syrských uprchlíků je ostudou české vlády' 'title': 'Pánek: Odmítání syrských uprchlíků je ostudou české vlády',
'description': 'md5:544f86de6d20c4815bea11bf2ac3004f',
'timestamp': 1418760010,
'upload_date': '20141216',
} }
}], }],
}, {
'url': 'https://video.aktualne.cz/dvtv/zeman-si-jen-leci-mindraky-sobotku-nenavidi-a-babis-se-mu-te/r~960cdb3a365a11e7a83b0025900fea04/',
'md5': 'f8efe9656017da948369aa099788c8ea',
'info_dict': {
'id': '3c496fec365911e7a6500025900fea04',
'ext': 'mp4',
'title': 'Zeman si jen léčí mindráky, Sobotku nenávidí a Babiš se mu teď hodí, tvrdí Kmenta',
'duration': 1103,
},
'params': {
'skip_download': True,
},
}, { }, {
'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/', 'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
'only_matching': True, 'only_matching': True,
}] }]
def _parse_video_metadata(self, js, video_id): def _parse_video_metadata(self, js, video_id):
metadata = self._parse_json(js, video_id, transform_source=js_to_json) data = self._parse_json(js, video_id, transform_source=js_to_json)
title = unescapeHTML(data['title'])
formats = [] formats = []
for video in metadata['sources']: for video in data['sources']:
ext = video['type'][6:] video_url = video.get('file')
formats.append({ if not video_url:
'url': video['file'], continue
'ext': ext, video_type = video.get('type')
'format_id': '%s-%s' % (ext, video['label']), ext = determine_ext(video_url, mimetype2ext(video_type))
'height': int(video['label'].rstrip('p')), if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
'fps': 25, formats.extend(self._extract_m3u8_formats(
}) video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif video_type == 'application/dash+xml' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
label = video.get('label')
height = self._search_regex(
r'^(\d+)[pP]', label or '', 'height', default=None)
format_id = ['http']
for f in (ext, label):
if f:
format_id.append(f)
formats.append({
'url': video_url,
'format_id': '-'.join(format_id),
'height': int_or_none(height),
})
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': metadata['mediaid'], 'id': data.get('mediaid') or video_id,
'title': unescapeHTML(metadata['title']), 'title': title,
'thumbnail': self._proto_relative_url(metadata['image'], 'http:'), 'description': data.get('description'),
'thumbnail': data.get('image'),
'duration': int_or_none(data.get('duration')),
'timestamp': int_or_none(data.get('pubtime')),
'formats': formats 'formats': formats
} }
@@ -103,7 +144,7 @@ class DVTVIE(InfoExtractor):
# single video # single video
item = self._search_regex( item = self._search_regex(
r"(?s)embedData[0-9a-f]{32}\['asset'\]\s*=\s*(\{.+?\});", r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
webpage, 'video', default=None, fatal=False) webpage, 'video', default=None, fatal=False)
if item: if item:
@@ -113,6 +154,8 @@ class DVTVIE(InfoExtractor):
items = re.findall( items = re.findall(
r"(?s)BBX\.context\.assets\['[0-9a-f]{32}'\]\.push\(({.+?})\);", r"(?s)BBX\.context\.assets\['[0-9a-f]{32}'\]\.push\(({.+?})\);",
webpage) webpage)
if not items:
items = re.findall(r'(?s)var\s+asset\s*=\s*({.+?});\n', webpage)
if items: if items:
return { return {

View File

@@ -11,6 +11,7 @@ from ..compat import (
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
unsmuggle_url,
) )
@@ -50,6 +51,10 @@ class EaglePlatformIE(InfoExtractor):
'view_count': int, 'view_count': int,
}, },
'skip': 'Georestricted', 'skip': 'Georestricted',
}, {
# referrer protected video (https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/)
'url': 'eagleplatform:tvrainru.media.eagleplatform.com:582306',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@@ -60,16 +65,40 @@ class EaglePlatformIE(InfoExtractor):
webpage) webpage)
if mobj is not None: if mobj is not None:
return mobj.group('url') return mobj.group('url')
# Basic usage embedding (see http://dultonmedia.github.io/eplayer/) PLAYER_JS_RE = r'''
<script[^>]+
src=(?P<qjs>["\'])(?:https?:)?//(?P<host>(?:(?!(?P=qjs)).)+\.media\.eagleplatform\.com)/player/player\.js(?P=qjs)
.+?
'''
# "Basic usage" embedding (see http://dultonmedia.github.io/eplayer/)
mobj = re.search( mobj = re.search(
r'''(?xs) r'''(?xs)
<script[^>]+ %s
src=(?P<q1>["\'])(?:https?:)?//(?P<host>.+?\.media\.eagleplatform\.com)/player/player\.js(?P=q1)
.+?
<div[^>]+ <div[^>]+
class=(?P<q2>["\'])eagleplayer(?P=q2)[^>]+ class=(?P<qclass>["\'])eagleplayer(?P=qclass)[^>]+
data-id=["\'](?P<id>\d+) data-id=["\'](?P<id>\d+)
''', webpage) ''' % PLAYER_JS_RE, webpage)
if mobj is not None:
return 'eagleplatform:%(host)s:%(id)s' % mobj.groupdict()
# Generalization of "Javascript code usage", "Combined usage" and
# "Usage without attaching to DOM" embeddings (see
# http://dultonmedia.github.io/eplayer/)
mobj = re.search(
r'''(?xs)
%s
<script>
.+?
new\s+EaglePlayer\(
(?:[^,]+\s*,\s*)?
{
.+?
\bid\s*:\s*["\']?(?P<id>\d+)
.+?
}
\s*\)
.+?
</script>
''' % PLAYER_JS_RE, webpage)
if mobj is not None: if mobj is not None:
return 'eagleplatform:%(host)s:%(id)s' % mobj.groupdict() return 'eagleplatform:%(host)s:%(id)s' % mobj.groupdict()
@@ -79,9 +108,10 @@ class EaglePlatformIE(InfoExtractor):
if status != 200: if status != 200:
raise ExtractorError(' '.join(response['errors']), expected=True) raise ExtractorError(' '.join(response['errors']), expected=True)
def _download_json(self, url_or_request, video_id, note='Downloading JSON metadata', *args, **kwargs): def _download_json(self, url_or_request, video_id, *args, **kwargs):
try: try:
response = super(EaglePlatformIE, self)._download_json(url_or_request, video_id, note) response = super(EaglePlatformIE, self)._download_json(
url_or_request, video_id, *args, **kwargs)
except ExtractorError as ee: except ExtractorError as ee:
if isinstance(ee.cause, compat_HTTPError): if isinstance(ee.cause, compat_HTTPError):
response = self._parse_json(ee.cause.read().decode('utf-8'), video_id) response = self._parse_json(ee.cause.read().decode('utf-8'), video_id)
@@ -93,11 +123,24 @@ class EaglePlatformIE(InfoExtractor):
return self._download_json(url_or_request, video_id, note)['data'][0] return self._download_json(url_or_request, video_id, note)['data'][0]
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id') host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id')
headers = {}
query = {
'id': video_id,
}
referrer = smuggled_data.get('referrer')
if referrer:
headers['Referer'] = referrer
query['referrer'] = referrer
player_data = self._download_json( player_data = self._download_json(
'http://%s/api/player_data?id=%s' % (host, video_id), video_id) 'http://%s/api/player_data' % host, video_id,
headers=headers, query=query)
media = player_data['data']['playlist']['viewports'][0]['medialist'][0] media = player_data['data']['playlist']['viewports'][0]['medialist'][0]

View File

@@ -1,15 +1,18 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
int_or_none,
try_get,
unified_timestamp,
)
class EggheadCourseIE(InfoExtractor): class EggheadCourseIE(InfoExtractor):
IE_DESC = 'egghead.io course' IE_DESC = 'egghead.io course'
IE_NAME = 'egghead:course' IE_NAME = 'egghead:course'
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[a-zA-Z_0-9-]+)' _VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
_TEST = { _TEST = {
'url': 'https://egghead.io/courses/professor-frisby-introduces-composable-functional-javascript', 'url': 'https://egghead.io/courses/professor-frisby-introduces-composable-functional-javascript',
'playlist_count': 29, 'playlist_count': 29,
@@ -22,18 +25,60 @@ class EggheadCourseIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(r'<h1 class="title">([^<]+)</h1>', webpage, 'title') course = self._download_json(
ul = self._search_regex(r'(?s)<ul class="series-lessons-list">(.*?)</ul>', webpage, 'session list') 'https://egghead.io/api/v1/series/%s' % playlist_id, playlist_id)
found = re.findall(r'(?s)<a class="[^"]*"\s*href="([^"]+)">\s*<li class="item', ul) entries = [
entries = [self.url_result(m) for m in found] self.url_result(
'wistia:%s' % lesson['wistia_id'], ie='Wistia',
video_id=lesson['wistia_id'], video_title=lesson.get('title'))
for lesson in course['lessons'] if lesson.get('wistia_id')]
return self.playlist_result(
entries, playlist_id, course.get('title'),
course.get('description'))
class EggheadLessonIE(InfoExtractor):
IE_DESC = 'egghead.io lesson'
IE_NAME = 'egghead:lesson'
_VALID_URL = r'https://egghead\.io/lessons/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'https://egghead.io/lessons/javascript-linear-data-flow-with-container-style-types-box',
'info_dict': {
'id': 'fv5yotjxcg',
'ext': 'mp4',
'title': 'Create linear data flow with container style types (Box)',
'description': 'md5:9aa2cdb6f9878ed4c39ec09e85a8150e',
'thumbnail': r're:^https?:.*\.jpg$',
'timestamp': 1481296768,
'upload_date': '20161209',
'duration': 304,
'view_count': 0,
'tags': ['javascript', 'free'],
},
'params': {
'skip_download': True,
},
}
def _real_extract(self, url):
lesson_id = self._match_id(url)
lesson = self._download_json(
'https://egghead.io/api/v1/lessons/%s' % lesson_id, lesson_id)
return { return {
'_type': 'playlist', '_type': 'url_transparent',
'id': playlist_id, 'ie_key': 'Wistia',
'title': title, 'url': 'wistia:%s' % lesson['wistia_id'],
'description': self._og_search_description(webpage), 'id': lesson['wistia_id'],
'entries': entries, 'title': lesson.get('title'),
'description': lesson.get('summary'),
'thumbnail': lesson.get('thumb_nail'),
'timestamp': unified_timestamp(lesson.get('published_at')),
'duration': int_or_none(lesson.get('duration')),
'view_count': int_or_none(lesson.get('plays_count')),
'tags': try_get(lesson, lambda x: x['tag_list'], list),
} }

View File

@@ -10,7 +10,25 @@ from ..utils import (
class ESPNIE(InfoExtractor): class ESPNIE(InfoExtractor):
_VALID_URL = r'https?://(?:espn\.go|(?:www\.)?espn)\.com/video/clip(?:\?.*?\bid=|/_/id/)(?P<id>\d+)' _VALID_URL = r'''(?x)
https?://
(?:
(?:(?:\w+\.)+)?espn\.go|
(?:www\.)?espn
)\.com/
(?:
(?:
video/clip|
watch/player
)
(?:
\?.*?\bid=|
/_/id/
)
)
(?P<id>\d+)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://espn.go.com/video/clip?id=10365079', 'url': 'http://espn.go.com/video/clip?id=10365079',
'info_dict': { 'info_dict': {
@@ -25,20 +43,34 @@ class ESPNIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
}, { }, {
# intl video, from http://www.espnfc.us/video/mls-highlights/150/video/2743663/must-see-moments-best-of-the-mls-season 'url': 'https://broadband.espn.go.com/video/clip?id=18910086',
'url': 'http://espn.go.com/video/clip?id=2743663',
'info_dict': { 'info_dict': {
'id': '2743663', 'id': '18910086',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Must-See Moments: Best of the MLS season', 'title': 'Kyrie spins around defender for two',
'description': 'md5:4c2d7232beaea572632bec41004f0aeb', 'description': 'md5:2b0f5bae9616d26fba8808350f0d2b9b',
'timestamp': 1449446454, 'timestamp': 1489539155,
'upload_date': '20151207', 'upload_date': '20170315',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['Unable to download f4m manifest'], 'expected_warnings': ['Unable to download f4m manifest'],
}, {
'url': 'http://nonredline.sports.espn.go.com/video/clip?id=19744672',
'only_matching': True,
}, {
'url': 'https://cdn.espn.go.com/video/clip/_/id/19771774',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?id=19141491',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?bucketId=257&id=19505875',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player/_/id/19141491',
'only_matching': True,
}, { }, {
'url': 'http://www.espn.com/video/clip?id=10365079', 'url': 'http://www.espn.com/video/clip?id=10365079',
'only_matching': True, 'only_matching': True,

View File

@@ -41,6 +41,7 @@ from .alphaporno import AlphaPornoIE
from .amcnetworks import AMCNetworksIE from .amcnetworks import AMCNetworksIE
from .animeondemand import AnimeOnDemandIE from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE from .anitube import AnitubeIE
from .anvato import AnvatoIE
from .anysex import AnySexIE from .anysex import AnySexIE
from .aol import AolIE from .aol import AolIE
from .allocine import AllocineIE from .allocine import AllocineIE
@@ -70,6 +71,10 @@ from .arte import (
TheOperaPlatformIE, TheOperaPlatformIE,
ArteTVPlaylistIE, ArteTVPlaylistIE,
) )
from .asiancrush import (
AsianCrushIE,
AsianCrushPlaylistIE,
)
from .atresplayer import AtresPlayerIE from .atresplayer import AtresPlayerIE
from .atttechchannel import ATTTechChannelIE from .atttechchannel import ATTTechChannelIE
from .atvat import ATVAtIE from .atvat import ATVAtIE
@@ -87,10 +92,9 @@ from .azmedien import (
AZMedienPlaylistIE, AZMedienPlaylistIE,
AZMedienShowPlaylistIE, AZMedienShowPlaylistIE,
) )
from .azubu import AzubuIE, AzubuLiveIE
from .baidu import BaiduVideoIE from .baidu import BaiduVideoIE
from .bambuser import BambuserIE, BambuserChannelIE from .bambuser import BambuserIE, BambuserChannelIE
from .bandcamp import BandcampIE, BandcampAlbumIE from .bandcamp import BandcampIE, BandcampAlbumIE, BandcampWeeklyIE
from .bbc import ( from .bbc import (
BBCCoUkIE, BBCCoUkIE,
BBCCoUkArticleIE, BBCCoUkArticleIE,
@@ -98,7 +102,10 @@ from .bbc import (
BBCCoUkPlaylistIE, BBCCoUkPlaylistIE,
BBCIE, BBCIE,
) )
from .beampro import BeamProLiveIE from .beampro import (
BeamProLiveIE,
BeamProVodIE,
)
from .beeg import BeegIE from .beeg import BeegIE
from .behindkink import BehindKinkIE from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE from .bellmedia import BellMediaIE
@@ -178,7 +185,7 @@ from .chirbit import (
ChirbitProfileIE, ChirbitProfileIE,
) )
from .cinchcast import CinchcastIE from .cinchcast import CinchcastIE
from .clipfish import ClipfishIE from .cjsw import CJSWIE
from .cliphunter import CliphunterIE from .cliphunter import CliphunterIE
from .cliprs import ClipRsIE from .cliprs import ClipRsIE
from .clipsyndicate import ClipsyndicateIE from .clipsyndicate import ClipsyndicateIE
@@ -251,7 +258,10 @@ from .democracynow import DemocracynowIE
from .dfb import DFBIE from .dfb import DFBIE
from .dhm import DHMIE from .dhm import DHMIE
from .dotsub import DotsubIE from .dotsub import DotsubIE
from .douyutv import DouyuTVIE from .douyutv import (
DouyuShowIE,
DouyuTVIE,
)
from .dplay import ( from .dplay import (
DPlayIE, DPlayIE,
DPlayItIE, DPlayItIE,
@@ -287,7 +297,10 @@ from .dw import (
from .eagleplatform import EaglePlatformIE from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE from .echomsk import EchoMskIE
from .egghead import EggheadCourseIE from .egghead import (
EggheadCourseIE,
EggheadLessonIE,
)
from .ehow import EHowIE from .ehow import EHowIE
from .eighttracks import EightTracksIE from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE from .einthusan import EinthusanIE
@@ -337,7 +350,12 @@ from .flipagram import FlipagramIE
from .folketinget import FolketingetIE from .folketinget import FolketingetIE
from .footyroom import FootyRoomIE from .footyroom import FootyRoomIE
from .formula1 import Formula1IE from .formula1 import Formula1IE
from .fourtube import FourTubeIE from .fourtube import (
FourTubeIE,
PornTubeIE,
PornerBrosIE,
FuxIE,
)
from .fox import FOXIE from .fox import FOXIE
from .fox9 import FOX9IE from .fox9 import FOX9IE
from .foxgay import FoxgayIE from .foxgay import FoxgayIE
@@ -350,9 +368,9 @@ from .foxsports import FoxSportsIE
from .franceculture import FranceCultureIE from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE from .franceinter import FranceInterIE
from .francetv import ( from .francetv import (
PluzzIE,
FranceTvInfoIE,
FranceTVIE, FranceTVIE,
FranceTVEmbedIE,
FranceTVInfoIE,
GenerationQuoiIE, GenerationQuoiIE,
CultureboxIE, CultureboxIE,
) )
@@ -386,7 +404,6 @@ from .globo import (
from .go import GoIE from .go import GoIE
from .go90 import Go90IE from .go90 import Go90IE
from .godtube import GodTubeIE from .godtube import GodTubeIE
from .godtv import GodTVIE
from .golem import GolemIE from .golem import GolemIE
from .googledrive import GoogleDriveIE from .googledrive import GoogleDriveIE
from .googleplus import GooglePlusIE from .googleplus import GooglePlusIE
@@ -460,6 +477,7 @@ from .jamendo import (
) )
from .jeuxvideo import JeuxVideoIE from .jeuxvideo import JeuxVideoIE
from .jove import JoveIE from .jove import JoveIE
from .joj import JojIE
from .jwplatform import JWPlatformIE from .jwplatform import JWPlatformIE
from .jpopsukitv import JpopsukiIE from .jpopsukitv import JpopsukiIE
from .kaltura import KalturaIE from .kaltura import KalturaIE
@@ -542,7 +560,9 @@ from .mangomolo import (
) )
from .matchtv import MatchTVIE from .matchtv import MatchTVIE
from .mdr import MDRIE from .mdr import MDRIE
from .mediaset import MediasetIE
from .medici import MediciIE from .medici import MediciIE
from .megaphone import MegaphoneIE
from .meipai import MeipaiIE from .meipai import MeipaiIE
from .melonvod import MelonVODIE from .melonvod import MelonVODIE
from .meta import METAIE from .meta import METAIE
@@ -569,7 +589,6 @@ from .mixcloud import (
) )
from .mlb import MLBIE from .mlb import MLBIE
from .mnet import MnetIE from .mnet import MnetIE
from .mpora import MporaIE
from .moevideo import MoeVideoIE from .moevideo import MoeVideoIE
from .mofosex import MofosexIE from .mofosex import MofosexIE
from .mojvideo import MojvideoIE from .mojvideo import MojvideoIE
@@ -630,7 +649,10 @@ from .neteasemusic import (
NetEaseMusicProgramIE, NetEaseMusicProgramIE,
NetEaseMusicDjRadioIE, NetEaseMusicDjRadioIE,
) )
from .newgrounds import NewgroundsIE from .newgrounds import (
NewgroundsIE,
NewgroundsPlaylistIE,
)
from .newstube import NewstubeIE from .newstube import NewstubeIE
from .nextmedia import ( from .nextmedia import (
NextMediaIE, NextMediaIE,
@@ -638,6 +660,10 @@ from .nextmedia import (
AppleDailyIE, AppleDailyIE,
NextTVIE, NextTVIE,
) )
from .nexx import (
NexxIE,
NexxEmbedIE,
)
from .nfb import NFBIE from .nfb import NFBIE
from .nfl import NFLIE from .nfl import NFLIE
from .nhk import NhkVodIE from .nhk import NhkVodIE
@@ -651,6 +677,7 @@ from .nick import (
NickIE, NickIE,
NickDeIE, NickDeIE,
NickNightIE, NickNightIE,
NickRuIE,
) )
from .niconico import NiconicoIE, NiconicoPlaylistIE from .niconico import NiconicoIE, NiconicoPlaylistIE
from .ninecninemedia import ( from .ninecninemedia import (
@@ -663,6 +690,8 @@ from .nintendo import NintendoIE
from .njpwworld import NJPWWorldIE from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE from .nobelprize import NobelPrizeIE
from .noco import NocoIE from .noco import NocoIE
from .nonktube import NonkTubeIE
from .noovo import NoovoIE
from .normalboots import NormalbootsIE from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE from .nosvideo import NosVideoIE
from .nova import NovaIE from .nova import NovaIE
@@ -731,8 +760,8 @@ from .openload import OpenloadIE
from .ora import OraTVIE from .ora import OraTVIE
from .orf import ( from .orf import (
ORFTVthekIE, ORFTVthekIE,
ORFOE1IE,
ORFFM4IE, ORFFM4IE,
ORFOE1IE,
ORFIPTVIE, ORFIPTVIE,
) )
from .packtpub import ( from .packtpub import (
@@ -744,6 +773,7 @@ from .pandoratv import PandoraTVIE
from .parliamentliveuk import ParliamentLiveUKIE from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE from .patreon import PatreonIE
from .pbs import PBSIE from .pbs import PBSIE
from .pearvideo import PearVideoIE
from .people import PeopleIE from .people import PeopleIE
from .periscope import ( from .periscope import (
PeriscopeIE, PeriscopeIE,
@@ -809,11 +839,16 @@ from .radiobremen import RadioBremenIE
from .radiofrance import RadioFranceIE from .radiofrance import RadioFranceIE
from .rai import ( from .rai import (
RaiPlayIE, RaiPlayIE,
RaiPlayLiveIE,
RaiIE, RaiIE,
) )
from .rbmaradio import RBMARadioIE from .rbmaradio import RBMARadioIE
from .rds import RDSIE from .rds import RDSIE
from .redbulltv import RedBullTVIE from .redbulltv import RedBullTVIE
from .reddit import (
RedditIE,
RedditRIE,
)
from .redtube import RedTubeIE from .redtube import RedTubeIE
from .regiotv import RegioTVIE from .regiotv import RegioTVIE
from .rentv import ( from .rentv import (
@@ -860,6 +895,7 @@ from .rutube import (
) )
from .rutv import RUTVIE from .rutv import RUTVIE
from .ruutu import RuutuIE from .ruutu import RuutuIE
from .ruv import RuvIE
from .sandia import SandiaIE from .sandia import SandiaIE
from .safari import ( from .safari import (
SafariIE, SafariIE,
@@ -906,8 +942,9 @@ from .soundcloud import (
SoundcloudIE, SoundcloudIE,
SoundcloudSetIE, SoundcloudSetIE,
SoundcloudUserIE, SoundcloudUserIE,
SoundcloudTrackStationIE,
SoundcloudPlaylistIE, SoundcloudPlaylistIE,
SoundcloudSearchIE SoundcloudSearchIE,
) )
from .soundgasm import ( from .soundgasm import (
SoundgasmIE, SoundgasmIE,
@@ -956,6 +993,7 @@ from .tagesschau import (
TagesschauIE, TagesschauIE,
) )
from .tass import TassIE from .tass import TassIE
from .tastytrade import TastyTradeIE
from .tbs import TBSIE from .tbs import TBSIE
from .tdslifeway import TDSLifewayIE from .tdslifeway import TDSLifewayIE
from .teachertube import ( from .teachertube import (
@@ -964,7 +1002,6 @@ from .teachertube import (
) )
from .teachingchannel import TeachingChannelIE from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE from .teamcoco import TeamcocoIE
from .teamfourstar import TeamFourStarIE
from .techtalks import TechTalksIE from .techtalks import TechTalksIE
from .ted import TEDIE from .ted import TEDIE
from .tele13 import Tele13IE from .tele13 import Tele13IE
@@ -1013,11 +1050,6 @@ from .trilulilu import TriluliluIE
from .trutv import TruTVIE from .trutv import TruTVIE
from .tube8 import Tube8IE from .tube8 import Tube8IE
from .tubitv import TubiTvIE from .tubitv import TubiTvIE
from .tudou import (
TudouIE,
TudouPlaylistIE,
TudouAlbumIE,
)
from .tumblr import TumblrIE from .tumblr import TumblrIE
from .tunein import ( from .tunein import (
TuneInClipIE, TuneInClipIE,
@@ -1097,6 +1129,10 @@ from .uplynk import (
UplynkIE, UplynkIE,
UplynkPreplayIE, UplynkPreplayIE,
) )
from .upskill import (
UpskillIE,
UpskillCourseIE,
)
from .urort import UrortIE from .urort import UrortIE
from .urplay import URPlayIE from .urplay import URPlayIE
from .usanetwork import USANetworkIE from .usanetwork import USANetworkIE
@@ -1124,6 +1160,7 @@ from .vgtv import (
from .vh1 import VH1IE from .vh1 import VH1IE
from .vice import ( from .vice import (
ViceIE, ViceIE,
ViceArticleIE,
ViceShowIE, ViceShowIE,
) )
from .viceland import VicelandIE from .viceland import VicelandIE
@@ -1186,12 +1223,14 @@ from .vk import (
) )
from .vlive import ( from .vlive import (
VLiveIE, VLiveIE,
VLiveChannelIE VLiveChannelIE,
VLivePlaylistIE
) )
from .vodlocker import VodlockerIE from .vodlocker import VodlockerIE
from .vodpl import VODPlIE from .vodpl import VODPlIE
from .vodplatform import VODPlatformIE from .vodplatform import VODPlatformIE
from .voicerepublic import VoiceRepublicIE from .voicerepublic import VoiceRepublicIE
from .voot import VootIE
from .voxmedia import VoxMediaIE from .voxmedia import VoxMediaIE
from .vporn import VpornIE from .vporn import VpornIE
from .vrt import VRTIE from .vrt import VRTIE
@@ -1213,6 +1252,7 @@ from .washingtonpost import (
WashingtonPostArticleIE, WashingtonPostArticleIE,
) )
from .wat import WatIE from .wat import WatIE
from .watchbox import WatchBoxIE
from .watchindianporn import WatchIndianPornIE from .watchindianporn import WatchIndianPornIE
from .wdr import ( from .wdr import (
WDRIE, WDRIE,
@@ -1262,12 +1302,12 @@ from .yahoo import (
YahooIE, YahooIE,
YahooSearchIE, YahooSearchIE,
) )
from .yam import YamIE
from .yandexmusic import ( from .yandexmusic import (
YandexMusicTrackIE, YandexMusicTrackIE,
YandexMusicAlbumIE, YandexMusicAlbumIE,
YandexMusicPlaylistIE, YandexMusicPlaylistIE,
) )
from .yandexdisk import YandexDiskIE
from .yesjapan import YesJapanIE from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE from .yinyuetai import YinYueTaiIE
from .ynet import YnetIE from .ynet import YnetIE
@@ -1299,5 +1339,6 @@ from .youtube import (
YoutubeWatchLaterIE, YoutubeWatchLaterIE,
) )
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zaq1 import Zaq1IE
from .zdf import ZDFIE, ZDFChannelIE from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import ZingMp3IE from .zingmp3 import ZingMp3IE

View File

@@ -203,19 +203,19 @@ class FacebookIE(InfoExtractor):
}] }]
@staticmethod @staticmethod
def _extract_url(webpage): def _extract_urls(webpage):
mobj = re.search( urls = []
r'<iframe[^>]+?src=(["\'])(?P<url>https://www\.facebook\.com/video/embed.+?)\1', webpage) for mobj in re.finditer(
if mobj is not None: r'<iframe[^>]+?src=(["\'])(?P<url>https?://www\.facebook\.com/(?:video/embed|plugins/video\.php).+?)\1',
return mobj.group('url') webpage):
urls.append(mobj.group('url'))
# Facebook API embed # Facebook API embed
# see https://developers.facebook.com/docs/plugins/embedded-video-player # see https://developers.facebook.com/docs/plugins/embedded-video-player
mobj = re.search(r'''(?x)<div[^>]+ for mobj in re.finditer(r'''(?x)<div[^>]+
class=(?P<q1>[\'"])[^\'"]*\bfb-(?:video|post)\b[^\'"]*(?P=q1)[^>]+ class=(?P<q1>[\'"])[^\'"]*\bfb-(?:video|post)\b[^\'"]*(?P=q1)[^>]+
data-href=(?P<q2>[\'"])(?P<url>(?:https?:)?//(?:www\.)?facebook.com/.+?)(?P=q2)''', webpage) data-href=(?P<q2>[\'"])(?P<url>(?:https?:)?//(?:www\.)?facebook.com/.+?)(?P=q2)''', webpage):
if mobj is not None: urls.append(mobj.group('url'))
return mobj.group('url') return urls
def _login(self): def _login(self):
(useremail, password) = self._get_login_info() (useremail, password) = self._get_login_info()

View File

@@ -102,6 +102,8 @@ class FirstTVIE(InfoExtractor):
'format_id': f.get('name'), 'format_id': f.get('name'),
'tbr': tbr, 'tbr': tbr,
'source_preference': quality(f.get('name')), 'source_preference': quality(f.get('name')),
# quality metadata of http formats may be incorrect
'preference': -1,
}) })
# m3u8 URL format is reverse engineered from [1] (search for # m3u8 URL format is reverse engineered from [1] (search for
# master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru) # master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru)

View File

@@ -43,7 +43,7 @@ class FiveTVIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'glavnoe', 'id': 'glavnoe',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Итоги недели с 8 по 14 июня 2015 года', 'title': r're:^Итоги недели с \d+ по \d+ \w+ \d{4} года$',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, { }, {
@@ -70,7 +70,8 @@ class FiveTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
video_url = self._search_regex( video_url = self._search_regex(
r'<a[^>]+?href="([^"]+)"[^>]+?class="videoplayer"', [r'<div[^>]+?class="flowplayer[^>]+?data-href="([^"]+)"',
r'<a[^>]+?href="([^"]+)"[^>]+?class="videoplayer"'],
webpage, 'video url') webpage, 'video url')
title = self._og_search_title(webpage, default=None) or self._search_regex( title = self._og_search_title(webpage, default=None) or self._search_regex(

Some files were not shown because too many files have changed in this diff Show More