Compare commits

...

1870 Commits

Author SHA1 Message Date
Sergey M․
e6332059ac release 2016.09.24 2016-09-24 02:16:47 +07:00
Sergey M․
8eec691e8a [ChangeLog] Actualize 2016-09-24 02:12:49 +07:00
Sergey M․
24628cf7db [soundcloud:playlist] Provide video id for playlist entries (Closes #10733) 2016-09-24 02:01:01 +07:00
Sergey M․
71ad00c09f [prosiebensat1] Add support for kabeleinsdoku (Closes #10732) 2016-09-23 21:08:16 +07:00
Remita Amine
45cae3b021 [cbs] extract info from thunder videoPlayerService(closes #10728) 2016-09-22 19:28:22 +01:00
Yen Chi Hsuan
4ddcb5999d [openload] Fix extraction (closes #10408, closes #10727)
Thanks to @daniel100097 for providing a working version
2016-09-23 01:47:51 +08:00
Yen Chi Hsuan
628406db96 [Makefile] Cleanup files from fragment-based downloaders 2016-09-23 01:13:56 +08:00
Yen Chi Hsuan
e3d6bdc8fc [ustream] Support HLS streams (closes #10698) 2016-09-23 01:11:13 +08:00
Sergey M․
0a439c5c4c [udemy] Stringify video id 2016-09-22 21:48:53 +07:00
Remita Amine
1978540a51 [ooyala] extract all hls formats 2016-09-21 21:49:52 +01:00
Sergey M․
12f211d0cb [videomore] Fix embed regex 2016-09-21 22:51:36 +07:00
Remita Amine
3a5a18705f [adobepass] add support MSO that depend on watchTVeverywhere(closes #10709) 2016-09-21 15:57:27 +01:00
Remita Amine
1ae0ae5db0 [cartoonnetwork] add support Adobe Pass auth 2016-09-20 18:52:00 +01:00
Sergey M․
f62a77b99a [soundcloud] Modernize 2016-09-20 21:56:57 +07:00
coolsa
4bfd294e2f [soundcloud] Extract license metadata 2016-09-20 21:56:57 +07:00
Remita Amine
e33a7253b2 [fox] add support for Adobe Pass auth(closes #8584) 2016-09-20 15:52:23 +01:00
Remita Amine
c38f06818d add support for Adobe Pass auth in tbs,tnt and trutv extractors(fixes #10642)(closes #10222)(closes #10519) 2016-09-20 11:55:30 +01:00
Sergey M․
cb57386873 release 2016.09.19 2016-09-19 02:58:32 +07:00
Sergey M․
59fd8f931d [ChangeLog] Actualize 2016-09-19 02:57:14 +07:00
Sergey M․
70b4cf9b1b [crunchyroll] Check if already logged in (Closes #10700) 2016-09-19 02:50:06 +07:00
Sergey M․
cc764a6da8 [twitch:stream] Remove fallback to profile extraction when stream is offline
Main page does not contain profile videos anymore
2016-09-18 19:10:18 +07:00
Yen Chi Hsuan
d8dbf8707d [thisav] Improve title extraction (closes #10682)
I didn't add a test case as the one in #10682 looks like a copyrighted
product.
2016-09-18 18:35:38 +08:00
Sergey M․
a1da888d0c [vyborymos] Improve station info extraction 2016-09-18 17:30:55 +07:00
Sergey M․
3acff9423d release 2016.09.18 2016-09-18 17:16:55 +07:00
Sergey M․
9ca93b99d1 [ChangeLog] Actualize 2016-09-18 17:15:22 +07:00
Sergey M․
14ae11efab [vyborymos] Add extractor (Closes #10692) 2016-09-18 16:56:40 +07:00
Sergey M․
190d2027d0 [xfileshare] Add title regex for streamin.to and fallback to video id (Closes #10646) 2016-09-18 07:22:06 +07:00
Sergey M․
26394d021d [globo:article] Add support for multiple videos (Closes #10653) 2016-09-17 23:34:10 +07:00
Sergey M․
30d0b549be [extractor/common] Add manifest_url for hls and hds formats 2016-09-17 21:33:38 +07:00
Sergey M․
86f4d14f81 Refactor fragments interface and dash segments downloader
- Eliminate segment_urls and initialization_url
+ Introduce manifest_url (manifest may contain unfragmented data in this case url will be used for direct media URL and manifest_url for manifest itself correspondingly)
* Rewrite dashsegments downloader to use fragments data
* Improve generic mpd extraction
2016-09-17 20:35:22 +07:00
Sergey M․
21d21b0c72 [svt] Fix DASH formats extraction 2016-09-17 19:25:31 +07:00
Sergey M․
b4c1d6e800 [extractor/common] Expose fragments interface for dashsegments formats 2016-09-17 18:31:18 +07:00
Sergey M․
a0d5077c8d [extractor/common] Introduce fragments interface 2016-09-17 18:31:09 +07:00
Yen Chi Hsuan
584d6f3457 [thisav] Recognize jwplayers (closes #10447) 2016-09-17 18:46:43 +08:00
Yen Chi Hsuan
e14c82bd6b [jwplatform] Use js_to_json to detect more JWPlayers 2016-09-17 18:45:08 +08:00
Sergey M․
c51a7f0b2f [franceinter] Fix upload date extraction 2016-09-17 15:44:37 +07:00
Remita Amine
d05ef09d9d [mangomolo] fix domain regex 2016-09-17 08:11:01 +01:00
Remita Amine
30d9e20938 [postprocessor/ffmpeg] apply FFmpegFixupM3u8PP only for videos with aac codec(#5591) 2016-09-16 22:06:55 +01:00
Remita Amine
fc86d4eed0 [mangomolo] fix typo 2016-09-16 20:10:47 +01:00
Remita Amine
7d273a387a [mangomolo] add support for Mangomolo embeds 2016-09-16 19:31:39 +01:00
Remita Amine
6ad0219556 [common] add helper method for Wowza Streaming Engine format extraction 2016-09-16 19:30:38 +01:00
Remita Amine
98b7506e96 [toutv] add support for authentication(closes #10669) 2016-09-16 17:40:15 +01:00
Sergey M․
52dc8a9b3f [franceinter] Fix upload date extraction 2016-09-16 22:02:59 +07:00
Sergey M․
9d8985a165 [tv4] Fix hls and hds formats (Closes #10659) 2016-09-16 00:54:34 +07:00
Sergey M․
f5e008d134 release 2016.09.15 2016-09-15 23:46:11 +07:00
Sergey M․
e6bf3621e7 [ChangeLog] Actualize 2016-09-15 23:31:16 +07:00
stepshal
490b755769 Improve some id regexes 2016-09-15 23:12:58 +07:00
Sergey M․
1dec2c8a0e [adobepass] Change mvpd cache section name
In order to better emphasize it's relation to Adobe Pass
2016-09-15 22:47:45 +07:00
Sergey M․
dcce092e0a [extractor/common] Simplify _get_netrc_login_info and carry long lines 2016-09-15 22:35:12 +07:00
Sergey M․
32443dd346 [extractor/common] Update _get_login_info's comment 2016-09-15 22:34:29 +07:00
Sergey M․
2133565cec [extractor/common] Simplify _get_login_info 2016-09-15 22:26:37 +07:00
Sergey M․
1da50aa34e [YoutubeDL] Improve Adobe Pass options' wording 2016-09-15 22:24:55 +07:00
Sergey M․
d2522b86ac [options] Actually print Adobe Pass options sections in --help 2016-09-15 22:18:31 +07:00
Sergey M․
537f753399 [options] Improve Adobe Pass wording 2016-09-15 22:17:17 +07:00
Sergey M․
c849836854 [utils] Improve _hidden_inputs 2016-09-15 21:54:48 +07:00
Sergey M․
eb5b1fc021 [crunchyroll] Fix authentication (Closes #10655) 2016-09-15 21:53:35 +07:00
Sergey M․
95be29e1c6 [twitch] Fix api calls (Closes #10654, closes #10660) 2016-09-15 20:58:02 +07:00
Remita Amine
c035dba19e [bellmedia] add support for more sites 2016-09-15 08:12:12 +01:00
Remita Amine
87148bb711 [adobepass] rename --ap-mso-list option to --ap-list-mso 2016-09-14 20:21:09 +01:00
Remita Amine
797c636bcb [ap] improve adobe pass names and parse error handling 2016-09-14 18:58:47 +01:00
Sergey M․
0002962f3f [franceinter] Improve extraction (Closes #10538) 2016-09-14 23:59:38 +07:00
Sergey M․
3e4185c396 [utils] Use native french month names 2016-09-14 23:59:38 +07:00
Sergey M․
f6717dec8a [utils] Improve month_by_name and add tests 2016-09-14 23:59:38 +07:00
renalid
a942d6cb48 [utils,franceinter] Add french months' names and fix extraction
Update of the "FranceInter" radio extractor : webpages HTML structure
had changed, the extractor didn't work. So I updated this extractor to
get the mp3 URL and all details.
2016-09-14 23:59:38 +07:00
Yen Chi Hsuan
961516bfd1 [kwuo:song] Improve error detection (closes #10650) 2016-09-15 00:56:15 +08:00
Yen Chi Hsuan
6db354a9f4 [kuwo] Update _TESTS 2016-09-15 00:53:04 +08:00
Remita Amine
353f340e11 [go] fix typo 2016-09-14 17:22:42 +01:00
Remita Amine
014b7e6b25 [go] add support for free full episodes(#10439) 2016-09-14 17:08:25 +01:00
stepshal
925194022c Improve some _VALID_URLs 2016-09-14 22:47:21 +07:00
Sergey M․
b690ea15eb [viafree] Fix test 2016-09-14 22:45:23 +07:00
Remita Amine
5712c0f426 [adobepass] remove unnecessary option 2016-09-14 16:37:21 +01:00
Yen Chi Hsuan
86d68f906e [bilibili] Fix extraction for videos without backup_url (#10647) 2016-09-14 22:11:49 +08:00
Yen Chi Hsuan
4875ff6847 [bilibili] Remove copyrighted test cases
I can't find any English or Chinese material that claims BiliBili has
bought legal redistribution permissions for copyrighted products from
copyrighted holders.

References for removed test cases:
"刀语": https://en.wikipedia.org/wiki/Katanagatari, by White Fox
"哆啦A梦": https://en.wikipedia.org/wiki/Doraemon, by Shin-Ei Animation
"岳父岳母真难当": https://en.wikipedia.org/wiki/Serial_(Bad)_Weddings, by Les films du 24
"混沌武士": https://en.wikipedia.org/wiki/Samurai_Champloo, by Manglobe

I shouldn't have added them to _TESTS
2016-09-14 22:09:43 +08:00
Remita Amine
1b6712ab23 [adobepass] add specific options for adobe pass authentication
- add --ap-username and --ap-password option to specify
TV provider username and password in the cmd line
- add --ap-retries option to limit the number of retries
- add --list-ap-msi-ids to list the supported TV Providers
2016-09-13 22:16:01 +01:00
Sergey M․
8414c2da31 [adobepass] PEP 8 2016-09-13 23:22:16 +07:00
Sergey M․
45396dd2ed [nhk] Fix extraction (Closes #10633) 2016-09-13 23:20:25 +07:00
Remita Amine
7a7309219c [adobepass] add an option to specify mso_id and support for ROGERS TV Provider(closes #10606) 2016-09-12 23:39:35 +01:00
Sergey M․
fcba157e80 [ISSUE_TEMPLATE_tmpl.md] Fix typo 2016-09-12 23:29:43 +07:00
Sergey M․
a6ccc3e518 [safari] Improve ids regexes (#10617) 2016-09-12 23:05:52 +07:00
Sergey M․
1d16035bb4 [kaltura] Improve audio detection 2016-09-12 22:43:45 +07:00
Sergey M․
e8bcd982cc [kaltura] Skip chun format 2016-09-12 22:33:00 +07:00
Sergey M․
a5ff05df1a [extractor/generic] Add vimeo embed that requires Referer passed 2016-09-12 21:49:31 +07:00
Sergey M․
d002e91986 [vimeo:ondemand] Pass Referer along with embed URL (#10624) 2016-09-12 21:48:45 +07:00
Sergey M․
546edb2efa [ISSUE_TEMPLATE_tmpl.md] Fix typo 2016-09-12 21:01:31 +07:00
Yen Chi Hsuan
be45730226 [nbc] Add new extractor for NBC Olympics (#10295, #10361) 2016-09-12 02:55:15 +08:00
Sergey M․
ee7e672eb0 [tube8] Remove proxy settings from test 2016-09-11 23:46:50 +07:00
Sergey M․
0307d6fba6 release 2016.09.11.1 2016-09-11 23:33:20 +07:00
Sergey M․
fc150cba1d [devscripts/release.sh] Add missing fi 2016-09-11 23:32:01 +07:00
Sergey M․
d667ab7fad [ChangeLog] Actualize 2016-09-11 23:30:18 +07:00
Sergey M․
eb87d4545a [devscripts/release.sh] Add ChangeLog reminder prompt 2016-09-11 23:29:25 +07:00
Sergey M․
1c81476cbb release 2016.09.11 2016-09-11 23:20:09 +07:00
Sergey M․
bc9186c882 [tvplay] Remove unused import 2016-09-11 22:51:12 +07:00
Sergey M․
6599c72527 [tube8] Extract categories and tags (Closes #10579) 2016-09-11 22:50:36 +07:00
Yen Chi Hsuan
6bb05b32a9 [pornhub] Extract categories and tags (closes #10499) 2016-09-11 19:22:51 +08:00
Yen Chi Hsuan
fea74acad8 [foxnews] Revert to old extractor names 2016-09-11 18:54:24 +08:00
Yen Chi Hsuan
f01115c933 [openload] Temporary fix (#10408) 2016-09-11 18:36:59 +08:00
Yen Chi Hsuan
2cdbc06a1f [foxnews] Support Fox News Articles (closes #10598) 2016-09-11 18:32:45 +08:00
Sergey M․
2cb93afcd8 [viafree] Improve video id extraction (Closes #10615) 2016-09-11 14:59:14 +07:00
Yen Chi Hsuan
bfcda07a27 [abc:iview] Skip the test. They are removed soon 2016-09-11 04:06:00 +08:00
Yen Chi Hsuan
001a5fd3d7 [iwara] Fix extraction after relaunch
Closes #10462, closes #3215
2016-09-11 03:02:00 +08:00
Remita Amine
1e35999c1e [tfo] Add new extractor 2016-09-10 19:43:31 +01:00
Sergey M․
2512b17493 [lrt] Fix audio extraction (Closes #10566) 2016-09-11 01:27:20 +07:00
Sergey M․
56c0ead4d3 [9now] Improve video data extraction (Closes #10561) 2016-09-11 00:42:13 +07:00
Scott Leggett
7324243750 [9now] Fix extraction 2016-09-11 00:16:29 +07:00
Sergey M․
84a18e9b90 [polskieradio:category] Improve extraction 2016-09-10 22:01:49 +07:00
Sergey M․
b29f842e0e [canalplus] Add support for c8.fr (Closes #10577) 2016-09-10 20:46:45 +07:00
Sergey M․
f009fcac0d Merge branch 'master' of github.com:rg3/youtube-dl 2016-09-10 19:21:03 +07:00
Yen Chi Hsuan
6c3affcb18 [newgrounds] Fix uploader extraction
Closes #10584

Also change test URLs to HTTPS, as proposed by
@stepshal in #10593.

Closes #10593
2016-09-10 20:09:09 +08:00
Sergey M․
1e19ff2984 Merge branch 'polskie-radio-programme' of https://github.com/JakubAdamWieczorek/youtube-dl 2016-09-10 00:42:36 +07:00
Sergey M․
c6129feb7f [ketnet] Add extractor (Closes #10343) 2016-09-09 23:20:45 +07:00
Sergey M․
bb5ebd4453 [canvas] Add support for een.be (Closes #10605) 2016-09-09 22:16:21 +07:00
Remita Amine
cb9cbd84ed [extractors] add import for TeleQuebecIE 2016-09-08 22:55:27 +01:00
Remita Amine
4d5726b0d7 [telequebec] Add new extractor(closes #1999) 2016-09-08 22:53:44 +01:00
Remita Amine
4614ad7b59 [parliamentliveuk] fix extraction(closes #9137) 2016-09-08 20:46:12 +01:00
Sergey M․
b717837190 release 2016.09.08 2016-09-08 23:46:14 +07:00
Sergey M․
2abad67e52 [ChangeLog] Actualize 2016-09-08 23:32:16 +07:00
Sergey M․
ad0e2b3359 [abcotvs] Add support for ABC Owned Television Stations 2016-09-08 23:15:58 +07:00
Sergey M․
37720844f6 [jwplatform] Extract height from label 2016-09-08 22:53:20 +07:00
Sergey M․
6cfcb8ac36 [tvnoe] Do not capture unused groups in _VALID_URL 2016-09-08 22:53:20 +07:00
Remita Amine
7a979da8cb [yahoo] Look for Brightcove Legacy Studio embeds(closes #9345) 2016-09-08 16:44:22 +01:00
Sergey M․
2fdc7b0e04 [viafree] PEP 8 2016-09-08 22:40:02 +07:00
Sergey M․
010d034fca [videomore] Fix extraction (Closes #10592) 2016-09-08 22:38:49 +07:00
Yen Chi Hsuan
02e552886f Merge pull request #10596 from stepshal/r_prefix
Add missing r prefix for _VALID_URLs
2016-09-08 18:31:09 +08:00
stepshal
25042f7372 Add missing r prefix for _VALID_URLs 2016-09-08 17:04:57 +07:00
Yen Chi Hsuan
3f612f0767 Fix _VALID_URLs further (#10594) 2016-09-08 17:39:29 +08:00
Yen Chi Hsuan
17bf6e71cc Merge pull request #10594 from stepshal/https_support
Add support for https for rest of the exctractors.
2016-09-08 17:28:46 +08:00
Yen Chi Hsuan
881f35479d Credit @xyb for miaopai extractor (#10556) 2016-09-08 17:22:43 +08:00
stepshal
89f257d6e5 Add support for https for rest of the exctractors. 2016-09-08 13:52:22 +07:00
Yen Chi Hsuan
e78a5428b6 [foxgay] Fix extraction (closes #10480) 2016-09-08 02:01:09 +08:00
Remita Amine
6656a82481 [rmcdecouverte] Add new extractor(closes #9709) 2016-09-07 17:33:22 +01:00
Remita Amine
d7e794928d [tlc] fix query string parsing 2016-09-07 17:33:22 +01:00
Yen Chi Hsuan
9c27188988 Merge branch 'xyb-miaopai' 2016-09-08 00:31:06 +08:00
Yen Chi Hsuan
b84d311d53 [ChangeLog] Update for #10556 2016-09-08 00:29:55 +08:00
Yen Chi Hsuan
f87feb4b68 [miaopai] Coding style (#10556) 2016-09-08 00:28:33 +08:00
Yen Chi Hsuan
2841bdcebb Merge branch 'miaopai' of https://github.com/xyb/youtube-dl into xyb-miaopai 2016-09-08 00:08:02 +08:00
Yen Chi Hsuan
84b91dd4e3 [gamestar] Fix metadata extraction (closes #10479) 2016-09-07 23:07:50 +08:00
Yen Chi Hsuan
92c9c2a88b [moevideo] Skip another removed test (#10474) 2016-09-07 22:21:59 +08:00
Remita Amine
9d54b02bae [puls4] fix extraction(closes #10583) 2016-09-07 14:43:20 +01:00
Remita Amine
846d8b76a0 [cctv] Add new extractor(closes #8153) 2016-09-07 10:11:09 +01:00
Philipp Hagemeister
aa3f9fe695 Explain why and why not to specify --hls-prefer-native
This has been asked at http://stackoverflow.com/questions/39357037/what-does-youtube-dl-option-hls-prefer-native-do-any-downside-adding-to-youtu
2016-09-07 10:38:59 +02:00
Remita Amine
8258f4457c [lci] Add new extractor(closes #10573) 2016-09-06 20:47:42 +01:00
Remita Amine
948cd5b72d [wat] extract dash formats 2016-09-06 20:44:45 +01:00
Jakub Adam Wieczorek
8d3737cda7 [polskieradio] Add support for downloading whole programmes.
This extends the Polskie Radio (the Polish national radio) extractor to
enable the user to download all the broadcasts of a single programme.
2016-09-06 21:34:44 +02:00
Sergey M․
155bc674c4 [viafree] Improve video id detection (Closes #10569) 2016-09-07 00:41:31 +07:00
Remita Amine
c33c962adf [trutv] Add new extractor(#10519) 2016-09-06 15:56:17 +01:00
Remita Amine
bdcc046d12 [turner] use android secure hls host and catch token extraction errors 2016-09-06 15:53:03 +01:00
Xie Yanbo
a493f10208 using _parse_html5_media_entries to parse video tag 2016-09-05 23:08:33 +08:00
Sergey M․
f3eeaacb4e [nick] Add test for #10559 2016-09-05 21:42:41 +07:00
Sergey M․
b4d6a85d60 [nick] Add support for nickelodeon.nl (Closes #10559) 2016-09-05 21:33:14 +07:00
Remita Amine
0b36a96212 [abcotvs] extend _VALID_URL and add support for clips.abcotvs.com(closes #9551) 2016-09-05 13:41:21 +01:00
Yen Chi Hsuan
bc22a79694 Credit @mcepl for #10524 2016-09-05 16:44:06 +08:00
Yen Chi Hsuan
340e31ca74 Merge branch 'PeterDing-bilibili' 2016-09-05 13:55:07 +08:00
Yen Chi Hsuan
973dee491f [ChangeLog] Update for #10190 2016-09-05 13:54:35 +08:00
Yen Chi Hsuan
1f85029d82 [bilibili] Simplify 2016-09-05 13:53:58 +08:00
Xie Yanbo
95be19d436 [miaopai] Add new extractor 2016-09-05 13:53:09 +08:00
Yen Chi Hsuan
95843da529 Merge branch 'bilibili' of https://github.com/PeterDing/youtube-dl into PeterDing-bilibili 2016-09-05 13:47:24 +08:00
Yen Chi Hsuan
abf2c79f95 Merge branch 'mcepl-tvnoe' 2016-09-05 13:39:51 +08:00
Yen Chi Hsuan
b49ad71ce1 [ChangeLog] Update for #10524 2016-09-05 13:38:55 +08:00
Yen Chi Hsuan
9127e1533d [tvnoe] PEP8 and coding style 2016-09-05 13:37:36 +08:00
Matěj Cepl
78e762d23c Add new extractor for TV Noe (Czech Christian TV).
Fixes #10520
2016-09-04 19:06:40 +02:00
Sergey M․
4809490108 release 2016.09.04.1 2016-09-04 20:58:28 +07:00
Sergey M․
8112bfeaba [ChangeLog] Actualize 2016-09-04 20:57:18 +07:00
Sergey M․
d9606d9b6c release 2016.09.04 2016-09-04 20:51:48 +07:00
Remita Amine
433af6ad30 [theplatform] fix player regex(closes #10546) 2016-09-04 14:24:41 +01:00
Sergey M․
feaa5ad787 [youtube:playlist] Extend _VALID_URL 2016-09-04 20:12:34 +07:00
Remita Amine
100bd86a68 [rottentomatoes] delegate extraction to InternetVideoArchiveIE 2016-09-04 11:45:29 +01:00
Remita Amine
0def758782 [internetvideoarchive] extract all formats 2016-09-04 11:45:29 +01:00
Yen Chi Hsuan
919cf1a62f [downloader/dash] Abort if the first segment fails
Closes #10497, Closes #10542
2016-09-04 17:32:29 +08:00
Yen Chi Hsuan
b29cd56591 [pornovoisines] Fix extraction (closes #10469) 2016-09-04 17:01:39 +08:00
Yen Chi Hsuan
622638512b [rottentomatoes] Fix extraction
Closes #10467
2016-09-04 16:25:59 +08:00
Sergey M․
37c7490ac6 [espn] Extend _VALID_URL (Closes #10549) 2016-09-04 04:59:46 +07:00
Sergey M․
091624f9da [vimple] Extend _VALID_URL (Closes #10547) 2016-09-04 03:39:13 +07:00
Sergey M․
7e5dc339de [youtube:watchlater] Fix extraction (Closes #10544) 2016-09-04 00:29:01 +07:00
Sergey M․
4a69fa04e0 [downloader/dash] Abort download immediately after giving up on some fragment 2016-09-03 17:51:48 +07:00
Sergey M․
2e99cd30c3 [downloader/dash:hls] Report exact fragment error on retry 2016-09-03 17:51:48 +07:00
Sergey M․
25afc2a783 [downloader/dash:hls] Respect --fragment-retries and --skip-unavailable-fragments (Closes #10165, closes #10448) 2016-09-03 17:51:48 +07:00
Sergey M․
9603b66012 Introduce --skip-unavailable-fragments 2016-09-03 17:51:48 +07:00
Yen Chi Hsuan
45aab4d30b [youjizz] Fix extraction. The site has moved to HTML5
Closes #10437
2016-09-03 18:37:36 +08:00
Yen Chi Hsuan
ed2bfe93aa [fc2:embed] Add ie_key 2016-09-03 18:22:00 +08:00
Yen Chi Hsuan
cdc783510b [foxnews:insider] Add new extractor
Closes #10445
2016-09-03 18:16:19 +08:00
Yen Chi Hsuan
cf0efe9636 [fc2:embed] New extractor for Flash player URLs
Closes #10512
2016-09-03 17:25:03 +08:00
Christian Pointner
dedb177029 Fix parsing of HTML5 media elements
This fixes an error in _parse_html5_media_entries in case
an audio or video tag directly uses a src attribute insted
of <source> elements in it's body.
2016-09-03 16:09:35 +07:00
Sergey M․
86c3bbbced release 2016.09.03 2016-09-03 01:46:41 +07:00
Sergey M․
4b3a607658 [ChangeLog] Actualize 2016-09-03 01:45:17 +07:00
Sergey M․
3a7d35b982 Credit @C4K3 for #10536 2016-09-03 01:42:33 +07:00
Sergey M․
6496ccb413 [youtube] Add support for rental videos' previews (Closes #10532) 2016-09-03 01:17:15 +07:00
Sergey M․
3fcce30289 [drtv] Update tests 2016-09-02 23:53:17 +07:00
Sergey M․
c2b2c7e138 [utils] Add quicktime to mimetype2ext 2016-09-02 23:50:42 +07:00
Sergey M․
dacb3a864a [youtube:playlist] Fallback to video extraction for video/playlist URLs when playlist is broken (Closes #10537) 2016-09-02 23:43:20 +07:00
Sergey M․
6066d03db0 [drtv] Modernize and make more robust 2016-09-02 23:02:15 +07:00
Sergey M․
6562d34a8c [utils] Improve mimetype2ext 2016-09-02 22:57:48 +07:00
Sebastian Blunt
5e9e3d0f6b [drtv] Add support for dr.dk/nyheder
It's the same video player, the only difference is that the video player
is loaded differently, and certain metadata (title and description) is
not available under dr.dk/mu, so make it by default get that from some
of the html meta tags.

Skip the dr.dk/tv test

dr.dk/tv videos are only available for between 7 and 90 days due to
Danish law, and in certain cases may be readded. Skip this test as it is
no longer available.
2016-09-02 22:20:36 +07:00
Sergey M․
349fc5c705 [facebook:plugins:video] Add extractor (Closes #10530) 2016-09-02 21:13:50 +07:00
Remita Amine
2c3e0af93e [go] Add new extractor 2016-09-02 09:53:04 +01:00
Remita Amine
6150502e47 [adobepass] check for authz_token expiration(#10527) 2016-09-01 22:29:20 +01:00
Remita Amine
b207d5ebd4 [curiositystream] don't cache auth token 2016-09-01 19:46:58 +01:00
Remita Amine
4191779dcd [nytimes] improve extraction 2016-09-01 19:08:29 +01:00
Sergey M․
f97ec8bcb9 [glide] Remove unused import 2016-09-01 23:46:58 +07:00
Sergey M․
8276d3b87a [thestar] Fix extraction (Closes #10465) 2016-09-01 23:46:15 +07:00
Sergey M․
af95ee94b4 [glide] Fix extraction (Closes #10478) 2016-09-01 23:38:49 +07:00
Sergey M․
8fb6af6bba [exfm] Remove extractor (Closes #10482) 2016-09-01 23:32:28 +07:00
Sergey M․
f6af0f888b [youporn] Fix categories and tags extraction (Closes #10521) 2016-09-01 23:15:01 +07:00
Sergey M․
e816c9d158 [extractor/common] Simplify _extract_m3u8_formats 2016-09-01 22:18:16 +07:00
Sergey M․
9250181f37 [extractor/common] Restore NAME usage from EXT-X-MEDIA tag for formats codes in _extract_m3u8_formats (Closes #10522) 2016-09-01 21:37:25 +07:00
Remita Amine
f096ec2625 [curiositystream] Add new extractor 2016-09-01 13:37:09 +01:00
Yen Chi Hsuan
4c8ab6fd71 [thvideo] Remove extractor. Website down.
Closes #10464

According to a screenshot in http://tieba.baidu.com/p/4691302183,
thvideo.tv is shut down "temporarily". I see no clues that it will be up
again, so I remove it here.
2016-09-01 17:04:41 +08:00
Yen Chi Hsuan
05d4612947 [movingimage] Adapt to the new domain name and fix extraction
Closes #10466
2016-09-01 16:58:16 +08:00
Yen Chi Hsuan
746a695b36 [myvidster] Update _TESTS (closes #10473) 2016-09-01 16:42:35 +08:00
Yen Chi Hsuan
165c54e97d [southpark.cc.com:español] Skip geo-restricted _TESTS
Breaks https://travis-ci.org/rg3/youtube-dl/jobs/156728175
2016-09-01 16:28:03 +08:00
Remita Amine
2896dd73bc [cbs] extract once formats(closes #10515) 2016-09-01 08:00:13 +01:00
Remita Amine
f8fd510eb4 [limelight] skip ism manifests and reduce requests 2016-08-31 18:32:15 +01:00
Sergey M․
7a3e849f6e [porncom] Extract categories and tags (Closes #10510) 2016-08-31 22:23:55 +07:00
Sergey M․
196c6ba067 [facebook] Extract timestamp (Closes #10508) 2016-08-31 22:12:37 +07:00
Remita Amine
165620e320 [yahoo] extract more and better formats 2016-08-30 21:49:28 +01:00
Sergey M․
4fd350611c release 2016.08.31 2016-08-31 02:39:39 +07:00
Sergey M․
263fef43de [ChangeLog] Actualize 2016-08-31 02:37:40 +07:00
Sergey M․
a249ab83cb [pyvideo] Remove debugging code 2016-08-31 01:56:58 +07:00
Sergey M․
f7043ef39c [soundcloud] Fix _VALID_URL clashes with sets (Closes #10505) 2016-08-31 01:56:15 +07:00
Sergey M․
64fc49aba0 [bandcamp:album] Fix title extraction (Closes #10455) 2016-08-31 00:29:49 +07:00
Sergey M․
245023a861 [pyvideo] Fix extraction (Closes #10468) 2016-08-30 23:51:18 +07:00
Remita Amine
3c77a54d5d [turner] keep video id intact 2016-08-30 10:46:48 +01:00
Remita Amine
da30a20a4d [turner,cnn] move a check for wrong timestamp to CNNIE 2016-08-29 19:26:53 +01:00
Remita Amine
1fe48afea5 [cnn] update _TEST for CNNBlogsIE and CNNArticleIE(closes #10489) 2016-08-29 18:24:16 +01:00
Remita Amine
42e05be867 [ctv] add support for (tsn,bnn,thecomedynetwork).ca websites(#10016) 2016-08-29 18:24:16 +01:00
Remita Amine
fe45b0e060 [9c9media] fix multiple stacks extraction and extract more metadata(#10016) 2016-08-29 18:24:16 +01:00
Sergey M․
a06e1498aa [kusi] Update test 2016-08-29 22:54:33 +07:00
Sergey M․
5a80e7b43a [turner] Skip invalid subtitles' URLs 2016-08-29 22:44:15 +07:00
Sergey M․
3fb2a23029 [adultswim] Extract video info from onlineOriginals (Closes #10492) 2016-08-29 22:40:35 +07:00
PeterDing
7be15d4097 [bilibili] Support episodes
[extractor/bilibili] add md5 for testing

[extractor/bilibili] remove unnecessary headers

[extractor/bilibili] correct _TESTS; find thumbnail for episode

[extractor/bilibili] [Fix] restore removed tests
2016-08-29 23:31:08 +08:00
Sergey M․
cd10b3ea63 [turner] Extract all formats 2016-08-29 22:13:49 +07:00
Sergey M․
547993dcd0 [turner] Fix subtitles extraction 2016-08-29 21:52:41 +07:00
Yen Chi Hsuan
6c9b71bc08 [downloader/external] Recommend --hls-prefer-native for SOCKS users
Related: #10490
2016-08-29 19:05:38 +08:00
Remita Amine
93b8404599 [generic,vodplatform] improve embed regex 2016-08-29 07:57:20 +01:00
Sergey M․
9ba1e1dcc0 [played] Remove extractor (Closes #10470) 2016-08-29 08:26:07 +07:00
Remita Amine
b8079a40bc [turner] fix secure m3u8 formats downloading 2016-08-28 17:51:53 +01:00
Remita Amine
5bc8a73af6 [cartoonnetwork] make extraction work for more videos in the website
some videos require `networkName=CN2` to be present in the feed url
2016-08-28 17:08:26 +01:00
Remita Amine
b3eaeded12 [tbs] Add new extractor(#10222) 2016-08-28 16:51:09 +01:00
Remita Amine
ec65b391cb [cartoonnetwork] Add new extractor(#10110) 2016-08-28 16:51:09 +01:00
Remita Amine
2982514072 [turner,nba,cnn,adultswim] add base extractor to parse cvp feeds 2016-08-28 16:51:09 +01:00
Yen Chi Hsuan
98908bcf7c [openload] Update algorithm again (#10408) 2016-08-28 22:49:46 +08:00
Yen Chi Hsuan
04b32c8f96 [bilibili] Fix extraction (closes #10375)
Thanks @gdkchan for the algorithm
2016-08-28 22:06:31 +08:00
Yen Chi Hsuan
40eec6b15c [openload] Fix extraction (closes #10408)
Thanks to @yokrysty again!
2016-08-28 20:27:52 +08:00
Yen Chi Hsuan
39efc6e3e0 [generic] Update some _TESTS 2016-08-28 15:46:11 +08:00
Sergey M․
1198fe14a1 release 2016.08.28 2016-08-28 07:24:08 +07:00
Sergey M․
71e90766b5 [README.md] Fix typo in download archive FAQ entry 2016-08-28 07:09:03 +07:00
Sergey M․
d7aae610f6 [ChangeLog] Actualize 2016-08-28 07:00:15 +07:00
Sergey M․
92c27a0dbf [periscope:user] Fix extraction (Closes #10453) 2016-08-28 02:35:49 +07:00
Yen Chi Hsuan
d181cff685 Merge branch 'steven7851-patch-2' 2016-08-27 01:17:12 +08:00
Yen Chi Hsuan
3b4b82d4ce [douyutv] Simplify 2016-08-27 01:16:39 +08:00
Yen Chi Hsuan
545ef4f531 Merge branch 'patch-2' of https://github.com/steven7851/youtube-dl into steven7851-patch-2 2016-08-26 22:29:46 +08:00
Yen Chi Hsuan
906b87cf5f [crackle] Revert to template-based thumbnail extraction
To reduce to number of HTTP requests
2016-08-26 19:58:47 +08:00
steven7851
b281aad2dc [douyutv] Use new api
use lapi for flv info, and html5 api for room info
#10153 #10318
2016-08-26 07:32:54 +08:00
Sergey M․
6b18a24e6e [tnaflix] Fix extraction (Closes #10434) 2016-08-26 05:57:52 +07:00
Sergey M․
c9de980106 Credit @Xender for nhk:vod (#10424) 2016-08-26 04:49:52 +07:00
Sergey M․
f9b373afda [nhk:vod] Improve extraction (Closes #10424) 2016-08-26 04:48:40 +07:00
Aleksander Nitecki
298a120ab7 [nhk] Add extractor for VoD. 2016-08-26 04:15:51 +07:00
Sergey M․
e3faecde30 [trutube] Remove extractor (Closes #10438) 2016-08-26 03:43:13 +07:00
Remita Amine
a0f071a50d [usanetwork] Add new extractor 2016-08-25 19:41:31 +01:00
Yen Chi Hsuan
20bad91d76 [downloader/external] Clarify that ffmpeg doesn't support SOCKS
Ref: #10304
2016-08-25 22:38:06 +08:00
Yen Chi Hsuan
b54a2da433 [crackle] Fix extraction and update _TESTS (closes #10333) 2016-08-25 22:22:31 +08:00
Yen Chi Hsuan
dc2c37f316 [spankbang] Fix description and uploader (closes #10339) 2016-08-25 20:47:35 +08:00
Philipp Hagemeister
c1f62dd338 [README] Clean up grammar in --download-archive paragraph 2016-08-25 14:45:01 +02:00
Sergey M․
5a3efcd27c [README.md] Add FAQ entry for download archive 2016-08-25 18:57:31 +07:00
Sergey M․
4c8f9c2577 [README.md] Add comments in sample configuration for clarity 2016-08-25 18:27:15 +07:00
Sergey M․
f26a298247 [README.md] Use en-US URL in cookies FAQ entry 2016-08-25 18:19:41 +07:00
Sergey M․
ea01cdbf61 [README.md] Clarify how to export cookies from browser for cookies FAQ entry 2016-08-25 18:17:45 +07:00
Sergey M․
6a76b53355 [README.md] Quote URL in streaming to player FAQ entry 2016-08-25 18:05:01 +07:00
Remita Amine
d37708fc86 [YoutubeDL] check only for None Value in thumbnails sorting 2016-08-25 11:53:47 +01:00
Remita Amine
5c13c28566 raise unexpected error when no stream found 2016-08-25 09:55:23 +01:00
Remita Amine
f70e9229e6 [discoverygo] detect when video needs authentication(closes #10425) 2016-08-25 09:11:23 +01:00
Remita Amine
30afe4aeb2 [cbc] Add support for watch.cbc.ca 2016-08-25 08:49:44 +01:00
Remita Amine
75fa990dc6 [YoutubeDL] add fallback value for thumbnails values in thumbnails sorting 2016-08-25 08:49:44 +01:00
Remita Amine
f39ffc5877 [common] extract formats from #EXT-X-MEDIA tags 2016-08-25 08:49:44 +01:00
Remita Amine
07ea9c9b05 [downloader/hls] fill IV with zeros for IVs shorter than 16-octet 2016-08-25 08:49:44 +01:00
Remita Amine
073ac1225f [utils] add ac-3 to the list of audio codecs in parse_codecs 2016-08-25 08:49:44 +01:00
Sergey M․
0c6422cdd6 [README.md] Add FAQ entry for streaming to player 2016-08-25 07:34:55 +07:00
Yen Chi Hsuan
08773689f3 [kickstarter] Silent the warning for og:description
Closes #10415
2016-08-25 01:29:32 +08:00
Yen Chi Hsuan
0c75abbb7b [mtvservices:embedded] Use another endpoint to get feed URL
Closes #10363

In the original mtvservices:embedded test case, config.xml is still used
to get the feed URL. Some other examples, including test_Generic_40
(http://www.vulture.com/2016/06/new-key-peele-sketches-released.html),
and the video mentioned in #10363, use another endpoint to get the feed
URL. The 'index.html' approach works for the original test case, too. So
I didn't keep the old approach.
2016-08-24 23:58:22 +08:00
Yen Chi Hsuan
97653f81b2 [bilibili] Mark as broken
Bilibili now uses emscripten, which is very difficult for reverse
engineering. I don't expect it to be fixed in near future, so I mark
it as broken.

Ref: #10375
2016-08-24 21:28:00 +08:00
Sergey M․
d38b27dd9b release 2016.08.24.1 2016-08-24 10:11:04 +07:00
Sergey M․
6d94cbd2f4 [ChangeLog] Actualize 2016-08-24 10:07:06 +07:00
Sergey M․
30317f4887 [pluralsight] Modernize and make more robust 2016-08-24 08:52:12 +07:00
Sergey M․
8c3e35dd44 [pluralsight] Add support for subtitles (Closes #9681) 2016-08-24 08:41:52 +07:00
Sergey M․
c86f51ee38 release 2016.08.24 2016-08-24 01:38:46 +07:00
Sergey M․
6e52bbb413 [ChangeLog] Actualize 2016-08-24 01:36:27 +07:00
Sergey M․
05bddcc512 [youtube] Fix authentication (2) (Closes #10392) 2016-08-24 01:29:50 +07:00
Sergey M․
1212e9972f [youtube] Fix authentication (#10392) 2016-08-24 00:25:21 +07:00
Remita Amine
ccb6570e9e [syfy,bravotv] restrict drupal settings regex 2016-08-23 17:31:35 +01:00
Yen Chi Hsuan
18b6216150 [openload] Fix extraction (closes #10408)
Thanks @yokrysty for the algorithm
2016-08-23 21:55:58 +08:00
Remita Amine
fb009b7f53 [bravotv] correct clip info extraction and add support for adobe pass auth(closes #10407) 2016-08-23 10:29:52 +01:00
Sergey M․
3083e4dc07 [eagleplatform] Improve detection of embedded videos (Closes #10409) 2016-08-23 07:22:14 +07:00
Remita Amine
7367bdef23 [awaan] fix extraction, modernize, rename the extractors and add test for live stream 2016-08-22 23:10:06 +01:00
Remita Amine
ad31642584 [nrk,abc:iview] use _extract_akamai_formats 2016-08-22 07:54:08 +01:00
Remita Amine
c7c43a93ba [common] add helper method to extract akamai m3u8 and f4m formats 2016-08-22 07:49:34 +01:00
Yen Chi Hsuan
96229e5f95 [mtvservices:embedded] Update config URL
All starts from #10363. The test case in mtvservices:embedded uses
config.xml, while the video from #10363 and the test case in generic.py
is broken. Both uses index.html for fetching the feed URL.
2016-08-22 13:56:09 +08:00
Remita Amine
55d119e2a1 [abc:iview] Add new extractor(closes #6148) 2016-08-22 00:07:17 +01:00
Sergey M․
6d2679ee26 release 2016.08.22 2016-08-22 04:17:34 +07:00
Sergey M․
afbab5688e [ChangeLog] Actualize 2016-08-22 04:15:46 +07:00
Sergey M․
3d897cc791 [ivi] Fix episode number extraction 2016-08-22 03:34:27 +07:00
Sergey M․
cf143c4d97 [ivi] Add support for 720p and 1080p 2016-08-22 03:31:33 +07:00
Yen Chi Hsuan
ad120ae1c5 [extractor/common] Change the default m3u8 protocol in HTML5
Helper functions should have consistent default values
2016-08-22 02:26:07 +08:00
Remita Amine
d0fa172e5f [firsttv] keep a test videos with multiple formats 2016-08-21 19:13:43 +01:00
Yen Chi Hsuan
f97f9f71e5 Merge branch 'TRox1972-charlierose' 2016-08-22 02:11:43 +08:00
Yen Chi Hsuan
526656726b [charlierose] Simplify and improve 2016-08-22 02:06:47 +08:00
Remita Amine
9b8c554ea7 [firsttv] fix extraction(closes #9249) 2016-08-21 17:56:25 +01:00
Yen Chi Hsuan
d13bfc07b7 Merge branch 'charlierose' of https://github.com/TRox1972/youtube-dl into TRox1972-charlierose 2016-08-22 00:48:35 +08:00
Sergey M․
efe470e261 [twitch] Renew authentication 2016-08-21 22:45:50 +07:00
Sergey M․
e3f6b56909 [twitch] Refactor API calls 2016-08-21 22:09:29 +07:00
Sergey M․
b1e676fde8 [twitch] Modernize 2016-08-21 21:28:02 +07:00
Sergey M․
92d4cfa358 [kaltura] Fallback ext calculation on caption's format 2016-08-21 21:01:01 +07:00
Remita Amine
3d47ee0a9e [zingmp3] fix extraction and add support for video clips(closes #10041) 2016-08-21 14:09:48 +01:00
Yen Chi Hsuan
d164a0d41b [README.md] Add a format selection example using comma
Ref: #10399
2016-08-21 20:00:48 +08:00
Déstin Reed
db29af6d36 [charlierose] Add new extractor 2016-08-21 11:29:48 +02:00
Sergey M․
2c6acdfd2d [kaltura] Add test for #10279 2016-08-21 08:37:01 +07:00
Sergey M․
fddaa76a59 [kaltura] Assume ttml to be default subtitles' extension 2016-08-21 08:28:36 +07:00
Sergey M․
a809446750 [kaltura] Add subtitles support when entry_id is unknown beforehand (Closes #10279) 2016-08-21 08:28:36 +07:00
Sergey M․
d8f30a7e66 [kaltura] Remove unused code 2016-08-21 08:28:36 +07:00
Sergey M․
5b1d85754e [YoutubeDL] Autocalculate ext when ext is None 2016-08-21 08:28:36 +07:00
Remita Amine
e25586e471 [cultureunplugged] fix extraction(closes #10330) 2016-08-20 20:02:49 +01:00
Remita Amine
292a2301bf [cnn] add support for money.cnn.com videos(closes #2797) 2016-08-20 19:00:25 +01:00
Remita Amine
dabe15701b [cbs, cbsnews] fix extraction(fixes #10393) 2016-08-20 13:25:32 +01:00
Sergey M․
4245f55880 [dotsub] Replace test (Closes #10386) 2016-08-20 06:18:20 +07:00
Déstin Reed
5b9d187cc6 [imdb] Improve title extraction and make thumbnail non-fatal 2016-08-20 04:50:39 +07:00
Yen Chi Hsuan
39e1c4f08c [litv] Support 'promo' URLs (closes #10385) 2016-08-20 00:52:37 +08:00
Yen Chi Hsuan
19f35402c5 [snotr] Fix extraction (closes #10338) 2016-08-20 00:18:22 +08:00
Yen Chi Hsuan
70852b47ca [utils] Recognize units with full names in parse_filename
Reference: https://en.wikipedia.org/wiki/Template:Quantities_of_bytes
2016-08-20 00:17:26 +08:00
Yen Chi Hsuan
a9a3b4a081 [miomio] Adapt to the new API and update _TESTS
The test case is from #9680
2016-08-20 00:08:23 +08:00
Yen Chi Hsuan
ecc90093f9 [vuclip] Adapt to the new API and update _TEST 2016-08-19 23:56:09 +08:00
Yen Chi Hsuan
520251c093 [extractor/common] Recognize m3u8 manifests in HTML5 multimedia tags 2016-08-19 23:53:47 +08:00
Yen Chi Hsuan
55af45fcab [radiobremen] Update _TEST (closes #10337) 2016-08-19 23:12:30 +08:00
Yen Chi Hsuan
b82232036a [n-tv.de] Fix extraction (closes #10331) 2016-08-19 20:39:28 +08:00
Yen Chi Hsuan
e4659b4547 [utils] Correct octal/hexadecimal number detection in js_to_json 2016-08-19 20:37:17 +08:00
Sergey M․
9e5751b9fe [globo:article] Relax _VALID_URL and video id regex (Closes #10379) 2016-08-19 01:13:45 +07:00
Sergey M․
bd1bcd3ea0 release 2016.08.19 2016-08-19 00:15:12 +07:00
Sergey M․
93a63b36f1 [ChangeLog] Actualize 2016-08-19 00:13:24 +07:00
Sergey M․
8b2dc4c328 [options] Remove output template description from --help
Same reasons as for --format
2016-08-18 23:59:13 +07:00
Sergey M․
850837b67a [porncom] Add extractor (Closes #2251, closes #10251) 2016-08-18 23:52:41 +07:00
Sergey M․
13585d7682 [utils] Recognize lowercase units in parse_filesize 2016-08-18 23:32:00 +07:00
Sergey M․
fd3ec986a4 [generic] Fix dbtv test (Closes #10364) 2016-08-18 21:35:41 +07:00
Sergey M․
b0d578ff7b [dbtv] Relax embed regex 2016-08-18 21:30:55 +07:00
Déstin Reed
b0c8f2e9c8 [DBTV:generic] Add support for embeds 2016-08-18 21:29:27 +07:00
Sergey M․
51815886a9 [vk:wallpost] Fix audio extraction 2016-08-18 06:14:05 +07:00
Sergey M․
08a42f9c74 [vk] Fix authentication on python3 2016-08-18 05:22:23 +07:00
Sergey M․
e15ad9ef09 [keezmovies] PEP 8 2016-08-18 04:39:31 +07:00
Sergey M․
4e9fee1015 [hgtvcom:show] Add extractor (Closes #10365) 2016-08-18 04:37:14 +07:00
Remita Amine
7273e5849b [discoverygo] extend _VALID_URL to support other networks 2016-08-17 11:03:09 +01:00
Sergey M․
b505e98784 [extremetube] Revert display_id 2016-08-17 07:02:13 +07:00
Sergey M․
92cd9fd565 [keezmovies] Make display_id optional 2016-08-17 07:01:32 +07:00
Sergey M․
b3d7dce429 release 2016.08.17 2016-08-17 06:21:21 +07:00
Sergey M․
a44694ab4e [ChangeLog] Actualize 2016-08-17 06:19:22 +07:00
Sergey M․
ab19b46b88 [extremetube] Modernize 2016-08-17 06:02:12 +07:00
Sergey M․
8804f10e6b [tube8] Modernize 2016-08-17 05:46:45 +07:00
Sergey M․
6be17c0870 [mofosex] Extract all formats and modernize (Closes #10335) 2016-08-17 05:45:49 +07:00
Sergey M․
8652770bd2 [keezmovies] Improve and modernize 2016-08-17 05:44:46 +07:00
Sergey M․
2a1321a272 [vbox7:generic] Add support for vbox7 embeds 2016-08-17 01:02:59 +07:00
Sergey M․
9c0fa60bf3 [vbox7] Add support for embed URLs 2016-08-17 00:42:02 +07:00
Sergey M․
502d87c546 [mtg] Improve view count extraction 2016-08-17 00:32:28 +07:00
Sergey M․
b35b0d73d8 [viafree] Add extractor (Closes #10358) 2016-08-17 00:21:30 +07:00
Sergey M․
6e7e4a6edf [mtg] Add support for viafree URLs (#10358) 2016-08-17 00:19:43 +07:00
Remita Amine
53fef319f1 [fxnetworks] extend _VALID_URL to support simpsonsworld.com 2016-08-16 16:22:34 +01:00
Remita Amine
2cabee2a7d [amcnetworks] fix typo 2016-08-16 16:22:34 +01:00
Remita Amine
11f502fac1 [theplatform] extract subtitles with multiple formats from the metadata 2016-08-16 16:22:34 +01:00
Sergey M․
98affc1a48 [xvideos] Fix test 2016-08-16 21:20:15 +07:00
Sergey M․
70a2829fee [xvideos] Fix HLS extraction (Closes #10356) 2016-08-16 21:17:52 +07:00
Remita Amine
837e56c8ee [amcnetworks] extract episode metadata 2016-08-16 14:49:32 +01:00
Remita Amine
b5ddee8c77 [amcnetworks] Add new extractor 2016-08-16 13:44:01 +01:00
Sergey M․
fb64adcbd3 [adobepass] PEP 8 2016-08-16 04:45:21 +07:00
Sergey M․
4f640f2890 [bbc:playlist] Fix tests 2016-08-16 04:43:10 +07:00
Sergey M․
254e64a20a [bbc:playlist] Add support for pagination (Closes #10349) 2016-08-16 04:36:23 +07:00
Remita Amine
818ac213eb [adobepass] add IE suffix to the extractor and remove duplicate constant 2016-08-15 21:36:34 +01:00
Remita Amine
cbef4d5c9f [fxnetworks] add test and check geo restriction 2016-08-15 17:10:45 +01:00
Remita Amine
bf90c46790 [fxnetworks] Add new extractor(closes #9462) 2016-08-15 16:34:32 +01:00
Yen Chi Hsuan
69eb4d699f [cbsnews] Remove invalid tests. CBS Live videos gets deleted soon. 2016-08-15 20:29:22 +08:00
Yen Chi Hsuan
6d8ec8c3b7 [ChangeLog] Update for CBSLocal and related changes 2016-08-15 13:39:43 +08:00
Yen Chi Hsuan
760845ce99 [cbslocal] Adapt to SendtoNewsIE 2016-08-15 13:37:37 +08:00
Yen Chi Hsuan
5c2d087221 [sendtonews] Fix extraction 2016-08-15 13:31:08 +08:00
Yen Chi Hsuan
b6c4e36728 [jwplatform] Parse video_id from JWPlayer data
And remove a mysterious comma from 115c65793a
2016-08-15 13:29:01 +08:00
Sergey M․
1a57b8c18c [zippcast] Remove extractor (Closes #10332)
ZippCast is shut down
2016-08-15 08:25:24 +07:00
Remita Amine
24eb13b1c6 [uplynk,viceland] update tests and change uplynk extractors names 2016-08-14 22:45:43 +01:00
Remita Amine
525e0316c0 [adobepass] fix check for pendingLogout errors 2016-08-14 21:25:43 +01:00
Remita Amine
7e60ce9cf7 [adobepass] clear cache in case of pendingLogout errors 2016-08-14 21:24:33 +01:00
Remita Amine
e811bcf8f8 [viceland] raise ExtractorError for errors other than HTTP 400 2016-08-14 20:13:35 +01:00
Remita Amine
6103f59095 [viceland] remove outdated comment 2016-08-14 19:08:35 +01:00
Remita Amine
9fa5789279 [viceland] fix info extraction(closes #8799) 2016-08-14 19:04:23 +01:00
Remita Amine
d2ac04674d [viceland] Add new extractor(#8799) 2016-08-14 18:04:50 +01:00
Remita Amine
1fd6e30988 [adobepass] create separate class for adobe pass authentication 2016-08-14 18:04:50 +01:00
Sergey M․
884cdb6cd9 [life:embed] Improve extraction 2016-08-14 20:49:11 +07:00
Remita Amine
9771b1f901 [theplatform] use _get_netrc_login_info and fix session expiration check(#10345) 2016-08-14 11:55:28 +01:00
Remita Amine
2118fdd1a9 [common] add separate method for getting netrc ligin info 2016-08-14 11:55:28 +01:00
Sergey M․
320d597c21 [vgtv] Detect geo restricted videos (#10348) 2016-08-14 16:25:14 +07:00
Remita Amine
aaf44a2f47 [uplynk] Add new extractor 2016-08-13 22:53:41 +01:00
Yen Chi Hsuan
fafabc0712 Update ChangeLog for #10342
[skip ci]
2016-08-14 02:33:15 +08:00
Yen Chi Hsuan
409760a932 Merge pull request #10342 from muphil/patch-1
[xiami] bug fix for extractor xiami.py
2016-08-14 02:30:50 +08:00
phi
097eba019d bug fix for extractor xiami.py
Before applying this patch, when downloading resources from xiami.com, it crashes with these:
Traceback (most recent call last):
  File "/home/phi/.local/bin/youtube-dl", line 11, in <module>
    sys.exit(main())
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/__init__.py", line 433, in main
    _real_main(argv)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/__init__.py", line 423, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 1786, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 691, in extract_info
    ie_result = ie.extract(url)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/extractor/common.py", line 347, in extract
    return self._real_extract(url)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/extractor/xiami.py", line 116, in _real_extract
    return self._extract_tracks(self._match_id(url))[0]
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/extractor/xiami.py", line 43, in _extract_tracks
    '%s/%s%s' % (self._API_BASE_URL, item_id, '/type/%s' % typ if typ else ''), item_id)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/extractor/common.py", line 562, in _download_json
    json_string, video_id, transform_source=transform_source, fatal=fatal)
  File "/home/phi/.local/lib/python3.5/site-packages/youtube_dl/extractor/common.py", line 568, in _parse_json
    return json.loads(json_string)
  File "/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'NoneType'

This patch solves exactly this problem.
2016-08-14 02:18:59 +08:00
Sergey M․
73a85620ee release 2016.08.13 2016-08-13 23:17:11 +07:00
Sergey M․
a560f28c98 [ChangeLog] Actualize 2016-08-13 23:01:35 +07:00
Sergey M․
5ec5461e1a [pbs] Clarify comment on http formats 2016-08-13 22:50:18 +07:00
Sergey M․
542130a5d9 [pbs] Fix description extraction and update tests 2016-08-13 21:59:29 +07:00
Sergey M․
82997dad57 [franceculture] Fix extraction (Closes #10324) 2016-08-13 21:00:34 +07:00
Sergey M․
647a7bf5e8 [pornotube] Fix extraction (Closes #10322) 2016-08-13 20:49:16 +07:00
Sergey M․
77afa008dd [4tube] Fix metadata extraction (Closes #10321) 2016-08-13 19:55:09 +07:00
Yen Chi Hsuan
db535435b3 [bigflix] Remove an invalid test
There's no video anymore
2016-08-13 18:02:11 +08:00
Sergey M․
c2a453b461 [imgur] Fix width and height extraction (Closes #10325) 2016-08-13 16:46:07 +07:00
Sergey M․
cd29eaab95 [vbox7] Remove unused imports 2016-08-13 16:45:34 +07:00
Yen Chi Hsuan
52aa7e7476 [test_verbose_output] Fix tests under Python 3 2016-08-13 17:36:14 +08:00
Sergey M․
e97c55ee6a [expotv] Improve extraction and update test 2016-08-13 16:29:05 +07:00
Remita Amine
acfccacad5 [downloader/external:curl] Clarify why CurlFD should not capture stderr 2016-08-13 10:26:02 +01:00
Remita Amine
5f2c2b7936 [test_utils] add test for option with not str value 2016-08-13 09:54:12 +01:00
Sergey M․
cb55908e51 [vbox7] Fix extraction (Closes #10309) 2016-08-13 15:47:20 +07:00
Yen Chi Hsuan
e581224843 [tapely] Remove extractor. It's shut down
Closes #10323
2016-08-13 16:32:07 +08:00
Remita Amine
f50365e91c [pbs] add test for videos with undocumented http formats and remove unused import 2016-08-13 09:10:09 +01:00
Sergey M․
c366f8d30a [24video] Add support for me and xxx TLDs 2016-08-13 14:47:51 +07:00
Sergey M․
6a26c5f9d5 [muenchentv] Fix extraction (Closes #10313) 2016-08-13 14:28:44 +07:00
Sergey M․
bd6fb007de [24video] Fix comment count extraction 2016-08-13 14:22:47 +07:00
Sergey M․
b69b2ff736 [sunporno] Add support for embed URLs 2016-08-13 14:13:49 +07:00
Sergey M․
794e5dcd7e [sunporno] Fix metadata extraction (Closes #10316) 2016-08-13 14:09:35 +07:00
Remita Amine
f0d3669437 [hgtv] Add new extractor(closes #3999) 2016-08-12 18:05:49 +01:00
Remita Amine
98e698f1ff [external/curl] respect more downloader options and display progress 2016-08-12 12:30:02 +01:00
Remita Amine
3cddb8d6a7 [pbs] check all http formats and remove unnecessary request
- some of the quality that not reported in the documentation
are available(4500k, 6500k)
- the videoInfo request doesn't work for a long time
2016-08-12 08:38:06 +01:00
Sergey M․
990d533ee4 [crunchyroll] Add support for HLS (Closes #10301) 2016-08-12 00:56:16 +07:00
Sergey M․
b0081562d2 release 2016.08.12 2016-08-12 00:22:22 +07:00
Sergey M․
fff37cfd4f [ChangeLog] Actualize 2016-08-12 00:18:28 +07:00
Sergey M․
a3be69b7f0 [viu] Remove from extractors 2016-08-12 00:14:51 +07:00
Sergey M․
0fd1b1624c [goldenmoustache] Remove extractor (Closes #10298)
Now uses dailymotion
2016-08-11 23:52:17 +07:00
Sergey M․
367976d49f [drtuber] Improve title extraction 2016-08-11 23:47:52 +07:00
Sergey M․
0aef0771f8 [drtuber] Make dislike count optional (Closes #10297) 2016-08-11 23:47:27 +07:00
Sergey M․
0c070681c5 [chirbit] Fix extraction (Closes #10296) 2016-08-11 23:37:56 +07:00
Sergey M․
30b25d382d [francetvinfo] Relax _VALID_URL 2016-08-11 21:42:55 +07:00
Yen Chi Hsuan
e5f878c205 [ChangeLog] Add change log for #10269
[skip ci]
2016-08-11 19:13:41 +08:00
Yen Chi Hsuan
e2e84aed7e Merge branch 'lkho-pr/#10268' 2016-08-11 19:09:18 +08:00
Yen Chi Hsuan
b1927f4e8a [YoutubeDL] Disable newline conversion when writing subtitles
By default io.open() convert all '\n' occurrences to '\r\n' when writing
files. If the content already contains '\r\n', it will be converted to
'\r\r\n', breaking some video players.
2016-08-11 19:04:23 +08:00
Yen Chi Hsuan
3b9323d96e Merge branch 'pr/#10268' of https://github.com/lkho/youtube-dl into lkho-pr/#10268 2016-08-11 19:03:08 +08:00
lkho
7f832413d6 Preserve line endings for downloaded subtitle files 2016-08-10 23:40:50 +08:00
Sergey M․
7f2ed47595 [rtlnl] Relax _VALID_URL (Closes #10282) 2016-08-10 21:07:43 +07:00
Sergey M․
c3fa77bdef [formula1] Relax _VALID_URL (Closes #10283) 2016-08-10 21:00:40 +07:00
Remita Amine
57ce8a6d08 [wat] improve extraction(#10281)
add alternative method to extract http formats
works even if the video is geo-restricted or removed
from public access(most of the cases)
2016-08-10 14:20:28 +01:00
Yen Chi Hsuan
69d8eeeec5 [ctsnews] Fix extraction 2016-08-10 11:38:38 +08:00
Yen Chi Hsuan
81c13222c6 [utils] Recognize more formats in unified_timestamp
Used in CtsNews
2016-08-10 11:37:23 +08:00
Sergey M․
b1ce2ba197 release 2016.08.10 2016-08-10 00:20:44 +07:00
Sergey M․
5c8411e968 [ChangeLog] Actualize 2016-08-10 00:18:28 +07:00
Sergey M․
cc9c8ce5df [devscripts/prepare_manpage] Fix description strings starting with dash (Closes #10273) 2016-08-09 22:24:58 +07:00
Remita Amine
20ef4123b9 [uol] remove unused import 2016-08-09 15:13:15 +01:00
Remita Amine
4e62d26aa2 [uol] Add new extractor(#4263) 2016-08-09 15:09:08 +01:00
Sergey M․
b657816684 Credit @singh-pratyush96 for #10223 2016-08-09 04:04:45 +07:00
Sergey M․
9778b3e7ee Credit @zvonicek for #10242 and #10253 2016-08-09 04:03:52 +07:00
Sergey M․
25dd58ca6a [metadatafromtitle] Remove unused exception class 2016-08-09 04:01:05 +07:00
nyorain
5e42f8a0ad Make --metadata-from-title non fatal
Output a warning if the metadata can't be parsed from the title (and don't write any metadata) instead of raising a critical error.
2016-08-09 03:56:22 +07:00
Sergey M․
1ad6b891b2 Add more checks for --min/max-sleep-interval arguments and use more idiomatic naming 2016-08-09 03:47:56 +07:00
Sergey M․
7aa589a5e1 Fix --min/max-sleep-interval wording 2016-08-09 03:46:52 +07:00
singh-pratyush96
065bc35489 Add --max-sleep-interval (Closes #9930) 2016-08-09 03:32:42 +07:00
Sergey M․
3a380766d1 [rbmaradio] Improve, simplify and extract all formats (Closes #10242) 2016-08-09 02:46:29 +07:00
Petr Zvoníček
affaea0688 [rbmaradio] Fixed extractor 2016-08-09 02:18:33 +07:00
Sergey M․
77426a087b [sonyliv] Improve (Closes #10258) 2016-08-09 02:16:28 +07:00
Sukhbir Singh
8991844ea2 [sonyliv] Add new extractor 2016-08-09 02:09:13 +07:00
Sergey M․
082395d0a0 [extractor/generic] Add proper default to _search_json_ld call 2016-08-08 22:48:33 +07:00
Sergey M․
e8ed7354e6 [flipagram] Add proper default to _search_json_ld call 2016-08-08 22:46:19 +07:00
Sergey M․
1e7f602e2a [condenast] Make _search_json_ld call non fatal 2016-08-08 22:45:49 +07:00
Sergey M․
522f6c066d [bbc] Add proper default to _search_json_ld call 2016-08-08 22:44:36 +07:00
Sergey M․
321b5e082a [extractor/common] Respect default in _search_json_ld 2016-08-08 22:36:18 +07:00
Sergey M․
3711fa1eb2 Revert "[flipagram] Make _search_json_ld non fatal"
This reverts commit d34995a9e3.
2016-08-08 21:49:45 +07:00
Sergey M․
395c74615c Revert "[extractor/generic] Make _search_json_ld non fatal"
This reverts commit 958849275f.
2016-08-08 21:49:27 +07:00
Yen Chi Hsuan
3dc240e8c6 [sohu] Update _TESTS (closes #10260) 2016-08-08 18:48:21 +08:00
Yen Chi Hsuan
a41a6c5094 [chaturbate] Skip the invalid test 2016-08-08 13:06:02 +08:00
Yen Chi Hsuan
d71207121d [biqle] Skip an invalid test 2016-08-08 12:59:55 +08:00
Yen Chi Hsuan
b1c6f21c74 [aparat] Fix extraction 2016-08-08 12:59:07 +08:00
Yen Chi Hsuan
412abb8760 [bilibili] Update _TESTS 2016-08-08 12:57:17 +08:00
Yen Chi Hsuan
f17d5f6d14 [features.aol.com] Fix _TESTS 2016-08-08 12:52:36 +08:00
Remita Amine
6bb801cfaf [cwtv] extract http formats 2016-08-07 22:58:12 +01:00
Sergey M․
de02d1f4e9 [rozhlas] Fix regexes and improve extraction (Closes #10253) 2016-08-08 04:58:02 +07:00
Petr Zvoníček
e1f93a0a76 [rozhlas] Add new extractor 2016-08-08 04:41:45 +07:00
Charlie Le
d21a661bb4 [README.md] Update Options Link
The link references a bad anchor. The updated link now references the correct anchor.
2016-08-08 03:46:42 +07:00
Yen Chi Hsuan
b2bd968f4b [kuwo:singer] Fix extraction 2016-08-07 22:59:34 +08:00
Sergey M․
4a01befb34 release 2016.08.07 2016-08-07 21:12:41 +07:00
Sergey M․
845dfcdc40 [ChangeLog] Actualize 2016-08-07 21:10:48 +07:00
Sergey M․
d92cb46305 [discoverygo] Add extractor (Closes #10245) 2016-08-07 20:57:05 +07:00
Sergey M․
a8795327ca [utils] Add support TV Parental Guidelines ratings in parse_age_limit 2016-08-07 20:45:18 +07:00
Sergey M․
d34995a9e3 [flipagram] Make _search_json_ld non fatal 2016-08-07 19:06:55 +07:00
Sergey M․
958849275f [extractor/generic] Make _search_json_ld non fatal 2016-08-07 19:04:22 +07:00
Sergey M․
998f094452 [bbc] Remove proxy from test 2016-08-07 18:13:05 +07:00
Sergey M․
aaa42cf0cf [bbc] PEP 8 2016-08-07 18:05:13 +07:00
Sergey M․
9fb64c04cd [bbc] Add support for morph embeds (Closes #10239) 2016-08-07 18:01:50 +07:00
Remita Amine
f9622868e7 [bbc] preserve format_id backward compatibility 2016-08-07 11:14:15 +01:00
Remita Amine
37768f9242 [common] correctly lower the preference of m3u8 master manifest format 2016-08-07 10:59:09 +01:00
Sergey M․
a1aadd09a4 [tnaflixnetworkbase] Improve title extraction 2016-08-07 16:00:09 +07:00
Sergey M․
b47a75017b [tnaflix] Fix metadata extraction (Closes #10249) 2016-08-07 16:00:03 +07:00
Remita Amine
e37b54b140 [fox] fix theplatform release url query 2016-08-06 20:53:39 +01:00
Yen Chi Hsuan
c1decda58c [openload] Fix extraction (closes #9706) 2016-08-07 02:44:15 +08:00
Yen Chi Hsuan
d3f8e038fe [utils] Add decode_png for openload (#9706) 2016-08-07 02:42:58 +08:00
Remita Amine
ad152e2d95 [bbc] fix test 2016-08-06 19:36:12 +01:00
Remita Amine
b0af12154e [bbc] reduce requests and improve format_id 2016-08-06 19:24:59 +01:00
Remita Amine
d16b3c6677 [common] extract partOfTVSeries info in json-ld 2016-08-06 18:58:38 +01:00
Remita Amine
c57244cdb1 [common] lower the preference of m3u8 master manifest format 2016-08-06 18:55:05 +01:00
Remita Amine
a7e5f27412 [bbc] improve extraction
- extract f4m and dash formats
- improve format sorting and listing
- improve extraction of articles with `otherSettings.playlist`
2016-08-06 18:48:09 +01:00
Remita Amine
089a40955c [pokemon] improve _VALID_URL 2016-08-06 12:08:14 +01:00
Remita Amine
d73ebac100 [pokemon] Add new extractor(closes #10093) 2016-08-06 11:18:14 +01:00
Remita Amine
e563c0d73b [condenast] fallback to loader.js if video.js fail 2016-08-05 21:01:16 +01:00
Sergey M․
491c42e690 release 2016.08.06 2016-08-06 01:23:48 +07:00
Sergey M․
7f2339c617 [ChangeLog] Actualize 2016-08-06 01:19:47 +07:00
Sergey M․
8122e79fef [gamekings] Remove remnants 2016-08-06 00:12:37 +07:00
Sergey M․
fe3ad1d456 [adultswim] Remove superfluous md5 from test 2016-08-06 00:02:05 +07:00
Sergey M․
038a5e1a65 [adultswim] Add support for trailers (Closes #10235) 2016-08-06 00:00:05 +07:00
Sergey M․
84bc23b41b [archiveorg] PEP 8 2016-08-05 23:16:19 +07:00
Sergey M․
46933a15d6 [extractor/common] Support root JSON-LD lists (Closes #10203) 2016-08-05 23:14:32 +07:00
Sergey M․
3859ebeee6 [tvplay] Capture and output native error message 2016-08-05 22:50:42 +07:00
Remita Amine
d50aca41f8 [archiveorg] improve format extraction(closes #10219) 2016-08-05 16:42:15 +01:00
Remita Amine
0ca057b965 [jwplatform] add support for playlist extraction and relative urls and improve audio detection 2016-08-05 16:42:15 +01:00
Sergey M․
5ca968d0a6 [tvplay] Extract series metadata 2016-08-05 22:37:38 +07:00
Sergey M․
f0d31c624e [tvplay] Add support for subtitles (Closes #10194) 2016-08-05 22:17:32 +07:00
Remita Amine
08c655906c [5min] fix _VALID_URL(closes #10228) 2016-08-05 10:22:33 +01:00
Remita Amine
5a993e1692 [natgeo] fix tests(closes #10229) 2016-08-05 10:13:26 +01:00
Remita Amine
a7d2953073 [extractors] add tvp:embed import 2016-08-05 10:11:59 +01:00
Remita Amine
fdd0b8f8e0 [tvp] extract video id from the webpage(fixes #7799) 2016-08-05 09:44:15 +01:00
Remita Amine
f65dc41b72 [naver] extract upload date 2016-08-05 08:12:25 +01:00
Yen Chi Hsuan
962250f7ea [cbslocal] Fix timestamp parsing (closes #10213) 2016-08-05 11:44:50 +08:00
Yen Chi Hsuan
7dc2a74e0a [utils] Fix unified_timestamp for formats parsed by parsedate_tz() 2016-08-05 11:41:55 +08:00
Remita Amine
b02b960c6b [naver] improve extraction(closes #8096) 2016-08-04 21:42:22 +01:00
Remita Amine
4f427c4be8 [condenast] improve extraction 2016-08-04 18:30:56 +01:00
Sergey M․
8a00ea567b [natgeo:episodeguide] Do not shadow url from outer scope 2016-08-04 23:21:04 +07:00
Remita Amine
8895be01fc [5min] fix _VALID_URL 2016-08-04 16:55:12 +01:00
Remita Amine
52e7fcfeb7 [engadget] Relax _VALID_URL 2016-08-04 16:34:47 +01:00
Remita Amine
2396062c74 [5min] delegate extraction to AolIE
recently the 5min SenseHandler request return
HTTP Error 503: Service Unavailable error
2016-08-04 16:21:27 +01:00
Remita Amine
14704aeff6 [kaltura] remove debugging line 2016-08-04 14:54:34 +01:00
Remita Amine
3c2c3af059 [extractors] change imports for national geographic extractors 2016-08-04 12:20:56 +01:00
Remita Amine
1891ea2d76 [nationalgeographic] Add support for National Geographic Episode Guide 2016-08-04 12:18:10 +01:00
Remita Amine
1094074c04 [kaltura] extract subtitles and reduce requests 2016-08-04 09:39:06 +01:00
Remita Amine
217d5ae013 [vodplatform] Add new extractor 2016-08-04 09:39:06 +01:00
Remita Amine
8b40854529 [common] lower proto_preference of rtsp formats
Most of the time the RtspFD fail to download videos but it report
success of the download with this output:
[mpv] 0 bytes
[download] 100% of 0.00B
2016-08-04 09:39:06 +01:00
Yen Chi Hsuan
6bb0fbf9fb Revert "[README.md] Use full paths for all configuration files (#8863)"
This reverts commit 899d2bea63.
2016-08-04 09:54:28 +08:00
Sergey M․
8d3b226b83 [gamekings] Remove extractor
Now covered by generic jwplayer
2016-08-03 22:06:10 +07:00
Remita Amine
42b7a5afe0 [limelight] extract http formats 2016-08-03 13:12:51 +01:00
Yen Chi Hsuan
899d2bea63 [README.md] Use full paths for all configuration files (#8863) 2016-08-03 11:15:27 +08:00
Sergey M․
9cb0e65d7e [ntvru] Fix extraction 2016-08-02 22:56:48 +07:00
Sergey M․
b070564efb [extractor/common] Support multiple properties in _og_search_property 2016-08-02 22:55:14 +07:00
Philipp Hagemeister
ce28252c48 [options] Add test that checks that --password=secret is hidden in verbose output 2016-08-02 17:03:46 +02:00
Philipp Hagemeister
3aa9a73554 [options] Hide --password=secret in verbose output 2016-08-02 17:03:26 +02:00
Philipp Hagemeister
6a9b3b61ea [comedycentral] Re-add shortnames
In cc99d4f826, the shortname feature got deleted by accident. Re-add it as a separate IE.
2016-08-02 14:02:31 +02:00
Sergey M․
45408eb075 release 2016.08.01 2016-08-01 22:59:23 +07:00
Sergey M․
eafc66855d [ChangeLog] Add recent changes 2016-08-01 22:56:01 +07:00
Sergey M․
e03d3e6453 [cwtv] Add support for cwtvpr.com (Closes #10196) 2016-08-01 22:51:01 +07:00
Remita Amine
a70e45f80a [limelight] keep videos marked as previewStream
e382b953f0 (commitcomment-18472915)
2016-08-01 16:25:41 +01:00
Sergey M․
697655a7c0 [safari] Relax url regexes (Closes #10202) 2016-08-01 21:48:48 +07:00
Remita Amine
e382b953f0 [limelight] skip preview and drm protected videos 2016-08-01 00:33:30 +01:00
Yen Chi Hsuan
116e7e0d04 [bloomberg] Support BPlayer() players (closes #10187) 2016-07-31 14:47:19 +08:00
Sergey M․
cf03e34ad3 [yandexmusic:track] Fix extraction (Closes #10193) 2016-07-31 07:56:18 +07:00
Sergey M․
2903137292 release 2016.07.30 2016-07-30 14:45:07 +07:00
Sergey M․
9361f2169c [ChangeLog] Make extractor improvements' descriptions more concrete 2016-07-30 14:43:28 +07:00
Yen Chi Hsuan
35aa6c538f Add ChangeLog 2016-07-30 12:33:09 +08:00
Sergey M․
fa9f1d16b8 [dailymotion:playlist] Carry long line 2016-07-29 22:47:34 +07:00
Dave
485fedf6fd [dailymotion:playlist] Optimize download archive processing 2016-07-29 22:45:41 +07:00
Jaime Marquínez Ferrándiz
da0baba5c8 [rtve] Fix extraction for some videos
For example http://www.rtve.es/alacarta/videos/documentos-tv/documentos-tv-descredito/3574098/.
2016-07-29 17:20:27 +02:00
Jaime Marquínez Ferrándiz
bb9f3bfedf Revert "[rtve] Fix extraction (#10076)"
This reverts commit c39b2ed990.

Apparently outside of Spain using 'auth/resources' is required (#10097).
2016-07-29 17:14:04 +02:00
Sergey M․
dbc0b39b91 [tv2] Improve extraction 2016-07-29 22:01:34 +07:00
Sergey M․
481c5c5137 [tv2:article] Fix extraction (Closes #10188) 2016-07-29 21:43:17 +07:00
Sergey M․
0cacae2807 [twitch:clips] Sort formats 2016-07-29 09:01:53 +07:00
Sergey M․
d9d56deadf release 2016.07.28 2016-07-28 02:42:57 +07:00
Sergey M․
74ba450a81 [twitch:clips] Fix extraction (Closes #9767) 2016-07-28 22:30:09 +07:00
Sergey M․
db19df6ca0 [extractor/generic] Add test for #10179 2016-07-28 22:20:08 +07:00
Sergey M․
fbdf8d15d1 [soundcloud] Add _extract_urls (#10179) 2016-07-28 22:16:05 +07:00
Sergey M․
94aae01548 [extractor/generic] Extract all soundcloud embeds (Closes #10179) 2016-07-28 22:15:15 +07:00
Sergey M․
39eef54cf0 [ard:mediathek] Skip unavailable test 2016-07-28 21:38:23 +07:00
Sergey M․
05c8268c81 [shared] Modernize and make more robust 2016-07-27 23:39:02 +07:00
Sergey M․
289a16b4f3 [shared] Respect redirect URL (Closes #10170) 2016-07-27 23:28:01 +07:00
Sergey M․
7935926baa [devscripts/show-downloads-statistics] Add support for paging 2016-07-27 00:14:40 +07:00
Sergey M․
dcbb07c35a release 2016.07.26.2 2016-07-26 23:56:53 +07:00
Sergey M․
40090e8d51 [extractor/common] Improve is_suitable
In order to fix breakage introduced by a3aa814b77
2016-07-26 23:54:06 +07:00
Sergey M․
3e050d51d4 [orf:oe1] Relax _VALID_URL 2016-07-26 23:14:04 +07:00
Sergey M․
ced70c8640 [cbc] PEP 8 2016-07-26 23:08:08 +07:00
Sergey M․
9a700deea4 [instagram] Remove duplicate field in test 2016-07-26 23:07:16 +07:00
Sergey M․
dc35ba0eba [mgtv] Fix typo 2016-07-26 23:06:21 +07:00
Sergey M․
88bd486b9a [cbc] Improve extraction for videos embedded with clipId 2016-07-26 22:58:50 +07:00
Sergey M․
7f8b92e3cf [bigflix] Update tests 2016-07-26 21:44:53 +07:00
Yen Chi Hsuan
35f6e0ff36 [mtv.de] Skip 2 geo-restricted tests 2016-07-26 13:19:47 +08:00
Yen Chi Hsuan
326fa4e6e5 [generic] Skip an invalid test 2016-07-26 13:16:04 +08:00
Yen Chi Hsuan
c74299a72c [cmt] Detect unavailable videos and update _TESTS 2016-07-26 13:13:14 +08:00
Yen Chi Hsuan
10a1bb3a78 [mtv] Fix for videos with missing bitrates 2016-07-26 13:12:24 +08:00
Yen Chi Hsuan
4d3e543c73 Update extractors.py 2016-07-26 11:17:28 +08:00
Yen Chi Hsuan
05d1e7aaa9 [generic] Fix an MTV test and another test that breaks nosetests 2016-07-26 11:11:36 +08:00
Yen Chi Hsuan
a3aa814b77 Update _TESTS for MTV sites 2016-07-26 11:10:41 +08:00
Yen Chi Hsuan
5c32a77cad [nextmovie] Remove extractor
This domain name now redirects to mtv.com
2016-07-26 11:08:55 +08:00
Yen Chi Hsuan
14a28e705b [test/test_all_urls] Remove *.cc.com tests 2016-07-26 11:08:09 +08:00
Yen Chi Hsuan
cc99d4f826 [comedycentral] Remove IEs for *.cc.com except tosh.cc.com
All other subdomains now redirects to cc.com/* URLs
2016-07-26 11:06:50 +08:00
Yen Chi Hsuan
712c7530ff [mtv] Extract more metadata and more
1. Remove MTVIggyIE. All www.mtviggy.com URLs now redirects to
   www.mtv.com
2. Fix MTVDEIE
3. Return multiple URLs from _transform_rtmp_url. This is for
   tosh.cc.com
2016-07-26 11:03:43 +08:00
Sergey M․
0a147785e8 [camdemy] Extract duration properly 2016-07-25 23:03:58 +07:00
Sergey M․
59eaf69e33 [camdemy] Fix camdemy 2016-07-25 23:03:43 +07:00
Sergey M․
e8be2943a7 [smotri] Modernize, make more robust and fix tests 2016-07-24 18:38:18 +07:00
Sergey M․
8fdc538b46 release 2016.07.24 2016-07-24 11:39:50 +07:00
Sergey M․
9513c1eb17 [tvp] Update dash format comment 2016-07-24 11:03:39 +07:00
Sergey M․
ae6fff4e64 [onet] Enable dash formats 2016-07-24 10:43:05 +07:00
Sergey M․
5a65668e25 [dcn] Enable dash formats 2016-07-24 10:35:55 +07:00
Sergey M․
f75e6890db [telegraaf] Make hls non fatal 2016-07-24 10:29:26 +07:00
Sergey M․
d9cb92c840 [telegraaf] Enable dash formats 2016-07-24 10:29:09 +07:00
Sergey M․
94c04a3c79 [arkena] Enable dash formats 2016-07-24 10:28:11 +07:00
Sergey M․
f094834857 [extractor/common] Add support for $ in SegmentTemplate in MPD manifests 2016-07-24 10:27:16 +07:00
Déstin Reed
111de00289 [DailyMail] Improve title and description extraction 2016-07-24 05:37:13 +07:00
Sergey M․
b4a131e1a5 [facebook] Relax _VALID_URL (Closes #10151) 2016-07-24 04:36:49 +07:00
Sergey M․
f1991ce928 [arkena] Skip dash formats 2016-07-23 18:07:55 +07:00
Sergey M․
6548030a17 Credit @rvanbekkum for arkena (#8682) 2016-07-23 18:00:19 +07:00
Sergey M․
3a8947650b [arkenaplay] Remove extractor 2016-07-23 17:57:55 +07:00
Sergey M․
1979969f91 [extractor/generic] Add support for arkena embeds 2016-07-23 17:56:48 +07:00
Sergey M․
0673741af3 [extractors] Add imports for arkena and lcp 2016-07-23 17:56:29 +07:00
Sergey M․
c8e170b209 [lcp] Improve extraction 2016-07-23 17:56:11 +07:00
Sergey M․
bbe1f3634a [arkena] Improve extraction (Closes #8682) 2016-07-23 17:55:54 +07:00
Rob van Bekkum
4671dd41b2 [arkena:lcp] Add extractors 2016-07-23 17:01:09 +07:00
Sergey M․
f164b97123 [utils] Add another f4m mimetype to mimetype2ext 2016-07-23 16:48:59 +07:00
Sergey M․
5275efe30d release 2016.07.22 2016-07-22 23:11:28 +07:00
Sergey M․
b13647cf3c [eporner] Fix extraction (Closes #10139) 2016-07-22 23:04:13 +07:00
Sergey M․
add7d2a0e2 [pornhub] Make error regex less ambiguous (Closes #10138) 2016-07-22 21:24:09 +07:00
Sergey M․
e298d3a08c [youtube] Fix authentication (Closes #10140) 2016-07-22 21:05:39 +07:00
Sergey M․
fd8c8c7dcd [youtube:shared] Relax _VALID_URL 2016-07-21 22:58:34 +07:00
Sergey M․
9158af16cc [bbc.co.uk:iplayer:playlist] Add support for group URLs 2016-07-21 22:37:36 +07:00
Sergey M․
c6668e4ad1 [bbc.co.uk:iplayer:playlist] Skip unavailable test 2016-07-21 22:34:55 +07:00
Sergey M․
84e8cca48b [youjizz] Relax _VALID_URL (Closes #10131) 2016-07-20 22:41:13 +07:00
Sergey M․
790b06b7d4 [odatv] Improve (Closes #9285) 2016-07-20 21:43:22 +07:00
skacurt
740d7c49c2 [odatv] Add extractor 2016-07-20 21:42:05 +07:00
Sergey M․
4e51ec5f57 [extractors] Add import for comedycentral.tv 2016-07-19 22:50:37 +07:00
Sergey M․
05087d1b4c [bbc] Improve extraction from sxml playlists 2016-07-19 22:49:38 +07:00
Sergey M․
a66a73ee90 [ard] Add test for rbb-online 2016-07-18 02:25:31 +07:00
Sergey M․
8188b923db release 2016.07.17 2016-07-17 19:04:29 +07:00
Sergey M․
d993a1354d [README.md] Make download URLs consistent 2016-07-17 18:58:47 +07:00
Sergey M․
e8882e7043 [spike] Relax _VALID_URL and improve extraction (Closes #10106) 2016-07-17 18:34:25 +07:00
Sergey M․
1056821799 [viki] Fix tests (Closes #10098) 2016-07-17 18:13:54 +07:00
Sergey M․
890e6d3309 [viki] Lower m3u8 preference
http URLs are always provde the same or better quality
2016-07-17 18:12:03 +07:00
Sergey M․
246080d378 [viki] Override m3u8 formats acodec 2016-07-17 18:10:16 +07:00
Sergey M․
b1ea680270 Revert "[bbc] extract more and better qulities from Unified Streaming Platform m3u8 manifests"
This reverts commit 0385aa6199.
2016-07-17 17:29:36 +07:00
Sergey M․
45550d1039 [comedycentraltv] Add extractor (Closes #10101) 2016-07-17 16:58:58 +07:00
Sergey M․
7cdfc4c90f [mtvservices] Strip description 2016-07-17 16:56:39 +07:00
Sergey M․
af21f56f98 [ard] Add support for rbb-online (Closes #10095) 2016-07-17 03:40:58 +07:00
Sergey M․
1a8f0773b6 [streamable] Fix title extraction and improve (Closes #9122) 2016-07-17 02:01:00 +07:00
Zach Bruggeman
59cc5bd8bf [streamable] Add extractor 2016-07-17 01:35:09 +07:00
Sergey M․
49bc16b95e [nintendo] Improve playlist extraction (Closes #9986) 2016-07-17 00:01:25 +07:00
TRox1972
a2f9ca1e67 [nintendo] Add extractor 2016-07-16 23:58:53 +07:00
Sergey M․
371ddb14fe [extractor/generic] Change twitter:player embeds priority to lowest (Closes #10090) 2016-07-16 15:59:43 +07:00
Yen Chi Hsuan
998895dffa [cloudy] Drop videoraj.to
videoraj.ch is now a shoe-selling website, and videoraj.to domain name
is gone.
2016-07-16 15:37:54 +08:00
Yen Chi Hsuan
aadd3ce21f [cliphunter] Update _TESTS 2016-07-16 15:37:54 +08:00
Yen Chi Hsuan
ae7b846203 [cbsnews] Update _TESTS of CBSNewsLiveVideoIE 2016-07-16 15:37:54 +08:00
Yen Chi Hsuan
21ba7d0981 [cbc] Skip geo-restricted test case 2016-07-16 15:37:54 +08:00
Sergey M․
691fbe7f98 release 2016.07.16 2016-07-16 02:20:00 +07:00
Sergey M․
2e221ca3a8 [YoutubeDL] Fix incomplete formats check 2016-07-16 01:18:05 +07:00
Sergey M․
317f7ab634 [YoutubeDL] Fix format selection with filters (Closes #10083) 2016-07-16 00:55:43 +07:00
Yen Chi Hsuan
23495d6a39 Revert "[ffmpeg] Fix embedding subtitles (#9063)"
This reverts commit ccff2c404d.

Fixes #10081.

The new approach breaks embedding subtitles into video-only or
audio-only files. FFMpeg provides a trick: add '?' after the argument of
'-map' so that a missing stream is ignored. For example:

opts = [
    '-map', '0:v?',
    '-c:v', 'copy',
    '-map', '0:a?',
    '-c:a', 'copy',
    # other options...
]

Unfortunately, such a format is not implemented in avconv, either.
I guess adding '-ignore_unknown' if self.basename == 'ffmpeg' is the
best solution. However, the example mentioned in #9063 no longer serves
problematic files, so I can't test it. I'll reopen #9063 and wait for
another example so that I can test '-ignore_unknown'.
2016-07-15 20:02:36 +08:00
Remita Amine
224db034ab [syfy] fix extraction(closes #9087)(closes #3820)(closes #2388) 2016-07-14 23:59:47 +01:00
Sergey M․
ad27649be3 [3qsdn] Restrict src JS regex 2016-07-15 03:36:50 +07:00
Sergey M․
84571be645 [orf:tvthek] Remove test md5 2016-07-15 03:17:29 +07:00
Nehal Patel
7b0d333a7e Fix unit tests for m3u8 and RTSP extractors that require ffmpeg or mplayer 2016-07-15 03:06:23 +07:00
Remita Amine
342f0c3682 [ninenow] correct test url 2016-07-14 14:19:18 +01:00
Remita Amine
38e0f16a94 [ninenow] Add new extractor(closes #5181) 2016-07-14 14:16:11 +01:00
Remita Amine
e910fe2fe4 [brightcove] skip ism manifests 2016-07-14 14:13:57 +01:00
Jaime Marquínez Ferrándiz
233b58dec7 Add extractor for rtve.es/television (fixes #10076) 2016-07-13 21:02:34 +02:00
Jaime Marquínez Ferrándiz
c39b2ed990 [rtve] Fix extraction (#10076)
For http://www.rtve.es/alacarta/videos/documentos-tv/documentos-tv-revolucion-del-movil/3069778/ using 'auth/resources' fails, and other URLs seem to work fine.
2016-07-13 20:23:27 +02:00
Remita Amine
35ec86689c [bbc] extract only the original Unified Streaming Platform m3u8 manifests
0385aa6199 (commitcomment-18233275)
manifests with higher birate require more time to check formats
2016-07-13 18:01:14 +01:00
Sergey M․
c485959034 release 2016.07.13 2016-07-13 23:58:01 +07:00
Sergey M․
a0560d8ab8 [ellentv] Improve extraction (Closes #10067) 2016-07-13 22:42:53 +07:00
Remita Amine
0385aa6199 [bbc] extract more and better qulities from Unified Streaming Platform m3u8 manifests 2016-07-13 15:58:24 +01:00
Remita Amine
00f4764cb7 [common] extract vbr, abr and fps for Unified Streaming Platform m3u8 manifests 2016-07-13 15:58:24 +01:00
Sergey M․
51c2cd0b83 [extractors] Add vk:wallpost extractor import 2016-07-13 21:53:23 +07:00
Sergey M․
5f5a9d6158 [vk] Improve login 2016-07-13 21:52:52 +07:00
Sergey M․
2d19fb5072 [vk:wallpost] Add extractor 2016-07-13 21:51:44 +07:00
Yen Chi Hsuan
9d865a1af6 [travis] Skip downloading srelay
SOCKS tests never run on Travis CI due to unknown reasons, and
downloading them broke some tests (e.g.
https://travis-ci.org/rg3/youtube-dl/builds/144306425)
2016-07-13 14:27:14 +08:00
Remita Amine
41aa44259d [shahid] try to bypass geo restriction and extract more metadata(closes #10062) 2016-07-12 23:15:38 +01:00
Philipp Hagemeister
381ff44756 [devscripts/generate-download] Remove MD5 and SHA1 2016-07-12 09:09:54 +02:00
Sergey M․
7f29cf545a [youtube] Add YouTube Red paid video reference test (#10059) 2016-07-12 02:10:35 +07:00
Remita Amine
7d1219f3e0 [tmz] delegate extraction to KalturaIE 2016-07-11 19:08:22 +01:00
Remita Amine
f1b4af7d79 [beightcove:new] remove html tags from description 2016-07-11 19:06:50 +01:00
Remita Amine
8a8590a617 [dbtv] delegate extraction to BrightcoveNewIE 2016-07-11 16:30:24 +01:00
Remita Amine
4a7a5e41f7 [tvplay] improve extraction 2016-07-11 14:51:44 +01:00
Yen Chi Hsuan
2a49d01600 [playvid] Update _TESTS
Blocks https://travis-ci.org/rg3/youtube-dl/jobs/143809100
2016-07-11 15:15:28 +08:00
Yen Chi Hsuan
b99af8a51c [biobiochiletv] Fix extraction and update _TESTS 2016-07-11 13:23:57 +08:00
Yen Chi Hsuan
8e7020daef [rudo] Add new extractor
Used in biobiochile.tv
2016-07-11 13:19:25 +08:00
Sergey M․
a26bcc61c1 release 2016.07.11 2016-07-11 03:17:12 +07:00
Sergey M․
5c4dcf8172 [vidzi] Add support for embed URLs (Closes #10058) 2016-07-11 03:14:39 +07:00
Sergey M․
e9fb6a4bbe [youtube] Relax TFA regexes 2016-07-11 03:08:38 +07:00
Yen Chi Hsuan
e2dbcaa1bf [vuclip] Fix extraction 2016-07-11 00:52:25 +08:00
Yen Chi Hsuan
ae01850165 [miomio] Fix _TESTS 2016-07-11 00:03:24 +08:00
Yen Chi Hsuan
c3baaedfc8 [miomio] Support new 'h5' player (closes #9605)
Depends on #8876
2016-07-10 23:46:48 +08:00
Yen Chi Hsuan
0b68de3cc1 Merge pull request #8876 from remitamine/html5_media
[extractor/common] add helper method to extract html5 media entries
2016-07-10 23:40:45 +08:00
Sergey M․
39e9d524e5 Credit @nehalvpatel for roosterteeth (#9864) 2016-07-10 01:30:12 +07:00
Sergey M․
865b087224 [roosterteeth] Improve (Closes #9864) 2016-07-10 01:30:12 +07:00
Nehal Patel
3121b25639 [roosterteeth] Add extractor 2016-07-10 01:30:12 +07:00
Sergey M․
0286b85c79 release 2016.07.09.2 2016-07-09 22:22:24 +07:00
Sergey M․
ab52bb5137 [animeondemand] Fix typo 2016-07-09 22:20:34 +07:00
Sergey M․
61a98b8623 [lynda] Remove md5 from test (Closes #10047) 2016-07-09 21:29:11 +07:00
Sergey M․
6daf34a045 [facebook] Fix typo and break when found video_data (Closes #10048) 2016-07-09 21:25:07 +07:00
Yen Chi Hsuan
c03adf90bd [generic] Add the test. Closes #1638 2016-07-09 14:39:01 +08:00
Yen Chi Hsuan
0ece114b7b [vimeo] Recognize non-standard embeds (#1638) 2016-07-09 14:38:27 +08:00
Yen Chi Hsuan
5b6a74856b Merge pull request #9288 from reyyed/issue#9063fix
[ffmpeg] Fix embedding subtitles (#9063)
2016-07-09 14:29:53 +08:00
Sergey M․
ce43100a01 release 2016.07.09.1 2016-07-09 10:06:40 +07:00
Remita Amine
8cc9b4016d [srmediathek] extend _VALID_URL(closes #9373) 2016-07-09 03:22:09 +01:00
Remita Amine
31eeab9f41 [ard] fix f4m extraction and skip tests with 404 errors 2016-07-09 03:22:09 +01:00
Sergey M․
9558dcec9c [youtube:user] Preserve user/c path segment 2016-07-09 08:37:19 +07:00
Sergey M․
6e6b70d65f [extractor/generic] Properly comment out a test 2016-07-09 08:37:19 +07:00
Sergey M․
d417fd88d0 release 2016.07.09 2016-07-09 07:16:47 +07:00
Sergey M․
9e4f5dc1e9 [animeondemand] Pass num for episode based videos 2016-07-09 07:13:32 +07:00
Sergey M․
1251565ee0 [options] Rollback old behavior for configuratio files' encoding
Until agreed with some solution
2016-07-09 07:12:52 +07:00
Sergey M․
1f7258a367 [animeondemand] Add support for full length films (Closes #10031) 2016-07-09 06:57:04 +07:00
Sergey M․
0af985069b [flipagram] Improve extraction (Closes #9898) 2016-07-09 03:31:17 +07:00
Sergey M․
0de168f7ed [extractor/generic] Detect schema.org/VideoObject embeds 2016-07-09 03:29:07 +07:00
Sergey M․
95b31e266b [extractor/common] Add expected_type in json ld routines 2016-07-09 03:28:04 +07:00
Sergey M․
6b3a3098b5 [extractor/common] Extract more metadata for VideoObject in _json_ld 2016-07-09 03:27:11 +07:00
Sergey M․
2de624fdd5 [extractor/common] Introduce filesize metafield for thumbnails 2016-07-09 03:24:36 +07:00
Déstin Reed
3fee7f636c [flipagram] Add extractor 2016-07-09 03:23:32 +07:00
Remita Amine
89e2fff2b7 [mgtv] pass geo verification headers for api request 2016-07-08 20:18:25 +01:00
Sergey M․
cedc70b292 [facebook] Fix invalid video being extracted (Closes #9851) 2016-07-09 00:28:07 +07:00
Remita Amine
07d7689f2e [le] extract http formats 2016-07-08 15:35:20 +01:00
Yen Chi Hsuan
ae8cb5328d Merge branch 'JakubAdamWieczorek-polskie-radio' 2016-07-08 19:35:21 +08:00
Yen Chi Hsuan
2e32ac0b9a [polskieradio] Fix regex in _TESTS 2016-07-08 19:34:53 +08:00
Yen Chi Hsuan
672f01c370 Merge branch 'polskie-radio' of https://github.com/JakubAdamWieczorek/youtube-dl into JakubAdamWieczorek-polskie-radio 2016-07-08 19:33:28 +08:00
Jakub Adam Wieczorek
e2d616dd30 [polskieradio] Add thumbnails. 2016-07-08 13:23:00 +02:00
Yen Chi Hsuan
0ab7f4fe2b [nick] support nickjr.com (closes #7542) 2016-07-08 15:11:28 +08:00
Sergey M․
29c4a07776 [lynda] Fix test 2016-07-08 03:33:53 +07:00
Philipp Hagemeister
826e911e41 Merge branch 'master' of github.com:rg3/youtube-dl 2016-07-07 19:42:22 +02:00
Philipp Hagemeister
30d22dae8e [options] Do not decode Unicode on Python 2.x
The configuration file contents are being returned as unicode now, so decoding them is no longer necessary.
(Run python2 with -3 to see the warning before this commit)
2016-07-07 19:41:00 +02:00
Yen Chi Hsuan
ec3518725b [compat] Fix test_cmdline_umlauts on Python 2.6
The original statement raises uncaught UnicodeWarning on Python 2.6
2016-07-07 22:30:58 +08:00
Remita Amine
5f87d845eb [tweakers] fix info extraction(closes #9516) 2016-07-07 12:51:42 +01:00
Philipp Hagemeister
571808a7aa document comments in configuration file (fixes #10024) 2016-07-07 12:12:21 +02:00
Yen Chi Hsuan
dfe5fa49ae [compat] Fix compat_shlex_split for non-ASCII input
Closes #9871
2016-07-07 17:37:29 +08:00
Remita Amine
01a0c511eb [radiocanada] extract more formats 2016-07-07 03:46:12 +01:00
remitamine
b3d30315ce Merge pull request #9597 from remitamine/toutv
[toutv] fix info extraction(closes #1792)(closes #2082)
2016-07-07 01:51:01 +01:00
remitamine
882af14d7d [toutv] fix info extraction(closes #1792)(closes #2082) 2016-07-07 01:47:28 +01:00
Remita Amine
47335a0efa [telecinco] fix info extraction 2016-07-06 23:09:13 +01:00
Sergey M․
34bc2d9dfd release 2016.07.07 2016-07-07 01:54:29 +07:00
Sergey M․
08c7af4afa [kamcord] Add extractor (Closes #10001) 2016-07-07 01:50:39 +07:00
Yen Chi Hsuan
f7291a0b7c [daum.net] Fix extraction for specific examples
Closes #9972
2016-07-07 01:26:14 +08:00
Yen Chi Hsuan
c65aa4e9e1 [brightcove:legacy] Support 'playlistTabs' and skip a dead test
Closes #9965
2016-07-07 01:13:37 +08:00
Yen Chi Hsuan
ad213a1d74 [francetv] Recognize more Dailymotion embedded videos
Closes #9955
2016-07-06 23:37:54 +08:00
Yen Chi Hsuan
43f1e4e41e [onet] Add MD5 checksum 2016-07-06 20:32:03 +08:00
Yen Chi Hsuan
54b0e909d5 [amp] Fix a typo 2016-07-06 20:10:47 +08:00
Yen Chi Hsuan
f8752b86ac [Onet,ClipRs] Add new extractor for onet.tv and use it for clip.rs
Closes #9950
2016-07-06 20:09:05 +08:00
Yen Chi Hsuan
84c237fb8a [utils] Add get_element_by_class
For #9950
2016-07-06 20:02:52 +08:00
Remita Amine
ab49d7a9fa use mimetype2ext to determine manifest ext in multiple extractors 2016-07-06 09:11:46 +01:00
Remita Amine
b4173f1551 [utils] add mimetypes to determine manifest ext(m3u8, f4m, mpd) 2016-07-06 09:06:28 +01:00
Remita Amine
2817b99cf2 [metacafe] fix info extraction(closes #8539)(closes #3253) 2016-07-06 02:19:55 +01:00
Remita Amine
001fffd004 [spiegel:article] update test(closes #10018) 2016-07-06 00:16:41 +01:00
Sergey M․
0e94b4713d release 2016.07.06 2016-07-06 00:54:23 +07:00
Sergey M․
a6d3b89feb [prosiebensat1] Make downloading urls JSON non fatal 2016-07-06 00:52:48 +07:00
Remita Amine
6c26815d63 [onionstudios] fix info extraction 2016-07-05 18:05:07 +01:00
Sergey M․
73c4ac2c95 [youtube:channel] Improve channel id extraction and detect unavailable channels (Closes #10009) 2016-07-05 23:30:44 +07:00
Remita Amine
84f214d840 [prosiebensat1] extract all formats 2016-07-05 17:11:45 +01:00
Remita Amine
e3f88be7a9 [rtvnh] extract all formats 2016-07-05 14:45:39 +01:00
Remita Amine
31af3e35e0 [sandia] remove unused imports 2016-07-05 13:39:24 +01:00
Remita Amine
94a5cff91d [sendia] fix info extraction 2016-07-05 13:37:46 +01:00
Remita Amine
77082c7b9e [slideshare] fix description extraction 2016-07-05 12:01:04 +01:00
Remita Amine
252a1f75d2 [spiegel] improve info extraction 2016-07-05 11:46:25 +01:00
Remita Amine
5abf513cf8 [stitcher] fix episode config extraction 2016-07-05 10:44:16 +01:00
Yen Chi Hsuan
c6054e3201 [xuite] Support videos with already encoded media id 2016-07-05 14:26:42 +08:00
Yen Chi Hsuan
4080530624 [youtube:shared] Recognize the new 'shared' URLs
Closes #10007
2016-07-05 13:15:05 +08:00
Sergey M․
c25f1a9b63 release 2016.07.05 2016-07-05 06:32:46 +07:00
Remita Amine
dfaa86b75e [test_utils] add test for smuggling a smuggled url 2016-07-04 21:36:32 +01:00
Remita Amine
d9163ae3b6 [kaltura] fix extraction error for videos from multiple kaltura servers 2016-07-04 21:34:27 +01:00
Remita Amine
dafafe7cf1 [la7] extract more info from a kaltura custom server 2016-07-04 17:59:58 +01:00
Remita Amine
81953d1ae5 [kaltura] add support videos stored on custom kaltura servers(closes #5557) 2016-07-04 17:59:58 +01:00
Yen Chi Hsuan
3a212ed62e [iqiyi] Skip an unstable MD5 checksum 2016-07-04 11:25:46 +08:00
Sergey M․
195f084542 [pornhub] Detect private videos (Closes #9987) 2016-07-04 03:27:00 +07:00
Sergey M․
aa7a455b2e [README.md] Clarify configuration file may not exist by default 2016-07-04 01:24:33 +07:00
Sergey M․
6a4e659c93 [yahoo] Recognize brightcove embed (Closes #9995) 2016-07-03 23:00:36 +07:00
Yen Chi Hsuan
40f3666f6b [test/test_http] Update tests for 38cce791c7 2016-07-03 23:50:55 +08:00
Remita Amine
dd801bbe18 [brightcove] improve error detection 2016-07-03 16:37:22 +01:00
Yen Chi Hsuan
38cce791c7 Rename --cn-verfication-proxy to --geo-verification-proxy
And deprecate the former one

Since commit f138873900, this option is
not limited to China websites, so rename it.
2016-07-03 23:29:56 +08:00
Sergey M․
bf3ae6a543 [devscripts/show-downloads-statictics] Add script for displaying downloads statistics 2016-07-03 22:20:14 +07:00
Sergey M․
bff98341d5 release 2016.07.03.1 2016-07-03 21:28:55 +07:00
Yen Chi Hsuan
2644e911be [iqiyi] Fix extraction
See https://github.com/soimort/you-get/issues/1211#issuecomment-229011559
2016-07-03 22:19:56 +08:00
Remita Amine
a5f67895d3 [nationalgeographic] restore http formats
there was a misunderstanding about the reason of 403 response
the problem happen only when the user use aria2c as a downloader
a1f6f5c768 (commitcomment-18107559)
2016-07-03 14:10:25 +01:00
Yen Chi Hsuan
15e4b6b758 [rai] Support an alternative form of embedded relinker URL
Closes #8551
2016-07-03 19:52:11 +08:00
Yen Chi Hsuan
2b28b892d8 [rai] Support videos with embedded content item ID (#8551) 2016-07-03 19:52:11 +08:00
Sergey M․
7507fc98cb [README.md] Fix somes typo in coding conventions section 2016-07-03 18:35:28 +07:00
Yen Chi Hsuan
477b7a8474 [downloader/f4m] Fix for Rai live streams 2016-07-03 19:26:39 +08:00
Yen Chi Hsuan
034a884957 [rai] Support direct relinker URLs (closes #8552) 2016-07-03 19:26:39 +08:00
Remita Amine
64436cb1a4 [nationalgeographic] skip download for national geographic channel tests(closes #9991) 2016-07-03 10:43:36 +01:00
Yen Chi Hsuan
f138873900 [rai] Fix extraction and update _TESTS
Closes #8617
Closes #9157
Closes #9232
2016-07-03 15:49:35 +08:00
Yen Chi Hsuan
e793338c88 [buzzfeed] Detect Facebook embed and update _TESTS
Closes #5701
2016-07-03 14:12:02 +08:00
Yen Chi Hsuan
369bb06206 [facebook] Improve embed detection (#5701) 2016-07-03 14:11:29 +08:00
Sergey M․
2cb31d288e [history:topic] Relax _VALID_URL 2016-07-03 13:01:04 +07:00
Sergey M․
c723d1cd8d [README.md] Update some codebase links 2016-07-03 11:35:13 +07:00
Sergey M․
1f55234057 Add PULL_REQUEST_TEMPLATE.md 2016-07-03 11:31:49 +07:00
Sergey M․
04006fae8d [README.md] Start writing youtube-dl coding conventions 2016-07-03 11:31:07 +07:00
Jaime Marquínez Ferrándiz
4cb13d0d6a [hrti] Don't redefine variable in list comprehension 2016-07-02 23:02:14 +02:00
Remita Amine
a1f6f5c768 [nationalgeographic] add support Adobe Pass auth 2016-07-02 21:24:22 +01:00
Remita Amine
05c7feec77 [aenetworks] add support Adobe Pass auth 2016-07-02 21:24:22 +01:00
Remita Amine
bf83024826 [theplatform] add basic support for Adobe Pass 2016-07-02 21:24:22 +01:00
Sergey M․
a0cfd82dda release 2016.07.03 2016-07-03 03:19:22 +07:00
Sergey M․
1b734adb2d [xtube] Fix extraction (Closes #9953, closes #9961) 2016-07-03 03:17:35 +07:00
Sergey M․
9b724d7277 [extractors] Add hrti:playlist import 2016-07-03 02:25:39 +07:00
Sergey M․
c3a5dd3b5d Credit @atopuzov for hrti (#9482) 2016-07-03 02:22:59 +07:00
Sergey M․
e3755a624b [hrti] Improve and add support for playlists (Closes #9482) 2016-07-03 02:22:14 +07:00
Sergey M․
95cf60e826 [utils] Add PUTRequest 2016-07-03 02:21:32 +07:00
Aleksandar Topuzovic
6b03e1e25d [HRTi] Implement extractor for Croatian Radiotelevision 2016-07-03 02:20:41 +07:00
Yen Chi Hsuan
712b0b5b70 [la7.it] Fix the extractor 2016-07-02 23:49:03 +08:00
Yen Chi Hsuan
6a424391d9 [facebook] Make embed detection stricter to prevent false-positives 2016-07-02 23:15:55 +08:00
Yen Chi Hsuan
dbf0157a26 [generic] Add MD5 checksums 2016-07-02 21:58:07 +08:00
Yen Chi Hsuan
7deef1ba67 [generic] Support Wordpress "YouTube Video Importer" plugin
Closes #9938
2016-07-02 21:58:07 +08:00
Yen Chi Hsuan
fd6ca38262 [facebook] Improve Facebook embedded detection
Related to #9938.

Another example comes from 9834872bf6.
2016-07-02 21:58:07 +08:00
Sergey M․
bdafd88da0 [vk] Extend _VALID_URLs to support new domain (Closes #9981) 2016-07-02 16:43:19 +07:00
Sergey M․
7a1e71575e release 2016.07.02 2016-07-02 02:47:42 +07:00
Sergey M․
ac2d8f54d1 [vine] Remove superfluous whitespace 2016-07-02 02:45:00 +07:00
Sergey M․
14ff6baa0e [fusion] Improve 2016-07-02 02:44:37 +07:00
TRox1972
bb08101ec4 [Fusion] Add new extractor 2016-07-02 02:37:28 +07:00
Sergey M․
bc4b2d75ba [pornhub] Add support for thumbzilla (Closes #8696) 2016-07-02 02:11:07 +07:00
Sergey M․
35fc3021ba [periscope] Add another fallback source 2016-07-02 01:35:57 +07:00
cant-think-of-a-name
347227237b [periscope] fix playlist extraction (#9967)
The JSON response changed and the extractor needed to be updated in order to gather the video IDs.
2016-07-02 01:29:11 +07:00
Sergey M․
564dc3c6e8 [vine] Fix extraction (Closes #9970) 2016-07-02 01:24:57 +07:00
Sergey M․
9f4576a7eb [twitch] Update usher URL (Closes #9975) 2016-07-01 23:16:43 +07:00
Sergey M․
f11315e8d4 release 2016.07.01 2016-07-01 03:59:57 +07:00
Sergey M․
0c2ac64bb8 [sixplay] Rename preference key to quality in format dict 2016-07-01 03:57:59 +07:00
Jaime Marquínez Ferrándiz
a9eede3913 [test/compat] compat_shlex_split: test with newlines 2016-07-01 03:30:35 +07:00
Jaime Marquínez Ferrándiz
9e29ef13a3 [options] Accept quoted string across multiple lines (#9940)
Like:

    -f "
    bestvideo+bestaudio/
    best
    "
2016-07-01 03:30:31 +07:00
Sergey M․
eaaaaec042 [pornhub] Add more tests with removed videos 2016-07-01 03:18:27 +07:00
Sergey M․
3cb3b60064 [pornhub] Relax removed message regex (Closes #9964) 2016-07-01 03:14:23 +07:00
kidol
044e3d91b5 [Pornhub] Fix error detection 2016-07-01 02:59:50 +07:00
Remita Amine
c9e538a3b1 [ctvnews] use orderedSet, increase the number of items for playlists and use smaller bin list for test 2016-06-30 19:52:32 +01:00
Remita Amine
76dad392f5 [meta] Clarify the source of uppod st decryption algorithm 2016-06-30 18:27:57 +01:00
Remita Amine
9617b557aa [ctv] Add new extractor(closes #4077) 2016-06-30 18:22:35 +01:00
Remita Amine
bf4fa24414 [ctvnews] Add new extractor(closes #2156) 2016-06-30 18:22:35 +01:00
Remita Amine
20361b4f25 [rds] extract 9c9media formats 2016-06-30 18:22:35 +01:00
Remita Amine
05a0068a76 [9c9media] Add new extractor 2016-06-30 18:22:35 +01:00
Sergey M․
66a42309fa release 2016.06.30 2016-06-30 23:56:55 +07:00
Sergey M․
fd94e2671a [meta] Add support for pladform embeds 2016-06-30 23:20:44 +07:00
Sergey M․
8ff6697861 [pladform] Improve embed detection 2016-06-30 23:19:29 +07:00
Sergey M․
eafa643715 [meta] Make duration and description optional
For iframe URLs
2016-06-30 23:06:13 +07:00
Sergey M․
049da7cb6c [meta] Extend _VALID_URL 2016-06-30 23:04:18 +07:00
Remita Amine
7dbeee7e22 [generic] make twitter:player extraction non fatal 2016-06-30 14:11:55 +01:00
Remita Amine
93ad6c6bfa [sixplay] Add new extractor(closes #2183) 2016-06-30 13:50:49 +01:00
Remita Amine
329179073b [generic] add generic support for twitter:player embeds 2016-06-30 12:01:30 +01:00
Remita Amine
4d86d2008e [urplay] fix typo and check with flake8 2016-06-30 11:30:42 +01:00
Remita Amine
ab47b6e881 [theatlantic] Add new extractor(closes #6611) 2016-06-30 04:08:56 +01:00
Remita Amine
df43389ade [skysports] Add new extractor(closes #7066) 2016-06-30 02:54:21 +01:00
Remita Amine
397b305cfe [meta] Add new extractor(closes #8789) 2016-06-30 00:21:03 +01:00
Remita Amine
e496fa50cd [urplay] Add new extractor(closes #9332) 2016-06-29 20:19:31 +01:00
Sergey M․
06a96da15b [eagleplatform] Improve embed detection and extract in separate routine (Closes #9926) 2016-06-29 23:01:34 +07:00
Remita Amine
70157c2c43 [aenetworks] add support for movie pages 2016-06-29 16:55:17 +01:00
Remita Amine
c58ed8563d [aenetworks] extract history topic playlist title 2016-06-29 16:18:16 +01:00
Remita Amine
4c7821227c [aenetworks:historytopic] fix topic video url 2016-06-29 16:03:32 +01:00
Remita Amine
42362fdb5e [aenetworks] add support for show and season for A&E Network sites and History topics(closes #9816) 2016-06-29 15:49:17 +01:00
Sergey M․
97124e572d [arte:playlist] Fix test 2016-06-28 22:39:53 +07:00
Remita Amine
32616c14cc [vrt] extract all formats 2016-06-28 14:02:03 +01:00
Sergey M․
8174d0fe95 release 2016.06.27 2016-06-27 23:09:39 +07:00
Sergey M․
8704778d95 [pbs] Check manually constructed http links (Closes #9921) 2016-06-27 23:06:42 +07:00
Sergey M․
c287f2bc60 [extractor/generic] Use _extract_url for kaltura embeds (Closes #9922) 2016-06-27 22:45:26 +07:00
Sergey M․
9ea5c04c0d [kaltura] Add _extract_url with fixed regex 2016-06-27 22:44:17 +07:00
Sergey M․
fd7a7498a4 [test_all_urls] PEP 8 and change wording 2016-06-27 22:11:45 +07:00
Matthieu Muffato
e3a6747d8f New test-case: extractor names are supposed to be unique
@dstftw explained in
https://github.com/rg3/youtube-dl/pull/9918#issuecomment-228625878 that
extractor names are supposed to be unique. @dstftw has fixed the two
offending extractors, and here I add a test to ensure this does not
happen in the future.
2016-06-27 22:09:29 +07:00
Sergey M․
f41ffc00d1 [skynewsarabia:article] Clarify IE_NAME 2016-06-27 05:08:09 +07:00
Sergey M․
81fda15369 [sr:mediathek] Clarify IE_NAME 2016-06-27 05:07:12 +07:00
Sergey M․
427cd050a3 [extractor/generic] Improve kaltura embed detection (Closes #9911) 2016-06-27 04:11:53 +07:00
Sergey M․
b0c200f1ec [msn] Add test URL with non-alphanumeric characters 2016-06-26 22:03:36 +07:00
Sergey M․
92747e664a release 2016.06.26 2016-06-26 21:15:24 +07:00
Sergey M․
f1f336322d [msn] Fix extraction (Closes #8960, closes #9542) 2016-06-26 21:10:05 +07:00
Sergey M․
bf8dd79045 [extractor/common] Fix sorting with custom field preference 2016-06-26 21:09:07 +07:00
TRox1972
c6781156aa [MSN] add new extractor 2016-06-26 21:07:59 +07:00
remitamine
59bbe4911a [extractor/common] add helper method to extract html5 media entries 2016-06-26 14:04:08 +01:00
remitamine
4f3c5e0627 [utils] add helper function for parsing codecs 2016-06-26 14:03:58 +01:00
Sergey M․
f484c5fa25 [vidbit] Improve (Closes #9759) 2016-06-26 16:59:28 +07:00
Sergey M․
88d9f6c0c4 [utils] Add support for name list in _html_search_meta 2016-06-26 16:57:14 +07:00
TRox1972
3c9c088f9c [Vidbit] Add new extractor 2016-06-26 16:52:52 +07:00
Yen Chi Hsuan
fc3996bfe1 [iqiyi] Remove codes for debugging 2016-06-26 15:45:41 +08:00
Yen Chi Hsuan
5b6ad8630c [iqiyi] Partially fix IqiyiIE
Use the HTML5 API. Only low-resolution formats available

Related: #9839

Thanks @zhangn1985 for the overall algorithm (soimort/you-get#1224)
2016-06-26 15:18:32 +08:00
Yen Chi Hsuan
30105f4ac0 [le] Move urshift() to utils.py 2016-06-26 15:17:26 +08:00
Yen Chi Hsuan
1143535d76 [utils] Add urshift()
Used in IqiyiIE and LeIE
2016-06-26 15:16:49 +08:00
Yen Chi Hsuan
7d52c052ef [generic] Fix test_Generic_76
Broken: https://travis-ci.org/rg3/youtube-dl/jobs/140251658
2016-06-26 11:56:27 +08:00
stepshal
a2406fce3c Fix misspelling 2016-06-26 01:28:55 +07:00
Sergey M․
3b34ab538c [svtplay] Extend _VALID_URL (#9900) 2016-06-26 00:29:53 +07:00
Sergey M․
ac782306f1 [iqiyi] Mark broken 2016-06-26 00:25:41 +07:00
Sergey M․
0c00e889f3 Credit @JakubAdamWieczorek for #9813 2016-06-25 23:35:57 +07:00
Sergey M․
ce96ed05f4 [polskieradio] Add test with video 2016-06-25 23:31:21 +07:00
Sergey M․
0463b77a1f [polskieradio] Improve extraction (Closes #9813) 2016-06-25 23:19:18 +07:00
Jakub Adam Wieczorek
2d185706ea [polskieradio] Add support for Polskie Radio.
Polskie Radio is the main Polish state-funded radio broadcasting service.
2016-06-25 23:19:18 +07:00
Sergey M․
b72b44318c [utils] Add strip_or_none 2016-06-25 23:19:18 +07:00
Sergey M․
46f59e89ea [utils] Add unified_timestamp 2016-06-25 23:19:18 +07:00
Sergey M․
b4241e308e release 2016.06.25 2016-06-25 03:03:20 +07:00
Sergey M․
3d4b08dfc7 [setup.py] Add file version information and quotes consistency (Closes #9878) 2016-06-25 02:50:12 +07:00
Sergey M․
be49068d65 [youtube] Fix and skip some tests 2016-06-24 22:47:19 +07:00
Sergey M․
525cedb971 [youtube] Relax URL expansion in description 2016-06-24 22:37:13 +07:00
Sergey M․
de3c7fe0d4 [youtube] Fix 141 format tests 2016-06-24 22:27:55 +07:00
Yen Chi Hsuan
896cc72750 [mixcloud] View count and like count may be absent
Closes #9874
2016-06-24 17:26:12 +08:00
Yen Chi Hsuan
c1ff6e1ad0 [vimeo:review] Fix extraction for password-protected videos
Closes #9853
2016-06-24 16:48:37 +08:00
Remita Amine
fee70322d7 [appletrailers] correct thumbnail fallback 2016-06-23 19:03:34 +01:00
Remita Amine
8065d6c55f [dcn] extend _VALID_URL for awaan.ae and extract all available formats 2016-06-23 17:22:15 +01:00
Remita Amine
494172d2e5 [appletrailers] extract info from an alternative source if available(closes #8422)(closes #8422) 2016-06-23 15:49:42 +01:00
Remita Amine
6e3c2047f8 [tvp] extract all formats and detect erros 2016-06-23 04:36:16 +01:00
Sergey M․
011bd3221b release 2016.06.23.1 2016-06-23 09:42:56 +07:00
Sergey M․
b46eabecd3 [jsinterp] Relax JS function regex (Closes #9863) 2016-06-23 09:41:34 +07:00
Remita Amine
0437307a41 [nbc:nbcnews] improve extraction and add msnbc to the extractor 2016-06-23 01:36:19 +01:00
Remita Amine
22b7ac13ef [tf1] fix wat id extraction(closes #9862) 2016-06-23 00:14:34 +01:00
Sergey M․
96f88e91b7 release 2016.06.23 2016-06-23 04:29:34 +07:00
Sergey M․
3331a4644d [vk] Remove unused import 2016-06-23 04:27:10 +07:00
Sergey M․
adf1921dc1 [xnxx] Improve _VALID_URL (Closes #9858) 2016-06-23 04:26:49 +07:00
Sergey M․
97674f0419 [xnxx] Replace test 2016-06-23 04:24:00 +07:00
rr-
73843ae8ac [xnxx] fix url regex
The pattern has changed from "video123412" to "video-o8xa19".
The changes maintain backwards compatibility with old-style URLs.
2016-06-23 04:19:55 +07:00
Sergey M․
f2bb8c036a [vk] Modernize 2016-06-23 04:18:43 +07:00
Sergey M․
75ca6bcee2 [vk] Workaround buggy new.vk.com Set-Cookie headers 2016-06-23 04:17:13 +07:00
Sergey M․
089657ed1f [vimeo:album] Add paged example URL 2016-06-23 02:00:03 +07:00
Sergey M․
b5eab86c24 [vimeo:album] Impove _VALID_URL 2016-06-23 01:56:58 +07:00
Sergey M․
c8e3e0974b [vimeo:channel] Improve playlist extraction 2016-06-23 01:28:36 +07:00
Purdea Andrei
dfc8f46e1c [vimeo:channel] Add video id to url_result
This will allow us to decide much faster that we don't want an already archived video,
and will allow having to download webpages for each video that has already been downloaded,
thus significantly speeding up the archival of channels that have no new content.
2016-06-23 01:26:27 +07:00
Sergey M․
c143ddce5d [vimeo] Override original URL only when necessary 2016-06-23 00:51:36 +07:00
Jaime Marquínez Ferrándiz
169d836feb lazy-extractors: Fix after commit 6e6b9f600f
The problem was in the following code:

    class ArteTVPlus7IE(ArteTVBaseIE):

        ...

        @classmethod
        def suitable(cls, url):
            return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)

And its sublcasses like ArteTVCinemaIE.

Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed.

To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor.
2016-06-22 19:20:50 +02:00
TRox1972
6ae938b295 [Vine] Extract view count 2016-06-22 23:57:35 +07:00
Sergey M․
cf40fdf5c1 release 2016.06.22 2016-06-22 23:43:24 +07:00
Sergey M․
23bdae0955 [svt] Various improvements
+ [svt:play] Add fallback path looking for video id and fix extraction for oppetarkiv
* [svt:base] Detect geo restriction
* [svt:base] Extract series related metadata
2016-06-22 23:36:07 +07:00
Shai Coleman
ca74c90bf5 Fix issue downloading facebook videos
youtube-dl expects the format items to be returned as a list,
but when there's only one item Facebook returns a dict instead,
this wraps the dict in a list if necessary
2016-06-22 12:52:15 +01:00
Sergey M․
7cfc1e2a10 [gametrailers] Remove extractor
gametrailers closed (see http://www.polygon.com/2016/2/8/10944452/gametrailers-shuts-down-after-13-year-run)
2016-06-21 22:31:41 +07:00
Remita Amine
1ac5705f62 [gamespot] extract all formats 2016-06-21 13:37:57 +01:00
Yen Chi Hsuan
e4f90ea0a7 [svt] Fix extraction for SVTPlay (closes #9809) 2016-06-21 17:55:53 +08:00
Sergey M․
cdfc187cd5 [cbs] Remove unused import 2016-06-20 22:40:33 +07:00
Sergey M․
feef925f49 [streamcloud] Capture error message (#9840) 2016-06-20 22:40:22 +07:00
Sergey M․
19e2d1cdea release 2016.06.20 2016-06-20 20:50:01 +07:00
Sergey M․
8369a4fe76 [downloader/hls] Simplify and carry long lines 2016-06-20 21:55:17 +07:00
Philipp Hagemeister
1f749b6658 Revert "[jsinterp] Avoid double key lookup for setting new key"
This reverts commit 7c05097633.
2016-06-20 13:29:13 +02:00
Remita Amine
819707920a [cbs] fix _VALID_URL 2016-06-19 23:55:19 +01:00
Remita Amine
43518503a6 [cbs,cbsnews,cbssports] reduce requests while extracting all formats 2016-06-19 23:40:00 +01:00
Remita Amine
5839d556e4 [theplatform] reduce requests for theplatform feed info extraction 2016-06-19 23:37:05 +01:00
Yen Chi Hsuan
6c83e583b3 [radiojavan] PEP8
E275 is added in pycodestyle 2.6

See https://github.com/PyCQA/pycodestyle/pull/491
2016-06-19 13:32:08 +08:00
Yen Chi Hsuan
6aeb64b673 Merge pull request #8201 from remitamine/hls-aes
[downloader/hls] Add support for AES-128 encrypted segments in hlsnative downloader
2016-06-19 13:25:08 +08:00
Remita Amine
6cd64b6806 [foxsports] extract http formats 2016-06-19 05:45:48 +01:00
remitamine
e154c65128 [downloader/hls] Add support for AES-128 encrypted segments in hlsnative downloader 2016-06-19 01:01:40 +01:00
Sergey M․
a50fd6e026 release 2016.06.19.1 2016-06-19 03:57:14 +07:00
Sergey M․
6a55bb66ee [vimeo] Fix rented videos (Closes #9830) 2016-06-19 03:56:01 +07:00
Lucas Moura
7c05097633 [jsinterp] Avoid double key lookup for setting new key
In order to add a new key to both __objects and __functions dicts on jsinterp.py, it is
necessary to first verify if a key was present and if not, create the key and
assign it to a value.

However, this can be done with a single step using dict setdefault method.
2016-06-19 03:29:45 +07:00
Sergey M․
589568789f release 2016.06.19 2016-06-19 02:30:29 +07:00
Sergey M․
7577d849a6 [r7] Fix extraction and add support for articles (Closes #9826) 2016-06-19 02:25:34 +07:00
Sergey M․
cb23192bc4 [closertotruth] Update and improve (Closes #8680) 2016-06-19 00:35:29 +07:00
Steven Gosseling
41c1023300 [closertotruth] Add extractor
Removed print statement from code.

Replaced two regex searches with the corret ones.

Removed some unnecessary semicolumns

fixed title extraction

refactored everything to search_regex

processed comments on commit 5650b0d, fixed feedback from flake8

Improved regexes and returns info dict now.

Added support for closertotruth interview URL

Added support for episodes page
2016-06-18 23:19:56 +07:00
Sergey M․
90b6288cce [arte:+7] Simplify _VALID_URL 2016-06-18 22:23:48 +07:00
Sergey M․
c1823c8ad9 [README.md] Remove 'small' from description (#9814) 2016-06-18 22:08:48 +07:00
Sergey M․
d7c6c656c5 [arte:+7] Expand _VALID_URL (Closes #9820) 2016-06-18 21:42:17 +07:00
Yen Chi Hsuan
b0b128049a [extractors] Update references to sportschau (#9799) 2016-06-18 13:43:47 +08:00
Yen Chi Hsuan
e8f13f2637 [sportschau.de] Fix extraction and moved to its own file (closes #9799) 2016-06-18 13:42:58 +08:00
Yen Chi Hsuan
b5aad37f6b [ard] Remove SportschauIE, which is now based on WDR (#9799) 2016-06-18 13:42:39 +08:00
Yen Chi Hsuan
6d0d4fc26d [wdr] Add WDRBaseIE, for Sportschau (#9799) 2016-06-18 13:40:55 +08:00
Yen Chi Hsuan
0278aa443f [br] Skip invalid tests 2016-06-18 12:53:48 +08:00
Yen Chi Hsuan
1f35745758 [azubu] Don't fail on optional fields 2016-06-18 12:39:08 +08:00
Yen Chi Hsuan
573c35272f [bbc] Skip a geo-restricted test case 2016-06-18 12:35:55 +08:00
Yen Chi Hsuan
09e3f91e40 [arte] Update _TESTS and fix for pages with multiple YouTube videos
Some tests are from #6895 and #6613
2016-06-18 12:34:58 +08:00
Yen Chi Hsuan
1b6cf16be7 [aftonbladet] Fix extraction 2016-06-18 12:27:39 +08:00
Yen Chi Hsuan
26264cb056 [adobetv] Use embedded data in the webpage
Sometimes the HTML webpage is returned even with '?format=json'
2016-06-18 12:21:40 +08:00
Yen Chi Hsuan
a72df5f36f [mtvservices] Fix ext for RTMP streams 2016-06-18 12:19:06 +08:00
Yen Chi Hsuan
c878e635de [bet] Moved to MTVServices 2016-06-18 12:17:24 +08:00
Sergey M․
0f47cc2e92 release 2016.06.18.1 2016-06-18 06:20:34 +07:00
Sergey M․
5fc2757682 release 2016.06.18 2016-06-18 06:00:05 +07:00
Sergey M․
e3944c2621 [pornhd] Add working test 2016-06-18 05:50:17 +07:00
Sergey M․
667d96480b [pornhd] Detect removed videos and modernize 2016-06-18 05:42:20 +07:00
Sergey M․
e6fe993c31 [pornhd] Improve formats extraction 2016-06-18 05:37:53 +07:00
Sergey M․
d0d93f76ea [pornhd] Fix metadata extraction 2016-06-18 05:30:46 +07:00
Sergey M․
20a6a154fe [mtv] Use compat_xpath and fix FutureWarning 2016-06-18 04:46:26 +07:00
Sergey M․
f011876076 [nickde] Add extractor (Closes #9778) 2016-06-18 04:40:48 +07:00
Sergey M․
6929569403 [mitele] Extract series metadata and make title more robust (Closes #9758) 2016-06-18 04:06:19 +07:00
Sergey M․
eb451890da [carambatv] Add extractor (Closes #9815) 2016-06-18 03:04:14 +07:00
Sergey M․
ded7511a70 [bbccouk] Add support for playlists (Closes #9812) 2016-06-17 23:42:52 +07:00
Sergey M․
d2161cade5 release 2016.06.16 2016-06-16 22:40:55 +07:00
Sergey M․
27e5fa8198 [cda] Fix extraction (Closes #9803) 2016-06-16 22:33:12 +07:00
Yen Chi Hsuan
efbd1eb51a [wimp] Fix extraction and update _TESTS 2016-06-16 12:27:21 +08:00
Yen Chi Hsuan
369ff75081 [jwplatform] Improved JWPlayer support 2016-06-16 12:26:45 +08:00
Yen Chi Hsuan
47212f7bcb [utils] Don't transform numbers not starting with a zero
Fix test_Viidea and maybe others
2016-06-16 11:00:54 +08:00
Sergey M․
4c93ee8d14 [imdb] Improve _VALID_URL (Closes #9788) 2016-06-15 22:34:55 +07:00
Yen Chi Hsuan
8bc4dbb1af [wrzuta.pl] Detect error and update _TESTS 2016-06-14 11:14:59 +08:00
Sergey M․
6c3760292c [pornhub] Improve title extraction (Closes #9777) 2016-06-14 04:57:59 +07:00
Sergey M․
4cef70db6c [devscripts/release.sh] Add flag for gpg-sign commits 2016-06-14 03:16:56 +07:00
Sergey M․
ff4af6ec59 [lynda] Remove superfluous _NETRC_MACHINE 2016-06-14 02:49:33 +07:00
Sergey M․
d01fb21d4c release 2016.06.14 2016-06-14 02:19:42 +07:00
Sergey M․
a4ea28eee6 Credit @venth for wrzuta:playlist (#9341) 2016-06-14 02:15:47 +07:00
Sergey M․
bc2a871f3e Credit @dracony for rockstargames (#9737) 2016-06-14 02:15:09 +07:00
Sergey M․
1759672eed [wrzuta:playlist] Improve and simplify (Closes #9341) 2016-06-14 02:13:54 +07:00
venth
fea55ef4a9 [wrzuta.pl:playlist] Added playlist extraction from wrzuta.pl 2016-06-14 02:10:48 +07:00
Sergey M․
16b6bd01d2 [rockstargames] Improve and add Youtube fallback (Closes #9737) 2016-06-14 01:11:24 +07:00
Dracony
14d0f4e0f3 Added extractor for rockstargames.com 2016-06-14 01:09:35 +07:00
Sergey M․
778f969447 [twitch:clips] Add extractor (Closes #9767) 2016-06-14 00:06:31 +07:00
Sergey M․
79cd8b3d8a [README.md] Suggest checking extractor code under all Python versions 2016-06-13 10:04:04 +07:00
Sergey M․
b4663f12b1 [README.md] Update links to info dict metafields 2016-06-13 07:16:35 +07:00
Sergey M․
b50e02c1e4 [README.md] Update links to options available for YoutubeDL 2016-06-13 07:05:32 +07:00
Sergey M․
33b72ce64e [xfileshare] Improve removed videos detection 2016-06-13 01:19:54 +07:00
Sergey M․
cf2bf840ba [xfileshare] Fix test 2016-06-13 01:11:14 +07:00
Sergey M․
bccdac6874 [xfileshare:xvidstage] Add support for videos with packed codes (Closes #4335) 2016-06-13 01:11:04 +07:00
Sergey M․
e69f9f5d68 [downloader/external] Decode error string before writing to stderr 2016-06-12 16:45:07 +07:00
Sergey M․
77a9a9c295 release 2016.06.12 2016-06-12 12:06:48 +07:00
Sergey M․
84dcd1c4e4 [streamcloud] Detect removed videos (Closes #3768) 2016-06-12 11:08:39 +07:00
Sergey M․
971e3b7520 [nrk:skole] Fix extraction 2016-06-12 07:20:37 +07:00
Sergey M․
4e79011729 [nrktv] Fix tests 2016-06-12 06:57:04 +07:00
Sergey M․
a936ac321c [README.md] Document using output template in batch files (Closes #9717) 2016-06-12 06:39:31 +07:00
Sergey M․
98960c911c [instagram] Extract metadata from JSON 2016-06-12 06:06:04 +07:00
Sergey M․
329ca3bef6 [utils] Add try_get
To reduce boilerplate when accessing JSON
2016-06-12 06:05:34 +07:00
Sergey M․
2c3322e36e [youporn] Fix metadata extraction 2016-06-12 04:49:37 +07:00
Sergey M․
80ae228b34 [matchtv] Modernize 2016-06-12 01:57:23 +07:00
Yen Chi Hsuan
6d28c408cf [viki] Do not use a fallback language for title in the first try
In test_Viki_3, 'titles' gives a Hebrew title.
2016-06-11 23:00:44 +08:00
Yen Chi Hsuan
c83b35d4aa [viki] Update _TESTS 2016-06-11 22:39:13 +08:00
Yen Chi Hsuan
94e5d6aedb [viki] Skip a geo-restricted test 2016-06-11 21:49:01 +08:00
Yen Chi Hsuan
531a74968c [vimeo] Fix extraction for VimeoReview videos 2016-06-11 21:35:08 +08:00
Yen Chi Hsuan
c5edd147d1 [generic] Remove an invalid test
Now handled by telewebion.py
2016-06-11 18:39:58 +08:00
Yen Chi Hsuan
856150d056 [telewebion] Add new extractor (closes #5135) 2016-06-11 18:39:58 +08:00
Yen Chi Hsuan
03ebea89b0 Merge pull request #9755 from vxbinaca/patch-2
[utils] Change Firefox 44 to 47
2016-06-11 17:38:45 +08:00
Paul Henning
15d106787e [utils] Change Firefox 44 to 47
See commit title.
2016-06-11 05:36:31 -04:00
Yen Chi Hsuan
7aab3696dd [kuwo] Update _TESTS 2016-06-11 15:37:04 +08:00
Yen Chi Hsuan
47787efa2b [leeco] Recognize Le Sports URLs (fixes #9750) 2016-06-11 13:14:41 +08:00
Sergey M․
4a420119a6 release 2016.06.11.3 2016-06-11 08:34:30 +07:00
Sergey M․
33751818d3 release 2016.06.11.2 2016-06-11 08:28:51 +07:00
Sergey M․
698f127c1a [setup.py] Add python 3.5 classifier 2016-06-11 06:14:22 +07:00
Sergey M․
fe458b6596 [limelight] Extract ttml subtitles (Closes #9739) 2016-06-11 05:57:27 +07:00
Sergey M․
21ac1a8ac3 [limelight] Fix typo 2016-06-11 05:52:50 +07:00
Sergey M․
79027c0ea0 [limelight] Improve _VALID_URLs 2016-06-11 05:40:02 +07:00
Sergey M․
4cad2929cd [limelight] Fix _VALID_URLs 2016-06-11 05:30:44 +07:00
Sergey M․
62666af99f [indavideo] Fix formats' height (Closes #9744) 2016-06-11 05:13:05 +07:00
Sergey M․
9ddc289f88 [README.md] Document missing playlist fields in output template 2016-06-11 04:59:47 +07:00
Sergey M․
6626c214e1 release 2016.06.11.1 2016-06-11 03:00:08 +07:00
Sergey M․
d845622b2e release 2016.06.11 2016-06-11 02:41:48 +07:00
Sergey M
1058f56e96 Merge pull request #9747 from TRox1972/lynda
[Lynda] Extract course description
2016-06-11 02:34:58 +07:00
TRox1972
0434358823 [Lynda] Extract course description 2016-06-10 19:17:58 +02:00
Sergey M․
3841256c2c [lynda] Skip login if already logged in 2016-06-10 23:01:52 +07:00
Sergey M․
bdf16f8140 [lynda] Add support for new authentication (Closes #9740) 2016-06-10 22:40:18 +07:00
Yen Chi Hsuan
836ab0c554 [compat] Import html5 entities correctly 2016-06-10 18:12:57 +08:00
Yen Chi Hsuan
6c0376fe4f [dw] Skip an invalid test
DW documentaries only last for one or two weeks. See #9475
2016-06-10 16:53:40 +08:00
Yen Chi Hsuan
1fa309da40 [generic] Update test_Generic_40
The original link now redirects to an YouTube user channel.
2016-06-10 16:39:31 +08:00
Yen Chi Hsuan
daa0df9e8b [youtube:user] Support another URL form
Such an URL comes from http://www.gametrailers.com/. This is originally
a test case in GenericIE, but now seems all GameTrailers videos are on
YouTube.
2016-06-10 16:37:12 +08:00
Yen Chi Hsuan
09728d5fbc [audiomack:album] Force video_id to be strings
Related: be6217b261
2016-06-10 16:11:28 +08:00
Yen Chi Hsuan
c16f8a4659 [voicerepublic] Force video_id to be strings
Related: be6217b261
2016-06-10 16:04:28 +08:00
Yen Chi Hsuan
a225238530 [vporn] Improve error detection and update _TESTS 2016-06-10 15:12:53 +08:00
Yen Chi Hsuan
55b2f099c0 [utils] Decode HTML5 entities
Used in test_Vporn_1. Also related to #9270
2016-06-10 15:11:55 +08:00
Yen Chi Hsuan
9631a94fb5 [compat] Add compat_html_entities_html5
Used in tset_Vporn_1. Also Related to #9270
2016-06-10 15:05:24 +08:00
Yen Chi Hsuan
cc4444662c [generic] Remove Vulture embed detection
Vulture.com videos now hosts on YouTube, Vimeo, MTV, NBC News or Hulu.
Here's an example of Hulu:
http://www.vulture.com/2016/06/kimmel-interviews-mariah-carey-in-a-bathtub.html
2016-06-10 13:40:57 +08:00
Yen Chi Hsuan
de3eb07ed6 [generic] Detect NBC News embeds 2016-06-10 13:32:59 +08:00
Yen Chi Hsuan
5de008e8c3 [nbcnews] Support embed widgets
Used in some Vulture videos
2016-06-10 13:31:55 +08:00
Yen Chi Hsuan
3e74b444e7 [vulture] Remove the extractor
The first 10 URLs in google search "site:http://video.vulture.com/video"
is dead. I guess Vulture does not host videos on their own anymore.
2016-06-10 13:13:59 +08:00
Yen Chi Hsuan
e1e0a10c56 [weibo] Remove the extractor
The Weibo weishipin (微視頻, tiny videos) service is dead and now all
videos are hosted on Sina videos, which is covered by sina.py
2016-06-10 13:01:22 +08:00
Yen Chi Hsuan
436214baf7 [xfileshare] Skip an invalid test 2016-06-10 12:31:06 +08:00
Yen Chi Hsuan
506d0e9693 [xuite] Skip the invalid test 2016-06-10 12:29:58 +08:00
Yen Chi Hsuan
55290788d3 [yahoo] Yahoo doesn't like region names in lower cases
Fix test_Yahoo_7
2016-06-10 12:28:56 +08:00
Yen Chi Hsuan
bc7e7adf51 [wdr] Subtitles are TTML 2016-06-10 00:22:41 +08:00
Sergey M․
b0aebe702c [godtv] Relax _VALID_URL 2016-06-09 21:34:47 +07:00
Sergey M․
416878f41f [godtv] Add more tests 2016-06-09 21:33:51 +07:00
Sergey M․
c0fed3bda5 [godtv] Improve and add support for playlists (Closes #9608) 2016-06-09 21:29:41 +07:00
TRox1972
bb1e44cc8e [godtv] Add extractor
[GodTV] Improvements
2016-06-09 21:27:27 +07:00
N1k145
21efee5f8b [openload] Relax _VALID_URL
[openload] added to _TESTS, removed escape
2016-06-09 20:46:54 +07:00
Yen Chi Hsuan
e2713d32f4 [openload] Fix extraction. Thanks @perron375 for the solution
Closes #9706
2016-06-09 19:00:13 +08:00
Yen Chi Hsuan
e21c26daf9 Merge pull request #9395 from pmrowla/afreecatv
[afreecatv] Add new extractor for afreecatv.com VODs
2016-06-09 17:20:16 +08:00
Yen Chi Hsuan
1594a4932f [wdr] Misc changes 2016-06-09 13:49:35 +08:00
Yen Chi Hsuan
6869d634c6 [wdr] Simplify extraction 2016-06-09 13:41:12 +08:00
Yen Chi Hsuan
50918c4ee0 [wdr] Support radio players (closes #6147) 2016-06-09 13:04:30 +08:00
Yen Chi Hsuan
6c33d24b46 [utils] Add audio/mpeg to mimetype2ext()
Used in WDR live radios (#6147)
2016-06-09 12:58:24 +08:00
Sergey M․
be6217b261 [YoutubeDL] Force string conversion on non string video ids 2016-06-09 05:34:19 +07:00
Sergey M․
9d51a0a9a1 [vessel] Make hls formats non fatal 2016-06-09 04:13:38 +07:00
Sergey M․
39da509f67 [vessel] Extract DASH formats 2016-06-09 04:12:48 +07:00
Sergey M․
a479b8f687 [vessel] Use native hls by default 2016-06-09 04:09:32 +07:00
Sergey M․
48a5eabc48 [extractor/generic] Add support vessel embeds (Closes #7083) 2016-06-09 04:02:27 +07:00
Sergey M․
11380753b5 [vessel] Add support for embed urls and improve extraction 2016-06-09 04:00:47 +07:00
Yen Chi Hsuan
411c590a1f [youku:show] Add new extractor 2016-06-08 23:45:46 +08:00
Yen Chi Hsuan
6da8d7de69 [twitter] Update _TESTS 2016-06-08 21:48:12 +08:00
Yen Chi Hsuan
c6308b3153 [twitter] Fix extraction for videos with HLS streams
Closes #9623
2016-06-08 21:28:10 +08:00
Yen Chi Hsuan
fc0a45fa41 [twitter] Detect suspended accounts and update _TESTS 2016-06-08 21:12:14 +08:00
Yen Chi Hsuan
e6e90515db [nbc] Add the test case from #9578
Closes #9578
2016-06-08 20:50:01 +08:00
Yen Chi Hsuan
22a0a95247 [theplatform] Some NBC videos require an additional cookie
Related: #9578
2016-06-08 20:47:39 +08:00
Yen Chi Hsuan
50ce1c331c [downloader/external] Add another env for proxies in ffmpeg/avconv
Related sources:
https://git.libav.org/?p=libav.git;a=blob;f=libavformat/http.c;h=8fe8d11e1edfdbb04a8726db2c49cfef3f572aac;hb=HEAD#l152
https://git.libav.org/?p=libav.git;a=blob;f=libavformat/tls.c;h=fab243e93e20034e88e619188c13a44a5d8ccdb9;hb=HEAD#l63
https://github.com/FFmpeg/FFmpeg/blob/f8e89d8/libavformat/http.c#L191
https://github.com/FFmpeg/FFmpeg/blob/f8e89d8/libavformat/tls.c#L92
2016-06-08 14:43:52 +08:00
Yen Chi Hsuan
7264e38591 [bilibili] Fix for videos without upload time (closes #9710) 2016-06-08 14:31:40 +08:00
Sergey M․
33d9f3707c [thesixtyone] Relax _VALID_URL (Closes #9714) 2016-06-08 02:22:04 +07:00
Sergey M․
a26a9d6239 [livestream:event] Ensure video id is string (Closes #9721) 2016-06-07 23:53:08 +07:00
Yen Chi Hsuan
a4a8201c02 [wdr] Update _TESTS 2016-06-08 00:25:51 +08:00
Yen Chi Hsuan
a6571f1073 [common] Fix <bootstrapInfo> detection in F4M manifests
Regression since 0a5685b26f
2016-06-08 00:19:33 +08:00
Sergey M․
57b6e9652e [canal+] Add support for d17.tv 2016-06-07 22:32:08 +07:00
Sergey M․
3d9b3605a3 [canal+] Update tests 2016-06-07 22:26:18 +07:00
Sergey M․
74193838f7 [canal+] Improve extraction (Closes #9718) 2016-06-07 22:12:20 +07:00
Sergey M
fb94e260b5 Merge pull request #9720 from Kagami/vlive-new-statuses
[vlive] Acknowledge vlive+ streams statuses
2016-06-07 21:22:53 +07:00
Kagami Hiiragi
345dec937f [vlive] Acknowledge vlive+ streams statuses
Same as common statuses just with "PRODUCT_" prefix:
PRODUCE_LIVE_END, PRODUCT_COMING_SOON, etc.
2016-06-07 17:12:13 +03:00
Philipp Hagemeister
4315f74fa8 Merge remote-tracking branch 'Boris-de/wdrmaus_fix#8562' 2016-06-07 12:29:18 +02:00
Jaime Marquínez Ferrándiz
e67f688025 [compat] Add 'compat_input' to __all__ 2016-06-05 23:16:08 +02:00
Sergey M․
db59b37d0b [devscripts/create-github-release] Make full published releases by default 2016-06-06 03:02:11 +07:00
Sergey M․
244fe977fe [options] Add --load-info-json alias for symmetry with --write-info-json 2016-06-06 02:52:58 +07:00
Sergey M․
7b0d1c2859 [__init__] Use write_string instead of compat_string (Closes #9689) 2016-06-05 21:01:20 +07:00
Yen Chi Hsuan
21d0a8e48b Merge pull request #9702 from Eun/patch-1
curl: follow redirect
2016-06-05 17:43:26 +08:00
Tobias Salzmann
47f12ad3e3 curl: follow redirect 2016-06-05 11:04:55 +02:00
Sergey M
8f1aaa97a1 [README.md] Update pypi instructions 2016-06-05 11:19:44 +07:00
Sergey M
9d78524cbe Merge pull request #9697 from ryandesign/ryandesign-README-MacPorts
Update README.md to mention MacPorts
2016-06-05 11:09:20 +07:00
Ryan Schmidt
bc270284b5 Update README.md to mention MacPorts 2016-06-04 21:30:22 -05:00
Philipp Hagemeister
c93b4eaceb git pushMerge branch 'master' of github.com:rg3/youtube-dl 2016-06-04 22:55:21 +02:00
Philipp Hagemeister
71b9cb3107 extend FAQ (#9696) 2016-06-04 22:55:15 +02:00
Sergey M․
633b444fd2 [downloader/hls] Correct comment on twitch vods 2016-06-05 03:31:10 +07:00
Sergey M․
51c4d85ce7 [downloader/hls] PEP 8 2016-06-05 03:21:43 +07:00
Sergey M․
631d4c87ee [twitch:vod] Use native hls 2016-06-05 03:19:44 +07:00
Sergey M․
1e236d7e23 [downloader/hls] Do not rely on EXT-X-PLAYLIST-TYPE:EVENT 2016-06-05 03:16:05 +07:00
Sergey M․
2c34735267 [youtube] Add itags 256 and 258 2016-06-05 01:44:13 +07:00
Sergey M․
39b32571df [devscripts/release.sh] Release to GitHub 2016-06-05 00:48:33 +07:00
Sergey M․
db56f281d9 [devscripts/create-github-release] Add script for releasing on GitHub
Yet only Basic authentication is supported either via .netrc or by manual input
2016-06-05 00:47:26 +07:00
Sergey M․
e92b552a10 [devscripts/buildserver] Use compat_input from compat 2016-06-05 00:44:51 +07:00
Sergey M․
1ae6c83bce [compat] Add compat_input 2016-06-05 00:43:55 +07:00
Sergey M․
0fc832e1b2 [vidio] Improve (Closes #9562) 2016-06-04 16:48:24 +07:00
TRox1972
7def35712a [vidio] Add extractor (Closes #7195)
[Vidio] fix fallback value and wrap duration in int_or_none

[Vidio] don't use video_id for _html_search_regex()
2016-06-04 16:48:24 +07:00
Philipp Hagemeister
cad88f96dc disable uploading to yt-dl.org for now 2016-06-04 11:42:52 +02:00
Sergey M․
762d44c956 [channel9] Add support for rss links (Closes #9673) 2016-06-04 04:57:16 +07:00
Sergey M․
4d8856d511 [loc] Extract direct download links 2016-06-04 00:26:03 +07:00
Sergey M․
c917106be4 [loc] Extract subtites 2016-06-03 23:55:22 +07:00
Sergey M․
76e9cd7f24 [loc] Add support for another URL schema and simplify 2016-06-03 23:43:34 +07:00
Sergey M․
bf4c6a38e1 release 2016.06.03 2016-06-03 23:25:24 +07:00
Sergey M․
7f3c3dfa52 [loc] Improve (Closes #9521) 2016-06-03 23:19:11 +07:00
TRox1972
9c3c447eb3 [loc] Add extractor (Closes #3188)
Added extractor of loc.gov, which closes #3188. I am not an experienced programmer, so I am sure I did a bunch of mistakes, but the extractor works (for me at least).

[LibraryOfCongress] don't use video_id for _search_regex()

[LibraryOfCongress] Improvements
2016-06-03 22:17:35 +07:00
Yen Chi Hsuan
ad73083ff0 [bilibili] Add _part%d suffixes back (closes #9660) 2016-06-02 19:29:27 +08:00
Yen Chi Hsuan
1e8b59243f Merge pull request #9669 from bzc6p/master
Added sanitization support for Hungarian letters Ő and Ű
2016-06-02 18:23:54 +08:00
bzc6p
c88270271e Added sanitization support for Hungarian letters Ő and Ű 2016-06-02 11:51:48 +02:00
bzc6p
b96f007eeb Added sanitization support for Hungarian letters Ő and Ű 2016-06-02 11:39:32 +02:00
Yen Chi Hsuan
9a4aec8b7e [utils] Use bytes-like objects as header values on Python 2 2016-06-02 15:00:49 +08:00
Yen Chi Hsuan
54fb199681 [test/test_http] Fix getsockname() on Jython 2016-06-02 15:00:49 +08:00
Yen Chi Hsuan
8c32e5dc32 [test/test_utils] Add test for #9588 2016-06-02 15:00:49 +08:00
Yen Chi Hsuan
0ea590076f [utils] Always decode Location header
escape_url is broken for bytes-like objects
2016-06-02 15:00:49 +08:00
Remita Amine
4a684895c0 [seeker] Add new extractor(closes #9619) 2016-06-01 21:20:25 +01:00
Remita Amine
f4e4aa9b6b [revision3:embed] Add new extractor 2016-06-01 21:20:25 +01:00
Sergey M․
5e3856a2c5 release 2016.06.02 2016-06-02 01:19:57 +07:00
Sergey M․
6e6b9f600f [arte] Add support for playlists and rework tests (Closes #9632) 2016-06-02 01:10:23 +07:00
Sergey M․
6a1df4fb5f [spankwire] Add support for new URL format (Closes #9657) 2016-06-01 21:23:58 +07:00
Yen Chi Hsuan
dde1ce7c06 [tf1] Fix a regular expression (closes #9656)
This is a Python bug fixed in 2.7.6 [1]

[1] https://github.com/rg3/youtube-dl/issues/9656#issuecomment-222968594
2016-06-01 20:04:43 +08:00
Yen Chi Hsuan
811586ebcf [generic] Update the UDNEmbed test case 2016-06-01 19:23:44 +08:00
Yen Chi Hsuan
0ff3749bfe [udn] Fix m3u8 and f4m extraction as well as improve 2016-06-01 19:23:09 +08:00
Yen Chi Hsuan
28bab13348 [generic,viewlift] Move a test case to the specialized extractor 2016-06-01 19:18:01 +08:00
Yen Chi Hsuan
877032314f [generic] Improve Kaltura detection
Closes #4004
2016-06-01 18:37:34 +08:00
Peter Rowlands
e7d85c4ef7 use /track/video/file to determine if video exists 2016-05-31 17:28:49 +09:00
Sergey M․
8ec2b2c41c [options] Add --limit-rate alias for rate limiting option
Closes #9644
In order to follow regular --verb-noun pattern and better conformity with wget and curl
2016-05-30 21:48:35 +07:00
Sergey M․
197a5da1d0 [yandexmusic] Improve captcha detection 2016-05-30 03:26:26 +07:00
Sergey M․
abbb2938fa release 2016.05.30.2 2016-05-30 03:12:12 +07:00
Sergey M․
f657b1a5f2 release 2016.05.30.1 2016-05-30 03:03:06 +07:00
Philipp Hagemeister
86a52881c6 [travis] unsubscribe @phihag 2016-05-29 21:29:38 +02:00
Sergey M․
8267423652 release 2016.05.30 2016-05-30 01:18:23 +07:00
Sergey M
917a3196f8 [README.md] Update c runtime dependency FAQ entry 2016-05-30 01:03:40 +07:00
Sergey M․
56bd028a0f [devscripts/buildserver] Listen on all interfaces 2016-05-30 00:21:18 +07:00
Sergey M․
681b923b5c [devscripts/release.sh] Allow passing buildserver address as cli option 2016-05-29 23:36:42 +07:00
Yen Chi Hsuan
9ed6d8c6c5 [youku] Extract resolution 2016-05-29 13:54:05 +08:00
Sergey M․
f3fb420b82 [devscripts/release.sh] Check for wheel 2016-05-29 11:49:14 +06:00
Sergey M․
165e3561e9 [devscripts/buildserver] Check Wow6432Node first when searching for python
This allows building releases from 64bit OS
2016-05-29 10:02:00 +06:00
Sergey M․
27f17c0eab [Makefile] Fix youtube-dl.1 target
Now it accepts output filename as argument
2016-05-29 09:11:16 +06:00
Sergey M․
44c8892369 [devscripts/prepare_manpage] Fix manpage generation on Windows 2016-05-29 09:06:10 +06:00
Sergey M․
f574103d7c [buildserver] Fix buildserver and make python2 compatible 2016-05-29 09:03:17 +06:00
Yen Chi Hsuan
6d138e98e3 Merge pull request #9621 from venth/feature/ignored_intellij
ignored intellij related files
2016-05-29 03:10:29 +08:00
venth
2a329110b9 ignored intellij related files 2016-05-28 20:27:18 +02:00
Yen Chi Hsuan
2bee7b25f3 [Makefile] Cleanup m4a files
[ci skip]
2016-05-29 01:59:09 +08:00
Yen Chi Hsuan
92cf872a48 [.gitignore] Ignore mp3 files
[ci skip]
2016-05-29 01:59:01 +08:00
Yen Chi Hsuan
6461f2b7ec [bilibili] Fix extraction, improve and cleanup 2016-05-29 01:26:00 +08:00
Sergey M․
807cf7b07f [udemy] Fix authentication for localized layout (Closes #9594) 2016-05-28 21:18:24 +06:00
Sergey M․
de7d76af52 [coub] Add another test 2016-05-27 23:38:17 +06:00
Sergey M․
11c70deba7 [coub] Add extractor (Closes #9609) 2016-05-27 23:34:58 +06:00
Sergey M․
f36532404d [vk] Remove superfluous code 2016-05-27 22:19:10 +06:00
Sergey M․
77b8b4e696 [extractor/common] Borrow quality metadata from parent set-level manifest for f4m 2016-05-27 01:47:44 +06:00
Sergey M․
2615fa7584 [downloader/f4m] Simply select format when it's the only one 2016-05-27 01:46:12 +06:00
Boris Wachtmeister
3a686853e1 [WDR] fixed parsing of playlists 2016-05-26 20:54:51 +02:00
Boris Wachtmeister
949fc42e00 [WDR] the other wdrmaus.de pages also changed to the new player 2016-05-26 20:54:51 +02:00
Boris Wachtmeister
33a1ff7113 [WDR] extract jsonp-url by parsing data-extension of mediaLink 2016-05-26 20:54:51 +02:00
Boris Wachtmeister
bec2c14f2c [WDR] add special handling if alt-url is a m3u8 2016-05-26 20:54:51 +02:00
Boris Wachtmeister
37f972954d [WDR] use _download_json with a strip_jsonp 2016-05-26 20:54:51 +02:00
Boris Wachtmeister
3874e6ea66 [WDR] use single quotes for strings 2016-05-26 20:54:51 +02:00
Yen Chi Hsuan
fac2af3c51 [common] Fix m3u8 extraction in f4m manifests 2016-05-27 01:41:27 +08:00
Sergey M․
6f8cb24219 [tvp] Expand _VALID_URL and improve naming (Closes #9602) 2016-05-26 22:21:55 +06:00
Yen Chi Hsuan
448bb5f333 [common] Fix non-bootstrapped support in f4m 2016-05-27 00:03:48 +08:00
Yen Chi Hsuan
293c255688 [utils] Remove debugging codes 2016-05-26 22:54:16 +08:00
Yen Chi Hsuan
ac88d2316e [dw] Support documentaries (closes #9475) 2016-05-26 22:48:47 +08:00
Yen Chi Hsuan
5950cb1d6d [utils] Support a new form of date
Found in dw.com (#9475)
2016-05-26 22:44:00 +08:00
Yen Chi Hsuan
761052db92 [playwire] Add the test (closed #9531) 2016-05-26 21:57:06 +08:00
Yen Chi Hsuan
240b60453e [common] Support m3u8 in f4m manifests
Related: #9531
2016-05-26 21:55:43 +08:00
Yen Chi Hsuan
85b0fe7d64 [playwire] Use _extract_f4m_formats
Related: #9531
2016-05-26 21:43:35 +08:00
Yen Chi Hsuan
0a5685b26f [common] Support non-bootstraped streams in f4m manifests
Related: #9531
2016-05-26 21:41:47 +08:00
Sergey M․
6f748df43f [eporner] Make test only_matching 2016-05-25 20:51:17 +06:00
Yen Chi Hsuan
b410cb83d4 Merge pull request #9595 from Kagami/vlive-site-update
[vlive] Address site update
2016-05-25 19:24:15 +08:00
Yen Chi Hsuan
da9d82840a Merge pull request #9600 from wankerer/master
[eporner] fix for the new URL layout
2016-05-25 18:52:55 +08:00
wankerer
4ee0b8afdb [eporner] fix for the new URL layout
Recently eporner slightly changed the URL layout, the ID that used to be
digits only are now digits and letters, so youtube-dl falls back to
the generic extractor that doesn't work.

Fix the matching regex to allow letters in ID.

[v2: added a test case]
2016-05-24 15:57:36 -07:00
remitamine
1de32771e1 [eyedotv] Add new extractor(closes #9582) 2016-05-24 20:10:12 +01:00
remitamine
688c634b7d skip some tests to reduce test time 2016-05-24 16:44:11 +01:00
Sergey M․
0d6ee97508 Credit @TRox1972 for tosh.cc (#9566) and localnews8 (#9539) 2016-05-24 21:42:47 +06:00
Sergey M․
6b43132ce9 [xhamster] Update tests 2016-05-24 21:38:27 +06:00
mexican porn commits
a4690b3244 [xhamster] url regex fix for videos with empty title. 2016-05-24 21:35:43 +06:00
remitamine
444417edb5 [radiocanada] Add new extractor(#4020) 2016-05-24 15:58:27 +01:00
remitamine
277c7465f5 [ooyala] check manifest ext with determine_ext and update tests for related extractors 2016-05-24 11:24:29 +01:00
Kagami Hiiragi
25bcd3550e [vlive] Address site update
Changes:
* Fix video params extraction
* Don't make status request since status info now available on the page
* Remove unneeded code
* Fix test
2016-05-24 12:54:28 +03:00
remitamine
a4760d204f [ooyala] use api v2 to reduce requests for format extraction 2016-05-24 00:22:29 +01:00
remitamine
e8593f346a [ooyala] extract subtitles 2016-05-23 23:58:16 +01:00
remitamine
05b651e3a5 [washingtonpost] reduce requests for m3u8 manifests 2016-05-23 13:04:50 +01:00
remitamine
42a7439717 [cbs] allow to pass content id to the extractor(closes #9589) 2016-05-23 09:31:37 +01:00
remitamine
b1e9ebd080 [washingtonpost] remove unnecessary code 2016-05-23 02:30:12 +01:00
remitamine
0c50eeb987 [reuters] Add new extractor 2016-05-23 02:27:31 +01:00
remitamine
4b464a6a78 [washingtonpost] improve format extraction and add support for video pages extraction 2016-05-23 00:48:11 +01:00
Sergey M․
5db9df622f [life:embed] Use native hls 2016-05-23 04:22:09 +06:00
Sergey M․
5181759c0d [life] Update _VALID_URL 2016-05-23 04:00:08 +06:00
Sergey M․
e54373204a [lifenews] Fix metadata extraction 2016-05-23 03:44:04 +06:00
remitamine
102810ef04 [voxmedia] fix volume embed extraction 2016-05-22 20:37:35 +01:00
Yen Chi Hsuan
78d3b3e213 [generic] Improve Livestream detection (closes #2234) 2016-05-23 01:40:11 +08:00
Yen Chi Hsuan
7a46542f97 [livestream] Video IDs should always be strings (#2234) 2016-05-23 01:40:11 +08:00
Yen Chi Hsuan
eb7941e3e6 [compat] Fix for XML with <!DOCTYPE> in Python 2.7 and 3.2
Such XML documents cause DeprecationWarning if python is run
with `-W error`
2016-05-23 01:40:11 +08:00
remitamine
db3b8b2103 [tf1] add support for more related web sites 2016-05-22 17:03:17 +01:00
remitamine
c5f5155100 [wat] extract all formats 2016-05-22 17:03:17 +01:00
Yen Chi Hsuan
4a12077855 [genric] Eliminate duplicated video URLs (closes #6562) 2016-05-22 22:23:20 +08:00
Sergey M
a4a7c44bd3 [README.md] Document solution for extremely slow start on Windows 2016-05-22 15:04:51 +06:00
Thor77
70346165fe [bandcamp] raise ExtractorError when track not streamable (#9465)
* [bandcamp] raise ExtractorError when track not streamable

* [bandcamp] update md5 for second test

* don't rely on json-data, but just check for 'file'

* don't rely on presence of 'file'
2016-05-22 14:15:39 +08:00
Sergey M
c776b99691 [README.md] Remove Windows updating trickery
Windows updating fixed in e9297256d4.
2016-05-22 10:14:02 +06:00
Sergey M․
e9297256d4 [update] Fix youtube-dl.exe updating from arbitrary directory (Closes #2718) 2016-05-22 10:06:45 +06:00
Sergey M
e5871c672b [README.md] Clarify location for youtube-dl.exe even more
%USERPROFILE% not in %PATH% by default.
2016-05-22 09:36:07 +06:00
Sergey M
9b06b0fb92 [README.md] Clarify updating on Windows 2016-05-22 09:26:06 +06:00
Sergey M
4f3a25c2b4 [README.md] Fix typo 2016-05-22 09:00:08 +06:00
Sergey M
21a19aa94d [README.md] Clarify location for youtube-dl.exe 2016-05-22 08:59:28 +06:00
Sergey M․
c6b9cf05e1 [utils] Do not fail on unknown date formats in unified_strdate 2016-05-22 08:28:41 +06:00
Sergey M․
4d8819d249 [extractor/generic] Add support for theplatform embeds (Closes #8636, closes #9476) 2016-05-22 06:52:39 +06:00
Sergey M․
898f4b49cc [theplatform] Add _extract_urls 2016-05-22 06:47:22 +06:00
Sergey M․
0150a00f33 [cc] Add test for tosh.cc (Closes #9566) 2016-05-22 02:58:41 +06:00
TRox1972
c8831015f4 [ComedyCentral] Add support for tosh.cc.com and cc.com/video-clips 2016-05-22 02:55:10 +06:00
Sergey M․
92d221ad48 [periscope] Update uploader_id (Closes #9565) 2016-05-22 02:39:15 +06:00
Sergey M․
0db9a05f88 [periscope:user] Adapt to layout changes (Closes #9563) 2016-05-22 02:15:56 +06:00
Philipp Hagemeister
e03b35b8f9 release 2016.05.21.2 2016-05-21 21:47:39 +02:00
Philipp Hagemeister
d2fee3c99e release.sh: also check for python3 rsa module 2016-05-21 21:47:22 +02:00
Philipp Hagemeister
598869afb1 release 2016.05.21.1 2016-05-21 21:27:00 +02:00
Philipp Hagemeister
7e642e4fd6 release: check for pandoc
Abort releaseing if pandoc is missing.
(pandoc was not included in my essential app database, and thus missing on my new machine.)
2016-05-21 21:26:57 +02:00
Philipp Hagemeister
c8cc3745fb release 2016.05.21 2016-05-21 21:18:59 +02:00
Jaime Marquínez Ferrándiz
4c718d3c50 [rtve] Recognize 'filmoteca' URLs 2016-05-21 17:37:35 +02:00
Yen Chi Hsuan
115c65793a [jwplatform] Don't fail with RTMP URLs without mp4:, mp3: or flv: 2016-05-21 13:50:38 +08:00
Yen Chi Hsuan
661d46b28f [cbslocal] Add new extractor (closes #9522) 2016-05-21 13:40:45 +08:00
Yen Chi Hsuan
5ce3d5bd1b [sendtonews] Add new extractor
Used in CBSLocal. Part of #9522
2016-05-21 13:39:42 +08:00
Yen Chi Hsuan
612b5f403e [jwplatform] Improved m3u8 and rtmp support
Changes made for SendtoNewsIE. Part of #9522
2016-05-21 13:38:01 +08:00
Yen Chi Hsuan
9f54e692d2 [anvato] Add new extractor
Used in CBSLocal (#9522)
2016-05-21 13:18:29 +08:00
Yen Chi Hsuan
7b2fcbfd4e [common] Skip TYPE=CLOSED-CAPTIONS lines in m3u8 manifests
According to [1], valid values for TYPE are AUDIO, VIDEO, SUBTITLES
and CLOSED-CAPTIONS. Such a value is found in Anvato master playlists,
though I don't use _extract_m3u8_formats() in the end.

Part of #9522.

[1] https://tools.ietf.org/html/draft-pantos-http-live-streaming-19#section-4.3.4.1
2016-05-21 13:16:28 +08:00
Yen Chi Hsuan
16da9bbc29 [common] Add _m3u8_meta_format() template
For extractors who handle m3u8 manifests by themselves. (eg., AnvatoIE)

Part of #9522
2016-05-21 13:15:28 +08:00
Sergey M․
c8602b2f9b [nrk] Unquote subtitles' URLs 2016-05-21 05:09:16 +06:00
Sergey M․
b219f5e51b [brightcove:new] Improve error reporting 2016-05-21 00:59:06 +06:00
Sergey M․
1846e9ade0 [localnews8] Fix extractor (Closes #9539) 2016-05-20 22:31:08 +06:00
TRox1972
6756602be6 [LocalNews8] add extractor (Closes #9200) 2016-05-20 22:10:13 +06:00
Sergey M․
6c114b1210 [extractor/generic] Remove generic id and title from wistia extractionand update tests 2016-05-20 21:55:35 +06:00
Sergey M․
7ded6545ed [extractor/generic] Add test for wistia standard embed 2016-05-20 21:43:36 +06:00
Sergey M․
aa5957ac49 [extractor/generic] Add support for async wistia embeds (Closes #9549) 2016-05-20 21:33:31 +06:00
remitamine
64413f7563 [cbc] fix extraction for flv only videos(fixes #5309) 2016-05-20 16:21:23 +01:00
Sergey M․
45f160a43c [wistia] Improve hls support 2016-05-20 21:16:08 +06:00
Sergey M․
36ca2c55db [wistia] Skip storyboard and improve extraction 2016-05-20 21:04:01 +06:00
Sergey M․
f0c96af9cb [wistia] Add alias and modernize 2016-05-20 20:55:10 +06:00
Yen Chi Hsuan
31a70191e7 [cbc] Add the test case from #5156 2016-05-20 19:04:50 +08:00
Yen Chi Hsuan
ad96b4c8f5 [common] Extract audio formats in SMIL
Found in http://www.cbc.ca/player/play/2657631896

Closes #5156
2016-05-20 19:02:53 +08:00
Yen Chi Hsuan
043dc9d36f [cbc] Fix for old-styled URLs
The URL http://www.cbc.ca/player/News/ID/2672225049/ (#6342) redirects
to http://www.cbc.ca/player/play/2672224672, while youtube-dl wasn't
able to handle it correctly.
2016-05-20 18:39:54 +08:00
remitamine
52f7c75cff [cbc] extract http formats and update tests 2016-05-20 06:58:46 +01:00
Sergey M․
f6e588afc0 [24video] Fix description extraction 2016-05-20 08:53:04 +06:00
remitamine
a001296703 [learnr] Add new extractor(closes #4284) 2016-05-19 18:18:03 +01:00
Yen Chi Hsuan
2cbd8c6781 Merge pull request #9537 from TRox1972/p1
[Makefile] delete thumbnails
2016-05-19 16:58:44 +08:00
TRox1972
8585dc4cdc [Makefile] delete thumbnails 2016-05-19 01:21:38 +02:00
Sergey M․
dd81769c62 [ndtv] Fix extraction 2016-05-19 04:34:19 +06:00
Sergey M․
46bc9b7d7c [utils] Allow None in remove_{start,end} 2016-05-19 04:31:30 +06:00
remitamine
b78531a36a [formula1] Add new extractor(closes #3617) 2016-05-18 22:24:46 +01:00
Sergey M․
11e6a0b641 [nfb] Modernize and extract subtitles 2016-05-18 00:25:15 +06:00
Sergey M․
15cda1ef77 [nfb] Fix uploader extraction 2016-05-17 23:46:47 +06:00
Yen Chi Hsuan
055f0d3d06 [abcnews] Added a new extractor (closes #3992)
Related: #6108, #8664, #9459
2016-05-17 15:38:57 +08:00
Yen Chi Hsuan
cdd94c2eae [utils] Check for None values in SOCKS proxy
Originally reported at
https://github.com/rg3/youtube-dl/pull/9287#issuecomment-219617864
2016-05-17 14:38:15 +08:00
Philipp Hagemeister
36755d9d69 release 2016.05.16 2016-05-16 17:25:47 +02:00
Sergey M․
f7199423e5 [groupon] Add support for Youtube embeds (Closes #9508) 2016-05-16 00:30:13 +06:00
Sergey M․
a0a81918f1 [collegehumor] Remove extractor
It now uses brightcove
2016-05-15 22:07:51 +06:00
Yen Chi Hsuan
5572d598a5 [hearthisat] Update the first test 2016-05-15 15:44:04 +08:00
Yen Chi Hsuan
cec9727c7f [hearthisat] Detect invalid download links (fixes #9440) 2016-05-15 15:35:31 +08:00
Yen Chi Hsuan
79298173c5 [utils] Fix getheader in urlhandle_detect_ext
Fixes #7049, related to #9440
2016-05-15 15:34:50 +08:00
Sergey M․
69c9cc2716 [xvideos] Extract html5 player formats (Closes #9495) 2016-05-15 03:38:04 +06:00
Sergey M․
ed56f26039 [extractor/common] Improve name extraction for m3u8 formats 2016-05-15 03:34:35 +06:00
Sergey M․
6f41b2bcf1 [extractor/generic] Improve 3qsdn embeds support (Closes #9453) 2016-05-14 23:58:25 +06:00
Sergey M․
cda6d47aad [utils] Simplify integer conversion in js_to_json 2016-05-14 23:41:57 +06:00
Sergey M․
5d39176f6d [extractor/generic:3qsdn] Add support for embeds 2016-05-14 23:40:34 +06:00
Sergey M․
5c86bfe70f [3qsdn] Add extractor 2016-05-14 23:35:03 +06:00
Sergey M․
364cf465dd [test_utils] PEP 8 2016-05-14 20:46:33 +06:00
Sergey M․
ca950f49e9 [ora] Revert extraction to regexes
It's less fragile than using js_to_json with ora js
2016-05-14 20:45:18 +06:00
Sergey M․
89ac4a19e6 [utils] Process non-base 10 integers in js_to_json 2016-05-14 20:39:58 +06:00
felix
640eea0a0c [ora] minimise fragile regex shenanigans; recognise unsafespeech.com URLs 2016-05-14 20:13:06 +06:00
felix
bd1e484448 [utils] js_to_json: various improvements
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
2016-05-14 20:12:39 +06:00
Yen Chi Hsuan
a834622b89 Merge pull request #9492 from jwilk/teamcoco
[teamcoco] Fix base64 regexp
2016-05-14 20:02:40 +08:00
Yen Chi Hsuan
707bb426b1 Merge pull request #9493 from jwilk/errno
Don't hardcode errno constant
2016-05-14 20:00:11 +08:00
Jakub Wilk
66e7ace17a Don't hardcode errno constant
The value of ENOENT is architecture-dependent, so don't assume it's
always 2.
2016-05-14 13:41:41 +02:00
Jakub Wilk
791ff52f75 [teamcoco] Fix base64 regexp 2016-05-14 13:19:54 +02:00
Yen Chi Hsuan
98d560f205 [test/test_socks] Skip SOCKS tests
They occasional trigger errors or blocks
(https://travis-ci.org/rg3/youtube-dl/jobs/130184883)
2016-05-14 18:48:36 +08:00
Yen Chi Hsuan
afcc317800 Merge pull request #9466 from TRox1972/patch-1
Update README.md
2016-05-14 17:03:04 +08:00
Sergey M․
b5abf86148 [cinemassacre] Remove extractor (Closes #9457)
It now uses jwplatform
2016-05-14 04:53:14 +06:00
Sergey M․
134c6ea856 [YoutubeDL] Sanitize url for url and url_transparent extraction results 2016-05-14 04:46:38 +06:00
remitamine
0730be9022 [sina] fix extraction(fixes #1146) 2016-05-13 20:25:01 +01:00
Sergey M․
96c2e3e909 [imdb] Improve extraction 2016-05-13 23:25:05 +06:00
Sergey M․
f196508f7b [imdb] Relax _VALID_URL (Closes #9481) 2016-05-13 22:19:00 +06:00
Yen Chi Hsuan
cc1028aa6d [openload] Fix extraction (closes #9472) 2016-05-13 18:11:08 +08:00
remitamine
ad55e10165 [brightcove] change the protocol for m3u8 formats to m3u8_native 2016-05-13 08:35:38 +01:00
remitamine
18cf6381f6 [nrk] extract m3u8 formats 2016-05-13 08:05:28 +01:00
remitamine
cdf32ff15d [extractors] add import for UstudioEmbedIE 2016-05-13 05:25:32 +01:00
remitamine
99d79b8692 [ustudio] add support ustudio app/embed urls 2016-05-13 05:21:45 +01:00
remitamine
b9e7bc55da [mgtv] extract http formats 2016-05-12 22:46:23 +01:00
Sergey M․
d8d540cf0d [nrk] Rework extractor (Closes #9470) 2016-05-13 02:07:12 +06:00
Sergey M․
0df79d552a [twitch:bookmarks] Remove extractor
Bookmarks no longer available
2016-05-13 00:14:30 +06:00
Sergey M․
0db3a66162 [twitch] Skip dead tests 2016-05-12 23:57:52 +06:00
Yen Chi Hsuan
7581bfc958 [utils] Unquote crendentials passed to SOCKS proxies
Fixes #9450
2016-05-13 00:27:25 +08:00
TRox1972
f388f616c1 Update README.md 2016-05-12 16:48:12 +02:00
Yen Chi Hsuan
a3fa6024d6 [bloomberg] Fix test_Bloomberg
In this test case, sometimes HLS is the best format while sometimes HDS
is. To prevent occasional test failures, force HDS to be the best
format. In the past, testing against HDS formats causes the same error
as #9214, which is fixed as #9377 landed.
2016-05-12 20:08:42 +08:00
Yen Chi Hsuan
1b405bb47d [downloader/f4m] Tolerate truncate segments when testing
Replaces #9216

Fixes #9214 and test_Bloomberg partially
2016-05-12 20:02:36 +08:00
Yen Chi Hsuan
7e8ddca1bb [vevo] Delay the georestriction check to prevent false alerts
Fixes #9408
2016-05-12 19:56:58 +08:00
Yen Chi Hsuan
778a1ccca7 [utils] Add Œ and œ found in French to ACCENT_CHARS
Fixes #9463
2016-05-12 19:48:48 +08:00
Yen Chi Hsuan
4540515cb3 [iqiyi] Fix 1080P extraction (closes #9446) 2016-05-12 18:48:27 +08:00
Sergey M․
e0741fd449 [__init__] Simplify colon presence check 2016-05-11 22:03:30 +06:00
teemuy
e73b9c65e2 Bugfix: Allow colons in custom HTTP header values. 2016-05-11 21:59:24 +06:00
Yen Chi Hsuan
702ccf2dc0 [compat] Rename shlex_quote and remove unused subprocess_check_output 2016-05-10 16:00:21 +08:00
Philipp Hagemeister
28b4f73620 release 2016.05.10 2016-05-10 09:08:08 +02:00
Yen Chi Hsuan
c2876afafe [test/test_socks] Use a different port range
Seems on Travis CI, ports in the original range are often used.
2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
6ddb4888d2 [options] Update --proxy description for SOCKS proxies 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
fa5cb8d021 [socks] Remove a superfluous clause 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
e21f17fc86 [test/test_socks] Test with local SOCKS servers 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
edaa23f822 [compat] Rename struct_(un)pack to compat_struct_(un)pack 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
d5ae6bb501 [utils] Add rationale for register_socks_protocols 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
51fb4995a5 [utils] Register SOCKS protocols in urllib and support SOCKS4A 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
9e9cd7248d [socks] Eliminate magic constants and improve 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
72f3289ac4 [test/test_socks] Add tests for SOCKS proxies 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
71aff18809 [socks] Support SOCKS proxies 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
dab0daeeb0 [utils,compat] Move struct_pack and struct_unpack to compat.py 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
4350b74545 [socks] Add socks.py from @bluec0re's public domain implementation
https://gist.github.com/bluec0re/cafd3764412967417fd3
2016-05-10 14:49:25 +08:00
Sergey M․
2937590e8b [downloader/hls] PEP 8 2016-05-09 22:16:33 +06:00
Sergey M․
fad7bbec3a [test_compat] Remove unused import 2016-05-09 22:15:55 +06:00
Sergey M․
e62d9c5caa [downloader/external] Call ffmpeg with with HTTP_PROXY env variable set (#9437) 2016-05-09 22:05:12 +06:00
Sergey M․
20cfdcc910 [test_compat] Avoid None values for compat_setenv 2016-05-09 22:00:14 +06:00
Sergey M․
1292638754 [test_compat] Use compat_setenv 2016-05-09 21:58:38 +06:00
Sergey M․
fe40f9eef2 [compat] Add compat_setenv 2016-05-09 21:55:03 +06:00
Sergey M․
6104cc2985 [downloader/hls] Add event media playlists to unsupported features of hlsnative 2016-05-09 20:55:37 +06:00
Sergey M․
c15c47d19b [downloader/hls] Remove EXT-X-MEDIA-SEQUENCE from unsupported features for hlsnative 2016-05-09 20:45:03 +06:00
Sergey M․
965fefdcd8 Credit @sleep-walker for #9431 2016-05-09 20:38:33 +06:00
Sergey M․
3951e7eb93 [ceskatelevize] Simplify, restore bonus video test and skip georestricted test (Closes #9431) 2016-05-09 20:37:20 +06:00
Tomáš Čech
f1f6f5aa5e [ceskatelevize] Add support for live streams
Live streams has no playlist title, use title of the stream containing
TV channel name. Internal m3u8 handler doesn't seem to handle well
continuous streams. Add test for live stream. Remove no longer
reachable test.
2016-05-09 18:58:15 +06:00
Sergey M
eb785b856f Merge pull request #9358 from dstftw/hls-native-to-ffmpeg-delegation
[downloader/hls] Delegate extraction to ffmpeg when unsupported features detected
2016-05-08 22:07:55 +00:00
Sergey M․
c52f4efaee [mva] Improve _VALID_URLs 2016-05-08 20:10:20 +06:00
Sergey M․
f23a92a0ce [mva] Add extractor (Closes #6667) 2016-05-08 20:02:54 +06:00
Yen Chi Hsuan
3b01a9fbb6 [litv] Add new extractor
LiTV is a streaming platform providing free and paid legal contents in
Taiwan.
2016-05-08 14:34:38 +08:00
Peter Rowlands
93fdb14177 don't use selection by attribute 2016-05-08 10:33:17 +09:00
Peter Rowlands
370d4eb8ad use stricter file selector
in case of empty in case of empty ./track/video/file entries
2016-05-08 10:02:48 +09:00
Peter Rowlands
3452c3a27c update tests 2016-05-08 10:02:19 +09:00
Sergey M․
9c072d38c6 [arte] Improve language preference (Closes #9401, closes #9162) 2016-05-08 06:52:42 +06:00
Peter Rowlands
81f35fee2f fix extractors.py import order 2016-05-08 08:57:16 +09:00
Peter Rowlands
0fdbe3146c use dict.get in case upload_date does not exist 2016-05-08 08:56:22 +09:00
Sergey M․
3e169233da Expanduser for more options with input files 2016-05-08 04:36:57 +06:00
Sergey M․
f5436c5d9e [downloader/external] Add temp fix ffmpeg m3u8 downloads (Closes #9394) 2016-05-08 02:29:26 +06:00
Sergey M․
5c24873a9e Credit @inondle for #9400 2016-05-08 02:04:34 +06:00
Sergey M․
00c21c225d Credit @kdeldycke for #9430 2016-05-08 00:11:44 +06:00
Sergey M
d013b26719 Merge pull request #9430 from kdeldycke/batch_file_home_expansion
Expand user's home in batch file path.
2016-05-07 18:09:51 +00:00
Kevin Deldycke
e2eca6f65e Expand user's home in batch file path. 2016-05-07 20:03:25 +02:00
Yen Chi Hsuan
a0904c5d80 [telegraaf] Fix extractor (closes #9318) 2016-05-08 00:56:31 +08:00
Sergey M․
cb1fa58813 [flickr] Extract uploader URL (Closes #9426) 2016-05-07 20:15:40 +06:00
remitamine
3fd6332c05 [flickr] extract license field(closes #9425) 2016-05-07 15:13:14 +01:00
Sergey M
401d147893 Merge pull request #9400 from inondle/master
[liveleak] Adds support for thumbnails and updates tests
2016-05-06 19:23:31 +00:00
inondle
e2ee97dcd5 [liveleak] Adds support for thumbnails, updates tests 2016-05-06 12:05:37 -07:00
Sergey M․
f745403b5b [vevo] Revert videoplayer.vevo.com to api.vevo.com 2016-05-06 23:37:17 +06:00
Sergey M․
3e80e6f40d [vevo] Allow request to api.vevo.com to fail (Closes #9417)
I don't know whether this it's tempopary or api has just gone
2016-05-06 23:35:58 +06:00
Sergey M․
25cb7a0eeb [youtube] Allow empty attribute values in description regex 2016-05-06 22:11:18 +06:00
Sergey M․
abc97b5eda [utils] Allow empty attribute values in get_element_by_attribute (Closes #9415) 2016-05-06 22:07:30 +06:00
remitamine
04e88ca2ca [vk] improve extraction(fixes #7976) 2016-05-06 15:02:40 +01:00
Peter Rowlands
8d93c21466 add multi_video test case 2016-05-06 12:08:43 +09:00
Peter Rowlands
1dbfd78754 fix multi_video part naming, add upload_date field 2016-05-06 12:07:29 +09:00
Peter Rowlands
22e35adefd use url instead of single formats entry 2016-05-06 10:41:30 +09:00
Yen Chi Hsuan
6f59aa934b [periscope:user] Add new extractor for user pages
Closes #9388
2016-05-06 02:14:39 +08:00
Yen Chi Hsuan
109db8ea64 Merge pull request #9367 from codesparkle/master
Feature: --restrict-filenames: replace accented characters by their unaccented counterpart instead of "_"
2016-05-06 01:44:03 +08:00
Peter Rowlands
833b644fff use xpath_text 2016-05-06 01:24:02 +09:00
Sergey M․
915620fd68 [redtube] PEP 8 2016-05-05 21:34:06 +06:00
Sergey M․
ac12e888f9 [redtube] Extract all formats, duration, upload date and view count (Closes #9397) 2016-05-05 21:02:54 +06:00
Yen Chi Hsuan
b1c6a5bac8 [Makefile] Remove more media files in make clean 2016-05-05 20:50:39 +08:00
Yen Chi Hsuan
7d08f6073d [kuwo:category] Update test 2016-05-05 20:20:26 +08:00
remitamine
758a059241 [dailymail] Add new extractor(closes #2667) 2016-05-05 13:13:22 +01:00
Yen Chi Hsuan
4f8c56eb4e [fczenit] Fix extraction and update test
Closes #9359
2016-05-05 17:55:37 +08:00
Peter Rowlands
57cf9b7f06 [afreecatv] Add new extractor for afreecatv.com VODs 2016-05-05 03:59:23 +09:00
Sergey M․
9da526aae7 [yandexmusic:playlist] Update test 2016-05-04 23:18:48 +06:00
Sergey M․
75b81df3af [udemy] Modernize 2016-05-04 23:14:12 +06:00
Sergey M․
aabdc83d6e [udemy] Fix course enroll (Closes #9393) 2016-05-04 23:03:44 +06:00
Sergey M․
2a48e6f01a [yandexmusic:playlist] Respect track order for long (>150) playlists 2016-05-04 22:45:01 +06:00
Sergey M․
203a3c0e6a [yandexmusic:playlist] Make title optional 2016-05-04 22:35:28 +06:00
Sergey M․
d36724cca4 [yandexmusic:playlist] Remove unused imports 2016-05-04 22:34:37 +06:00
Sergey M․
15fc0658f7 [yandexmusic:playlist] Modernize 2016-05-04 22:33:29 +06:00
Sergey M․
e960c3c223 [yandexmusic:playlist] Improve extraction (Closes #6801) 2016-05-04 22:25:39 +06:00
Sergey M․
bc7e77a04b [vevo] Use raise_geo_restricted 2016-05-03 23:18:36 +06:00
Sergey M․
964f49336f [aol] Improve _VALID_URL (Closes #9381) 2016-05-03 21:24:51 +06:00
Sergey M․
57d8e32a3e [xfileshare] Add support for streamin.to 2016-05-03 16:58:11 +06:00
Sergey M․
4174552391 [xfileshare] Refactor _VALID_URL and remove ded sites 2016-05-03 15:35:32 +06:00
Sergey M․
80bc4106af [xfileshare] Add support for thevideobee.to (Closes #9374) 2016-05-03 15:09:23 +06:00
Yen Chi Hsuan
7759be38da [xiami] Detect georestriction and skip tests 2016-05-03 16:19:43 +08:00
Yen Chi Hsuan
a0a309b973 [kuwo:category] Fix description and update test 2016-05-03 16:06:28 +08:00
Adam Thalhammer
c587cbb793 improved performance by extracting accented chars to top level 2016-05-03 10:40:30 +10:00
Sergey M․
6c52a86f54 [README.md] Update creator description 2016-05-02 21:32:57 +06:00
Sergey M․
8a92e51c60 [extractor/common] Relax wording for creator metafield 2016-05-02 21:31:35 +06:00
Sergey M․
f0e14fdd43 [YoutubeDL] Skip non-relevant field types when building output template 2016-05-02 20:05:06 +06:00
Sergey M․
df5f4e8888 [vevo] Remove superfluous code 2016-05-02 18:47:35 +06:00
Sergey M․
7960b0563b [YoutubeDL] Properly process unable-to-download-error on python2 2016-05-02 18:35:50 +06:00
Sergey M․
5c9ced9504 [vevo] Improve genre extraction 2016-05-02 18:19:00 +06:00
Adam Thalhammer
31c4448f6e Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347 2016-05-02 13:25:12 +10:00
Adam Thalhammer
79a2e94e79 Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347 2016-05-02 13:21:39 +10:00
Sergey M․
686cc89634 [discovery] Fix typo 2016-05-02 07:07:35 +06:00
Sergey M․
9508738f9a [vevo] Extract featured artist 2016-05-02 03:36:40 +06:00
Sergey M․
78a3ff33ab [vevo:playlist] Add fallback for playlist id 2016-05-02 03:29:48 +06:00
Sergey M․
881dbc86c4 [vevo] Extract track related metafields and add artists to title (Closes #1684) 2016-05-02 03:28:58 +06:00
Sergey M․
8e7d004888 [vevo] Add test for video only available via webpage 2016-05-02 03:06:48 +06:00
Sergey M․
9618c44824 [vevo] Extract video versions from webpage as a last resort (Closes #8426, closes #9366) 2016-05-02 02:58:20 +06:00
Sergey M․
516ea41a7d [vevo] Fix _call_api 2016-05-02 02:54:50 +06:00
Sergey M․
e2bd301ce7 [vevo:playlist] Fix genre playlists 2016-05-02 01:00:42 +06:00
Sergey M․
0c9d288ba0 [vevo:playlist] Remove debug params 2016-05-02 00:50:31 +06:00
Sergey M․
e0da32df6e [vevo:playlist] Add extractor (Closes #9334, closes #9364) 2016-05-02 00:48:26 +06:00
Philipp Hagemeister
174aba3223 release 2016.05.01 2016-05-01 10:19:14 +02:00
Sergey M․
0d66bd0eab [downloader/hls] Delegate extraction to ffmpeg when unsupported features detected 2016-05-01 13:56:51 +06:00
Sergey M․
4bd143a3a0 [postprocessor/ffmpeg] Simplify metadata preparation and add track related metafields (Closes #9357) 2016-05-01 10:56:54 +06:00
Sergey M․
6f27bf1c74 Credit @blahgeek for xiami (#9079) 2016-05-01 08:08:51 +06:00
Sergey M․
68bb2fef95 [tagesschau] Restrict playlist entry regex 2016-05-01 07:15:23 +06:00
Sergey M․
854cc54bc1 [tagesschau] Expand video id 2016-05-01 07:01:55 +06:00
Sergey M․
651ad35ce0 [tagesschau] Relax _VALID_URL 2016-05-01 06:57:19 +06:00
Sergey M․
6a0f9a24d0 [tagesschau] Separate player extractor 2016-05-01 06:45:44 +06:00
remitamine
9cf79e8f4b [ccc] improve extraction 2016-05-01 01:45:17 +01:00
Sergey M․
2844b09336 [tagesschau] Fix article media ids 2016-05-01 04:42:05 +06:00
Sergey M․
1a2b377cc2 [tagesschau] Fix audio support 2016-05-01 04:38:46 +06:00
Sergey M․
4c1b2e5c0e [tagesschau] Add support for playlists 2016-05-01 04:18:56 +06:00
Sergey M․
9e1b96ae40 [rtlnl] Match formats only by height 2016-05-01 03:20:36 +06:00
Sergey M․
fc35cd9e0c [tagesschau] Relax _VALID_URL 2016-05-01 02:56:32 +06:00
Sergey M․
339fe7228a [tagesschau] Update _FORMATS map 2016-05-01 02:56:32 +06:00
remitamine
ea7e7fecbd [discovery] remove unused imports 2016-04-30 21:55:28 +01:00
remitamine
d00b93d58c [discovery] extract more info using BrightcoveNewIE 2016-04-30 21:49:32 +01:00
remitamine
93f7a31bf3 [discovery] extract subtitle 2016-04-30 20:51:32 +01:00
remitamine
33a1ec950c [discovery] extract http formats 2016-04-30 20:51:32 +01:00
Sergey M․
4e0c0c1508 [xiami] Improve extraction (Closes #9079)
* Switch to JSON source
* Add abstract IE for playlists
* Extract more track related metadata
2016-04-30 21:50:23 +06:00
BlahGeek
89c0dc9a5f [xiami] Add xiami extractor 2016-04-30 21:48:40 +06:00
remitamine
f628d800fb [ted] add support for youtube embeds and update tests 2016-04-30 16:34:57 +01:00
remitamine
11fa3d7f99 [ted] extract all http formats 2016-04-30 15:44:30 +01:00
Sergey M․
d41ee7b774 [vlive] Pass Referer as bytestring (Closes #9352) 2016-04-30 19:22:42 +06:00
remitamine
e0e9bbb0e9 [pbs] extract srt and vtt subtitles 2016-04-30 14:02:17 +01:00
remitamine
7691184a31 [pbs] remove duplicate format 2016-04-30 12:57:30 +01:00
remitamine
35cd2f4c25 [pbs] extract only the formats that we know that they will be available as http format
https://projects.pbs.org/confluence/display/coveapi/COVE+Video+Specifications
2016-04-30 11:32:13 +01:00
remitamine
350d7963db [pbs] fix the least bitrate http url construction 2016-04-30 11:12:11 +01:00
remitamine
cbc032c8b7 [pbs] extract all http formats 2016-04-30 01:24:36 +01:00
remitamine
69c4cde4ba [wsj] improve extraction 2016-04-29 21:37:05 +01:00
Sergey M․
ca278a182b [rtlnl] Replace test 2016-04-30 02:07:29 +06:00
Sergey M․
373e1230e4 [rtlnl] Clarify tests 2016-04-30 01:50:26 +06:00
Sergey M․
cd63d091ce [rtlnl] Fix tests 2016-04-30 01:48:14 +06:00
Sergey M․
0571ffda7d [rtlnl] Improve extraction (Closes #9329)
* Make hls extraction non fatal and revert ext
* Extract progressive formats' metadata from corresponding hls formats
2016-04-30 01:43:39 +06:00
Reino17
5556047465 [rtlnl] Update 720p PG_URL_TEMPLATE
- Fixed the format_id for the 720p progressive videostream and added the video's resolution.
- The adaptive videostreams have the m3u8-extension, so I removed the confusing mp4-extension in order to make a better distinction between the these and the progressive videostreams.
2016-04-30 01:43:13 +06:00
remitamine
65a3bfb379 [dfb] extract m3u8 formats 2016-04-29 19:21:17 +01:00
Yen Chi Hsuan
cef3f3011f [funimation] Detect blocking and support CloudFlare cookies 2016-04-30 00:17:09 +08:00
Yen Chi Hsuan
e9c6cdf4a1 [common] Fix format_id construction for HLS 2016-04-29 22:50:16 +08:00
Sergey M․
00a17a9e12 [crunchyroll] Sort formats 2016-04-29 19:44:10 +06:00
Sergey M․
8312b1a3d1 [crunchyroll] Add even more relaxed fmt fallback 2016-04-29 19:43:53 +06:00
Sergey M․
6ff4469528 [crunchyroll] Relax fmt regex 2016-04-29 19:39:27 +06:00
Yen Chi Hsuan
68835d687a Merge branch 'Kagami-vlive-hls' 2016-04-29 19:30:51 +08:00
Yen Chi Hsuan
9d186afac8 [vlive] Coding style and PEP8 2016-04-29 19:29:50 +08:00
Yen Chi Hsuan
151d98130b Merge branch 'vlive-hls' of https://github.com/Kagami/youtube-dl into Kagami-vlive-hls 2016-04-29 19:26:39 +08:00
Kagami Hiiragi
b24d6336a7 [vlive] Add support for live videos 2016-04-29 14:22:50 +03:00
remitamine
065216d94f [crunchyroll] reduce requests for formats extraction 2016-04-29 11:46:42 +01:00
remitamine
67167920db [viewlift] replace SnagFilms extractors
- add support for other sites that use the same logic
- improve format extraction and sorting
2016-04-29 11:24:10 +01:00
Yen Chi Hsuan
14638e2915 [sexykarma] Rename to WatchIndianPornIE and fix extraction 2016-04-29 18:17:08 +08:00
Yen Chi Hsuan
1910077ed7 Revert "[sexykarma] Remove the extractor"
This reverts commit 31ff3c074e.
2016-04-29 17:59:23 +08:00
Yen Chi Hsuan
5819edef03 [ooyala] Skip an invalid test
Ooyala is used by lots of extractors and its correctness can be verified
by these websites.
2016-04-29 14:27:15 +08:00
Yen Chi Hsuan
f5535ed0e3 [orf] Skip the expired test 2016-04-29 14:24:07 +08:00
Yen Chi Hsuan
31ff3c074e [sexykarma] Remove the extractor
Its domain name is on sale.

Closes #9317
2016-04-29 13:36:52 +08:00
Sergey M․
72670c39de [arte:+7] Fix typo in _VALID_URL 2016-04-29 04:46:23 +06:00
Sergey M․
683d892bf9 [viewster] Remove unused import 2016-04-29 01:30:53 +06:00
Sergey M․
497971cd4a [yandexmusic] Clarify blockage even more 2016-04-29 01:28:07 +06:00
remitamine
e757fb3d05 [crunchyroll] improve extraction
- extract more metadata(series, episode, episode_number)
- reduce duplicate requests for extracting formats
- remove duplicate formats
2016-04-28 18:42:20 +01:00
remitamine
0ba9e3ca22 [viewster] extract formats for videos with multiple audios/subtitles 2016-04-28 17:45:09 +01:00
Sergey M․
4b53762914 [yandexmusic] Clarify blockage 2016-04-28 21:45:33 +06:00
Sergey M․
eebe6b382e [yandexmusic] Improve error handling 2016-04-28 21:37:34 +06:00
Yen Chi Hsuan
0cbcbdd89d [nuvid] Fix extraction
Closes #7620
2016-04-28 17:51:20 +08:00
Yen Chi Hsuan
7f776fa4b5 [yandexmusic] Skip tests as Travis CI blocked 2016-04-28 17:08:41 +08:00
Yen Chi Hsuan
eb5ad31ce1 Merge branch 'pmrowla-mwave-meetgreet' 2016-04-28 16:03:43 +08:00
Yen Chi Hsuan
a5941305b6 [mwave] Coding style 2016-04-28 16:03:08 +08:00
Yen Chi Hsuan
f8dddaf456 Merge branch 'mwave-meetgreet' of https://github.com/pmrowla/youtube-dl into pmrowla-mwave-meetgreet 2016-04-28 15:56:32 +08:00
Yen Chi Hsuan
618c71dc64 [cloudy] New domain name for the test_cloudy_1
I'm sure whether videoraj.ch still works or not, so keep it.
2016-04-28 15:46:00 +08:00
Sergey M․
52af8f222b [cwtv] Relax _VALID_URL (Closes #9327) 2016-04-28 04:01:21 +06:00
Yen Chi Hsuan
3cc8649c9d [20min] Detect embedded YouTube videos
Fixes #9331
2016-04-28 02:58:11 +08:00
Yen Chi Hsuan
dcf094d626 [theplatform] Fix for Python 3.2
test_AENetworks{,_1} fails as in Python < 3.3, binascii.a2b_* functions
accepts only bytes-like objects
2016-04-27 18:35:33 +08:00
Peter Rowlands
5b5d7cc11e [mwave] Add Mwave Meet & Greet extractor 2016-04-27 15:57:17 +09:00
Yen Chi Hsuan
2ac2cbc0a3 [malemotion] Remove the extractor
Announcement from their homepage:

```
MaleMotion is closed

After another system crash, I'm forced to close the site

This week all content will be erased

Don't forget to cancel your subscription if any !
```

Closes #9311.
2016-04-27 13:55:32 +08:00
Yen Chi Hsuan
a7e03861e8 [scivee] Skip the test
Not accessible from either Travis CI or my machine.

Closes #9315
2016-04-27 13:52:04 +08:00
Sergey M
046ea04a7d [README.md] Mention mpv 2016-04-27 00:22:08 +06:00
Sergey M
7464360379 [README.md] Add FAQ entry on output template conflicts 2016-04-27 00:16:48 +06:00
Sergey M․
175c2e9ec3 [youtube:search_url] Reimplement in terms of youtube:playlistbase 2016-04-26 22:29:29 +06:00
remitamine
f1f879098a [viewster] extract more metadata for http formats 2016-04-26 13:40:40 +01:00
Sergey M․
c9fd530670 [ok] Extract start time 2016-04-25 22:15:15 +06:00
Sergey M․
749b0046a8 [ok] Allow embeds without title (Closes #9303) 2016-04-25 22:05:47 +06:00
Yen Chi Hsuan
e3de3d6f2f [normalboots] Fix extraction
Now it's using ScreenwaveMedia
2016-04-25 23:49:12 +08:00
Yen Chi Hsuan
ad58942d57 [muzu] Remove extractor
MUZU is shutting down in October 2015. [1]

[1] http://www.musicbusinessworldwide.com/youtube-rival-muzu-is-heading-into-liquidation/
2016-04-25 23:35:05 +08:00
Yen Chi Hsuan
4645432d7a [eagleplatform] Checking direct HTTP links
Sometimes they fail with 404
2016-04-25 22:48:17 +08:00
Yen Chi Hsuan
6bdc2d5358 [mitele] Comment out unstable MD5
Also Akamai f4f fragments
2016-04-25 22:27:25 +08:00
Yen Chi Hsuan
2beff95da5 [nrk] Comment out unstable MD5 checksums
Both are Akamai f4f fragments.
2016-04-25 22:26:19 +08:00
Yen Chi Hsuan
abc1723edd [unistra] Sort formats
Originally URLs are passed to set() and not sorted, so the result is not
deterministic, causing occasional FAILs on Travis CI.
2016-04-25 22:24:40 +08:00
Yen Chi Hsuan
b248e6485b Merge branch 'remitamine-akamai_pv' 2016-04-25 21:02:30 +08:00
Yen Chi Hsuan
d6712378e7 Merge branch 'akamai_pv' of https://github.com/remitamine/youtube-dl into remitamine-akamai_pv 2016-04-25 21:02:02 +08:00
remitamine
fb72ec58ae [extractor/common] do not process f4m manifest that contain akamai playerVerificationChallenge 2016-04-25 13:37:03 +01:00
Sergey M․
c83a352227 [openload] Make thumbnail optional 2016-04-25 00:26:06 +06:00
Sergey M․
e9063b5de9 [openload] Add test 2016-04-25 00:22:55 +06:00
Sergey M․
594b0c4c69 [openload] Fix ext extraction 2016-04-25 00:03:29 +06:00
Sergey M․
eb9ee19422 [utils] Allow None mimetypes in mimetype2ext 2016-04-25 00:03:12 +06:00
Sergey M․
a1394b820d [openload] Fix title extraction (Closes #9298) 2016-04-25 00:01:37 +06:00
Yen Chi Hsuan
aa9dc24f5a [douyutv] Improve extraction and update tests
The JSON API sometimes return HTML pages with errors
2016-04-24 23:52:17 +08:00
Yen Chi Hsuan
51762e1a31 [xminus] Fix extraction (closes #9228) 2016-04-24 23:21:45 +08:00
Philipp Hagemeister
8b38f2ac40 release 2016.04.24 2016-04-24 17:06:46 +02:00
Yen Chi Hsuan
a82398bd72 [kwuo:song] Fix extraction and update the test 2016-04-24 22:20:45 +08:00
remitamine
c14dc00df3 [viewster] improve http formats extraction 2016-04-24 14:34:28 +01:00
Yen Chi Hsuan
03dd60ca41 [kuwo:category] Fix the test
Sometimes there are 24 songs and sometimes 30 lol
2016-04-24 21:16:06 +08:00
Yen Chi Hsuan
0738187f9b [ThePlatform] Fix tests failed since 79ba9140dc 2016-04-24 20:46:06 +08:00
Yen Chi Hsuan
a956cb6306 [onionstudios] Fix description extraction
\1 does not work in []. Fixes test_Generic_75
(http://www.clickhole.com/video/dont-understand-bitcoin-man-will-mumble-explanatio-2537)
2016-04-24 20:41:17 +08:00
Yen Chi Hsuan
a8062eabcd [mwave] Skip checking unstable MD5
On my PC the checksum is 02eda6d09fb63131a17a8d44e6237463, while a
recent Travis CI build
(https://travis-ci.org/rg3/youtube-dl/jobs/125341081) shows it's
c930e27b7720aaa3c9d0018dfc8ff6cc
2016-04-24 20:05:24 +08:00
Yen Chi Hsuan
2a7dee8cc5 [yahoo] Improve error detection and update tests 2016-04-24 18:12:16 +08:00
Yen Chi Hsuan
d9ed362116 [yahoo] Extract all <iframe>s
Fixes test_yahoo_6

(https://ca.finance.yahoo.com/news/hackers-sony-more-trouble-well-154609075.html)
2016-04-24 17:46:25 +08:00
Yen Chi Hsuan
4f54958097 [yahoo] Update some tests
One has new fields as ThePlatformIE changed, and others have changed
files.
2016-04-24 17:29:01 +08:00
Yen Chi Hsuan
2a7c38831c [yahoo] Extend _VALID_URL and fix extraction
Closes #9271
2016-04-24 17:01:18 +08:00
Yen Chi Hsuan
949b6497cc [generic] Unescape the video URL
Fixes #9279
2016-04-24 16:25:37 +08:00
Sergey M
2c21152ca7 [README.md] Document track metafields in output template 2016-04-24 12:22:18 +06:00
remitamine
fda9a1ca9e [viewster] simplify qualities_basename regex 2016-04-24 03:06:46 +01:00
remitamine
864d5e7231 [viewster] extract all http formats 2016-04-24 02:32:56 +01:00
Wang Jun Tham
ccff2c404d [ffmpeg] Fix embedding subtitles (#9063)
Changed command line parameters for ffmpeg when embedding subtitles.
Changed to ‘-map 0:v -c:v copy -map 0:a -c:a copy’
2016-04-24 00:08:02 +08:00
Sergey M․
5448b781f6 [dplay] Sign unsigned final download hls URLs 2016-04-23 17:28:45 +06:00
Sergey M․
e239413fbc [dplay] Extract subtitles (Closes #9284) 2016-04-23 16:50:31 +06:00
Sergey M․
fd0ff8bad8 [dplay] Improve extraction and document workarounds and tests 2016-04-23 16:36:17 +06:00
Sergey M․
397ec446f3 [dplay] Try secure api for no tld (Closes #9282) 2016-04-23 15:59:30 +06:00
Boris Wachtmeister
14f7a2b8af [WDRMaus] switch current show to new WDR extractor (fixes #8562)
It seems that the "current show" already uses the new WDR video-player,
while all the others videos still use the old player.

I just added the current show URL to the normal WDR-extractor, which
works fine. This commit needs my changes from PR #8842 that fix the
support for WDR.
2016-04-23 11:53:22 +02:00
Boris Wachtmeister
c0837a12c8 [WDR] complete overhaul after relaunch of the site
The WDR relaunched their site on 2016-02-23 which not only changed the
URL-schema completely but also the layout of their pages.

Apparently the whole "mediathek" now runs on the wdr-domain, so no
separate URL for funkhauseuropa anymore.
There seems to be no explicit handling of video-sizes on the page or in
the URLs anymore. There seems to be only one size for HTML5, but still
several sizes for flash. The extractor adds all to the list of formats.

There is no metadata for the HTML5-stream, so that the best flash-stream
will always be considered as the "best" format. At least in my tests
this seemed to be true anyway.
2016-04-23 11:42:18 +02:00
remitamine
29a7e8f6f8 [nhl] Add new extractor(closes #8419)(closes #8798) 2016-04-22 20:18:27 +01:00
Yen Chi Hsuan
eb01e97e10 [youku] Skip streams with channel_type=tail
Fixes #9275

These video segments look like ads and they don't appear in the web
player.
2016-04-23 02:54:09 +08:00
remitamine
cb7d4d0efd [nbc] add support for today.com(closes #2909) 2016-04-22 18:08:20 +01:00
Yen Chi Hsuan
c80037918b [iqiyi] Improve error detection (#9276) 2016-04-23 00:06:49 +08:00
remitamine
237a41108a [eagleplatform] extract all http formats 2016-04-22 14:32:38 +01:00
remitamine
e962ae15d3 [newstube] extract http formats(closes #9253) 2016-04-22 11:26:43 +01:00
remitamine
7c36ea7d54 [rtbf] improve extraction(fixes #9267) 2016-04-21 22:52:49 +01:00
remitamine
9260cf1d97 [tubitv] fix extraction(closes #8741) 2016-04-21 20:30:19 +01:00
Sergey M․
bdbb8530c7 [vimeo] Pass Referer for check-password request 2016-04-22 00:02:39 +06:00
Sergey M․
09a9fadb84 [dump] Remove extractor 2016-04-21 23:31:34 +06:00
Sergey M․
bf09af3acb Add --hls-prefer-ffmpeg 2016-04-21 23:02:17 +06:00
Sergey M․
88296ac326 [planetaplay] Remove remainings of extractor 2016-04-21 22:57:38 +06:00
Sergey M․
870d525848 [options] Remove experimental mark for --hls-prefer-native 2016-04-21 22:44:01 +06:00
Sergey M․
6577112890 [planetaplay] Remove extractor (Closes #9256) 2016-04-21 22:33:54 +06:00
Sergey M․
1988647dda [tvigle] Skip hls completely (#9259) 2016-04-21 22:15:20 +06:00
Yen Chi Hsuan
a292cba256 [mgtv] Fix _VALID_URL and add localized name 2016-04-22 00:07:43 +08:00
Yen Chi Hsuan
982e518a96 [dispeak] Rename DigitalSpeaking to DigitallySpeaking 2016-04-22 00:07:43 +08:00
Yen Chi Hsuan
748e730099 [dispeak] Several fixes 2016-04-22 00:07:43 +08:00
Sergey M
b6c0d4f431 Merge pull request #9110 from remitamine/parse_duration
[utils] imporove parse_duration to handle more formats
2016-04-21 22:53:16 +07:00
remitamine
acaff49575 [utils] imporove parse_duration to handle more formats 2016-04-21 16:34:54 +01:00
Yen Chi Hsuan
1da19488f9 [mgtv] Add new extractor (closes #9212) 2016-04-21 23:29:51 +08:00
Yen Chi Hsuan
442c4d361f [dispeak/gdcvault] Add the test case from #5784 2016-04-21 19:47:10 +08:00
Yen Chi Hsuan
ec59d657e7 [dispeak] Add new extractor
Both GDCVault and GPUTechConf uses the service of DigitalSpeaking.
2016-04-21 19:36:33 +08:00
Yen Chi Hsuan
99ef96f84c [gdcvault] Fix for videos with hard-coded hostnames
Fixes #9248
2016-04-21 18:07:03 +08:00
Yen Chi Hsuan
4dccea8ad0 [streetvoice] Fix extraction
The old API results in URLs with HTTP 403 from time to time.

Hopefully fixes #9219.
2016-04-21 13:07:53 +08:00
Yen Chi Hsuan
2c0d9c6217 [extractor/common] Allow empty post data 2016-04-21 13:06:06 +08:00
Sergey M․
12a5134596 [tvigle] Fix extraction (Closes #9259) 2016-04-20 23:52:41 +06:00
Sergey M․
16e633a5d7 [quickvid] Remove extractor (Closes #9258) 2016-04-20 23:29:02 +06:00
Sergey M․
494ab6db73 [youtube] Capture and output login error message 2016-04-20 22:14:32 +06:00
Sergey M․
107701fcfc [people] Remove bogus comment 2016-04-20 03:40:02 +06:00
Sergey M․
f77970765a [people] Add extractor 2016-04-20 03:37:23 +06:00
Philipp Hagemeister
81215d5652 release 2016.04.19 2016-04-19 03:03:52 +02:00
Sergey M․
241a318f27 [vimeo] Improve _VALID_URL (Closes #9229) 2016-04-18 21:40:28 +06:00
Sergey M․
4fdf082375 [theonion] Remove extractor (Closes #9220)
It now uses generic onionstudios embed
2016-04-17 23:12:23 +06:00
Jaime Marquínez Ferrándiz
1b6182d8f7 [youtube:playlist] Fetch all the videos in a mix (fixes #3837)
Since there doesn't seem to be any indication, it stops when there aren't new videos in the webpage.
2016-04-17 17:07:57 +02:00
remitamine
7bab22a402 [vice] remove unused import and variable 2016-04-17 14:06:19 +01:00
Yen Chi Hsuan
0f97fb4d00 [musicplayon] Relax _VALID_URL and improve metadata extraction
In r'pl=\d+&play=\d+' pages, several metadata items are missing

Closes #9222.
2016-04-17 17:24:33 +08:00
Yen Chi Hsuan
b1cf58f48f [musicplayon] Fix extraction (closes #9222) 2016-04-17 15:08:51 +08:00
remitamine
3014b0ae83 Merge pull request #9195 from remitamine/ffmpeg-pipe
[downloader/external] enable piping for FFmpegFD(closes #2124)
2016-04-16 22:00:49 +01:00
remitamine
b9f2fdd37f [ffmpeg] Clarify rationale for pipe(-) exclusion in _ffmpeg_filename_argument 2016-04-16 21:50:13 +01:00
remitamine
bbb3f730bb [onionstudios] extract m3u8 formats 2016-04-16 20:53:13 +01:00
remitamine
d868f43c58 [ffmpeg] check for - file name in _ffmpeg_filename_argument 2016-04-16 19:45:56 +01:00
Yen Chi Hsuan
21525bb8ca [kuwo:category] Update the test
Now the webpage says there are 24 songs.
2016-04-17 02:38:05 +08:00
Sergey M․
d8f103159f [nerdist] Remove extractor
It now uses brightcove
2016-04-17 00:16:31 +06:00
remitamine
663ee5f0a9 [vice] extract youtube embed 2016-04-16 17:49:39 +01:00
Sergey M․
b6b950bf58 [cbs] Remove unused import 2016-04-16 22:47:10 +06:00
Sergey M․
11e60fcad8 [extractor/generic] Improve instagram embeds (Closes #9213) 2016-04-16 22:39:20 +06:00
Sergey M․
c23533a100 [instagram] Add support for iframe embeds 2016-04-16 22:31:05 +06:00
Sergey M․
0dafea02e6 [instagram] Add support for embed URLs 2016-04-16 22:23:08 +06:00
Sergey M․
5d6360c3b7 [mooshare] Remove extractor 2016-04-16 21:31:50 +06:00
Yen Chi Hsuan
5e5c30c3fd [mdr] Fix extraction and update tests
It's strange that the date is changed. Anyway, new data matches what the
webpage says.
2016-04-16 21:57:28 +08:00
Yen Chi Hsuan
9154c87fc4 [huffpost] Fix a typo 2016-04-16 21:41:22 +08:00
Yen Chi Hsuan
ef0e4e7bc0 [generic] Fix test_Generic_2
Now a HEAD request returns 400 Bad Request
2016-04-16 19:44:45 +08:00
Yen Chi Hsuan
67d46a3f90 [ustream] Fix /embed/ URLs and add a test 2016-04-16 19:39:25 +08:00
Yen Chi Hsuan
bec47a0748 [tudou] Improve error detection (closes #9175) 2016-04-16 19:11:25 +08:00
Yen Chi Hsuan
36b7d9dbfa [twitter] Don't check /cards/ URLs
Fixes #9181

In this tweet, there are two cards:
1. https://twitter.com/i/cards/tfw/v1/719944006306701313
   This shows #TeamCap vs. #TeamIronMan
2. https://twitter.com/i/videos/tweet/719944021058060289
   This is the real video and can be handled by TwitterCardIE

In all current test_Twitter* tests, /videos/tweet/ approach works fine.
2016-04-16 18:57:50 +08:00
Yen Chi Hsuan
8c65e4a527 [bbc] Fix a test 2016-04-16 18:00:19 +08:00
Yen Chi Hsuan
6ad2ef8b7c [audiomack] Update the test
The original test raises 404
2016-04-16 17:54:39 +08:00
Yen Chi Hsuan
00b426d66d [varzesh3] Add md5 to the test 2016-04-16 17:41:56 +08:00
Yen Chi Hsuan
0de968b584 [newgrounds] Support videos (closes #9138) 2016-04-16 17:41:56 +08:00
remitamine
0841d5013c [cbs] do not catch Exceptions raised by by _extract_theplatform_smil 2016-04-16 10:25:59 +01:00
remitamine
a71fca8577 [theplatform] remove _sort_formats from _extract_theplatform_smil 2016-04-16 10:23:56 +01:00
Yen Chi Hsuan
ee94e7e66d [varzesh3] Fix metadata extraction (closes #9197) 2016-04-16 17:13:22 +08:00
Yen Chi Hsuan
759e37c9e6 [gazeta] Relax _VALID_URL and update tests
Closes #9196
2016-04-16 16:48:47 +08:00
Yen Chi Hsuan
ae65567102 [eagleplatform] Fix error handling 2016-04-16 16:47:16 +08:00
Yen Chi Hsuan
c394b4f4cb [puls4] Fix error detection (#9194) 2016-04-16 16:22:44 +08:00
Yen Chi Hsuan
260c7036ba [sportbox] Fix SportBoxEmbedIE
Also fixes test_Generic_29 (http://www.vestifinance.ru/articles/25753)
2016-04-16 16:13:14 +08:00
remitamine
f74197a074 [cbs] extract rtmp formats 2016-04-15 22:38:37 +01:00
remitamine
f3a58d46bf [youtube:user] check if the url didn't match only the other youtube extractors 2016-04-15 19:06:13 +01:00
Sergey M․
b6612c9b11 [karaoketv] Fix extraction 2016-04-15 21:26:54 +06:00
Yen Chi Hsuan
7e176effb2 [iqiyi] Also suuport pps.tv URLs
PPS is acquired by Baidu and merged with iQiyi in 2013 [1]. Now they
have the same page layouts.

[1] http://www.chinanews.com/it/2013/05-07/4792526.shtml
2016-04-15 22:39:18 +08:00
Yen Chi Hsuan
4a252cc2d2 [karaoketv] Update and mark as not _WORKING 2016-04-15 21:49:17 +08:00
Yen Chi Hsuan
f0ec61b525 [huffpost] Fix extraction 2016-04-15 20:55:56 +08:00
Yen Chi Hsuan
66d40ae3a5 Merge pull request #9041 from kasper93/master
[generic] Add support for LiveLeak embeds
2016-04-15 17:23:55 +08:00
Yen Chi Hsuan
e6da9240d4 [mixcloud:stream] Add new extractor
Closes #7633
2016-04-15 17:14:17 +08:00
Yen Chi Hsuan
dd91dfcd67 [mixcloud] Fix extraction by decrypting play info
Fixes #7521
2016-04-15 15:48:22 +08:00
Yen Chi Hsuan
c773082692 Merge branch 'Phaeilo-mixcloud' 2016-04-15 14:33:04 +08:00
Yen Chi Hsuan
9c250931f5 [mixcloud] Improve and simplify mixcloud:user and mixcloud:playlist 2016-04-15 14:32:02 +08:00
Yen Chi Hsuan
56f1750049 [tdslifeway] Use the new Brightcove API
Thanks for @remitamine's suggestion.
2016-04-15 04:28:54 +08:00
Yen Chi Hsuan
f2159c9815 [wayofthemaster] Remove extractor
Now it's using YouTube embeds.
2016-04-15 04:02:23 +08:00
Yen Chi Hsuan
b0cf2e7c1b [ubu] Remove extractor
1. Videos on ubu.com are now hosted on Vimeo
2. The duration is far from correct, and may not exist on other videos
   (For example http://ubu.com/film/hammons_king.html)
2016-04-15 03:48:23 +08:00
Yen Chi Hsuan
74b47d00c3 [xboxclips] Use http:// URL
xboxclips has misconfigured certificates
2016-04-15 03:30:38 +08:00
Yen Chi Hsuan
8cb57bab8e [ministrygrid] Fix extraction and modernize 2016-04-15 02:48:12 +08:00
Yen Chi Hsuan
e1bf277e19 [tdslifeway] Add TDSLifewayIE
Used by MinistryGridIE
2016-04-15 02:48:12 +08:00
remitamine
ce599d5a7e [downloader/external] enable piping for FFmpegFD(closes #2124) 2016-04-14 18:49:02 +01:00
Sergey M․
9e28538726 [arte:creative] Improve _VALID_URL 2016-04-14 21:54:41 +06:00
Sergey M․
404284132c [arte:info] Add extractor (Closes #9182) 2016-04-14 21:52:05 +06:00
remitamine
5565be9dd9 [aol] relex _VALID_URL regex 2016-04-14 08:47:55 +01:00
Yen Chi Hsuan
b3a9474ad1 Merge branch 'mixcloud' of https://github.com/Phaeilo/youtube-dl into Phaeilo-mixcloud 2016-04-14 15:31:58 +08:00
Yen Chi Hsuan
86475d59b1 [metacritic] Add a new valid test case 2016-04-14 15:12:59 +08:00
Yen Chi Hsuan
73d93f948e [lecture2go] Fix extraction
RTSP stream fails to download. Seems it's a mpv bug as direct playback
works well:

$ mpv --ytdl-format rtsp https://lecture2go.uni-hamburg.de/veranstaltungen/-/v/17473
2016-04-14 15:08:01 +08:00
Yen Chi Hsuan
f5d8743e0a [downloader/rtsp] Print the command 2016-04-14 15:07:31 +08:00
Yen Chi Hsuan
d1c4e4ba15 [laola1tv] Improve error detection and skip an invalid test 2016-04-14 14:11:28 +08:00
Yen Chi Hsuan
f141fefab7 [karrierevideos] Fix extraction
The server serves malformed header "Content Type: text/xml" for the XML
request (it should be Content-Type but not Content Type). Python 3.x,
which uses email.feedparser rejects such headers. As a result,
Content-Encoding header is not parsed, so the returned content is kept
not decompressed, and thus XML parsing error.
2016-04-14 14:06:05 +08:00
aystroganov@gmail.com
8334637f4a Make tbr field 'int' rather than 'tuple'
Closes #9180.
2016-04-13 14:29:34 +02:00
Philipp Hagemeister
b0ba11cc64 release 2016.04.13 2016-04-13 08:02:03 +02:00
Kacper Michajłow
b8f67449ec [generic] Add support for LiveLeak embeds 2016-04-13 01:54:19 +02:00
Yen Chi Hsuan
75af5d59ae [netease] Skip all tests: completely georestricted 2016-04-13 04:52:07 +08:00
Sergey M․
b969d12490 Credit @Phaeilo for presstv (#7113) 2016-04-13 01:52:50 +06:00
Philip Huppert
6d67169509 [mixcloud] improved extraction of user description 2016-04-12 21:18:13 +02:00
Philip Huppert
dcaf00fb3e [mixcloud] support older urllib versions 2016-04-12 21:18:13 +02:00
Philip Huppert
f896e1ccef [mixcloud] fixed some tests 2016-04-12 21:18:13 +02:00
Philip Huppert
c96eca426b [mixcloud] Added support for user uploads, playlists, favorites and listens.
Fixes #3750 and #5272
2016-04-12 21:18:13 +02:00
Sergey M․
466a614537 [youtube:playlist] Recognize popular uploads playlist as mix (Closes #9170) 2016-04-12 21:38:31 +06:00
Sergey M․
ffa2cecf72 [ard] Change subtitles extension to ttml (Closes #9169)
ttml is now served instead of srt
2016-04-12 21:20:31 +06:00
Yen Chi Hsuan
a837416025 [jadorecettepub] Remove extractor: website gone 2016-04-12 18:30:53 +08:00
Yen Chi Hsuan
c9d448876f [izlesene] Fix extraction
description may be absent
2016-04-12 18:29:28 +08:00
Yen Chi Hsuan
8865b8abfd [howstuffworks] Skip a broken test case 2016-04-12 17:30:14 +08:00
Yen Chi Hsuan
c77a0c01cb [groupon] Fix extraction 2016-04-12 17:26:09 +08:00
Yen Chi Hsuan
12355ac473 [goshgay] Fix extraction
isFamilyFriendly no longer exists in the webpage and I can't find
another indicator.
2016-04-12 17:23:00 +08:00
Sergey M․
49f523ca50 [mixcloud] Capture error message (#9156) 2016-04-11 20:45:58 +06:00
remitamine
4a903b93a9 Revert "[openclassroom] Add new extractor(closes #9147)"
This reverts commit 13267a2be3.
2016-04-11 14:44:35 +01:00
remitamine
13267a2be3 [openclassroom] Add new extractor(closes #9147) 2016-04-11 14:24:08 +01:00
Yen Chi Hsuan
134c207e3f [arte.tv:embed] Extended support (#2620) 2016-04-11 19:32:27 +08:00
Yen Chi Hsuan
0f56bd2178 Merge branch 'Phaeilo-presstv' 2016-04-11 16:17:05 +08:00
Yen Chi Hsuan
dfbc7f7f3f [presstv] Improve and simplify 2016-04-11 16:14:07 +08:00
Yen Chi Hsuan
7d58ea7c5b Merge branch 'presstv' of https://github.com/Phaeilo/youtube-dl into Phaeilo-presstv 2016-04-11 15:48:10 +08:00
Sergey M․
452908b257 [telebruxelles] Fix extraction (Closes #9142) 2016-04-11 00:06:05 +06:00
Sergey M․
5899e988d5 [glide] Improve extraction and extract upload info 2016-04-10 23:56:23 +06:00
Sergey M․
4a121d29bb [glide] Fix extraction (Closes #9141) 2016-04-10 23:45:17 +06:00
Sergey M․
7ebc36900d [jwplatform:base] Improve subtitles extraction 2016-04-10 22:55:07 +06:00
Sergey M․
d7eb052fa2 [screencastomatic] Add duration to test 2016-04-10 22:48:04 +06:00
Sergey M․
a6d6722c8f [jwplatform:base] Extract duration 2016-04-10 22:47:38 +06:00
Sergey M․
66fa495868 [screencastomatic] Fix extraction (Closes #9136) 2016-04-10 22:37:14 +06:00
Sergey M․
443285aabe [ebaumsworlds] Update _VALID_URL (Closes #9135) 2016-04-10 22:15:11 +06:00
Philip Huppert
de728757ad [presstv] Refactored extractor. 2016-04-10 16:36:44 +02:00
Sergey M․
f44c276842 [extractor/extractors] Remove non-existant imports 2016-04-10 19:21:58 +06:00
Sergey M․
a1fa60a934 [cliprs] Add extractor (Closes #9099) 2016-04-10 18:43:40 +06:00
Sergey M․
49caf3307f [extractor/common] Remove irrelevant comment 2016-04-10 17:10:27 +06:00
Jaime Marquínez Ferrándiz
6a801f4470 [test/InfoExtractors] add test for _download_json 2016-04-09 23:18:41 +02:00
Sergey M․
61dd350a04 [1tv] Fix extraction (Closes #9103) 2016-04-10 03:02:35 +06:00
Jaime Marquínez Ferrándiz
eb9c3edd5e [test/utils] Add test for date_from_str 2016-04-09 22:40:05 +02:00
Philip Huppert
95153a960d [presstv] updated extractor and tests to work with current PressTV website 2016-04-09 16:14:05 +02:00
Yen Chi Hsuan
6c4c7539f2 [test/helper] Check got values to be strings for md5: fields
Seen in PBSIE tests
2016-04-09 22:04:48 +08:00
Yen Chi Hsuan
c991106706 [videodetective] Adapt to InternetVideoArchiveIE 2016-04-09 21:47:35 +08:00
Yen Chi Hsuan
dae2a058de [rottentomatoes] Adapt to InternetVideoArchiveIE 2016-04-09 21:47:12 +08:00
Yen Chi Hsuan
c05025fdd7 [internetvideoarchive] Fix extraction and support json URLs 2016-04-09 21:46:51 +08:00
Philip Huppert
bfe96d7bea [presstv] Added extractor PressTV.
Fixes #7060
2016-04-09 14:55:54 +02:00
Yen Chi Hsuan
ab481b48e5 [funnyordie] Relax M3U8 URL matching
Also, m3u8_url extraction should be fatal as all formats depends
directly or indirectly on it.

This change fixes test_Generic_26 and TestFunnyOrDieSubtitles
2016-04-09 20:17:35 +08:00
Sergey M․
92c7f3157a [aol] Add coding cookie 2016-04-09 17:32:23 +06:00
Yen Chi Hsuan
cacd996662 [utils] Don't touch URLs if not necessary
Fix test_Generic_15 (Google redirect)
2016-04-09 19:27:54 +08:00
remitamine
bffb245a48 [aol] add support for videos with vidible IDs(closes #9124) 2016-04-09 10:51:23 +01:00
Yen Chi Hsuan
680efb6723 Merge pull request #8497 from jaimeMF/lazy-load
Add experimenta lazy loading of info extractors
2016-04-09 14:08:13 +08:00
Jaime Marquínez Ferrándiz
5a9858bfa9 setup.py: add command for building the lazy_extractors module 2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
8a5dc1c1e1 lazy extractors: Initialize the real info extractor
According to the docs '__init__' is only called automatically if '__new__' returns an instance of the original class.
2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
e0986e31cf lazy extractors: Output if it's enabled in the verbose log 2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
6b97ca96fc lazy extractors: Style fixes
* Sort extractors alphabetically
* Add newlines when needed (youtube_dl/extractors/lazy_extractors.py pass the flake8 test now)
2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
c1ce6acdd7 lazy extractors: Fix building with python2.6 2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
0d778b1db9 lazy extractors: specify the encoding
When building with python3 the unicode characters are not escaped, python2 needs to know the encoding.
2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
779822d945 Add experimental support for lazy loading the info extractors
'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created.
2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
1b3d5e05a8 Move the extreactors import to youtube_dl/extractor/extractors.py 2016-04-08 21:47:51 +02:00
Jaime Marquínez Ferrándiz
e52d7f85f2 Delay initialization of InfoExtractors until they are needed 2016-04-08 21:43:24 +02:00
Sergey M․
568d2f78d6 [tnaflix] Fix metadata extraction 2016-04-09 00:27:24 +06:00
Sergey M․
2f2fcf1a33 [tnaflix] Fix extraction (Closes #9074) 2016-04-08 23:34:59 +06:00
Sergey M․
bacec0397f [extractor/common] Relax _hidden_inputs 2016-04-08 23:33:45 +06:00
Sergey M․
3c6c7e7d7e [gdcvault] Fix extraction (Closes #9107, closes #9114) 2016-04-08 23:16:02 +06:00
Sergey M․
fb38aa8b53 [extractor/common] Support arbitrary format strings for template based identifiers in mpd manifests (Closes #9119, closes #9120) 2016-04-08 22:48:08 +06:00
Sergey M․
18da24634c [democracynow] Improve extraction 2016-04-08 22:27:27 +06:00
Sergey M․
a134426d61 [democracynow] Fix tests 2016-04-08 22:21:14 +06:00
Sergey M․
a64c0c9b06 [democracynow] Make description optional (Closes #9115) 2016-04-08 22:15:36 +06:00
Sergey M․
56019444cb [novamov] Improve _VALID_URL template (Closes #9116) 2016-04-08 21:26:42 +06:00
remitamine
a1ff3cd5f9 [acast] fix channel extraction(closes #9117) 2016-04-08 15:15:34 +01:00
remitamine
9a32e80477 [acast] fix extraction(#9117) 2016-04-08 14:51:00 +01:00
Sergey M․
536a55dabd [YoutubeDL] Sanitize single thumbnail URL 2016-04-08 00:17:47 +06:00
Sergey M․
ed6fb8b804 [vrt] Add support for direct hls playlists and YouTube (Closes #9108) 2016-04-07 23:22:43 +06:00
Sergey M․
3afef2e3fc [beeg] Improve extraction 2016-04-07 22:40:35 +06:00
Sergey M․
e90d175436 [yandexmusic] Extract music album metafields (Closes #7354) 2016-04-07 02:56:13 +06:00
Sergey M․
7a93ab5f3f [extractor/common] Introduce music album metafields 2016-04-07 02:53:53 +06:00
Philipp Hagemeister
c41cf65d4a release 2016.04.06 2016-04-06 15:13:08 +02:00
Jaime Marquínez Ferrándiz
ec4a4c6fcc Makefile: remove ISSUE_TEMPLATE.md from the 'all' target (fixes #9088)
It isn't included in the tar file, causing build failures.
Since it's only used for GitHub, I think we don't need to store it in the tar file.
2016-04-06 14:16:05 +02:00
Jaime Marquínez Ferrándiz
be0c7009fb Makefile: use full path for the ISSUE_TEMPLATE.md file 2016-04-06 14:09:31 +02:00
Yen Chi Hsuan
92d5477d84 [compat] Handle tuples properly in urlencode()
Fixes #9055
2016-04-06 18:29:54 +08:00
Yen Chi Hsuan
8790249c68 [iqiyi] Improve error detection for VIP-only videos
Closes #9071
2016-04-06 16:12:16 +08:00
Philipp Hagemeister
416930d450 release 2016.04.05 2016-04-05 18:36:24 +02:00
Sergey M․
65150b41bb [deezer] Fix extraction (Closes #9086) 2016-04-05 22:27:33 +06:00
Sergey M․
e42f413716 [rte] Improve thumbnail extraction (Closes #9085) 2016-04-05 22:23:20 +06:00
Sergey M․
40a056d85d [extractor/__init__] Remove novamov extractor and sort novamov based extractors alphabetically 2016-04-05 21:54:09 +06:00
Sergey M․
e7d77efb9d [auroravid] Add extractor (Closes #9070) 2016-04-05 21:52:07 +06:00
Sergey M․
995cf05c96 [novamov] Make title fatal 2016-04-05 21:40:43 +06:00
Jaime Marquínez Ferrándiz
5bf28d7864 [utils] dfxp2srt: add additional namespace
Used by the ZDF subtitles (#9081).
2016-04-04 20:46:35 +02:00
Jaime Marquínez Ferrándiz
8c7d6e8e22 [zdf] Extract subtitles (closes #9081) 2016-04-04 20:44:06 +02:00
Sergey M․
6d4fc66bfc [youtube] Add support for zwearz (Closes #9062) 2016-04-04 02:26:20 +06:00
remitamine
23576edbfc [brightcove:legacy] skip None value for uploader_id 2016-04-02 21:31:21 +01:00
remitamine
4d4cd35f48 [brightcove:legacy] extract uploader_id as a string 2016-04-02 20:55:44 +01:00
remitamine
3aac9b2fb1 [nowness] update tests 2016-04-02 18:57:15 +01:00
remitamine
e47d19e991 [brightcove:new] extract subtitles and strip video title 2016-04-02 18:57:15 +01:00
remitamine
41f5492fbc [brightcove:legacy] improve format extraction and extract uploader_id, duration and timestamp 2016-04-02 18:57:15 +01:00
Jaime Marquínez Ferrándiz
2defa7d75a [instagram:user] Fix extraction (fixes #9059)
The URL for the next page was incorrect and we always got the same page, therefore it got trapped in an infinite loop.
2016-04-02 18:03:56 +02:00
Sergey M․
bbc26c8a01 [bbc] Set vcodec to none for audio formats 2016-04-02 19:00:38 +06:00
Sergey M․
b507cc925b [extractor/common] Carry long line 2016-04-02 18:49:58 +06:00
Sergey M․
db8ee7ec05 [extractor/common] Fix numeric identifiers conversion in DASH URL templates 2016-04-02 18:48:05 +06:00
remitamine
08136dc138 [brightcove] fix format sorting 2016-04-02 10:57:57 +01:00
remitamine
fe7ef95e91 [cbsinteractive] Add support for ZDNet videos 2016-04-01 23:53:32 +01:00
remitamine
5f705baf5e [cnet] extract more formats 2016-04-01 20:42:15 +01:00
remitamine
0750b2491f [ffmpeg] try to convert tt subtitles usng dfxp2srt 2016-04-01 19:47:49 +01:00
remitamine
df634be2ed [common] prefer using mime type over ext for smil subtitle extraction
the subtitle ext for http://www.cnet.com/videos/download-amazon-prime-movies-and-tv/
is adb_xml while using the mime type it get tt(application/smptett+xml)
2016-04-01 19:47:49 +01:00
Jaime Marquínez Ferrándiz
6d628fafca [camwithher] Remove extra blank line 2016-04-01 20:45:21 +02:00
Jaime Marquínez Ferrándiz
0f28777f58 [cbsnews] Remove unused import 2016-04-01 20:43:14 +02:00
Jaime Marquínez Ferrándiz
329c1eae54 [aenetworks] Make pep8 happy 2016-04-01 20:42:19 +02:00
Sergey M․
9aaaf8e8e8 [camwithher] Improve extraction (Closes #8989) 2016-04-01 23:47:27 +06:00
theGeekPirate
04819db58e [camwithher] Add extractor
Corrected unnecessary test

Sane variable naming

RTMP all .flv & url_id for _download_webpage()

Corrected all outstanding issues, next up is a squash!
2016-04-01 23:44:25 +06:00
remitamine
79ba9140dc [theplatform] extract timestamp and uploader 2016-04-01 18:07:17 +01:00
Sergey M․
75d572e9fb [screencast] Improve title regexes (Closes #9025) 2016-04-01 23:01:55 +06:00
Martin Trigaux
791d6aaecc screencast.com: fallback on page title
When determining the title of the page, use the <title> tag of the page
2016-04-01 23:00:52 +06:00
Sergey M․
81de73e5b4 [screencast] Add test 2016-04-01 23:00:45 +06:00
Martin Trigaux
83cedc1cf2 screencast.com: support missing www
The "www." part of the URL is not mandatory
2016-04-01 22:58:16 +06:00
Sergey M․
244cd04237 [pluralsight] Remove unnecessary login/password encode 2016-04-01 22:46:46 +06:00
Sergey M․
fbdaced256 [lynda] Remove unnecessary login/password encode 2016-04-01 22:45:20 +06:00
Sergey M․
a3373823e1 [udemy] Remove unnecessary login/password encode
This is now covered by compat_urllib_parse_urlencode
2016-04-01 22:42:09 +06:00
Sergey M․
03caa463e7 [udemy:course] Skip non-video lectures 2016-04-01 22:38:56 +06:00
remitamine
3f64379eda [movieclips] fix extraction 2016-04-01 16:22:06 +01:00
remitamine
3e0c3d14d9 [cbs] add base extractor 2016-04-01 10:12:29 +01:00
remitamine
d8873d4def [aenetworks] improve format extraction 2016-04-01 09:58:02 +01:00
remitamine
db1c969da5 [theplatform] sign https urls 2016-04-01 09:58:02 +01:00
Philipp Hagemeister
1e02bc7ba2 release 2016.04.01 2016-04-01 09:07:40 +02:00
remitamine
63c55e9f22 [cbs] improve extraction(closes #6321) 2016-04-01 07:33:37 +01:00
remitamine
f9b1529af8 [generic] remove sbnation test(handled by VoxMediaIE) 2016-03-31 23:50:45 +01:00
remitamine
961fc024d2 [voxmedia] improve sbnation support 2016-03-31 23:33:36 +01:00
Sergey M․
b53a06e3b9 [udemy:course] Use new URL format 2016-04-01 02:24:22 +06:00
remitamine
4ecc1fc638 [howstuffworks] improve extraction 2016-03-31 21:11:58 +01:00
Yen Chi Hsuan
5b012dfce8 [tudou] Improve error handling (closes #8988) 2016-04-01 01:42:16 +08:00
remitamine
8369942773 [voxmedia] Add new extractor(closes #3182) 2016-03-31 18:36:41 +01:00
Sergey M․
86f3b66cec [udemy] Remove unused import 2016-03-31 23:00:11 +06:00
Sergey M․
6bb4600717 [udemy:course] Simplify course curriculum downloading 2016-03-31 22:59:19 +06:00
Sergey M․
41d06b0424 [extractor/common] Improve _request_webpage
* Do not ignore data, headers and query for Requests
* Default values for headers and query switched to dicts since these are used by urllib itself
2016-03-31 22:58:38 +06:00
Sergey M․
15d260ebaa [utils] Use update_Request in http_request 2016-03-31 22:55:49 +06:00
Sergey M․
ed0291d153 [utils] Add update_Request 2016-03-31 22:55:01 +06:00
Sergey M․
81da8cbc45 [udemy] Switch to api 2.0 (Closes #9035) 2016-03-31 22:05:25 +06:00
Sergey M․
5299bc3f91 [beeg] Switch to api v6 (Closes #9036) 2016-03-31 20:42:41 +06:00
remitamine
c9c39c22c5 [nationalgeographic] add support for channel.nationalgeographic.com urls 2016-03-31 13:47:38 +01:00
remitamine
d84b48e3f1 [nationalgeographic] improve extraction 2016-03-31 13:44:55 +01:00
remitamine
dd17041c82 [tenplay] remove extractor(fixes #6927) 2016-03-31 12:02:04 +01:00
remitamine
fea7295b14 [brightcove] relax embed_in_page regex 2016-03-31 10:48:22 +01:00
remitamine
9cf01f7f30 [nbc] add new extractor for csnne.com(#5432) 2016-03-31 00:26:42 +01:00
remitamine
ce548296fe [cnbc] fix test 2016-03-31 00:25:11 +01:00
remitamine
c02ec7d430 [cnbc] Add new extractor(closes #8012) 2016-03-30 23:18:31 +01:00
remitamine
6b820a2376 [myspace] improve extraction 2016-03-30 21:18:07 +01:00
Yen Chi Hsuan
e621a344e6 [kwuo] Port to new API and enable --cn-verification-proxy 2016-03-31 02:27:52 +08:00
Yen Chi Hsuan
3ae6f8fec1 [kwuo] Remove _sort_formats() from KuwoBaseIE._get_formats()
Following the idea proposed in 19dbaeece3
2016-03-31 02:11:21 +08:00
Yen Chi Hsuan
597d52fadb [kuwo:song] Correct song ID extraction (fixes #9033)
Bug introduced in daef04a4e7.
2016-03-31 02:00:50 +08:00
Sergey M․
afca767d19 [tumblr] Improve _VALID_URL (Closes #9027) 2016-03-30 22:26:43 +06:00
remitamine
6e359a1534 [comcarcoff] don not depend on crackle extractor(closes #8995)
previously extraction has been delegated to crackle to extract more info
and subtitles #6106 but some of the episodes can't be extracted using
crackle #8995.
2016-03-30 12:27:00 +01:00
Sergey M․
607619bc90 Add manually generated ISSUE_TEMPLATE.md
In order not to wait for the next release
2016-03-29 22:04:29 +06:00
Sergey M․
0b7bfc9422 Improve ISSUE_TEMPLATE_tmpl.md 2016-03-29 22:02:42 +06:00
Sergey M․
7168a6c874 [devscripts/make_issue_template] Fix __version__ again 2016-03-29 03:05:15 +06:00
Sergey M․
034947dd1e Rename ISSUE_TEMPLATE.tmpl in order not to be picked up by github 2016-03-29 02:48:04 +06:00
Sergey M․
3c0de33ad7 Remove ISSUE_TEMPLATE.md 2016-03-29 02:43:48 +06:00
Sergey M․
89924f8230 [devscripts/make_issue_template] Fix NameError under python3 2016-03-29 02:41:27 +06:00
Sergey M․
a39c68f7e5 Exclude make_issue_template.py from flake8 2016-03-29 02:19:24 +06:00
Sergey M․
4a5a67ca25 [devscripts/release.sh] Make ISSUE_TEMPLATE.md and commit it 2016-03-29 02:18:52 +06:00
Sergey M․
8751da85a7 [Makefile] Fix ISSUE_TEMPLATE.md target 2016-03-29 02:17:57 +06:00
Sergey M․
3bf1df51fd [devscripts/make_issue_template] Rework to use ISSUE_TEMPLATE.tmpl (Closes #8785) 2016-03-29 02:16:38 +06:00
Sergey M․
3842a3e652 Add ISSUE_TEMPLATE.tmpl as template for ISSUE_TEMPLATE.md 2016-03-29 02:15:26 +06:00
Sander van den Oever
7710bdf4e8 Add initial ISSUE_TEMPLATE
Add auto-updating of youtube-dl version in ISSUE_TEMPLATE

Move parts of template text and adopt makefile to new format

Moved the 'kind-of-issue' section and rephrased a bit

Rephrased and moved Example URL section upwards

Moved ISSUE_TEMPLATE inside .github folder.

Update makefile to match new folderstructure
2016-03-28 22:43:13 +06:00
Sergey M
8d9dd3c34b [README.md] Add format_id to the list of string meta fields available for use in format selection 2016-03-28 03:08:34 +05:00
Sergey M․
33f3040a3e [YoutubeDL] Fix sanitizing subtitles' url 2016-03-28 03:13:39 +06:00
Sergey M․
03442072c0 [pornhub] Fix typo (Closes #9008) 2016-03-28 01:21:44 +06:00
Sergey M․
c8b13fec02 [foxnews] Restore upload time fields in test 2016-03-28 01:14:12 +06:00
Sergey M․
87d105ac6c [amp] Fix upload timestamp extraction (Closes #9007) 2016-03-28 01:13:47 +06:00
Sergey M․
3454139576 [pornhub:uservideos] Add support for multipage videos (Closes #9006) 2016-03-28 00:50:46 +06:00
Sergey M․
3a23bae9cc [pornhub:playlistbase] Do not include videos not from playlist 2016-03-28 00:32:57 +06:00
Sergey M․
8f9a477e7f [pornhub:playlistbase] Use orderedSet 2016-03-28 00:21:08 +06:00
Sergey M․
a1cf3e38a3 [bbc] Extend vpid regex (Closes #9003) 2016-03-27 23:22:51 +06:00
Philipp Hagemeister
a122e7080b release 2016.03.27 2016-03-27 16:56:33 +02:00
Sergey M․
b22ca76204 [extractor/common] Filter out unsupported encrypted media for f4m formats (Closes #8573) 2016-03-27 07:42:38 +06:00
Sergey M․
f7df343b4a [downloader/f4m] Extract routine for removing unsupported encrypted media 2016-03-27 07:41:19 +06:00
Sergey M․
19dbaeece3 Remove _sort_formats from _extract_*_formats methods
Now _sort_formats should be called explicitly.
_sort_formats has been added to all the necessary places in code.

Closes #8051
2016-03-27 07:03:08 +06:00
Yen Chi Hsuan
395fd4b08a [twitter] Handle another form of embedded Vine
Fixes #8996
2016-03-27 04:36:02 +08:00
Sergey M․
8018028d0f [pluralsight] Extract chapter metadata (Closes #8993) 2016-03-27 02:10:52 +06:00
Sergey M․
00322ad4fd [lynda] Extract chapter metadata (#8993) 2016-03-27 02:00:36 +06:00
Sergey M․
4cf3489c6e [vevo] Update videoservice API URL (Closes #8900) 2016-03-27 01:11:11 +06:00
Sergey M․
b24ab3e341 [udemy] Improve paid course detection 2016-03-27 00:09:12 +06:00
Sergey M․
af4116f4f0 [udemy] Improve format_id 2016-03-27 00:02:52 +06:00
Sergey M․
f973e5d54e [udemy] Drop outputs' formats
Always results in 403
2016-03-26 23:55:07 +06:00
Sergey M․
62f55aa68a [udemy] Add outputs metadata to view_html formats 2016-03-26 23:54:12 +06:00
Sergey M․
02d7634d24 [udemy] Fix outputs' formats format_id 2016-03-26 23:43:25 +06:00
Sergey M․
48dce58ca9 [udemy] Use custom sorting 2016-03-26 23:42:46 +06:00
Sergey M․
efcba804f6 [udemy] Extract formats from view_html (Closes #8979) 2016-03-26 23:42:34 +06:00
Sergey M․
6dee688e6d [youtube:playlistsbase] Restrict playlist regex (Closes #8986) 2016-03-26 20:42:18 +06:00
Sergey M․
eedb7ba536 [YoutubeDL] Sort imports 2016-03-26 19:40:33 +06:00
Sergey M․
dcf77cf1a7 [YoutubeDL] Sanitize final URLs (Closes #8991) 2016-03-26 19:37:41 +06:00
Sergey M․
17bcc626bf [utils] Extract sanitize_url routine 2016-03-26 19:33:57 +06:00
Sergey M․
b5a5bbf376 [mailru] Extend _VALID_URL (Closes #8990) 2016-03-26 19:15:32 +06:00
Yen Chi Hsuan
e68d3a010f [twitter] Fix extraction (closes #8966)
HLS and DASH formats are no longer appeared in test cases. I keep them
for fear of triggering new errors.
2016-03-26 18:34:51 +08:00
Yen Chi Hsuan
d10fe8358c [generic] Add a test case for brightcove embed
Closes #8862
2016-03-26 18:30:43 +08:00
Yen Chi Hsuan
d6c340cae5 [brightcove] Extract more formats (#8862) 2016-03-26 18:21:07 +08:00
Yen Chi Hsuan
5964b598ff [brightcove] Support alternative BrightcoveExperience layout
The full URL lays in the `data` attribute of <object> (#8862)
2016-03-26 17:47:32 +08:00
Philipp Hagemeister
62cdb96f51 release 2016.03.26 2016-03-26 08:58:03 +01:00
Sergey M․
e289d6d62c [test_compat] Add tests for compat_urllib_parse_urlencode 2016-03-26 02:38:33 +06:00
Sergey M․
6e6bc8dae5 Use urlencode_postdata across the codebase 2016-03-26 02:19:24 +06:00
Sergey M․
15707c7e02 [compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode

Closes #8974
2016-03-26 01:46:57 +06:00
Sergey M․
2156f16ca7 [thescene] Fix extraction and improve style (Closes #8978) 2016-03-25 20:14:34 +06:00
Sergey M․
4db441de72 [once] Relax _VALID_URL (Closes #8976) 2016-03-25 19:51:28 +06:00
Philipp Hagemeister
0be8314dc8 release 2016.03.25 2016-03-25 09:27:18 +01:00
Yen Chi Hsuan
d7f62b049a [iqiyi] Update enc_key 2016-03-25 15:45:40 +08:00
Yen Chi Hsuan
3bb3356812 [douyutv] Extend _VALID_URL 2016-03-25 15:43:29 +08:00
Sergey M․
3f15fec1d1 Credit @Kagami for mnet (#8958) 2016-03-25 03:56:27 +06:00
Sergey M․
98e68806fb [mnet] Improve (Closes #8958) 2016-03-25 03:26:29 +06:00
Kagami Hiiragi
e031768666 [mnet] Add new extractor 2016-03-25 02:32:06 +06:00
Sergey M․
5eb7db4ee9 [udemy] Add support for new URL schema 2016-03-25 02:28:39 +06:00
Sergey M․
f0e83681d9 [udemy] Extract formats from outputs 2016-03-25 02:27:13 +06:00
Sergey M․
ff9d5d0938 [udemy] Improve course enrolling 2016-03-25 02:26:46 +06:00
Sergey M․
d041a73674 [extractor/__init__] Add youtube:live and sort youtube extractors alphabetically 2016-03-25 01:39:25 +06:00
Sergey M․
f07e276a04 [youtube:live] Add extractor (Closes #8959) 2016-03-25 01:18:14 +06:00
Sergey M․
993271da0a [nytimes] Tolerate missing metadata (Closes #8952) 2016-03-24 23:28:24 +06:00
Sergey M․
369e7e3ff0 [iprima] Fix extraction (Closes #8953) 2016-03-24 22:54:26 +06:00
Sergey M․
5767b4eeae [mtv] Fix description extraction (Closes #8962) 2016-03-24 22:23:31 +06:00
Yen Chi Hsuan
622d19160b [utils] Clarify Python versions affected by buggy struct module 2016-03-24 18:06:15 +08:00
Yen Chi Hsuan
32d88410eb [tumblr] Add a test with Instagram embed
Closes #8817
2016-03-24 16:32:53 +08:00
Yen Chi Hsuan
5a51775a58 [generic] Extract Instagram embeds (#8817) 2016-03-24 16:32:27 +08:00
Yen Chi Hsuan
87696e78d7 [instagram] Unescape description (#8817) 2016-03-24 16:30:01 +08:00
Yen Chi Hsuan
c4096e8aea [instagram] Extract embed videos (#8817) 2016-03-24 16:29:33 +08:00
Yen Chi Hsuan
fc27ea9464 [tumblr] Support Vine embeds (#8817) 2016-03-23 23:55:52 +08:00
Yen Chi Hsuan
088e1aac59 [generic] Support Vine embeds (#8817) 2016-03-23 23:55:08 +08:00
Yen Chi Hsuan
81f36eba88 [test/test_utils] Update for escape_url change (again) 2016-03-23 23:23:26 +08:00
Yen Chi Hsuan
2d60465e44 [test/test_utils] Update for escape_url change 2016-03-23 23:20:28 +08:00
Sergey M
4333d56494 Merge pull request #8898 from dstftw/fragment-retries
Add --fragment-retries option (Fixes #8466)
2016-03-23 20:12:32 +05:00
Sergey M․
882c699296 [tunein] Fix stream data extraction (Closes #8899, closes #8924) 2016-03-23 20:45:39 +06:00
Yen Chi Hsuan
efbed08dc2 [utils] Encode hostnames before passing to urllib
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError

Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz
7da2c87119 Add extractor for thescene.com (closes #8929) 2016-03-22 22:17:59 +01:00
Sergey M․
c6ca11f1b3 [once] Prevent ads from embedding into m3u8 playlists (Closes #8893) 2016-03-22 23:48:05 +06:00
Sergey M․
2beeb286e1 [laola1tv] Add support for livestreams (Closes #8934) 2016-03-22 22:32:59 +06:00
Sergey M․
cc7397b04d [ceskatelevize] Make m3u8 formats extraction non fatal (Closes #8933) 2016-03-22 21:12:29 +06:00
Sergey M․
bc5d16b302 [animeondemand] Skip dash for now 2016-03-21 23:37:39 +06:00
Sergey M․
85c637b737 [animeondemand] Extract teaser when no full episode available (#8923) 2016-03-21 23:35:50 +06:00
Sergey M․
5c69f7a479 [animeondemand] Respect startvideo (Closes #8923) 2016-03-21 23:31:40 +06:00
Sergey M․
ff5873b72d [motherless] Detect friends only videos 2016-03-21 22:24:42 +06:00
Sergey M․
065c4b27bf [xhamster:embed] Extract vars (Closes #8912) 2016-03-21 22:07:34 +06:00
Sergey M․
1600ed1ff9 [rutv] Improve flash version pattern (Closes #8911) 2016-03-21 21:46:49 +06:00
Sergey M․
5886b38d73 Add support for https for all extractors as preventive and future-proof measure 2016-03-21 21:36:32 +06:00
Sergey M․
0cef27ad25 Add missing r prefix for _VALID_URLs 2016-03-21 21:22:37 +06:00
Sergey M․
12af4beb3e [mailru] Add support for https (Closes #8920) 2016-03-21 21:17:29 +06:00
Sergey M․
9016d76f71 [YoutubeDL] Improve _format_note 2016-03-20 22:01:45 +06:00
Sergey M․
3c5d183c19 [animeondemand] Extract all formats (Closes #8906) 2016-03-20 21:51:22 +06:00
Sergey M․
3e8bb9a972 [animeondemand] Detect geo restriction 2016-03-20 20:39:00 +06:00
Yen Chi Hsuan
daef04a4e7 [kwuo] Fix KuwoChartIE and KuwoSingerIE and accept new URL forms 2016-03-20 20:17:56 +08:00
Yen Chi Hsuan
7caae128a7 Credit @vitstradal for the key algorithm in OpenloadIE (#8489)
[ci skip]
2016-03-20 19:12:02 +08:00
Yen Chi Hsuan
2648918c81 [vlive] Fix creator extraction (closes #8814) 2016-03-20 18:15:53 +08:00
Jaime Marquínez Ferrándiz
920d318d3c README: document that BSD make is also supported (#8902) 2016-03-20 10:55:14 +01:00
Yen Chi Hsuan
9e3c2f1d74 [openload] Misc improvements
* Add thumbnail
* Detect errors (#6469)
* Match more (#6469, #8489)
2016-03-20 16:49:44 +08:00
Yen Chi Hsuan
2bfeee69b9 [openload] Add new extractor (closes #8489) 2016-03-20 15:54:58 +08:00
Yen Chi Hsuan
664bcd80b9 [tudou] Use InAdvancePagedList (closes #8884) 2016-03-20 15:45:31 +08:00
Sergey M․
3c20208eff [francetv] Improve formats extraction 2016-03-20 13:00:46 +06:00
Sergey M․
db264e3cc3 [francetvinfo] Add support for france3-regions and strip title (Closes #7673) 2016-03-20 12:44:04 +06:00
Sergey M
d396f30467 Merge pull request #8902 from jaimeMF/bmake
Makefile: make it compatible with bmake
2016-03-20 11:08:57 +05:00
Sergey M․
96a9f22d98 [discovery] Relax _VALID_URL (Closes #8903) 2016-03-20 10:26:58 +06:00
Sergey M․
40025ee2a3 [postprocessort/ffmpeg] Allow embedding webvtt into webm (Closes #8874) 2016-03-20 04:12:34 +06:00
Jaime Marquínez Ferrándiz
3ff63fb365 Makefile: make it compatible with bmake
It's the portable version of BSD make: http://crufty.net/help/sjg/bmake.html
The syntax for conditionals is different in GNU make and BSD make, so we use the shell
2016-03-19 21:51:13 +01:00
Jaime Marquínez Ferrándiz
5c7cd37ebd tox.ini: Exclude test_iqiyi_sdk_interpreter.py 2016-03-19 21:50:16 +01:00
Sergey M․
298c04b464 [91porn] Use common messages' wording 2016-03-20 02:35:48 +06:00
Sergey M․
d95114dd83 [91porn] Unquote final URL (Closes #8881) 2016-03-20 02:34:02 +06:00
Sergey M․
94dcade8f8 Credit @jjatria for biobiochiletv (#7314) 2016-03-20 01:36:20 +06:00
Sergey M․
fa023ccb2c [biobiochiletv] Fix extraction, extract m3u8 formats and overall improve (Closes #7314) 2016-03-20 01:31:55 +06:00
jjatria
e36f4aa72b [biobiotv] Add extractor 2016-03-20 01:29:08 +06:00
Sergey M․
9261e347cc Credit @kasper93 for cda (#8805) 2016-03-19 23:18:04 +06:00
Sergey M․
f1ced6df51 [cda] Improve and simplify (Closes #8805) 2016-03-19 23:17:14 +06:00
Kacper Michajłow
8b0d7a66ef [cda] Add new extractor for cda.pl
Fixes #8760
2016-03-19 22:42:40 +06:00
Sergey M․
3aec71766d [safari:api] Separate extractor (Closes #8871) 2016-03-19 22:30:48 +06:00
Sergey M․
16a8b7986b [downloader/fragment] Document fragment_retries 2016-03-19 20:54:21 +06:00
Sergey M․
617e58d850 [downloader/{common,fragment}] Fix total retries reporting on python 2.6 2016-03-19 20:51:30 +06:00
Sergey M․
e33baba0dd [downloader/dash] Add fragment retry capability
YouTube may often return 404 HTTP error for a fragment causing the
whole download to fail. However if the same fragment is immediately
retried with the same request data this usually succeeds (1-2 attemps
is usually enough) thus allowing to download the whole file successfully.
So, we will retry all fragments that fail with 404 HTTP error for now.
2016-03-19 20:42:23 +06:00
Sergey M․
721f26b821 [downloader/fragment] Add report_retry_fragment 2016-03-19 20:41:24 +06:00
Sergey M․
52bb437e41 [options] Add --fragment-retries option 2016-03-19 20:40:36 +06:00
Jaime Marquínez Ferrándiz
782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string 2016-03-19 11:44:49 +01:00
Sergey M․
0d769bcb78 [extractor/generic] Fix missing byte literal prefix 2016-03-19 05:43:43 +06:00
remitamine
4cd70099ea [hbo] Add new extractor 2016-03-18 21:18:18 +01:00
Jaime Marquínez Ferrándiz
09fc33198a utils: lookup_unit_table: Use a stricter regex
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․
4c3b16d5d1 [test_YoutubeDL] Add test for format_id format selection 2016-03-19 00:04:26 +06:00
John Peel
d5aacf9a90 Added format_id to the filers on -f. 2016-03-18 23:59:24 +06:00
Sergey M․
19e2617a6f [commonprotocols] Add generic support for rtmp URLs (Closes #8488) 2016-03-18 23:42:15 +06:00
Sergey M․
edd9b71c2c [extractor/generic] Add a test for m3u playlist served without proper Content-Type 2016-03-18 22:49:11 +06:00
Sergey M․
5940862d5a [extractor/generic] Detect m3u playlists served without proper Content-Type 2016-03-18 22:45:28 +06:00
Sergey M․
de6c51e88e [extractor/generic] Fix direct link semantics 2016-03-18 22:43:07 +06:00
Sergey M․
303dcdb995 [extractor/generic] Simplify upload_date extraction 2016-03-18 22:41:16 +06:00
Sergey M․
20938f768b [extractor/generic] Add another test for generic m3u8 2016-03-18 21:54:33 +06:00
Sergey M․
955737b2d4 [extractor/generic] Force Content-Type to lowecase 2016-03-18 21:50:44 +06:00
Sergey M․
263eff9537 [extractor/generic] Properly extract format id from Content-Type
Fixes extraction for cases like: audio/x-mpegURL; charset=utf-8
2016-03-18 21:50:10 +06:00
Sergey M․
cae21032ab [theplatform] Improve geo restriction detection 2016-03-18 21:08:25 +06:00
remitamine
6187091532 [once] check http formats availability 2016-03-18 11:51:34 +01:00
591 changed files with 32766 additions and 12347 deletions

58
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,58 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
- Use *Preview* tab to see how your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.24**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.09.24
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

58
.github/ISSUE_TEMPLATE_tmpl.md vendored Normal file
View File

@@ -0,0 +1,58 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
- Use *Preview* tab to see how your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

22
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,22 @@
## Please follow the guide below
- You will be asked some questions, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *pull request* (like that [x])
- Use *Preview* tab to see how your *pull request* will actually look like
---
### Before submitting a *pull request* make sure you have:
- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/rg3/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/rg3/youtube-dl#youtube-dl-coding-conventions) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
### What is the purpose of your *pull request*?
- [ ] Bug fix
- [ ] New extractor
- [ ] New feature
---
### Description of your *pull request* and other information
Explanation of your *pull request* in arbitrary form goes here. Please make sure the description explains the purpose and effect of your *pull request* and is worded well enough to be understood. Provide as much context and examples as possible.

9
.gitignore vendored
View File

@@ -13,6 +13,7 @@ README.txt
youtube-dl.1
youtube-dl.bash-completion
youtube-dl.fish
youtube_dl/extractor/lazy_extractors.py
youtube-dl
youtube-dl.exe
youtube-dl.tar.gz
@@ -27,10 +28,16 @@ updates_key.pem
*.mp4
*.m4a
*.m4v
*.mp3
*.part
*.swp
test/testdata
test/local_parameters.json
.tox
youtube-dl.zsh
# IntelliJ related files
.idea
.idea/*
*.iml
tmp/

View File

@@ -11,7 +11,6 @@ script: nosetests test --verbose
notifications:
email:
- filippo.valsorda@gmail.com
- phihag@phihag.de
- yasoob.khld@gmail.com
# irc:
# channels:

22
AUTHORS
View File

@@ -163,3 +163,25 @@ Patrick Griffis
Aidan Rowe
mutantmonkey
Ben Congdon
Kacper Michajłow
José Joaquín Atria
Viťas Strádal
Kagami Hiiragi
Philip Huppert
blahgeek
Kevin Deldycke
inondle
Tomáš Čech
Déstin Reed
Roman Tsiupa
Artur Krysiak
Jakub Adam Wieczorek
Aleksandar Topuzović
Nehal Patel
Rob van Bekkum
Petr Zvoníček
Pratyush Singh
Aleksander Nitecki
Sebastian Blunt
Matěj Cepl
Xie Yanbo

View File

@@ -46,7 +46,7 @@ Make sure that someone has not already opened the issue you're trying to open. S
### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report?
@@ -85,7 +85,7 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need
* python
* make
* make (both GNU make and BSD make are supported)
* pandoc
* zip
* nosetests
@@ -97,9 +97,17 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
3. Start a new git branch with
cd youtube-dl
git checkout -b yourextractor
4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
```python
# coding: utf-8
from __future__ import unicode_literals
@@ -140,19 +148,151 @@ After you have ensured this site is distributing it's content legally, you can f
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](http://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions!
## youtube-dl coding conventions
This section introduces a guide lines for writing idiomatic, robust and future-proof extractor code.
Extractors are very fragile by nature since they depend on the layout of the source data provided by 3rd party media hoster out of your control and this layout tend to change. As an extractor implementer your task is not only to write code that will extract media links and metadata correctly but also to minimize code dependency on source's layout changes and even to make the code foresee potential future changes and be ready for that. This is important because it will allow extractor not to break on minor layout changes thus keeping old youtube-dl versions working. Even though this breakage issue is easily fixed by emitting a new version of youtube-dl with fix incorporated all the previous version become broken in all repositories and distros' packages that may not be so prompt in fetching the update from us. Needless to say some may never receive an update at all that is possible for non rolling release distros.
### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by [information dictionary](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) or simply *info dict*. Only the following meta fields in *info dict* are considered mandatory for successful extraction process by youtube-dl:
- `id` (media identifier)
- `title` (media title)
- `url` (media download URL) or `formats`
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` to be mandatory. Thus aforementioned metafields are the critical data the extraction does not make any sense without and if any of them fail to be extracted then extractor is considered completely broken.
[Any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example
Say you have some source dictionary `meta` that you've fetched as JSON with HTTP request and it has a key `summary`:
```python
meta = self._download_json(url, video_id)
```
Assume at this point `meta`'s layout is:
```python
{
...
"summary": "some fancy summary text",
...
}
```
Assume you want to extract `summary` and put into resulting info dict as `description`. Since `description` is optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
```
and not like:
```python
description = meta['summary'] # incorrect
```
The latter will break extraction process with `KeyError` if `summary` disappears from `meta` at some time later but with former approach extraction will just go ahead with `description` set to `None` that is perfectly fine (remember `None` is equivalent for absence of data).
Similarly, you should pass `fatal=False` when extracting optional data from a webpage with `_search_regex`, `_html_search_regex` or similar methods, for instance:
```python
description = self._search_regex(
r'<span[^>]+id="title"[^>]*>([^<]+)<',
webpage, 'description', fatal=False)
```
With `fatal` set to `False` if `_search_regex` fails to extract `description` it will emit a warning and continue extraction.
You can also pass `default=<some fallback value>`, for example:
```python
description = self._search_regex(
r'<span[^>]+id="title"[^>]*>([^<]+)<',
webpage, 'description', default=None)
```
On failure this code will silently continue the extraction with `description` set to `None`. That is useful for metafields that are known to may or may not be present.
### Provide fallbacks
When extracting metadata try to provide several scenarios for that. For example if `title` is present in several places/sources try extracting from at least some of them. This would make it more future-proof in case some of the sources became unavailable.
#### Example
Say `meta` from previous example has a `title` and you are about to extract it. Since `title` is mandatory meta field you should end up with something like:
```python
title = meta['title']
```
If `title` disappeares from `meta` in future due to some changes on hoster's side the extraction would fail since `title` is mandatory. That's expected.
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
```python
title = meta.get('title') or self._og_search_title(webpage)
```
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
### Make regular expressions flexible
When using regular expressions try to write them fuzzy and flexible.
#### Example
Say you need to extract `title` from the following HTML code:
```html
<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">some fancy title</span>
```
The code for that task should look similar to:
```python
title = self._search_regex(
r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
```
Or even better:
```python
title = self._search_regex(
r'<span[^>]+class=(["\'])title\1[^>]*>(?P<title>[^<]+)',
webpage, 'title', group='title')
```
Note how you tolerate potential changes in `style` attribute's value or switch from using double quotes to single for `class` attribute:
The code definitely should not look like:
```python
title = self._search_regex(
r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>',
webpage, 'title', group='title')
```
### Use safe conversion functions
Wrap all extracted numeric data into safe functions from `utils`: `int_or_none`, `float_or_none`. Use them for string to number conversions as well.

710
ChangeLog Normal file
View File

@@ -0,0 +1,710 @@
version 2016.09.24
Core
+ Add support for watchTVeverywhere.com authentication provider based MSOs for
Adobe Pass authentication (#10709)
Extractors
+ [soundcloud:playlist] Provide video id for early playlist entries (#10733)
+ [prosiebensat1] Add support for kabeleinsdoku (#10732)
* [cbs] Extract info from thunder videoPlayerService (#10728)
* [openload] Fix extraction (#10408)
+ [ustream] Support the new HLS streams (#10698)
+ [ooyala] Extract all HLS formats
+ [cartoonnetwork] Add support for Adobe Pass authentication
+ [soundcloud] Extract license metadata
+ [fox] Add support for Adobe Pass authentication (#8584)
+ [tbs] Add support for Adobe Pass authentication (#10642, #10222)
+ [trutv] Add support for Adobe Pass authentication (#10519)
+ [turner] Add support for Adobe Pass authentication
version 2016.09.19
Extractors
+ [crunchyroll] Check if already authenticated (#10700)
- [twitch:stream] Remove fallback to profile extraction when stream is offline
* [thisav] Improve title extraction (#10682)
* [vyborymos] Improve station info extraction
version 2016.09.18
Core
+ Introduce manifest_url and fragments fields in formats dictionary for
fragmented media
+ Provide manifest_url field for DASH segments, HLS and HDS
+ Provide fragments field for DASH segments
* Rework DASH segments downloader to use fragments field
+ Add helper method for Wowza Streaming Engine formats extraction
Extractors
+ [vyborymos] Add extractor for vybory.mos.ru (#10692)
+ [xfileshare] Add title regular expression for streamin.to (#10646)
+ [globo:article] Add support for multiple videos (#10653)
+ [thisav] Recognize HTML5 videos (#10447)
* [jwplatform] Improve JWPlayer detection
+ [mangomolo] Add support for Mangomolo embeds
+ [toutv] Add support for authentication (#10669)
* [franceinter] Fix upload date extraction
* [tv4] Fix HLS and HDS formats extraction (#10659)
version 2016.09.15
Core
* Improve _hidden_inputs
+ Introduce improved explicit Adobe Pass support
+ Add --ap-mso to provide multiple-system operator identifier
+ Add --ap-username to provide MSO account username
+ Add --ap-password to provide MSO account password
+ Add --ap-list-mso to list all supported MSOs
+ Add support for Rogers Cable multiple-system operator (#10606)
Extractors
* [crunchyroll] Fix authentication (#10655)
* [twitch] Fix API calls (#10654, #10660)
+ [bellmedia] Add support for more Bell Media Television sites
* [franceinter] Fix extraction (#10538, #2105)
* [kuwo] Improve error detection (#10650)
+ [go] Add support for free full episodes (#10439)
* [bilibili] Fix extraction for specific videos (#10647)
* [nhk] Fix extraction (#10633)
* [kaltura] Improve audio detection
* [kaltura] Skip chun format
+ [vimeo:ondemand] Pass Referer along with embed URL (#10624)
+ [nbc] Add support for NBC Olympics (#10361)
version 2016.09.11.1
Extractors
+ [tube8] Extract categories and tags (#10579)
+ [pornhub] Extract categories and tags (#10499)
* [openload] Temporary fix (#10408)
+ [foxnews] Add support Fox News articles (#10598)
* [viafree] Improve video id extraction (#10615)
* [iwara] Fix extraction after relaunch (#10462, #3215)
+ [tfo] Add extractor for tfo.org
* [lrt] Fix audio extraction (#10566)
* [9now] Fix extraction (#10561)
+ [canalplus] Add support for c8.fr (#10577)
* [newgrounds] Fix uploader extraction (#10584)
+ [polskieradio:category] Add support for category lists (#10576)
+ [ketnet] Add extractor for ketnet.be (#10343)
+ [canvas] Add support for een.be (#10605)
+ [telequebec] Add extractor for telequebec.tv (#1999)
* [parliamentliveuk] Fix extraction (#9137)
version 2016.09.08
Extractors
+ [jwplatform] Extract height from format label
+ [yahoo] Extract Brightcove Legacy Studio embeds (#9345)
* [videomore] Fix extraction (#10592)
* [foxgay] Fix extraction (#10480)
+ [rmcdecouverte] Add extractor for rmcdecouverte.bfmtv.com (#9709)
* [gamestar] Fix metadata extraction (#10479)
* [puls4] Fix extraction (#10583)
+ [cctv] Add extractor for CCTV and CNTV (#8153)
+ [lci] Add extractor for lci.fr (#10573)
+ [wat] Extract DASH formats
+ [viafree] Improve video id detection (#10569)
+ [trutv] Add extractor for trutv.com (#10519)
+ [nick] Add support for nickelodeon.nl (#10559)
+ [abcotvs:clips] Add support for clips.abcotvs.com
+ [abcotvs] Add support for ABC Owned Television Stations sites (#9551)
+ [miaopai] Add extractor for miaopai.com (#10556)
* [gamestar] Fix metadata extraction (#10479)
+ [bilibili] Add support for episodes (#10190)
+ [tvnoe] Add extractor for tvnoe.cz (#10524)
version 2016.09.04.1
Core
* In DASH downloader if the first segment fails, abort the whole download
process to prevent throttling (#10497)
+ Add support for --skip-unavailable-fragments and --fragment retries in
hlsnative downloader (#10165, #10448).
+ Add support for --skip-unavailable-fragments in DASH downloader
+ Introduce --skip-unavailable-fragments option for fragment based downloaders
that allows to skip fragments unavailable due to a HTTP error
* Fix extraction of video/audio entries with src attribute in
_parse_html5_media_entries (#10540)
Extractors
* [theplatform] Relax URL regular expression (#10546)
* [youtube:playlist] Extend URL regular expression
* [rottentomatoes] Delegate extraction to internetvideoarchive extractor
* [internetvideoarchive] Extract all formats
* [pornvoisines] Fix extraction (#10469)
* [rottentomatoes] Fix extraction (#10467)
* [espn] Extend URL regular expression (#10549)
* [vimple] Extend URL regular expression (#10547)
* [youtube:watchlater] Fix extraction (#10544)
* [youjizz] Fix extraction (#10437)
+ [foxnews] Add support for FoxNews Insider (#10445)
+ [fc2] Recognize Flash player URLs (#10512)
version 2016.09.03
Core
* Restore usage of NAME attribute from EXT-X-MEDIA tag for formats codes in
_extract_m3u8_formats (#10522)
* Handle semicolon in mimetype2ext
Extractors
+ [youtube] Add support for rental videos' previews (#10532)
* [youtube:playlist] Fallback to video extraction for video/playlist URLs when
no playlist is actually served (#10537)
+ [drtv] Add support for dr.dk/nyheder (#10536)
+ [facebook:plugins:video] Add extractor (#10530)
+ [go] Add extractor for *.go.com sites
* [adobepass] Check for authz_token expiration (#10527)
* [nytimes] improve extraction
* [thestar] Fix extraction (#10465)
* [glide] Fix extraction (#10478)
- [exfm] Remove extractor (#10482)
* [youporn] Fix categories and tags extraction (#10521)
+ [curiositystream] Add extractor for app.curiositystream.com
- [thvideo] Remove extractor (#10464)
* [movingimage] Fix for the new site name (#10466)
+ [cbs] Add support for once formats (#10515)
* [limelight] Skip ism snd duplicate manifests
+ [porncom] Extract categories and tags (#10510)
+ [facebook] Extract timestamp (#10508)
+ [yahoo] Extract more formats
version 2016.08.31
Extractors
* [soundcloud] Fix URL regular expression to avoid clashes with sets (#10505)
* [bandcamp:album] Fix title extraction (#10455)
* [pyvideo] Fix extraction (#10468)
+ [ctv] Add support for tsn.ca, bnn.ca and thecomedynetwork.ca (#10016)
* [9c9media] Extract more metadata
* [9c9media] Fix multiple stacks extraction (#10016)
* [adultswim] Improve video info extraction (#10492)
* [vodplatform] Improve embed regular expression
- [played] Remove extractor (#10470)
+ [tbs] Add extractor for tbs.com and tntdrama.com (#10222)
+ [cartoonnetwork] Add extractor for cartoonnetwork.com (#10110)
* [adultswim] Rework in terms of turner extractor
* [cnn] Rework in terms of turner extractor
* [nba] Rework in terms of turner extractor
+ [turner] Add base extractor for Turner Broadcasting System based sites
* [bilibili] Fix extraction (#10375)
* [openload] Fix extraction (#10408)
version 2016.08.28
Core
+ Add warning message that ffmpeg doesn't support SOCKS
* Improve thumbnail sorting
+ Extract formats from #EXT-X-MEDIA tags in _extract_m3u8_formats
* Fill IV with leading zeros for IVs shorter than 16 octets in hlsnative
+ Add ac-3 to the list of audio codecs in parse_codecs
Extractors
* [periscope:user] Fix extraction (#10453)
* [douyutv] Fix extraction (#10153, #10318, #10444)
+ [nhk:vod] Add extractor for www3.nhk.or.jp on demand (#4437, #10424)
- [trutube] Remove extractor (#10438)
+ [usanetwork] Add extractor for usanetwork.com
* [crackle] Fix extraction (#10333)
* [spankbang] Fix description and uploader extraction (#10339)
* [discoverygo] Detect cable provider restricted videos (#10425)
+ [cbc] Add support for watch.cbc.ca
* [kickstarter] Silent the warning for og:description (#10415)
* [mtvservices:embedded] Fix extraction for the new 'edge' player (#10363)
version 2016.08.24.1
Extractors
+ [pluralsight] Add support for subtitles (#9681)
version 2016.08.24
Extractors
* [youtube] Fix authentication (#10392)
* [openload] Fix extraction (#10408)
+ [bravotv] Add support for Adobe Pass (#10407)
* [bravotv] Fix clip info extraction (#10407)
* [eagleplatform] Improve embedded videos detection (#10409)
* [awaan] Fix extraction
* [mtvservices:embedded] Update config URL
+ [abc:iview] Add extractor (#6148)
version 2016.08.22
Core
* Improve formats and subtitles extension auto calculation
+ Recognize full unit names in parse_filesize
+ Add support for m3u8 manifests in HTML5 multimedia tags
* Fix octal/hexadecimal number detection in js_to_json
Extractors
+ [ivi] Add support for 720p and 1080p
+ [charlierose] Add new extractor (#10382)
* [1tv] Fix extraction (#9249)
* [twitch] Renew authentication
* [kaltura] Improve subtitles extension calculation
+ [zingmp3] Add support for video clips
* [zingmp3] Fix extraction (#10041)
* [kaltura] Improve subtitles extraction (#10279)
* [cultureunplugged] Fix extraction (#10330)
+ [cnn] Add support for money.cnn.com (#2797)
* [cbsnews] Fix extraction (#10362)
* [cbs] Fix extraction (#10393)
+ [litv] Support 'promo' URLs (#10385)
* [snotr] Fix extraction (#10338)
* [n-tv.de] Fix extraction (#10331)
* [globo:article] Relax URL and video id regular expressions (#10379)
version 2016.08.19
Core
- Remove output template description from --help
* Recognize lowercase units in parse_filesize
Extractors
+ [porncom] Add extractor for porn.com (#2251, #10251)
+ [generic] Add support for DBTV embeds
* [vk:wallpost] Fix audio extraction for new site layout
* [vk] Fix authentication
+ [hgtvcom:show] Add extractor for hgtv.com shows (#10365)
+ [discoverygo] Add support for another GO network sites
version 2016.08.17
Core
+ Add _get_netrc_login_info
Extractors
* [mofosex] Extract all formats (#10335)
+ [generic] Add support for vbox7 embeds
+ [vbox7] Add support for embed URLs
+ [viafree] Add extractor (#10358)
+ [mtg] Add support for viafree URLs (#10358)
* [theplatform] Extract all subtitles per language
+ [xvideos] Fix HLS extraction (#10356)
+ [amcnetworks] Add extractor
+ [bbc:playlist] Add support for pagination (#10349)
+ [fxnetworks] Add extractor (#9462)
* [cbslocal] Fix extraction for SendtoNews-based videos
* [sendtonews] Fix extraction
* [jwplatform] Extract video id from JWPlayer data
- [zippcast] Remove extractor (#10332)
+ [viceland] Add extractor (#8799)
+ [adobepass] Add base extractor for Adobe Pass Authentication
* [life:embed] Improve extraction
* [vgtv] Detect geo restricted videos (#10348)
+ [uplynk] Add extractor
* [xiami] Fix extraction (#10342)
version 2016.08.13
Core
* Show progress for curl external downloader
* Forward more options to curl external downloader
Extractors
* [pbs] Fix description extraction
* [franceculture] Fix extraction (#10324)
* [pornotube] Fix extraction (#10322)
* [4tube] Fix metadata extraction (#10321)
* [imgur] Fix width and height extraction (#10325)
* [expotv] Improve extraction
+ [vbox7] Fix extraction (#10309)
- [tapely] Remove extractor (#10323)
* [muenchentv] Fix extraction (#10313)
+ [24video] Add support for .me and .xxx TLDs
* [24video] Fix comment count extraction
* [sunporno] Add support for embed URLs
* [sunporno] Fix metadata extraction (#10316)
+ [hgtv] Add extractor for hgtv.ca (#3999)
- [pbs] Remove request to unavailable API
+ [pbs] Add support for high quality HTTP formats
+ [crunchyroll] Add support for HLS formats (#10301)
version 2016.08.12
Core
* Subtitles are now written as is. Newline conversions are disabled. (#10268)
+ Recognize more formats in unified_timestamp
Extractors
- [goldenmoustache] Remove extractor (#10298)
* [drtuber] Improve title extraction
* [drtuber] Make dislike count optional (#10297)
* [chirbit] Fix extraction (#10296)
* [francetvinfo] Relax URL regular expression
* [rtlnl] Relax URL regular expression (#10282)
* [formula1] Relax URL regular expression (#10283)
* [wat] Improve extraction (#10281)
* [ctsnews] Fix extraction
version 2016.08.10
Core
* Make --metadata-from-title non fatal when title does not match the pattern
* Introduce options for randomized sleep before each download
--min-sleep-interval and --max-sleep-interval (#9930)
* Respect default in _search_json_ld
Extractors
+ [uol] Add extractor for uol.com.br (#4263)
* [rbmaradio] Fix extraction and extract all formats (#10242)
+ [sonyliv] Add extractor for sonyliv.com (#10258)
* [aparat] Fix extraction
* [cwtv] Extract HTTP formats
+ [rozhlas] Add extractor for prehravac.rozhlas.cz (#10253)
* [kuwo:singer] Fix extraction
version 2016.08.07
Core
+ Add support for TV Parental Guidelines ratings in parse_age_limit
+ Add decode_png (#9706)
+ Add support for partOfTVSeries in JSON-LD
* Lower master M3U8 manifest preference for better format sorting
Extractors
+ [discoverygo] Add extractor (#10245)
* [flipagram] Make JSON-LD extraction non fatal
* [generic] Make JSON-LD extraction non fatal
+ [bbc] Add support for morph embeds (#10239)
* [tnaflixnetworkbase] Improve title extraction
* [tnaflix] Fix metadata extraction (#10249)
* [fox] Fix theplatform release URL query
* [openload] Fix extraction (#9706)
* [bbc] Skip duplicate manifest URLs
* [bbc] Improve format code
+ [bbc] Add support for DASH and F4M
* [bbc] Improve format sorting and listing
* [bbc] Improve playlist extraction
+ [pokemon] Add extractor (#10093)
+ [condenast] Add fallback scenario for video info extraction
version 2016.08.06
Core
* Add support for JSON-LD root list entries (#10203)
* Improve unified_timestamp
* Lower preference of RTSP formats in generic sorting
+ Add support for multiple properties in _og_search_property
* Improve password hiding from verbose output
Extractors
+ [adultswim] Add support for trailers (#10235)
* [archiveorg] Improve extraction (#10219)
+ [jwplatform] Add support for playlists
+ [jwplatform] Add support for relative URLs
* [jwplatform] Improve audio detection
+ [tvplay] Capture and output native error message
+ [tvplay] Extract series metadata
+ [tvplay] Add support for subtitles (#10194)
* [tvp] Improve extraction (#7799)
* [cbslocal] Fix timestamp parsing (#10213)
+ [naver] Add support for subtitles (#8096)
* [naver] Improve extraction
* [condenast] Improve extraction
* [engadget] Relax URL regular expression
* [5min] Fix extraction
+ [nationalgeographic] Add support for Episode Guide
+ [kaltura] Add support for subtitles
* [kaltura] Optimize network requests
+ [vodplatform] Add extractor for vod-platform.net
- [gamekings] Remove extractor
* [limelight] Extract HTTP formats
* [ntvru] Fix extraction
+ [comedycentral] Re-add :tds and :thedailyshow shortnames
version 2016.08.01
Fixed/improved extractors
- [yandexmusic:track] Adapt to changes in track location JSON (#10193)
- [bloomberg] Support another form of player (#10187)
- [limelight] Skip DRM protected videos
- [safari] Relax regular expressions for URL matching (#10202)
- [cwtv] Add support for cwtvpr.com (#10196)
version 2016.07.30
Fixed/improved extractors
- [twitch:clips] Sort formats
- [tv2] Use m3u8_native
- [tv2:article] Fix video detection (#10188)
- rtve (#10076)
- [dailymotion:playlist] Optimize download archive processing (#10180)
version 2016.07.28
Fixed/improved extractors
- shared (#10170)
- soundcloud (#10179)
- twitch (#9767)
version 2016.07.26.2
Fixed/improved extractors
- smotri
- camdemy
- mtv
- comedycentral
- cmt
- cbc
- mgtv
- orf
version 2016.07.24
New extractors
- arkena (#8682)
- lcp (#8682)
Fixed/improved extractors
- facebook (#10151)
- dailymail
- telegraaf
- dcn
- onet
- tvp
Miscellaneous
- Support $Time$ in DASH manifests
version 2016.07.22
New extractors
- odatv (#9285)
Fixed/improved extractors
- bbc
- youjizz (#10131)
- youtube (#10140)
- pornhub (#10138)
- eporner (#10139)
version 2016.07.17
New extractors
- nintendo (#9986)
- streamable (#9122)
Fixed/improved extractors
- ard (#10095)
- mtv
- comedycentral (#10101)
- viki (#10098)
- spike (#10106)
Miscellaneous
- Improved twitter player detection (#10090)
version 2016.07.16
New extractors
- ninenow (#5181)
Fixed/improved extractors
- rtve (#10076)
- brightcove
- 3qsdn
- syfy (#9087, #3820, #2388)
- youtube (#10083)
Miscellaneous
- Fix subtitle embedding for video-only and audio-only files (#10081)
version 2016.07.13
New extractors
- rudo
Fixed/improved extractors
- biobiochiletv
- tvplay
- dbtv
- brightcove
- tmz
- youtube (#10059)
- shahid (#10062)
- vk
- ellentv (#10067)
version 2016.07.11
New Extractors
- roosterteeth (#9864)
Fixed/improved extractors
- miomio (#9605)
- vuclip
- youtube
- vidzi (#10058)
version 2016.07.09.2
Fixed/improved extractors
- vimeo (#1638)
- facebook (#10048)
- lynda (#10047)
- animeondemand
Fixed/improved features
- Embedding subtitles no longer throws an error with problematic inputs (#9063)
version 2016.07.09.1
Fixed/improved extractors
- youtube
- ard
- srmediatek (#9373)
version 2016.07.09
New extractors
- Flipagram (#9898)
Fixed/improved extractors
- telecinco
- toutv
- radiocanada
- tweakers (#9516)
- lynda
- nick (#7542)
- polskieradio (#10028)
- le
- facebook (#9851)
- mgtv
- animeondemand (#10031)
Fixed/improved features
- `--postprocessor-args` and `--downloader-args` now accepts non-ASCII inputs
on non-Windows systems
version 2016.07.07
New extractors
- kamcord (#10001)
Fixed/improved extractors
- spiegel (#10018)
- metacafe (#8539, #3253)
- onet (#9950)
- francetv (#9955)
- brightcove (#9965)
- daum (#9972)
version 2016.07.06
Fixed/improved extractors
- youtube (#10007, #10009)
- xuite
- stitcher
- spiegel
- slideshare
- sandia
- rtvnh
- prosiebensat1
- onionstudios
version 2016.07.05
Fixed/improved extractors
- brightcove
- yahoo (#9995)
- pornhub (#9997)
- iqiyi
- kaltura (#5557)
- la7
- Changed features
- Rename --cn-verfication-proxy to --geo-verification-proxy
Miscellaneous
- Add script for displaying downloads statistics
version 2016.07.03.1
Fixed/improved extractors
- theplatform
- aenetworks
- nationalgeographic
- hrti (#9482)
- facebook (#5701)
- buzzfeed (#5701)
- rai (#8617, #9157, #9232, #8552, #8551)
- nationalgeographic (#9991)
- iqiyi
version 2016.07.03
New extractors
- hrti (#9482)
Fixed/improved extractors
- vk (#9981)
- facebook (#9938)
- xtube (#9953, #9961)
version 2016.07.02
New extractors
- fusion (#9958)
Fixed/improved extractors
- twitch (#9975)
- vine (#9970)
- periscope (#9967)
- pornhub (#8696)
version 2016.07.01
New extractors
- 9c9media
- ctvnews (#2156)
- ctv (#4077)
Fixed/Improved extractors
- rds
- meta (#8789)
- pornhub (#9964)
- sixplay (#2183)
New features
- Accept quoted strings across multiple lines (#9940)

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
@@ -12,15 +12,7 @@ SHAREDIR ?= $(PREFIX)/share
PYTHON ?= /usr/bin/env python
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
ifeq ($(PREFIX),/usr)
SYSCONFDIR=/etc
else
ifeq ($(PREFIX),/usr/local)
SYSCONFDIR=/etc
else
SYSCONFDIR=$(PREFIX)/etc
endif
endif
SYSCONFDIR != if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
install -d $(DESTDIR)$(BINDIR)
@@ -45,7 +37,7 @@ test:
ot: offlinetest
offlinetest: codetest
$(PYTHON) -m nose --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
$(PYTHON) -m nose --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py --exclude test_socks.py
tar: youtube-dl.tar.gz
@@ -67,6 +59,9 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
@@ -74,7 +69,7 @@ README.txt: README.md
pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md
$(PYTHON) devscripts/prepare_manpage.py >youtube-dl.1.temp.md
$(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md
pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md
@@ -93,7 +88,13 @@ youtube-dl.fish: youtube_dl/*.py youtube_dl/*/*.py devscripts/fish-completion.in
fish-completion: youtube-dl.fish
youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
lazy-extractors: youtube_dl/extractor/lazy_extractors.py
_EXTRACTOR_FILES != find youtube_dl/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py'
youtube_dl/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
$(PYTHON) devscripts/make_lazy_extractors.py $@
youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish ChangeLog
@tar -czf youtube-dl.tar.gz --transform "s|^|youtube-dl/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \
--exclude '*.kate-swp' \
@@ -106,7 +107,7 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
--exclude 'docs/_build' \
-- \
bin devscripts test youtube_dl docs \
LICENSE README.md README.txt \
ChangeLog LICENSE README.md README.txt \
Makefile MANIFEST.in youtube-dl.1 youtube-dl.bash-completion \
youtube-dl.zsh youtube-dl.fish setup.py \
youtube-dl

376
README.md
View File

@@ -17,7 +17,7 @@ youtube-dl - download videos from youtube.com or other video platforms
To install it right away for all UNIX users (Linux, OS X, etc.), type:
sudo curl https://yt-dl.org/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
If you do not have curl, you can alternatively use a recent wget:
@@ -25,20 +25,26 @@ If you do not have curl, you can alternatively use a recent wget:
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
Windows users can [download a .exe file](https://yt-dl.org/latest/youtube-dl.exe) and place it in their home directory or any other location on their [PATH](http://en.wikipedia.org/wiki/PATH_%28variable%29).
OS X users can install **youtube-dl** with [Homebrew](http://brew.sh/).
brew install youtube-dl
Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.exe) and place it in any location on their [PATH](http://en.wikipedia.org/wiki/PATH_%28variable%29) except for `%SYSTEMROOT%\System32` (e.g. **do not** put in `C:\Windows\System32`).
You can also use pip:
sudo pip install youtube-dl
sudo pip install --upgrade youtube-dl
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
OS X users can install youtube-dl with [Homebrew](http://brew.sh/):
brew install youtube-dl
Or with [MacPorts](https://www.macports.org/):
sudo port install youtube-dl
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
**youtube-dl** is a command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
@@ -73,8 +79,8 @@ which means you can modify it, redistribute it or use it however you like.
repairs broken URLs, but emits an error if
this is not possible instead of searching.
--ignore-config Do not read configuration files. When given
in the global configuration file /etc
/youtube-dl.conf: Do not read the user
in the global configuration file
/etc/youtube-dl.conf: Do not read the user
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
@@ -83,11 +89,15 @@ which means you can modify it, redistribute it or use it however you like.
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (YouTube only)
--no-color Do not emit color codes in output
--abort-on-unavailable-fragment Abort downloading when some fragment is not
available
## Network Options:
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in
an empty string (--proxy "") for direct
connection
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
To enable experimental SOCKS proxy, specify
a proper scheme. For example
socks5://127.0.0.1:1080/. Pass in an empty
string (--proxy "") for direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to
(experimental)
@@ -95,9 +105,9 @@ which means you can modify it, redistribute it or use it however you like.
(experimental)
-6, --force-ipv6 Make all connections via IPv6
(experimental)
--cn-verification-proxy URL Use this proxy to verify the IP address for
some Chinese sites. The default proxy
specified by --proxy (or none, if the
--geo-verification-proxy URL Use this proxy to verify the IP address for
some geo-restricted sites. The default
proxy specified by --proxy (or none, if the
options is not present) is used for the
actual downloading. (experimental)
@@ -160,10 +170,15 @@ which means you can modify it, redistribute it or use it however you like.
(experimental)
## Download Options:
-r, --rate-limit LIMIT Maximum download rate in bytes per second
-r, --limit-rate RATE Maximum download rate in bytes per second
(e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or
"infinite".
--fragment-retries RETRIES Number of retries for a fragment (default
is 10), or "infinite" (DASH and hlsnative
only)
--skip-unavailable-fragments Skip unavailable fragments (DASH and
hlsnative only)
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer Do not automatically adjust the buffer
@@ -174,7 +189,9 @@ which means you can modify it, redistribute it or use it however you like.
--xattr-set-filesize Set file xattribute ytdl.filesize with
expected filesize (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg (experimental)
ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
downloader
--hls-use-mpegts Use the mpegts container for HLS videos,
allowing to play the video while
downloading (some players may not be able
@@ -189,32 +206,8 @@ which means you can modify it, redistribute it or use it however you like.
-a, --batch-file FILE File containing URLs to download ('-' for
stdin)
--id Use only video ID in file name
-o, --output TEMPLATE Output filename template. Use %(title)s to
get the title, %(uploader)s for the
uploader name, %(uploader_id)s for the
uploader nickname if different,
%(autonumber)s to get an automatically
incremented number, %(ext)s for the
filename extension, %(format)s for the
format description (like "22 - 1280x720" or
"HD"), %(format_id)s for the unique id of
the format (like YouTube's itags: "137"),
%(upload_date)s for the upload date
(YYYYMMDD), %(extractor)s for the provider
(youtube, metacafe, etc), %(id)s for the
video id, %(playlist_title)s,
%(playlist_id)s, or %(playlist)s (=title if
present, ID otherwise) for the playlist the
video is in, %(playlist_index)s for the
position in the playlist. %(height)s and
%(width)s for the width and height of the
video format. %(resolution)s for a textual
description of the resolution of the video
format. %% for a literal percent. Use - to
output to stdout. Can also be used to
download to a different directory, for
example with -o '/my/downloads/%(uploader)s
/%(title)s-%(id)s.%(ext)s' .
-o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info
--autonumber-size NUMBER Specify the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
@@ -243,18 +236,19 @@ which means you can modify it, redistribute it or use it however you like.
--write-info-json Write video metadata to a .info.json file
--write-annotations Write video annotations to a
.annotations.xml file
--load-info FILE JSON file containing the video information
--load-info-json FILE JSON file containing the video information
(created with the "--write-info-json"
option)
--cookies FILE File to read cookies from and dump cookie
jar in
--cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information
permanently. By default $XDG_CACHE_HOME
/youtube-dl or ~/.cache/youtube-dl . At the
moment, only YouTube player files (for
videos with obfuscated signatures) are
cached, but that may change.
permanently. By default
$XDG_CACHE_HOME/youtube-dl or
~/.cache/youtube-dl . At the moment, only
YouTube player files (for videos with
obfuscated signatures) are cached, but that
may change.
--no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files
@@ -317,7 +311,15 @@ which means you can modify it, redistribute it or use it however you like.
bidirectional text support. Requires bidiv
or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each
download.
download when used alone or a lower bound
of a range for randomized sleep before each
download (minimum possible number of
seconds to sleep) when used along with
--max-sleep-interval.
--max-sleep-interval SECONDS Upper bound of a range for randomized sleep
before each download (maximum possible
number of seconds to sleep). Must only be
used along with --min-sleep-interval.
## Video Format Options:
-f, --format FORMAT Video format code, see the "FORMAT
@@ -356,6 +358,17 @@ which means you can modify it, redistribute it or use it however you like.
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku)
## Adobe Pass Options:
--ap-mso MSO Adobe Pass multiple-system operator (TV
provider) identifier, use --ap-list-mso for
a list of available MSOs
--ap-username USERNAME Multiple-system operator account login
--ap-password PASSWORD Multiple-system operator account password.
If this option is left out, youtube-dl will
ask interactively.
--ap-list-mso List all supported multiple-system
operators
## Post-processing Options:
-x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or
@@ -376,8 +389,8 @@ which means you can modify it, redistribute it or use it however you like.
--no-post-overwrites Do not overwrite post-processed files; the
post-processed files are overwritten by
default
--embed-subs Embed subtitles in the video (only for mkv
and mp4 videos)
--embed-subs Embed subtitles in the video (only for mp4,
webm and mkv videos)
--embed-thumbnail Embed thumbnail in the audio as cover art
--add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title /
@@ -411,13 +424,22 @@ which means you can modify it, redistribute it or use it however you like.
# CONFIGURATION
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`.
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and OS X, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. Note that by default configuration file may not exist so you may need to create it yourself.
For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
```
# Lines starting with # are comments
# Always extract audio
-x
# Do not copy the mtime
--no-mtime
# Use this proxy
--proxy 127.0.0.1:3128
# Save all videos under Movies directory in your home directory
-o ~/Movies/%(title)s.%(ext)s
```
@@ -427,7 +449,7 @@ You can use `--ignore-config` if you want to disable the configuration file for
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on per extractor basis. For that you will need to create a`.netrc` file in your `$HOME` and restrict permissions to read/write by you only:
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by you only:
```
touch $HOME/.netrc
chmod a-rwx,u+rw $HOME/.netrc
@@ -461,7 +483,7 @@ The basic usage is not to set any template arguments when downloading a single f
- `display_id`: An alternative identifier for the video
- `uploader`: Full name of the video uploader
- `license`: License name the video is licensed under
- `creator`: The main artist who created the video
- `creator`: The creator of the video
- `release_date`: The date (YYYYMMDD) when the video was released
- `timestamp`: UNIX timestamp of the moment the video became available
- `upload_date`: Video upload date (YYYYMMDD)
@@ -498,6 +520,9 @@ The basic usage is not to set any template arguments when downloading a single f
- `autonumber`: Five-digit number that will be increased with each download, starting at zero
- `playlist`: Name or id of the playlist that contains the video
- `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist
- `playlist_id`: Playlist identifier
- `playlist_title`: Playlist title
Available for the video that belongs to some logical chapter or section:
- `chapter`: Name or title of the chapter the video belongs to
@@ -513,6 +538,18 @@ Available for the video that is an episode of some series or programme:
- `episode_number`: Number of the video episode within a season
- `episode_id`: Id of the video episode
Available for the media that is a track or a part of a music album:
- `track`: Title of the track
- `track_number`: Number of the track within an album or a disc
- `track_id`: Id of the track
- `artist`: Artist(s) of the track
- `genre`: Genre(s) of the track
- `album`: Title of the album the track belongs to
- `album_type`: Type of the album
- `album_artist`: List of all artists appeared on the album
- `disc_number`: Number of the disc or other physical medium the track belongs to
- `release_year`: Year (YYYY) when the album was released
Each aforementioned sequence when referenced in output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by particular extractor, such sequences will be replaced with `NA`.
For example for `-o %(title)s-%(id)s.%(ext)s` and mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj` this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
@@ -525,6 +562,10 @@ The current default template is `%(title)s-%(id)s.%(ext)s`.
In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
#### Output template and Windows batch files
If you are using output template inside a Windows batch file then you must escape plain percent characters (`%`) by doubling, so that `-o "%(title)s-%(id)s.%(ext)s"` should become `-o "%%(title)s-%%(id)s.%%(ext)s"`. However you should not touch `%`'s that are not plain characters, e.g. environment variables for expansion should stay intact: `-o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s"`.
#### Output template examples
Note on Windows you may need to use double quotes instead of single.
@@ -598,6 +639,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
- `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by video hoster.
@@ -627,7 +669,11 @@ $ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
$ youtube-dl -f '(bestvideo+bestaudio/best)[protocol^=http]'
# Download the best video format and the best audio format without merging them
$ youtube-dl -f 'bestvideo,bestaudio' -o '%(title)s.f%(format_id)s.%(ext)s'
```
Note that in the last example, an output template is recommended as bestvideo and bestaudio may have the same file name.
# VIDEO SELECTION
@@ -674,12 +720,20 @@ hash -r
Again, from then on you'll be able to update with `sudo youtube-dl -U`.
### youtube-dl is extremely slow to start on Windows
Add a file exclusion for `youtube-dl.exe` in Windows Defender settings.
### I'm getting an error `Unable to extract OpenGraph title` on YouTube playlists
YouTube changed their playlist format in March 2014 and later on, so you'll need at least youtube-dl 2014.07.25 to download all YouTube videos.
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging guys](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
### I'm getting an error when trying to use output template: `error: using output template conflicts with using title, video ID or auto number`
Make sure you are not using `-o` with any of these options `-t`, `--title`, `--id`, `-A` or `--auto-number` set in command line or in a configuration file. Remove the latter if any.
### Do I always have to pass `-citw`?
By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, [please file an issue where you explain that](https://yt-dl.org/bug)). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, the only option out of `-citw` that is regularly useful is `-i`.
@@ -700,7 +754,7 @@ Videos or video formats streamed via RTMP protocol can only be downloaded when [
### I have downloaded a video but how can I play it?
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](http://www.videolan.org/) or [mplayer](http://www.mplayerhq.hu/).
### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
@@ -757,9 +811,9 @@ means you're using an outdated version of Python. Please update to Python 2.6 or
Since June 2012 ([#342](https://github.com/rg3/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws a *Runtime error from Visual C++*
### The exe throws an error due to missing `MSVCR100.dll`
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
To run the exe you need to install first the [Microsoft Visual C++ 2010 Redistributable Package (x86)](https://www.microsoft.com/en-US/download/details.aspx?id=5555).
### On Windows, how should I set up ffmpeg and youtube-dl? Where should I put the exe files?
@@ -782,10 +836,42 @@ Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the opt
### How do I pass cookies to youtube-dl?
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`. Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
### How do I stream directly to media player?
You will first need to tell youtube-dl to stream media to stdout with `-o -`, and also tell your media player to read from stdin (it must be capable of this for streaming) and then pipe former to latter. For example, streaming to [vlc](http://www.videolan.org/) can be achieved with:
youtube-dl -o - "http://www.youtube.com/watch?v=BaW_jenozKcj" | vlc -
### How do I download only new videos from a playlist?
Use download-archive feature. With this feature you should initially download the complete playlist with `--download-archive /path/to/download/archive/file.txt` that will record identifiers of all the videos in a special file. Each subsequent run with the same `--download-archive` will download only new videos and skip all videos that have been downloaded before. Note that only successful downloads are recorded in the file.
For example, at first,
youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
will download the complete `PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re` playlist and create a file `archive.txt`. Each subsequent run will only download new videos if any:
youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
### Should I add `--hls-prefer-native` into my config?
When youtube-dl detects an HLS video, it can download it either with the built-in downloader or ffmpeg. Since many HLS streams are slightly invalid and ffmpeg/youtube-dl each handle some invalid cases better than the other, there is an option to switch the downloader if needed.
When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](http://rg3.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
### Can you add support for this anime video site, or site which shows current movies for free?
As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
@@ -814,6 +900,12 @@ It is *not* possible to detect whether a URL is supported or not. That's because
If you want to find out whether a given URL is supported, simply call youtube-dl with it. If you get no videos back, chances are the URL is either not referring to a video or unsupported. You can find out which by examining the output (if you run youtube-dl on the console) or catching an `UnsupportedError` exception if you run it from a Python program.
# Why do I need to go through that much red tape when filing bugs?
Before we had the issue template, despite our extensive [bug reporting instructions](#bugs), about 80% of the issue reports we got were useless, for instance because people used ancient versions hundreds of releases old, because of simple syntactic errors (not in youtube-dl but in general shell usage), because the problem was alrady reported multiple times before, because people did not actually read an error message, even if it said "please install ffmpeg", because people did not mention the URL they were trying to download and many more simple, easy-to-avoid problems, many of whom were totally unrelated to youtube-dl.
youtube-dl is an open-source project manned by too few volunteers, so we'd rather spend time fixing bugs where we are certain none of those simple problems apply, and where we can be reasonably confident to be able to reproduce the issue without asking the reporter repeatedly. As such, the output of `youtube-dl -v YOUR_URL_HERE` is really all that's required to file an issue. The issue template also guides you through some basic steps you can do, such as checking that your version of youtube-dl is current.
# DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
@@ -831,7 +923,7 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need
* python
* make
* make (both GNU make and BSD make are supported)
* pandoc
* zip
* nosetests
@@ -843,9 +935,17 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
3. Start a new git branch with
cd youtube-dl
git checkout -b yourextractor
4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
```python
# coding: utf-8
from __future__ import unicode_literals
@@ -886,22 +986,154 @@ After you have ensured this site is distributing it's content legally, you can f
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](http://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions!
## youtube-dl coding conventions
This section introduces a guide lines for writing idiomatic, robust and future-proof extractor code.
Extractors are very fragile by nature since they depend on the layout of the source data provided by 3rd party media hoster out of your control and this layout tend to change. As an extractor implementer your task is not only to write code that will extract media links and metadata correctly but also to minimize code dependency on source's layout changes and even to make the code foresee potential future changes and be ready for that. This is important because it will allow extractor not to break on minor layout changes thus keeping old youtube-dl versions working. Even though this breakage issue is easily fixed by emitting a new version of youtube-dl with fix incorporated all the previous version become broken in all repositories and distros' packages that may not be so prompt in fetching the update from us. Needless to say some may never receive an update at all that is possible for non rolling release distros.
### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by [information dictionary](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) or simply *info dict*. Only the following meta fields in *info dict* are considered mandatory for successful extraction process by youtube-dl:
- `id` (media identifier)
- `title` (media title)
- `url` (media download URL) or `formats`
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` to be mandatory. Thus aforementioned metafields are the critical data the extraction does not make any sense without and if any of them fail to be extracted then extractor is considered completely broken.
[Any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example
Say you have some source dictionary `meta` that you've fetched as JSON with HTTP request and it has a key `summary`:
```python
meta = self._download_json(url, video_id)
```
Assume at this point `meta`'s layout is:
```python
{
...
"summary": "some fancy summary text",
...
}
```
Assume you want to extract `summary` and put into resulting info dict as `description`. Since `description` is optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
```
and not like:
```python
description = meta['summary'] # incorrect
```
The latter will break extraction process with `KeyError` if `summary` disappears from `meta` at some time later but with former approach extraction will just go ahead with `description` set to `None` that is perfectly fine (remember `None` is equivalent for absence of data).
Similarly, you should pass `fatal=False` when extracting optional data from a webpage with `_search_regex`, `_html_search_regex` or similar methods, for instance:
```python
description = self._search_regex(
r'<span[^>]+id="title"[^>]*>([^<]+)<',
webpage, 'description', fatal=False)
```
With `fatal` set to `False` if `_search_regex` fails to extract `description` it will emit a warning and continue extraction.
You can also pass `default=<some fallback value>`, for example:
```python
description = self._search_regex(
r'<span[^>]+id="title"[^>]*>([^<]+)<',
webpage, 'description', default=None)
```
On failure this code will silently continue the extraction with `description` set to `None`. That is useful for metafields that are known to may or may not be present.
### Provide fallbacks
When extracting metadata try to provide several scenarios for that. For example if `title` is present in several places/sources try extracting from at least some of them. This would make it more future-proof in case some of the sources became unavailable.
#### Example
Say `meta` from previous example has a `title` and you are about to extract it. Since `title` is mandatory meta field you should end up with something like:
```python
title = meta['title']
```
If `title` disappeares from `meta` in future due to some changes on hoster's side the extraction would fail since `title` is mandatory. That's expected.
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
```python
title = meta.get('title') or self._og_search_title(webpage)
```
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
### Make regular expressions flexible
When using regular expressions try to write them fuzzy and flexible.
#### Example
Say you need to extract `title` from the following HTML code:
```html
<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">some fancy title</span>
```
The code for that task should look similar to:
```python
title = self._search_regex(
r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
```
Or even better:
```python
title = self._search_regex(
r'<span[^>]+class=(["\'])title\1[^>]*>(?P<title>[^<]+)',
webpage, 'title', group='title')
```
Note how you tolerate potential changes in `style` attribute's value or switch from using double quotes to single for `class` attribute:
The code definitely should not look like:
```python
title = self._search_regex(
r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>',
webpage, 'title', group='title')
```
### Use safe conversion functions
Wrap all extracted numeric data into safe functions from `utils`: `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
# EMBEDDING YOUTUBE-DL
youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/rg3/youtube-dl/issues/new).
@@ -917,7 +1149,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of what can be done, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L121-L269). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@@ -1008,7 +1240,7 @@ Make sure that someone has not already opened the issue you're trying to open. S
### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report?

View File

@@ -1,17 +1,38 @@
#!/usr/bin/python3
from http.server import HTTPServer, BaseHTTPRequestHandler
from socketserver import ThreadingMixIn
import argparse
import ctypes
import functools
import shutil
import subprocess
import sys
import tempfile
import threading
import traceback
import os.path
sys.path.insert(0, os.path.dirname(os.path.dirname((os.path.abspath(__file__)))))
from youtube_dl.compat import (
compat_input,
compat_http_server,
compat_str,
compat_urlparse,
)
class BuildHTTPServer(ThreadingMixIn, HTTPServer):
# These are not used outside of buildserver.py thus not in compat.py
try:
import winreg as compat_winreg
except ImportError: # Python 2
import _winreg as compat_winreg
try:
import socketserver as compat_socketserver
except ImportError: # Python 2
import SocketServer as compat_socketserver
class BuildHTTPServer(compat_socketserver.ThreadingMixIn, compat_http_server.HTTPServer):
allow_reuse_address = True
@@ -191,7 +212,7 @@ def main(args=None):
action='store_const', dest='action', const='service',
help='Run as a Windows service')
parser.add_argument('-b', '--bind', metavar='<host:port>',
action='store', default='localhost:8142',
action='store', default='0.0.0.0:8142',
help='Bind to host:port (default %default)')
options = parser.parse_args(args=args)
@@ -216,7 +237,7 @@ def main(args=None):
srv = BuildHTTPServer((host, port), BuildHTTPRequestHandler)
thr = threading.Thread(target=srv.serve_forever)
thr.start()
input('Press ENTER to shut down')
compat_input('Press ENTER to shut down')
srv.shutdown()
thr.join()
@@ -231,8 +252,6 @@ def rmtree(path):
os.remove(fname)
os.rmdir(path)
#==============================================================================
class BuildError(Exception):
def __init__(self, output, code=500):
@@ -249,15 +268,25 @@ class HTTPError(BuildError):
class PythonBuilder(object):
def __init__(self, **kwargs):
pythonVersion = kwargs.pop('python', '2.7')
try:
key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, r'SOFTWARE\Python\PythonCore\%s\InstallPath' % pythonVersion)
python_version = kwargs.pop('python', '3.4')
python_path = None
for node in ('Wow6432Node\\', ''):
try:
self.pythonPath, _ = _winreg.QueryValueEx(key, '')
finally:
_winreg.CloseKey(key)
except Exception:
raise BuildError('No such Python version: %s' % pythonVersion)
key = compat_winreg.OpenKey(
compat_winreg.HKEY_LOCAL_MACHINE,
r'SOFTWARE\%sPython\PythonCore\%s\InstallPath' % (node, python_version))
try:
python_path, _ = compat_winreg.QueryValueEx(key, '')
finally:
compat_winreg.CloseKey(key)
break
except Exception:
pass
if not python_path:
raise BuildError('No such Python version: %s' % python_version)
self.pythonPath = python_path
super(PythonBuilder, self).__init__(**kwargs)
@@ -305,8 +334,10 @@ class YoutubeDLBuilder(object):
def build(self):
try:
subprocess.check_output([os.path.join(self.pythonPath, 'python.exe'), 'setup.py', 'py2exe'],
cwd=self.buildPath)
proc = subprocess.Popen([os.path.join(self.pythonPath, 'python.exe'), 'setup.py', 'py2exe'], stdin=subprocess.PIPE, cwd=self.buildPath)
proc.wait()
#subprocess.check_output([os.path.join(self.pythonPath, 'python.exe'), 'setup.py', 'py2exe'],
# cwd=self.buildPath)
except subprocess.CalledProcessError as e:
raise BuildError(e.output)
@@ -369,12 +400,12 @@ class Builder(PythonBuilder, GITBuilder, YoutubeDLBuilder, DownloadBuilder, Clea
pass
class BuildHTTPRequestHandler(BaseHTTPRequestHandler):
class BuildHTTPRequestHandler(compat_http_server.BaseHTTPRequestHandler):
actionDict = {'build': Builder, 'download': Builder} # They're the same, no more caching.
def do_GET(self):
path = urlparse.urlparse(self.path)
paramDict = dict([(key, value[0]) for key, value in urlparse.parse_qs(path.query).items()])
path = compat_urlparse.urlparse(self.path)
paramDict = dict([(key, value[0]) for key, value in compat_urlparse.parse_qs(path.query).items()])
action, _, path = path.path.strip('/').partition('/')
if path:
path = path.split('/')
@@ -388,7 +419,7 @@ class BuildHTTPRequestHandler(BaseHTTPRequestHandler):
builder.close()
except BuildError as e:
self.send_response(e.code)
msg = unicode(e).encode('UTF-8')
msg = compat_str(e).encode('UTF-8')
self.send_header('Content-Type', 'text/plain; charset=UTF-8')
self.send_header('Content-Length', len(msg))
self.end_headers()
@@ -400,7 +431,5 @@ class BuildHTTPRequestHandler(BaseHTTPRequestHandler):
else:
self.send_response(500, 'Malformed URL')
#==============================================================================
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,111 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import base64
import json
import mimetypes
import netrc
import optparse
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import (
compat_basestring,
compat_input,
compat_getpass,
compat_print,
compat_urllib_request,
)
from youtube_dl.utils import (
make_HTTPS_handler,
sanitized_Request,
)
class GitHubReleaser(object):
_API_URL = 'https://api.github.com/repos/rg3/youtube-dl/releases'
_UPLOADS_URL = 'https://uploads.github.com/repos/rg3/youtube-dl/releases/%s/assets?name=%s'
_NETRC_MACHINE = 'github.com'
def __init__(self, debuglevel=0):
self._init_github_account()
https_handler = make_HTTPS_handler({}, debuglevel=debuglevel)
self._opener = compat_urllib_request.build_opener(https_handler)
def _init_github_account(self):
try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None:
self._username = info[0]
self._password = info[2]
compat_print('Using GitHub credentials found in .netrc...')
return
else:
compat_print('No GitHub credentials found in .netrc')
except (IOError, netrc.NetrcParseError):
compat_print('Unable to parse .netrc')
self._username = compat_input(
'Type your GitHub username or email address and press [Return]: ')
self._password = compat_getpass(
'Type your GitHub password and press [Return]: ')
def _call(self, req):
if isinstance(req, compat_basestring):
req = sanitized_Request(req)
# Authorizing manually since GitHub does not response with 401 with
# WWW-Authenticate header set (see
# https://developer.github.com/v3/#basic-authentication)
b64 = base64.b64encode(
('%s:%s' % (self._username, self._password)).encode('utf-8')).decode('ascii')
req.add_header('Authorization', 'Basic %s' % b64)
response = self._opener.open(req).read().decode('utf-8')
return json.loads(response)
def list_releases(self):
return self._call(self._API_URL)
def create_release(self, tag_name, name=None, body='', draft=False, prerelease=False):
data = {
'tag_name': tag_name,
'target_commitish': 'master',
'name': name,
'body': body,
'draft': draft,
'prerelease': prerelease,
}
req = sanitized_Request(self._API_URL, json.dumps(data).encode('utf-8'))
return self._call(req)
def create_asset(self, release_id, asset):
asset_name = os.path.basename(asset)
url = self._UPLOADS_URL % (release_id, asset_name)
# Our files are small enough to be loaded directly into memory.
data = open(asset, 'rb').read()
req = sanitized_Request(url, data)
mime_type, _ = mimetypes.guess_type(asset_name)
req.add_header('Content-Type', mime_type or 'application/octet-stream')
return self._call(req)
def main():
parser = optparse.OptionParser(usage='%prog VERSION BUILDPATH')
options, args = parser.parse_args()
if len(args) != 2:
parser.error('Expected a version and a build directory')
version, build_path = args
releaser = GitHubReleaser()
new_release = releaser.create_release(version, name='youtube-dl %s' % version)
release_id = new_release['id']
for asset in os.listdir(build_path):
compat_print('Uploading %s...' % asset)
releaser.create_asset(release_id, os.path.join(build_path, asset))
if __name__ == '__main__':
main()

View File

@@ -15,13 +15,9 @@ data = urllib.request.urlopen(URL).read()
with open('download.html.in', 'r', encoding='utf-8') as tmplf:
template = tmplf.read()
md5sum = hashlib.md5(data).hexdigest()
sha1sum = hashlib.sha1(data).hexdigest()
sha256sum = hashlib.sha256(data).hexdigest()
template = template.replace('@PROGRAM_VERSION@', version)
template = template.replace('@PROGRAM_URL@', URL)
template = template.replace('@PROGRAM_MD5SUM@', md5sum)
template = template.replace('@PROGRAM_SHA1SUM@', sha1sum)
template = template.replace('@PROGRAM_SHA256SUM@', sha256sum)
template = template.replace('@EXE_URL@', versions_info['versions'][version]['exe'][0])
template = template.replace('@EXE_SHA256SUM@', versions_info['versions'][version]['exe'][1])

View File

@@ -0,0 +1,19 @@
# encoding: utf-8
from __future__ import unicode_literals
import re
class LazyLoadExtractor(object):
_module = None
@classmethod
def ie_key(cls):
return cls.__name__[:-2]
def __new__(cls, *args, **kwargs):
mod = __import__(cls._module, fromlist=(cls.__name__,))
real_cls = getattr(mod, cls.__name__)
instance = real_cls.__new__(real_cls)
instance.__init__(*args, **kwargs)
return instance

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import io
import optparse
def main():
parser = optparse.OptionParser(usage='%prog INFILE OUTFILE')
options, args = parser.parse_args()
if len(args) != 2:
parser.error('Expected an input and an output filename')
infile, outfile = args
with io.open(infile, encoding='utf-8') as inf:
issue_template_tmpl = inf.read()
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
out = issue_template_tmpl % {'version': locals()['__version__']}
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,98 @@
from __future__ import unicode_literals, print_function
from inspect import getsource
import os
from os.path import dirname as dirn
import sys
print('WARNING: Lazy loading extractors is an experimental feature that may not always work', file=sys.stderr)
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
lazy_extractors_filename = sys.argv[1]
if os.path.exists(lazy_extractors_filename):
os.remove(lazy_extractors_filename)
from youtube_dl.extractor import _ALL_CLASSES
from youtube_dl.extractor.common import InfoExtractor, SearchInfoExtractor
with open('devscripts/lazy_load_template.py', 'rt') as f:
module_template = f.read()
module_contents = [
module_template + '\n' + getsource(InfoExtractor.suitable) + '\n',
'class LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n']
ie_template = '''
class {name}({bases}):
_VALID_URL = {valid_url!r}
_module = '{module}'
'''
make_valid_template = '''
@classmethod
def _make_valid_url(cls):
return {valid_url!r}
'''
def get_base_name(base):
if base is InfoExtractor:
return 'LazyLoadExtractor'
elif base is SearchInfoExtractor:
return 'LazyLoadSearchExtractor'
else:
return base.__name__
def build_lazy_ie(ie, name):
valid_url = getattr(ie, '_VALID_URL', None)
s = ie_template.format(
name=name,
bases=', '.join(map(get_base_name, ie.__bases__)),
valid_url=valid_url,
module=ie.__module__)
if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
s += '\n' + getsource(ie.suitable)
if hasattr(ie, '_make_valid_url'):
# search extractors
s += make_valid_template.format(valid_url=ie._make_valid_url())
return s
# find the correct sorting and add the required base classes so that sublcasses
# can be correctly created
classes = _ALL_CLASSES[:-1]
ordered_cls = []
while classes:
for c in classes[:]:
bases = set(c.__bases__) - set((object, InfoExtractor, SearchInfoExtractor))
stop = False
for b in bases:
if b not in classes and b not in ordered_cls:
if b.__name__ == 'GenericIE':
exit()
classes.insert(0, b)
stop = True
if stop:
break
if all(b in ordered_cls for b in bases):
ordered_cls.append(c)
classes.remove(c)
break
ordered_cls.append(_ALL_CLASSES[-1])
names = []
for ie in ordered_cls:
name = ie.__name__
src = build_lazy_ie(ie, name)
module_contents.append(src)
if ie in _ALL_CLASSES:
names.append(name)
module_contents.append(
'_ALL_CLASSES = [{0}]'.format(', '.join(names)))
module_src = '\n'.join(module_contents) + '\n'
with open(lazy_extractors_filename, 'wt') as f:
f.write(module_src)

View File

@@ -1,13 +1,46 @@
from __future__ import unicode_literals
import io
import optparse
import os.path
import sys
import re
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
README_FILE = os.path.join(ROOT_DIR, 'README.md')
PREFIX = '''%YOUTUBE-DL(1)
# NAME
youtube\-dl \- download videos from youtube.com or other video platforms
# SYNOPSIS
**youtube-dl** \[OPTIONS\] URL [URL...]
'''
def main():
parser = optparse.OptionParser(usage='%prog OUTFILE.md')
options, args = parser.parse_args()
if len(args) != 1:
parser.error('Expected an output filename')
outfile, = args
with io.open(README_FILE, encoding='utf-8') as f:
readme = f.read()
readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
readme = PREFIX + readme
readme = filter_options(readme)
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(readme)
def filter_options(readme):
ret = ''
@@ -21,43 +54,25 @@ def filter_options(readme):
if in_options:
if line.lstrip().startswith('-'):
option, description = re.split(r'\s{2,}', line.lstrip())
split_option = option.split(' ')
split = re.split(r'\s{2,}', line.lstrip())
# Description string may start with `-` as well. If there is
# only one piece then it's a description bit not an option.
if len(split) > 1:
option, description = split
split_option = option.split(' ')
if not split_option[-1].startswith('-'): # metavar
option = ' '.join(split_option[:-1] + ['*%s*' % split_option[-1]])
if not split_option[-1].startswith('-'): # metavar
option = ' '.join(split_option[:-1] + ['*%s*' % split_option[-1]])
# Pandoc's definition_lists. See http://pandoc.org/README.html
# for more information.
ret += '\n%s\n: %s\n' % (option, description)
else:
ret += line.lstrip() + '\n'
# Pandoc's definition_lists. See http://pandoc.org/README.html
# for more information.
ret += '\n%s\n: %s\n' % (option, description)
continue
ret += line.lstrip() + '\n'
else:
ret += line + '\n'
return ret
with io.open(README_FILE, encoding='utf-8') as f:
readme = f.read()
PREFIX = '''%YOUTUBE-DL(1)
# NAME
youtube\-dl \- download videos from youtube.com or other video platforms
# SYNOPSIS
**youtube-dl** \[OPTIONS\] URL [URL...]
'''
readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
readme = PREFIX + readme
readme = filter_options(readme)
if sys.version_info < (3, 0):
print(readme.encode('utf-8'))
else:
print(readme)
if __name__ == '__main__':
main()

View File

@@ -6,7 +6,7 @@
# * the git config user.signingkey is properly set
# You will need
# pip install coverage nose rsa
# pip install coverage nose rsa wheel
# TODO
# release notes
@@ -15,10 +15,33 @@
set -e
skip_tests=true
if [ "$1" = '--run-tests' ]; then
skip_tests=false
shift
fi
gpg_sign_commits=""
buildserver='localhost:8142'
while true
do
case "$1" in
--run-tests)
skip_tests=false
shift
;;
--gpg-sign-commits|-S)
gpg_sign_commits="-S"
shift
;;
--buildserver)
buildserver="$2"
shift 2
;;
--*)
echo "ERROR: unknown option $1"
exit 1
;;
*)
break
;;
esac
done
if [ -z "$1" ]; then echo "ERROR: specify version number like this: $0 1994.09.06"; exit 1; fi
version="$1"
@@ -33,6 +56,12 @@ if [ ! -z "`git status --porcelain | grep -v CHANGELOG`" ]; then echo 'ERROR: th
useless_files=$(find youtube_dl -type f -not -name '*.py')
if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $useless_files"; exit 1; fi
if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
if ! type pandoc >/dev/null 2>/dev/null; then echo 'ERROR: pandoc is missing'; exit 1; fi
if ! python3 -c 'import rsa' 2>/dev/null; then echo 'ERROR: python3-rsa is missing'; exit 1; fi
if ! python3 -c 'import wheel' 2>/dev/null; then echo 'ERROR: wheel is missing'; exit 1; fi
read -p "Is ChangeLog up to date? (y/n) " -n 1
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
/bin/echo -e "\n### First of all, testing..."
make clean
@@ -45,10 +74,13 @@ fi
/bin/echo -e "\n### Changing version in version.py..."
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
make README.md CONTRIBUTING.md supportedsites
git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
git commit -m "release $version"
/bin/echo -e "\n### Changing version in ChangeLog..."
sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."
git tag -s -m "Release $version" "$version"
@@ -64,7 +96,7 @@ git push origin "$version"
REV=$(git rev-parse HEAD)
make youtube-dl youtube-dl.tar.gz
read -p "VM running? (y/n) " -n 1
wget "http://localhost:8142/build/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
wget "http://$buildserver/build/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
mkdir -p "build/$version"
mv youtube-dl youtube-dl.exe "build/$version"
mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz"
@@ -74,15 +106,16 @@ RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
(cd build/$version/ && sha256sum $RELEASE_FILES > SHA2-256SUMS)
(cd build/$version/ && sha512sum $RELEASE_FILES > SHA2-512SUMS)
/bin/echo -e "\n### Signing and uploading the new binaries to yt-dl.org ..."
/bin/echo -e "\n### Signing and uploading the new binaries to GitHub..."
for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done
scp -r "build/$version" ytdl@yt-dl.org:html/tmp/
ssh ytdl@yt-dl.org "mv html/tmp/$version html/downloads/"
ROOT=$(pwd)
python devscripts/create-github-release.py $version "$ROOT/build/$version"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"
/bin/echo -e "\n### Now switching to gh-pages..."
git clone --branch gh-pages --single-branch . build/gh-pages
ROOT=$(pwd)
(
set -e
ORIGIN_URL=$(git config --get remote.origin.url)
@@ -94,7 +127,7 @@ ROOT=$(pwd)
"$ROOT/devscripts/gh-pages/update-copyright.py"
"$ROOT/devscripts/gh-pages/update-sites.py"
git add *.html *.html.in update
git commit -m "release $version"
git commit $gpg_sign_commits -m "release $version"
git push "$ROOT" gh-pages
git push "$ORIGIN_URL" gh-pages
)

View File

@@ -0,0 +1,47 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import itertools
import json
import os
import re
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import (
compat_print,
compat_urllib_request,
)
from youtube_dl.utils import format_bytes
def format_size(bytes):
return '%s (%d bytes)' % (format_bytes(bytes), bytes)
total_bytes = 0
for page in itertools.count(1):
releases = json.loads(compat_urllib_request.urlopen(
'https://api.github.com/repos/rg3/youtube-dl/releases?page=%s' % page
).read().decode('utf-8'))
if not releases:
break
for release in releases:
compat_print(release['name'])
for asset in release['assets']:
asset_name = asset['name']
total_bytes += asset['download_count'] * asset['size']
if all(not re.match(p, asset_name) for p in (
r'^youtube-dl$',
r'^youtube-dl-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
r'^youtube-dl\.exe$')):
continue
compat_print(
' %s size: %s downloads: %d'
% (asset_name, format_size(asset['size']), asset['download_count']))
compat_print('total downloads traffic: %s' % format_size(total_bytes))

View File

@@ -6,15 +6,23 @@
- **22tracks:genre**
- **22tracks:track**
- **24video**
- **3qsdn**: 3Q SDN
- **3sat**
- **4tube**
- **56.com**
- **5min**
- **8tracks**
- **91porn**
- **9c9media**
- **9c9media:stack**
- **9gag**
- **9now.com.au**
- **abc.net.au**
- **Abc7News**
- **abc.net.au:iview**
- **abcnews**
- **abcnews:video**
- **abcotvs**: ABC Owned Television Stations
- **abcotvs:clips**
- **AcademicEarth:Course**
- **acast**
- **acast:channel**
@@ -25,11 +33,13 @@
- **AdobeTVVideo**
- **AdultSwim**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network
- **AfreecaTV**: afreecatv.com
- **Aftonbladet**
- **AirMozilla**
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
- **AMCNetworks**
- **AnimeOnDemand**
- **anitube.se**
- **AnySex**
@@ -40,8 +50,8 @@
- **appletrailers:section**
- **archive.org**: archive.org videos
- **ARD**
- **ARD:mediathek**: Saarländischer Rundfunk
- **ARD:mediathek**
- **Arkena**
- **arte.tv**
- **arte.tv:+7**
- **arte.tv:cinema**
@@ -50,13 +60,20 @@
- **arte.tv:ddc**
- **arte.tv:embed**
- **arte.tv:future**
- **arte.tv:info**
- **arte.tv:magazine**
- **arte.tv:playlist**
- **AtresPlayer**
- **ATTTechChannel**
- **AudiMedia**
- **AudioBoom**
- **audiomack**
- **audiomack:album**
- **auroravid**: AuroraVid
- **AWAAN**
- **awaan:live**
- **awaan:season**
- **awaan:video**
- **Azubu**
- **AzubuLive**
- **BaiduVideo**: 百度视频
@@ -67,13 +84,18 @@
- **bbc**: BBC
- **bbc.co.uk**: BBC iPlayer
- **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist**
- **BeatportPro**
- **Beeg**
- **BehindKink**
- **BellMedia**
- **Bet**
- **Bigflix**
- **Bild**: Bild.de
- **BiliBili**
- **BioBioChileTV**
- **BIQLE**
- **BleacherReport**
- **BleacherReportCMS**
- **blinkx**
@@ -91,52 +113,70 @@
- **BYUtv**
- **Camdemy**
- **CamdemyFolder**
- **CamWithHer**
- **canalc2.tv**
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
- **Canvas**
- **CBC**
- **CBCPlayer**
- **CarambaTV**
- **CarambaTVPage**
- **CartoonNetwork**
- **cbc.ca**
- **cbc.ca:player**
- **cbc.ca:watch**
- **cbc.ca:watch:video**
- **CBS**
- **CBSInteractive**
- **CBSLocal**
- **CBSNews**: CBS News
- **CBSNewsLiveVideo**: CBS News Live Videos
- **CBSSports**
- **CCTV**
- **CDA**
- **CeskaTelevize**
- **channel9**: Channel 9
- **CharlieRose**
- **Chaturbate**
- **Chilloutzone**
- **chirbit**
- **chirbit:profile**
- **Cinchcast**
- **Cinemassacre**
- **Clipfish**
- **cliphunter**
- **ClipRs**
- **Clipsyndicate**
- **CloserToTruth**
- **cloudtime**: CloudTime
- **Cloudy**
- **Clubic**
- **Clyp**
- **cmt.com**
- **CNET**
- **CNBC**
- **CNN**
- **CNNArticle**
- **CNNBlogs**
- **CollegeHumor**
- **CollegeRama**
- **ComCarCoff**
- **ComedyCentral**
- **ComedyCentralShows**: The Daily Show / The Colbert Report
- **ComedyCentralShortname**
- **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **Coub**
- **Cracked**
- **Crackle**
- **Criterion**
- **CrooksAndLiars**
- **Crunchyroll**
- **crunchyroll:playlist**
- **CSNNE**
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **CTVNews**
- **culturebox.francetvinfo.fr**
- **CultureUnplugged**
- **curiositystream**
- **curiositystream:collection**
- **CWTV**
- **DailyMail**
- **dailymotion**
- **dailymotion:playlist**
- **dailymotion:user**
@@ -146,17 +186,15 @@
- **daum.net:playlist**
- **daum.net:user**
- **DBTV**
- **DCN**
- **dcn:live**
- **dcn:season**
- **dcn:video**
- **DctpTv**
- **DeezerPlaylist**
- **defense.gouv.fr**
- **democracynow**
- **DHM**: Filmarchiv - Deutsches Historisches Museum
- **DigitallySpeaking**
- **Digiteka**
- **Discovery**
- **DiscoveryGo**
- **Dotsub**
- **DouyuTV**: 斗鱼
- **DPlay**
@@ -166,7 +204,6 @@
- **Dropbox**
- **DrTuber**
- **DRTV**
- **Dump**
- **Dumpert**
- **dvtv**: http://video.aktualne.cz/
- **dw**
@@ -190,27 +227,32 @@
- **EsriVideo**
- **Europa**
- **EveryonesMixtape**
- **exfm**: ex.fm
- **ExpoTV**
- **ExtremeTube**
- **EyedoTV**
- **facebook**
- **FacebookPluginsVideo**
- **faz.net**
- **fc2**
- **fc2:embed**
- **Fczenit**
- **features.aol.com**
- **fernsehkritik.tv**
- **Firstpost**
- **FiveTV**
- **Flickr**
- **Flipagram**
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom**
- **Formula1**
- **FOX**
- **Foxgay**
- **FoxNews**: Fox News and Fox Business Video
- **foxnews**: Fox News and Fox Business Video
- **foxnews:article**
- **foxnews:insider**
- **FoxSports**
- **france2.fr:generation-quoi**
- **FranceCulture**
- **FranceCultureEmission**
- **FranceInter**
- **francetv**: France 2, 3, 4, 5 and Ô
- **francetvinfo.fr**
@@ -219,14 +261,14 @@
- **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **Fusion**
- **FXNetworks**
- **GameInformer**
- **Gamekings**
- **GameOne**
- **gameone:playlist**
- **Gamersyde**
- **GameSpot**
- **GameStar**
- **Gametrailers**
- **Gazeta**
- **GDCVault**
- **generic**: Generic downloader that works on some sites
@@ -236,20 +278,25 @@
- **Glide**: Glide mobile video messages (glide.me)
- **Globo**
- **GloboArticle**
- **Go**
- **GodTube**
- **GoldenMoustache**
- **GodTV**
- **Golem**
- **GoogleDrive**
- **Goshgay**
- **GPUTechConf**
- **Groupon**
- **Hark**
- **HBO**
- **HearThisAt**
- **Heise**
- **HellPorno**
- **Helsinki**: helsinki.fi
- **HentaiStigma**
- **HGTV**
- **hgtv.com:show**
- **HistoricFilms**
- **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
- **HornBunny**
@@ -257,6 +304,8 @@
- **HotStar**
- **Howcast**
- **HowStuffWorks**
- **HRTi**
- **HRTiPlaylist**
- **HuffPost**: Huffington Post
- **Hypem**
- **Iconosquare**
@@ -278,19 +327,21 @@
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV
- **Iwara**
- **Izlesene**
- **JadoreCettePub**
- **JeuxVideo**
- **Jove**
- **jpopsuki.tv**
- **JWPlatform**
- **Kaltura**
- **Kamcord**
- **KanalPlay**: Kanal 5/9/11 Play
- **Kankan**
- **Karaoketv**
- **KarriereVideos**
- **keek**
- **KeezMovies**
- **Ketnet**
- **KhanAcademy**
- **KickStarter**
- **KonserthusetPlay**
@@ -304,23 +355,30 @@
- **kuwo:mv**: 酷我音乐 - MV
- **kuwo:singer**: 酷我音乐 - 歌手
- **kuwo:song**: 酷我音乐
- **la7.tv**
- **la7.it**
- **Laola1Tv**
- **LCI**
- **Lcp**
- **LcpPlay**
- **Le**: 乐视网
- **Learnr**
- **Lecture2Go**
- **Lemonde**
- **LePlaylist**
- **LetvCloud**: 乐视云
- **Libsyn**
- **life**: Life.ru
- **life:embed**
- **lifenews**: LIFE | NEWS
- **limelight**
- **limelight:channel**
- **limelight:channel_list**
- **LiTV**
- **LiveLeak**
- **livestream**
- **livestream:original**
- **LnkGo**
- **loc**: Library of Congress
- **LocalNews8**
- **LoveHomePorn**
- **lrt.lt**
- **lynda**: lynda.com videos
@@ -330,41 +388,51 @@
- **mailru**: Видео@Mail.Ru
- **MakersChannel**
- **MakerTV**
- **Malemotion**
- **mangomolo:live**
- **mangomolo:video**
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
- **META**
- **metacafe**
- **Metacritic**
- **Mgoon**
- **MGTV**: 芒果TV
- **MiaoPai**
- **Minhateca**
- **MinistryGrid**
- **Minoto**
- **miomio.tv**
- **MiTele**: mitele.es
- **mixcloud**
- **mixcloud:playlist**
- **mixcloud:stream**
- **mixcloud:user**
- **MLB**
- **Mnet**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex**
- **Mojvideo**
- **Moniker**: allmyvideos.net and vidspot.net
- **mooshare**: Mooshare.biz
- **Morningstar**: morningstar.com
- **Motherless**
- **Motorsport**: motorsport.com
- **MovieClips**
- **MovieFap**
- **Moviezine**
- **MovingImage**
- **MPORA**
- **MSNBC**
- **MSN**
- **mtg**: MTG services
- **MTV**
- **mtv.de**
- **mtviggy.com**
- **mtvservices:embedded**
- **MuenchenTV**: münchen.tv
- **MusicPlayOn**
- **muzu.tv**
- **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses
- **Mwave**
- **MwaveMeetGreet**
- **MySpace**
- **MySpace:album**
- **MySpass**
@@ -372,11 +440,14 @@
- **myvideo** (Currently broken)
- **MyVidster**
- **n-tv.de**
- **NationalGeographic**
- **natgeo**
- **natgeo:episodeguide**
- **natgeo:video**
- **Naver**
- **NBA**
- **NBC**
- **NBCNews**
- **NBCOlympics**
- **NBCSports**
- **NBCSportsVPlayer**
- **ndr**: NDR.de - Norddeutscher Rundfunk
@@ -384,7 +455,6 @@
- **ndr:embed:base**
- **NDTV**
- **NerdCubedFeed**
- **Nerdist**
- **netease:album**: 网易云音乐 - 专辑
- **netease:djradio**: 网易云音乐 - 电台
- **netease:mv**: 网易云音乐 - MV
@@ -397,22 +467,24 @@
- **Newstube**
- **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞
- **nextmovie.com**
- **nfb**: National Film Board of Canada
- **nfl.com**
- **NhkVod**
- **nhl.com**
- **nhl.com:news**: NHL news
- **nhl.com:videocenter**: NHL videocenter category
- **nhl.com:videocenter**
- **nhl.com:videocenter:category**: NHL videocenter category
- **nick.com**
- **nick.de**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **Nintendo**
- **njoy**: N-JOY
- **njoy:embed**
- **Noco**
- **Normalboots**
- **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **novamov**: NovaMov
- **nowness**
- **nowness:playlist**
- **nowness:series**
@@ -434,12 +506,16 @@
- **NYTimes**
- **NYTimesArticle**
- **ocw.mit.edu**
- **OdaTV**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **onet.tv**
- **onet.tv:channel**
- **OnionStudios**
- **Ooyala**
- **OoyalaExternal**
- **Openload**
- **OraTV**
- **orf:fm4**: radio FM4
- **orf:iptv**: iptv.ORF.at
@@ -450,15 +526,15 @@
- **Patreon**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **pcmag**
- **Periscope**: Periscope
- **People**
- **periscope**: Periscope
- **periscope:user**: Periscope user videos
- **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de**
- **Photobucket**
- **Pinkbike**
- **Pladform**
- **PlanetaPlay**
- **play.fm**
- **played.to**
- **PlaysTV**
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
- **Playvid**
@@ -468,13 +544,18 @@
- **plus.google**: Google Plus
- **pluzz.francetv.fr**
- **podomatic**
- **Pokemon**
- **PolskieRadio**
- **PolskieRadioCategory**
- **PornCom**
- **PornHd**
- **PornHub**
- **PornHub**: PornHub and Thumbzilla
- **PornHubPlaylist**
- **PornHubUserVideos**
- **Pornotube**
- **PornoVoisines**
- **PornoXO**
- **PressTV**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
@@ -485,10 +566,12 @@
- **qqmusic:playlist**: QQ音乐 - 歌单
- **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuickVid**
- **R7**
- **R7Article**
- **radio.de**
- **radiobremen**
- **radiocanada**
- **RadioCanadaAudioVideo**
- **radiofrance**
- **RadioJavan**
- **Rai**
@@ -498,12 +581,18 @@
- **RedTube**
- **RegioTV**
- **Restudy**
- **Reuters**
- **ReverbNation**
- **Revision3**
- **revision**
- **revision3:embed**
- **RICE**
- **RingTV**
- **RMCDecouverte**
- **RockstarGames**
- **RoosterTeeth**
- **RottenTomatoes**
- **Roxwel**
- **Rozhlas**
- **RTBF**
- **rte**: Raidió Teilifís Éireann TV
- **rte:radio**: Raidió Teilifís Éireann radio
@@ -514,7 +603,9 @@
- **rtve.es:alacarta**: RTVE a la carta
- **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams
- **rtve.es:television**
- **RTVNH**
- **Rudo**
- **RUHD**
- **RulePorn**
- **rutube**: Rutube videos
@@ -525,6 +616,7 @@
- **RUTV**: RUTV.RU
- **Ruutu**
- **safari**: safaribooksonline.com online video
- **safari:api**
- **safari:course**: safaribooksonline.com online courses
- **Sandia**: Sandia National Laboratories
- **Sapo**: SAPO Vídeos
@@ -537,26 +629,28 @@
- **ScreencastOMatic**
- **ScreenJunkies**
- **ScreenwaveMedia**
- **Seeker**
- **SenateISVP**
- **SendtoNews**
- **ServingSys**
- **Sexu**
- **SexyKarma**: Sexy Karma and Watch Indian Porn
- **Shahid**
- **Shared**: shared.sx and vivo.sx
- **ShareSix**
- **Sina**
- **SixPlay**
- **skynewsarabia:article**
- **skynewsarabia:video**
- **skynewsarabia:video**
- **SkySports**
- **Slideshare**
- **Slutload**
- **smotri**: Smotri.com
- **smotri:broadcast**: Smotri.com broadcasts
- **smotri:community**: Smotri.com community videos
- **smotri:user**: Smotri.com user videos
- **SnagFilms**
- **SnagFilmsEmbed**
- **Snotr**
- **Sohu**
- **SonyLIV**
- **soundcloud**
- **soundcloud:playlist**
- **soundcloud:search**: Soundcloud search
@@ -580,12 +674,13 @@
- **SportBoxEmbed**
- **SportDeutschland**
- **Sportschau**
- **sr:mediathek**: Saarländischer Rundfunk
- **SRGSSR**
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
- **SSA**
- **stanfordoc**: Stanford Open ClassRoom
- **Steam**
- **Stitcher**
- **Streamable**
- **streamcloud.eu**
- **StreamCZ**
- **StreetVoice**
@@ -596,8 +691,10 @@
- **Syfy**
- **SztvHu**
- **Tagesschau**
- **Tapely**
- **tagesschau:player**
- **Tass**
- **TBS**
- **TDSLifeway**
- **teachertube**: teachertube.com videos
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
@@ -611,19 +708,19 @@
- **Telecinco**: telecinco.es, cuatro.com and mediaset.es
- **Telegraaf**
- **TeleMB**
- **TeleQuebec**
- **TeleTask**
- **TenPlay**
- **Telewebion**
- **TF1**
- **TFO**
- **TheIntercept**
- **TheOnion**
- **ThePlatform**
- **ThePlatformFeed**
- **TheScene**
- **TheSixtyOne**
- **TheStar**
- **ThisAmericanLife**
- **ThisAV**
- **THVideo**
- **THVideoPlaylist**
- **tinypic**: tinypic.com videos
- **tlc.de**
- **TMZ**
@@ -631,13 +728,13 @@
- **TNAFlix**
- **TNAFlixNetworkEmbed**
- **toggle**
- **Tosh**: Tosh.0
- **tou.tv**
- **Toypics**: Toypics user profile
- **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken)
- **Trilulilu**
- **trollvids**
- **TruTube**
- **TruTV**
- **Tube8**
- **TubiTv**
- **tudou**
@@ -659,12 +756,13 @@
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **tvland.com**
- **tvp.pl**
- **tvp.pl:Series**
- **TVPlay**: TV3Play and related services
- **TVNoe**
- **tvp**: Telewizja Polska
- **tvp:embed**: Telewizja Polska
- **tvp:series**
- **Tweakers**
- **twitch:bookmarks**
- **twitch:chapter**
- **twitch:clips**
- **twitch:past_broadcasts**
- **twitch:profile**
- **twitch:stream**
@@ -673,16 +771,21 @@
- **twitter**
- **twitter:amplify**
- **twitter:card**
- **Ubu**
- **udemy**
- **udemy:course**
- **UDNEmbed**: 聯合影音
- **Unistra**
- **uol.com.br**
- **uplynk**
- **uplynk:preplay**
- **Urort**: NRK P3 Urørt
- **URPlay**
- **USANetwork**
- **USAToday**
- **ustream**
- **ustream:channel**
- **Ustudio**
- **ustudio**
- **ustudio:embed**
- **Varzesh3**
- **Vbox7**
- **VeeHD**
@@ -690,10 +793,14 @@
- **Vessel**
- **Vesti**: Вести.Ru
- **Vevo**
- **VevoPlaylist**
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com**
- **Viafree**
- **Vice**
- **Viceland**
- **ViceShow**
- **Vidbit**
- **Viddler**
- **video.google:search**: Google Video search
- **video.mit.edu**
@@ -706,12 +813,15 @@
- **VideoPremium**
- **VideoTt**: video.tt - Your True Tube (Currently broken)
- **videoweed**: VideoWeed
- **Vidio**
- **vidme**
- **vidme:user**
- **vidme:user:likes**
- **Vidzi**
- **vier**
- **vier:videos**
- **ViewLift**
- **ViewLiftEmbed**
- **Viewster**
- **Viidea**
- **viki**
@@ -730,25 +840,27 @@
- **vine:user**
- **vk**: VK
- **vk:uservideos**: VK - User's Videos
- **vk:wallpost**
- **vlive**
- **Vodlocker**
- **VODPlatform**
- **VoiceRepublic**
- **VoxMedia**
- **Vporn**
- **vpro**: npo.nl and ntr.nl
- **VRT**
- **vube**: Vube.com
- **VuClip**
- **vulture.com**
- **VyboryMos**
- **Walla**
- **WashingtonPost**
- **washingtonpost**
- **washingtonpost:article**
- **wat.tv**
- **WayOfTheMaster**
- **WatchIndianPorn**: Watch Indian Porn
- **WDR**
- **wdr:mobile**
- **WDRMaus**: Sendung mit der Maus
- **WebOfStories**
- **WebOfStoriesPlaylist**
- **Weibo**
- **WeiqiTV**: WQTV
- **wholecloud**: WholeCloud
- **Wimp**
@@ -756,12 +868,17 @@
- **WNL**
- **WorldStarHipHop**
- **wrzuta.pl**
- **wrzuta.pl:playlist**
- **WSJ**: Wall Street Journal
- **XBef**
- **XboxClips**
- **XFileShare**: XFileShare based sites: GorillaVid.in, daclips.in, movpod.in, fastvideo.in, realvid.net, filehoot.com and vidto.me
- **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE
- **XHamster**
- **XHamsterEmbed**
- **xiami:album**: 虾米音乐 - 专辑
- **xiami:artist**: 虾米音乐 - 歌手
- **xiami:collection**: 虾米音乐 - 精选集
- **xiami:song**: 虾米音乐
- **XMinus**
- **XNXX**
- **Xstream**
@@ -780,18 +897,21 @@
- **Ynet**
- **YouJizz**
- **youku**: 优酷
- **youku:show**
- **YouPorn**
- **YourUpload**
- **youtube**: YouTube.com
- **youtube:channel**: YouTube.com channels
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
- **youtube:live**: YouTube.com live streams
- **youtube:playlist**: YouTube.com playlists
- **youtube:playlists**: YouTube.com user/channel playlists
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
- **youtube:search**: YouTube.com searches
- **youtube:search:date**: YouTube.com searches, newest videos first
- **youtube:search_url**: YouTube.com search URLs
- **youtube:shared**
- **youtube:show**: YouTube.com (multi-season) shows
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
@@ -799,6 +919,4 @@
- **Zapiks**
- **ZDF**
- **ZDFChannel**
- **zingmp3:album**: mp3.zing.vn albums
- **zingmp3:song**: mp3.zing.vn songs
- **ZippCast**
- **zingmp3**: mp3.zing.vn

View File

@@ -2,5 +2,5 @@
universal = True
[flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,setup.py,build,.git
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
ignore = E402,E501,E731

View File

@@ -8,11 +8,12 @@ import warnings
import sys
try:
from setuptools import setup
from setuptools import setup, Command
setuptools_available = True
except ImportError:
from distutils.core import setup
from distutils.core import setup, Command
setuptools_available = False
from distutils.spawn import spawn
try:
# This will create an exe that needs Microsoft Visual C++ 2008
@@ -20,25 +21,37 @@ try:
import py2exe
except ImportError:
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("Cannot import py2exe", file=sys.stderr)
print('Cannot import py2exe', file=sys.stderr)
exit(1)
py2exe_options = {
"bundle_files": 1,
"compressed": 1,
"optimize": 2,
"dist_dir": '.',
"dll_excludes": ['w9xpopen.exe', 'crypt32.dll'],
'bundle_files': 1,
'compressed': 1,
'optimize': 2,
'dist_dir': '.',
'dll_excludes': ['w9xpopen.exe', 'crypt32.dll'],
}
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
DESCRIPTION = 'YouTube video downloader'
LONG_DESCRIPTION = 'Command-line program to download videos from YouTube.com and other video sites'
py2exe_console = [{
"script": "./youtube_dl/__main__.py",
"dest_base": "youtube-dl",
'script': './youtube_dl/__main__.py',
'dest_base': 'youtube-dl',
'version': __version__,
'description': DESCRIPTION,
'comments': LONG_DESCRIPTION,
'product_name': 'youtube-dl',
'product_version': __version__,
}]
py2exe_params = {
'console': py2exe_console,
'options': {"py2exe": py2exe_options},
'options': {'py2exe': py2exe_options},
'zipfile': None
}
@@ -70,16 +83,27 @@ else:
else:
params['scripts'] = ['bin/youtube-dl']
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
class build_lazy_extractors(Command):
description = 'Build the extractor lazy loading module'
user_options = []
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
spawn(
[sys.executable, 'devscripts/make_lazy_extractors.py', 'youtube_dl/extractor/lazy_extractors.py'],
dry_run=self.dry_run,
)
setup(
name='youtube_dl',
version=__version__,
description='YouTube video downloader',
long_description='Small command-line program to download videos from'
' YouTube.com and other video sites.',
description=DESCRIPTION,
long_description=LONG_DESCRIPTION,
url='https://github.com/rg3/youtube-dl',
author='Ricardo Garcia',
author_email='ytdl@yt-dl.org',
@@ -95,17 +119,19 @@ setup(
# test_requires = ['nosetest'],
classifiers=[
"Topic :: Multimedia :: Video",
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: Public Domain",
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.2",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
'Topic :: Multimedia :: Video',
'Development Status :: 5 - Production/Stable',
'Environment :: Console',
'License :: Public Domain',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
],
cmdclass={'build_lazy_extractors': build_lazy_extractors},
**params
)

View File

@@ -24,8 +24,13 @@ from youtube_dl.utils import (
def get_params(override=None):
PARAMETERS_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)),
"parameters.json")
LOCAL_PARAMETERS_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)),
"local_parameters.json")
with io.open(PARAMETERS_FILE, encoding='utf-8') as pf:
parameters = json.load(pf)
if os.path.exists(LOCAL_PARAMETERS_FILE):
with io.open(LOCAL_PARAMETERS_FILE, encoding='utf-8') as pf:
parameters.update(json.load(pf))
if override:
parameters.update(override)
return parameters
@@ -143,6 +148,9 @@ def expect_value(self, got, expected, field):
expect_value(self, item_got, item_expected, field)
else:
if isinstance(expected, compat_str) and expected.startswith('md5:'):
self.assertTrue(
isinstance(got, compat_str),
'Expected field %s to be a unicode object, but got value %r of type %r' % (field, got, type(got)))
got = 'md5:' + md5(got)
elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
self.assertTrue(

View File

@@ -11,6 +11,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL
from youtube_dl.extractor.common import InfoExtractor
from youtube_dl.extractor import YoutubeIE, get_info_extractor
from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
class TestIE(InfoExtractor):
@@ -47,6 +48,9 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._og_search_property('foobar', html), 'Foo')
self.assertEqual(ie._og_search_property('test1', html), 'foo > < bar')
self.assertEqual(ie._og_search_property('test2', html), 'foo >//< bar')
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
def test_html_search_meta(self):
ie = self.ie
@@ -65,6 +69,20 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._html_search_meta('d', html), '4')
self.assertEqual(ie._html_search_meta('e', html), '5')
self.assertEqual(ie._html_search_meta('f', html), '6')
self.assertEqual(ie._html_search_meta(('a', 'b', 'c'), html), '1')
self.assertEqual(ie._html_search_meta(('c', 'b', 'a'), html), '3')
self.assertEqual(ie._html_search_meta(('z', 'x', 'c'), html), '3')
self.assertRaises(RegexNotFoundError, ie._html_search_meta, 'z', html, None, fatal=True)
self.assertRaises(RegexNotFoundError, ie._html_search_meta, ('z', 'x'), html, None, fatal=True)
def test_download_json(self):
uri = encode_data_uri(b'{"foo": "blah"}', 'application/json')
self.assertEqual(self.ie._download_json(uri, None), {'foo': 'blah'})
uri = encode_data_uri(b'callback({"foo": "blah"})', 'application/javascript')
self.assertEqual(self.ie._download_json(uri, None, transform_source=strip_jsonp), {'foo': 'blah'})
uri = encode_data_uri(b'{"foo": invalid}', 'application/json')
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -222,6 +222,11 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low')
ydl = YDL({'format': 'bestvideo[format_id^=dash][format_id$=low]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low')
formats = [
{'format_id': 'vid-vcodec-dot', 'ext': 'mp4', 'preference': 1, 'vcodec': 'avc1.123456', 'acodec': 'none', 'url': TEST_URL},
]
@@ -330,6 +335,40 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], f1['format_id'])
def test_audio_only_extractor_format_selection(self):
# For extractors with incomplete formats (all formats are audio-only or
# video-only) best and worst should fallback to corresponding best/worst
# video-only or audio-only formats (as per
# https://github.com/rg3/youtube-dl/pull/5556)
formats = [
{'format_id': 'low', 'ext': 'mp3', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'high', 'ext': 'mp3', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL},
]
info_dict = _make_result(formats)
ydl = YDL({'format': 'best'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'high')
ydl = YDL({'format': 'worst'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'low')
def test_format_not_available(self):
formats = [
{'format_id': 'regular', 'ext': 'mp4', 'height': 360, 'url': TEST_URL},
{'format_id': 'video', 'ext': 'mp4', 'height': 720, 'acodec': 'none', 'url': TEST_URL},
]
info_dict = _make_result(formats)
# This must fail since complete video-audio format does not match filter
# and extractor does not provide incomplete only formats (i.e. only
# video-only or audio-only).
ydl = YDL({'format': 'best[height>360]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_invalid_format_specs(self):
def assert_syntax_error(format_spec):
ydl = YDL({'format': format_spec})

View File

@@ -6,6 +6,7 @@ from __future__ import unicode_literals
import os
import sys
import unittest
import collections
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@@ -100,8 +101,6 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch(':ytsubs', ['youtube:subscriptions'])
self.assertMatch(':ytsubscriptions', ['youtube:subscriptions'])
self.assertMatch(':ythistory', ['youtube:history'])
self.assertMatch(':thedailyshow', ['ComedyCentralShows'])
self.assertMatch(':tds', ['ComedyCentralShows'])
def test_vimeo_matching(self):
self.assertMatch('https://vimeo.com/channels/tributes', ['vimeo:channel'])
@@ -130,6 +129,15 @@ class TestAllURLsMatching(unittest.TestCase):
'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
['Yahoo'])
def test_no_duplicated_ie_names(self):
name_accu = collections.defaultdict(list)
for ie in self.ies:
name_accu[ie.IE_NAME.lower()].append(type(ie).__name__)
for (ie_name, ie_list) in name_accu.items():
self.assertEqual(
len(ie_list), 1,
'Multiple extractors with the same IE_NAME "%s" (%s)' % (ie_name, ', '.join(ie_list)))
if __name__ == '__main__':
unittest.main()

View File

@@ -10,34 +10,39 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.utils import get_filesystem_encoding
from youtube_dl.compat import (
compat_getenv,
compat_setenv,
compat_etree_fromstring,
compat_expanduser,
compat_shlex_split,
compat_str,
compat_struct_unpack,
compat_urllib_parse_unquote,
compat_urllib_parse_unquote_plus,
compat_urllib_parse_urlencode,
)
class TestCompat(unittest.TestCase):
def test_compat_getenv(self):
test_str = 'тест'
os.environ['YOUTUBE-DL-TEST'] = (
test_str if sys.version_info >= (3, 0)
else test_str.encode(get_filesystem_encoding()))
compat_setenv('YOUTUBE-DL-TEST', test_str)
self.assertEqual(compat_getenv('YOUTUBE-DL-TEST'), test_str)
def test_compat_setenv(self):
test_var = 'YOUTUBE-DL-TEST'
test_str = 'тест'
compat_setenv(test_var, test_str)
compat_getenv(test_var)
self.assertEqual(compat_getenv(test_var), test_str)
def test_compat_expanduser(self):
old_home = os.environ.get('HOME')
test_str = 'C:\Documents and Settings\тест\Application Data'
os.environ['HOME'] = (
test_str if sys.version_info >= (3, 0)
else test_str.encode(get_filesystem_encoding()))
compat_setenv('HOME', test_str)
self.assertEqual(compat_expanduser('~'), test_str)
os.environ['HOME'] = old_home
compat_setenv('HOME', old_home or '')
def test_all_present(self):
import youtube_dl.compat
@@ -70,8 +75,20 @@ class TestCompat(unittest.TestCase):
self.assertEqual(compat_urllib_parse_unquote_plus('abc%20def'), 'abc def')
self.assertEqual(compat_urllib_parse_unquote_plus('%7e/abc+def'), '~/abc def')
def test_compat_urllib_parse_urlencode(self):
self.assertEqual(compat_urllib_parse_urlencode({'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', b'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', b'def')]), 'abc=def')
def test_compat_shlex_split(self):
self.assertEqual(compat_shlex_split('-option "one two"'), ['-option', 'one two'])
self.assertEqual(compat_shlex_split('-option "one\ntwo" \n -flag'), ['-option', 'one\ntwo', '-flag'])
self.assertEqual(compat_shlex_split('-val 中文'), ['-val', '中文'])
def test_compat_etree_fromstring(self):
xml = '''
@@ -88,5 +105,15 @@ class TestCompat(unittest.TestCase):
self.assertTrue(isinstance(doc.find('chinese').text, compat_str))
self.assertTrue(isinstance(doc.find('foo/bar').text, compat_str))
def test_compat_etree_fromstring_doctype(self):
xml = '''<?xml version="1.0"?>
<!DOCTYPE smil PUBLIC "-//W3C//DTD SMIL 2.0//EN" "http://www.w3.org/2001/SMIL20/SMIL20.dtd">
<smil xmlns="http://www.w3.org/2001/SMIL20/Language"></smil>'''
compat_etree_fromstring(xml)
def test_struct_unpack(self):
self.assertEqual(compat_struct_unpack('!B', b'\x00'), (0,))
if __name__ == '__main__':
unittest.main()

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
@@ -15,6 +16,15 @@ import threading
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
def http_server_port(httpd):
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
# In Jython SSLSocket is not a subclass of socket.socket
sock = httpd.socket.sock
else:
sock = httpd.socket
return sock.getsockname()[1]
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
@@ -30,6 +40,22 @@ class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
self.send_header('Content-Type', 'video/mp4')
self.end_headers()
self.wfile.write(b'\x00\x00\x00\x00\x20\x66\x74[video]')
elif self.path == '/302':
if sys.version_info[0] == 3:
# XXX: Python 3 http server does not allow non-ASCII header values
self.send_response(404)
self.end_headers()
return
new_url = 'http://localhost:%d/中文.html' % http_server_port(self.server)
self.send_response(302)
self.send_header(b'Location', new_url.encode('utf-8'))
self.end_headers()
elif self.path == '/%E4%B8%AD%E6%96%87.html':
self.send_response(200)
self.send_header('Content-Type', 'text/html; charset=utf-8')
self.end_headers()
self.wfile.write(b'<html><video src="/vid.mp4" /></html>')
else:
assert False
@@ -46,18 +72,32 @@ class FakeLogger(object):
class TestHTTP(unittest.TestCase):
def setUp(self):
self.httpd = compat_http_server.HTTPServer(
('localhost', 0), HTTPTestRequestHandler)
self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True
self.server_thread.start()
def test_unicode_path_redirection(self):
# XXX: Python 3 http server does not allow non-ASCII header values
if sys.version_info[0] == 3:
return
ydl = YoutubeDL({'logger': FakeLogger()})
r = ydl.extract_info('http://localhost:%d/302' % self.port)
self.assertEqual(r['url'], 'http://localhost:%d/vid.mp4' % self.port)
class TestHTTPS(unittest.TestCase):
def setUp(self):
certfn = os.path.join(TEST_DIR, 'testcert.pem')
self.httpd = compat_http_server.HTTPServer(
('localhost', 0), HTTPTestRequestHandler)
self.httpd.socket = ssl.wrap_socket(
self.httpd.socket, certfile=certfn, server_side=True)
if os.name == 'java':
# In Jython SSLSocket is not a subclass of socket.socket
sock = self.httpd.socket.sock
else:
sock = self.httpd.socket
self.port = sock.getsockname()[1]
self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True
self.server_thread.start()
@@ -93,32 +133,41 @@ class TestProxy(unittest.TestCase):
def setUp(self):
self.proxy = compat_http_server.HTTPServer(
('localhost', 0), _build_proxy_handler('normal'))
self.port = self.proxy.socket.getsockname()[1]
self.port = http_server_port(self.proxy)
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
self.proxy_thread.daemon = True
self.proxy_thread.start()
self.cn_proxy = compat_http_server.HTTPServer(
('localhost', 0), _build_proxy_handler('cn'))
self.cn_port = self.cn_proxy.socket.getsockname()[1]
self.cn_proxy_thread = threading.Thread(target=self.cn_proxy.serve_forever)
self.cn_proxy_thread.daemon = True
self.cn_proxy_thread.start()
self.geo_proxy = compat_http_server.HTTPServer(
('localhost', 0), _build_proxy_handler('geo'))
self.geo_port = http_server_port(self.geo_proxy)
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
self.geo_proxy_thread.daemon = True
self.geo_proxy_thread.start()
def test_proxy(self):
cn_proxy = 'localhost:{0}'.format(self.cn_port)
geo_proxy = 'localhost:{0}'.format(self.geo_port)
ydl = YoutubeDL({
'proxy': 'localhost:{0}'.format(self.port),
'cn_verification_proxy': cn_proxy,
'geo_verification_proxy': geo_proxy,
})
url = 'http://foo.com/bar'
response = ydl.urlopen(url).read().decode('utf-8')
self.assertEqual(response, 'normal: {0}'.format(url))
req = compat_urllib_request.Request(url)
req.add_header('Ytdl-request-proxy', cn_proxy)
req.add_header('Ytdl-request-proxy', geo_proxy)
response = ydl.urlopen(req).read().decode('utf-8')
self.assertEqual(response, 'cn: {0}'.format(url))
self.assertEqual(response, 'geo: {0}'.format(url))
def test_proxy_with_idn(self):
ydl = YoutubeDL({
'proxy': 'localhost:{0}'.format(self.port),
})
url = 'http://中文.tw/'
response = ydl.urlopen(url).read().decode('utf-8')
# b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__':
unittest.main()

118
test/test_socks.py Normal file
View File

@@ -0,0 +1,118 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import random
import subprocess
from test.helper import (
FakeYDL,
get_params,
)
from youtube_dl.compat import (
compat_str,
compat_urllib_request,
)
class TestMultipleSocks(unittest.TestCase):
@staticmethod
def _check_params(attrs):
params = get_params()
for attr in attrs:
if attr not in params:
print('Missing %s. Skipping.' % attr)
return
return params
def test_proxy_http(self):
params = self._check_params(['primary_proxy', 'primary_server_ip'])
if params is None:
return
ydl = FakeYDL({
'proxy': params['primary_proxy']
})
self.assertEqual(
ydl.urlopen('http://yt-dl.org/ip').read().decode('utf-8'),
params['primary_server_ip'])
def test_proxy_https(self):
params = self._check_params(['primary_proxy', 'primary_server_ip'])
if params is None:
return
ydl = FakeYDL({
'proxy': params['primary_proxy']
})
self.assertEqual(
ydl.urlopen('https://yt-dl.org/ip').read().decode('utf-8'),
params['primary_server_ip'])
def test_secondary_proxy_http(self):
params = self._check_params(['secondary_proxy', 'secondary_server_ip'])
if params is None:
return
ydl = FakeYDL()
req = compat_urllib_request.Request('http://yt-dl.org/ip')
req.add_header('Ytdl-request-proxy', params['secondary_proxy'])
self.assertEqual(
ydl.urlopen(req).read().decode('utf-8'),
params['secondary_server_ip'])
def test_secondary_proxy_https(self):
params = self._check_params(['secondary_proxy', 'secondary_server_ip'])
if params is None:
return
ydl = FakeYDL()
req = compat_urllib_request.Request('https://yt-dl.org/ip')
req.add_header('Ytdl-request-proxy', params['secondary_proxy'])
self.assertEqual(
ydl.urlopen(req).read().decode('utf-8'),
params['secondary_server_ip'])
class TestSocks(unittest.TestCase):
_SKIP_SOCKS_TEST = True
def setUp(self):
if self._SKIP_SOCKS_TEST:
return
self.port = random.randint(20000, 30000)
self.server_process = subprocess.Popen([
'srelay', '-f', '-i', '127.0.0.1:%d' % self.port],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def tearDown(self):
if self._SKIP_SOCKS_TEST:
return
self.server_process.terminate()
self.server_process.communicate()
def _get_ip(self, protocol):
if self._SKIP_SOCKS_TEST:
return '127.0.0.1'
ydl = FakeYDL({
'proxy': '%s://127.0.0.1:%d' % (protocol, self.port),
})
return ydl.urlopen('http://yt-dl.org/ip').read().decode('utf-8')
def test_socks4(self):
self.assertTrue(isinstance(self._get_ip('socks4'), compat_str))
def test_socks4a(self):
self.assertTrue(isinstance(self._get_ip('socks4a'), compat_str))
def test_socks5(self):
self.assertTrue(isinstance(self._get_ip('socks5'), compat_str))
if __name__ == '__main__':
unittest.main()

View File

@@ -20,6 +20,7 @@ from youtube_dl.utils import (
args_to_str,
encode_base_n,
clean_html,
date_from_str,
DateRange,
detect_exe_version,
determine_ext,
@@ -32,14 +33,18 @@ from youtube_dl.utils import (
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
get_element_by_class,
InAdvancePagedList,
intlist_to_bytes,
is_html,
js_to_json,
limit_length,
mimetype2ext,
month_by_name,
ohdave_rsa_encrypt,
OnDemandPagedList,
orderedSet,
parse_age_limit,
parse_duration,
parse_filesize,
parse_count,
@@ -49,20 +54,23 @@ from youtube_dl.utils import (
sanitize_path,
prepend_extension,
replace_extension,
remove_start,
remove_end,
remove_quotes,
shell_quote,
smuggle_url,
str_to_int,
strip_jsonp,
struct_unpack,
timeconvert,
unescapeHTML,
unified_strdate,
unified_timestamp,
unsmuggle_url,
uppercase_escape,
lowercase_escape,
url_basename,
urlencode_postdata,
urshift,
update_url_query,
version_tuple,
xpath_with_ns,
@@ -76,6 +84,7 @@ from youtube_dl.utils import (
cli_option,
cli_valueless_option,
cli_bool_option,
parse_codecs,
)
from youtube_dl.compat import (
compat_chr,
@@ -138,8 +147,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual('yes_no', sanitize_filename('yes? no', restricted=True))
self.assertEqual('this_-_that', sanitize_filename('this: that', restricted=True))
tests = 'a\xe4b\u4e2d\u56fd\u7684c'
self.assertEqual(sanitize_filename(tests, restricted=True), 'a_b_c')
tests = 'aäb\u4e2d\u56fd\u7684c'
self.assertEqual(sanitize_filename(tests, restricted=True), 'aab_c')
self.assertTrue(sanitize_filename('\xf6', restricted=True) != '') # No empty filename
forbidden = '"\0\\/&!: \'\t\n()[]{}$;`^,#'
@@ -154,6 +163,10 @@ class TestUtil(unittest.TestCase):
self.assertTrue(sanitize_filename('-', restricted=True) != '')
self.assertTrue(sanitize_filename(':', restricted=True) != '')
self.assertEqual(sanitize_filename(
'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True),
'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYPssaaaaaaaeceeeeiiiionooooooooeuuuuuypy')
def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw')
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
@@ -211,6 +224,16 @@ class TestUtil(unittest.TestCase):
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_remove_start(self):
self.assertEqual(remove_start(None, 'A - '), None)
self.assertEqual(remove_start('A - B', 'A - '), 'B')
self.assertEqual(remove_start('B - A', 'A - '), 'B - A')
def test_remove_end(self):
self.assertEqual(remove_end(None, ' - B'), None)
self.assertEqual(remove_end('A - B', ' - B'), 'A')
self.assertEqual(remove_end('B - A', ' - B'), 'B - A')
def test_remove_quotes(self):
self.assertEqual(remove_quotes(None), None)
self.assertEqual(remove_quotes('"'), '"')
@@ -233,6 +256,15 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unescapeHTML('&#47;'), '/')
self.assertEqual(unescapeHTML('&eacute;'), 'é')
self.assertEqual(unescapeHTML('&#2013266066;'), '&#2013266066;')
# HTML5 entities
self.assertEqual(unescapeHTML('&period;&apos;'), '.\'')
def test_date_from_str(self):
self.assertEqual(date_from_str('yesterday'), date_from_str('now-1day'))
self.assertEqual(date_from_str('now+7day'), date_from_str('now+1week'))
self.assertEqual(date_from_str('now+14day'), date_from_str('now+2week'))
self.assertEqual(date_from_str('now+365day'), date_from_str('now+1year'))
self.assertEqual(date_from_str('now+30day'), date_from_str('now+1month'))
def test_daterange(self):
_20century = DateRange("19000101", "20000101")
@@ -258,8 +290,29 @@ class TestUtil(unittest.TestCase):
'20150202')
self.assertEqual(unified_strdate('Feb 14th 2016 5:45PM'), '20160214')
self.assertEqual(unified_strdate('25-09-2014'), '20140925')
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
self.assertEqual(unified_timestamp('8/7/2009'), 1247011200)
self.assertEqual(unified_timestamp('Dec 14, 2012'), 1355443200)
self.assertEqual(unified_timestamp('2012/10/11 01:56:38 +0000'), 1349920598)
self.assertEqual(unified_timestamp('1968 12 10'), -33436800)
self.assertEqual(unified_timestamp('1968-12-10'), -33436800)
self.assertEqual(unified_timestamp('28/01/2014 21:00:00 +0100'), 1390939200)
self.assertEqual(
unified_timestamp('11/26/2014 11:30:00 AM PST', day_first=False),
1417001400)
self.assertEqual(
unified_timestamp('2/2/2015 6:47:40 PM', day_first=False),
1422902860)
self.assertEqual(unified_timestamp('Feb 14th 2016 5:45PM'), 1455471900)
self.assertEqual(unified_timestamp('25-09-2014'), 1411603200)
self.assertEqual(unified_timestamp('27.02.2016 17:30'), 1456594200)
self.assertEqual(unified_timestamp('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
self.assertEqual(determine_ext('http://example.com/foo/bar/?download', None), None)
@@ -358,6 +411,12 @@ class TestUtil(unittest.TestCase):
self.assertEqual(res_url, url)
self.assertEqual(res_data, None)
smug_url = smuggle_url(url, {'a': 'b'})
smug_smug_url = smuggle_url(smug_url, {'c': 'd'})
res_url, res_data = unsmuggle_url(smug_smug_url)
self.assertEqual(res_url, url)
self.assertEqual(res_data, {'a': 'b', 'c': 'd'})
def test_shell_quote(self):
args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')]
self.assertEqual(shell_quote(args), """ffmpeg -i 'ñ€ß'"'"'.mp4'""")
@@ -376,6 +435,20 @@ class TestUtil(unittest.TestCase):
url_basename('http://media.w3.org/2010/05/sintel/trailer.mp4'),
'trailer.mp4')
def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None)
self.assertEqual(parse_age_limit(False), None)
self.assertEqual(parse_age_limit('invalid'), None)
self.assertEqual(parse_age_limit(0), 0)
self.assertEqual(parse_age_limit(18), 18)
self.assertEqual(parse_age_limit(21), 21)
self.assertEqual(parse_age_limit(22), None)
self.assertEqual(parse_age_limit('18'), 18)
self.assertEqual(parse_age_limit('18+'), 18)
self.assertEqual(parse_age_limit('PG-13'), 13)
self.assertEqual(parse_age_limit('TV-14'), 14)
self.assertEqual(parse_age_limit('TV-MA'), 17)
def test_parse_duration(self):
self.assertEqual(parse_duration(None), None)
self.assertEqual(parse_duration(False), None)
@@ -405,6 +478,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('01:02:03:04'), 93784)
self.assertEqual(parse_duration('1 hour 3 minutes'), 3780)
self.assertEqual(parse_duration('87 Min.'), 5220)
self.assertEqual(parse_duration('PT1H0.040S'), 3600.04)
def test_fix_xml_ampersands(self):
self.assertEqual(
@@ -444,9 +518,6 @@ class TestUtil(unittest.TestCase):
testPL(5, 2, (2, 99), [2, 3, 4])
testPL(5, 2, (20, 99), [])
def test_struct_unpack(self):
self.assertEqual(struct_unpack('!B', b'\x00'), (0,))
def test_read_batch_urls(self):
f = io.StringIO('''\xef\xbb\xbf foo
bar\r
@@ -556,6 +627,45 @@ class TestUtil(unittest.TestCase):
limit_length('foo bar baz asd', 12).startswith('foo bar'))
self.assertTrue('...' in limit_length('foo bar baz asd', 12))
def test_mimetype2ext(self):
self.assertEqual(mimetype2ext(None), None)
self.assertEqual(mimetype2ext('video/x-flv'), 'flv')
self.assertEqual(mimetype2ext('application/x-mpegURL'), 'm3u8')
self.assertEqual(mimetype2ext('text/vtt'), 'vtt')
self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt')
self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html')
def test_month_by_name(self):
self.assertEqual(month_by_name(None), None)
self.assertEqual(month_by_name('December', 'en'), 12)
self.assertEqual(month_by_name('décembre', 'fr'), 12)
self.assertEqual(month_by_name('December'), 12)
self.assertEqual(month_by_name('décembre'), None)
self.assertEqual(month_by_name('Unknown', 'unknown'), None)
def test_parse_codecs(self):
self.assertEqual(parse_codecs(''), {})
self.assertEqual(parse_codecs('avc1.77.30, mp4a.40.2'), {
'vcodec': 'avc1.77.30',
'acodec': 'mp4a.40.2',
})
self.assertEqual(parse_codecs('mp4a.40.2'), {
'vcodec': 'none',
'acodec': 'mp4a.40.2',
})
self.assertEqual(parse_codecs('mp4a.40.5,avc1.42001e'), {
'vcodec': 'avc1.42001e',
'acodec': 'mp4a.40.5',
})
self.assertEqual(parse_codecs('avc3.640028'), {
'vcodec': 'avc3.640028',
'acodec': 'none',
})
self.assertEqual(parse_codecs(', h264,,newcodec,aac'), {
'vcodec': 'h264',
'acodec': 'aac',
})
def test_escape_rfc3986(self):
reserved = "!*'();:@&=+$,/?#[]"
unreserved = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~'
@@ -577,11 +687,11 @@ class TestUtil(unittest.TestCase):
)
self.assertEqual(
escape_url('http://тест.рф/фрагмент'),
'http://тест.рф/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82'
'http://xn--e1aybc.xn--p1ai/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82'
)
self.assertEqual(
escape_url('http://тест.рф/абв?абв=абв#абв'),
'http://тест.рф/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2'
'http://xn--e1aybc.xn--p1ai/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2'
)
self.assertEqual(escape_url('http://vimeo.com/56015672#at=0'), 'http://vimeo.com/56015672#at=0')
@@ -608,6 +718,21 @@ class TestUtil(unittest.TestCase):
json_code = js_to_json(inp)
self.assertEqual(json.loads(json_code), json.loads(inp))
inp = '''{
0:{src:'skipped', type: 'application/dash+xml'},
1:{src:'skipped', type: 'application/vnd.apple.mpegURL'},
}'''
self.assertEqual(js_to_json(inp), '''{
"0":{"src":"skipped", "type": "application/dash+xml"},
"1":{"src":"skipped", "type": "application/vnd.apple.mpegURL"}
}''')
inp = '''{"foo":101}'''
self.assertEqual(js_to_json(inp), '''{"foo":101}''')
inp = '''{"duration": "00:01:07"}'''
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
def test_js_to_json_edgecases(self):
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
@@ -631,6 +756,27 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{"abc": "def",}')
self.assertEqual(json.loads(on), {'abc': 'def'})
on = js_to_json('{ 0: /* " \n */ ",]" , }')
self.assertEqual(json.loads(on), {'0': ',]'})
on = js_to_json(r'["<p>x<\/p>"]')
self.assertEqual(json.loads(on), ['<p>x</p>'])
on = js_to_json(r'["\xaa"]')
self.assertEqual(json.loads(on), ['\u00aa'])
on = js_to_json("['a\\\nb']")
self.assertEqual(json.loads(on), ['ab'])
on = js_to_json('{0xff:0xff}')
self.assertEqual(json.loads(on), {'255': 255})
on = js_to_json('{077:077}')
self.assertEqual(json.loads(on), {'63': 63})
on = js_to_json('{42:42}')
self.assertEqual(json.loads(on), {'42': 42})
def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
@@ -692,7 +838,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_filesize('2 MiB'), 2097152)
self.assertEqual(parse_filesize('5 GB'), 5000000000)
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
self.assertEqual(parse_filesize('1.2tb'), 1200000000000)
self.assertEqual(parse_filesize('1,24 KB'), 1240)
self.assertEqual(parse_filesize('1,24 kb'), 1240)
self.assertEqual(parse_filesize('8.5 megabytes'), 8500000)
def test_parse_count(self):
self.assertEqual(parse_count(None), None)
@@ -702,6 +851,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_count('1.000'), 1000)
self.assertEqual(parse_count('1.1k'), 1100)
self.assertEqual(parse_count('1.1kk'), 1100000)
self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
@@ -841,6 +992,7 @@ The first line
self.assertEqual(cli_option({'proxy': '127.0.0.1:3128'}, '--proxy', 'proxy'), ['--proxy', '127.0.0.1:3128'])
self.assertEqual(cli_option({'proxy': None}, '--proxy', 'proxy'), [])
self.assertEqual(cli_option({}, '--proxy', 'proxy'), [])
self.assertEqual(cli_option({'retries': 10}, '--retries', 'retries'), ['--retries', '10'])
def test_cli_valueless_option(self):
self.assertEqual(cli_valueless_option(
@@ -901,5 +1053,17 @@ The first line
self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
def test_urshift(self):
self.assertEqual(urshift(3, 1), 1)
self.assertEqual(urshift(-3, 1), 2147483646)
def test_get_element_by_class(self):
html = '''
<span class="foo bar">nice</span>
'''
self.assertEqual(get_element_by_class('foo', html), 'nice')
self.assertEqual(get_element_by_class('no-such-class', html), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,70 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
import unittest
import sys
import os
import subprocess
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
class TestVerboseOutput(unittest.TestCase):
def test_private_info_arg(self):
outp = subprocess.Popen(
[
sys.executable, 'youtube_dl/__main__.py', '-v',
'--username', 'johnsmith@gmail.com',
'--password', 'secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue(b'--username' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'--password' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_shortarg(self):
outp = subprocess.Popen(
[
sys.executable, 'youtube_dl/__main__.py', '-v',
'-u', 'johnsmith@gmail.com',
'-p', 'secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue(b'-u' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_eq(self):
outp = subprocess.Popen(
[
sys.executable, 'youtube_dl/__main__.py', '-v',
'--username=johnsmith@gmail.com',
'--password=secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue(b'--username' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'--password' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_shortarg_eq(self):
outp = subprocess.Popen(
[
sys.executable, 'youtube_dl/__main__.py', '-v',
'-u=johnsmith@gmail.com',
'-p=secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue(b'-u' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
if __name__ == '__main__':
unittest.main()

View File

@@ -44,7 +44,7 @@ class TestYoutubeLists(unittest.TestCase):
ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/watch?v=W01L70IGBgE&index=2&list=RDOQpdSVF_k_w')
entries = result['entries']
self.assertTrue(len(entries) >= 20)
self.assertTrue(len(entries) >= 50)
original_video = entries[0]
self.assertEqual(original_video['id'], 'OQpdSVF_k_w')

View File

@@ -8,6 +8,7 @@ deps =
passenv = HOME
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py
--exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
--exclude test_socks.py
commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html
# test.test_download:TestDownload.test_NowVideo

View File

@@ -5,6 +5,7 @@ from __future__ import absolute_import, unicode_literals
import collections
import contextlib
import copy
import datetime
import errno
import fileinput
@@ -39,6 +40,8 @@ from .compat import (
compat_urllib_request_DataHandler,
)
from .utils import (
age_restricted,
args_to_str,
ContentTooShortError,
date_from_str,
DateRange,
@@ -58,13 +61,17 @@ from .utils import (
PagedList,
parse_filesize,
PerRequestProxyHandler,
PostProcessingError,
platform_name,
PostProcessingError,
preferredencoding,
prepend_extension,
register_socks_protocols,
render_table,
replace_extension,
SameFileError,
sanitize_filename,
sanitize_path,
sanitize_url,
sanitized_Request,
std_headers,
subtitles_filename,
@@ -75,13 +82,9 @@ from .utils import (
write_string,
YoutubeDLCookieProcessor,
YoutubeDLHandler,
prepend_extension,
replace_extension,
args_to_str,
age_restricted,
)
from .cache import Cache
from .extractor import get_info_extractor, gen_extractors
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
from .downloader import get_suitable_downloader
from .downloader.rtmp import rtmpdump_version
from .postprocessor import (
@@ -128,6 +131,9 @@ class YoutubeDL(object):
username: Username for authentication purposes.
password: Password for authentication purposes.
videopassword: Password for accessing a video.
ap_mso: Adobe Pass multiple-system operator identifier.
ap_username: Multiple-system operator account username.
ap_password: Multiple-system operator account password.
usenetrc: Use netrc for authentication instead.
verbose: Print additional info to stdout.
quiet: Do not print messages to stdout.
@@ -194,8 +200,8 @@ class YoutubeDL(object):
prefer_insecure: Use HTTP instead of HTTPS to retrieve information.
At the moment, this is only supported by YouTube.
proxy: URL of the proxy server to use
cn_verification_proxy: URL of the proxy to use for IP address verification
on Chinese sites. (Experimental)
geo_verification_proxy: URL of the proxy to use for IP address verification
on geo-restricted sites. (Experimental)
socket_timeout: Time to wait for unresponsive hosts, in seconds
bidi_workaround: Work around buggy terminals without bidirectional text
support, using fridibi
@@ -246,7 +252,16 @@ class YoutubeDL(object):
source_address: (Experimental) Client-side IP address to bind to.
call_home: Boolean, true iff we are allowed to contact the
youtube-dl servers for debugging.
sleep_interval: Number of seconds to sleep before each download.
sleep_interval: Number of seconds to sleep before each download when
used alone or a lower bound of a range for randomized
sleep before each download (minimum possible number
of seconds to sleep) when used along with
max_sleep_interval.
max_sleep_interval:Upper bound of a range for randomized sleep before each
download (maximum possible number of seconds to sleep).
Must only be used along with sleep_interval.
Actual sleep time will be a random float from range
[sleep_interval; max_sleep_interval].
listformats: Print an overview of available video formats and exit.
list_thumbnails: Print a table of all thumbnails and exit.
match_filter: A function that gets called with the info_dict of
@@ -259,7 +274,9 @@ class YoutubeDL(object):
The following options determine which downloader is picked:
external_downloader: Executable of the external downloader to call.
None or unset for standard (built-in) downloader.
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv
if True, otherwise use ffmpeg/avconv if False, otherwise
use downloader suggested by extractor if None.
The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see youtube_dl/downloader/common.py):
@@ -300,6 +317,11 @@ class YoutubeDL(object):
self.params.update(params)
self.cache = Cache(self)
if self.params.get('cn_verification_proxy') is not None:
self.report_warning('--cn-verification-proxy is deprecated. Use --geo-verification-proxy instead.')
if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
if params.get('bidi_workaround', False):
try:
import pty
@@ -322,7 +344,7 @@ class YoutubeDL(object):
['fribidi', '-c', 'UTF-8'] + width_args, **sp_kwargs)
self._output_channel = os.fdopen(master, 'rb')
except OSError as ose:
if ose.errno == 2:
if ose.errno == errno.ENOENT:
self.report_warning('Could not find fribidi executable, ignoring --bidi-workaround . Make sure that fribidi is an executable file in one of the directories in your $PATH.')
else:
raise
@@ -358,6 +380,8 @@ class YoutubeDL(object):
for ph in self.params.get('progress_hooks', []):
self.add_progress_hook(ph)
register_socks_protocols()
def warn_if_short_id(self, argv):
# short YouTube ID starting with dash?
idxs = [
@@ -377,8 +401,9 @@ class YoutubeDL(object):
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
self._ies.append(ie)
self._ies_instances[ie.ie_key()] = ie
ie.set_downloader(self)
if not isinstance(ie, type):
self._ies_instances[ie.ie_key()] = ie
ie.set_downloader(self)
def get_info_extractor(self, ie_key):
"""
@@ -396,7 +421,7 @@ class YoutubeDL(object):
"""
Add the InfoExtractors returned by gen_extractors to the end of the list
"""
for ie in gen_extractors():
for ie in gen_extractor_classes():
self.add_info_extractor(ie)
def add_post_processor(self, pp):
@@ -576,7 +601,7 @@ class YoutubeDL(object):
is_id=(k == 'id'))
template_dict = dict((k, sanitize(k, v))
for k, v in template_dict.items()
if v is not None)
if v is not None and not isinstance(v, (list, tuple, dict)))
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
@@ -660,6 +685,7 @@ class YoutubeDL(object):
if not ie.suitable(url):
continue
ie = self.get_info_extractor(ie.ie_key())
if not ie.working():
self.report_warning('The program functionality for this site has been marked as broken, '
'and will probably not work.')
@@ -712,6 +738,7 @@ class YoutubeDL(object):
result_type = ie_result.get('_type', 'video')
if result_type in ('url', 'url_transparent'):
ie_result['url'] = sanitize_url(ie_result['url'])
extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) or
extract_flat is True):
@@ -905,7 +932,7 @@ class YoutubeDL(object):
'*=': lambda attr, value: value in attr,
}
str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9._-]+)
\s*$
@@ -1037,9 +1064,9 @@ class YoutubeDL(object):
if isinstance(selector, list):
fs = [_build_selector_function(s) for s in selector]
def selector_function(formats):
def selector_function(ctx):
for f in fs:
for format in f(formats):
for format in f(ctx):
yield format
return selector_function
elif selector.type == GROUP:
@@ -1047,17 +1074,17 @@ class YoutubeDL(object):
elif selector.type == PICKFIRST:
fs = [_build_selector_function(s) for s in selector.selector]
def selector_function(formats):
def selector_function(ctx):
for f in fs:
picked_formats = list(f(formats))
picked_formats = list(f(ctx))
if picked_formats:
return picked_formats
return []
elif selector.type == SINGLE:
format_spec = selector.selector
def selector_function(formats):
formats = list(formats)
def selector_function(ctx):
formats = list(ctx['formats'])
if not formats:
return
if format_spec == 'all':
@@ -1070,9 +1097,10 @@ class YoutubeDL(object):
if f.get('vcodec') != 'none' and f.get('acodec') != 'none']
if audiovideo_formats:
yield audiovideo_formats[format_idx]
# for audio only (soundcloud) or video only (imgur) urls, select the best/worst audio format
elif (all(f.get('acodec') != 'none' for f in formats) or
all(f.get('vcodec') != 'none' for f in formats)):
# for extractors with incomplete formats (audio only (soundcloud)
# or video only (imgur)) we will fallback to best/worst
# {video,audio}-only format
elif ctx['incomplete_formats']:
yield formats[format_idx]
elif format_spec == 'bestaudio':
audio_formats = [
@@ -1146,17 +1174,18 @@ class YoutubeDL(object):
}
video_selector, audio_selector = map(_build_selector_function, selector.selector)
def selector_function(formats):
formats = list(formats)
for pair in itertools.product(video_selector(formats), audio_selector(formats)):
def selector_function(ctx):
for pair in itertools.product(
video_selector(copy.deepcopy(ctx)), audio_selector(copy.deepcopy(ctx))):
yield _merge(pair)
filters = [self._build_format_filter(f) for f in selector.filters]
def final_selector(formats):
def final_selector(ctx):
ctx_copy = copy.deepcopy(ctx)
for _filter in filters:
formats = list(filter(_filter, formats))
return selector_function(formats)
ctx_copy['formats'] = list(filter(_filter, ctx_copy['formats']))
return selector_function(ctx_copy)
return final_selector
stream = io.BytesIO(format_spec.encode('utf-8'))
@@ -1214,6 +1243,10 @@ class YoutubeDL(object):
if 'title' not in info_dict:
raise ExtractorError('Missing "title" field in extractor result')
if not isinstance(info_dict['id'], compat_str):
self.report_warning('"id" field is not a string - forcing string conversion')
info_dict['id'] = compat_str(info_dict['id'])
if 'playlist' not in info_dict:
# It isn't part of a playlist
info_dict['playlist'] = None
@@ -1226,9 +1259,12 @@ class YoutubeDL(object):
info_dict['thumbnails'] = thumbnails = [{'url': thumbnail}]
if thumbnails:
thumbnails.sort(key=lambda t: (
t.get('preference'), t.get('width'), t.get('height'),
t.get('id'), t.get('url')))
t.get('preference') if t.get('preference') is not None else -1,
t.get('width') if t.get('width') is not None else -1,
t.get('height') if t.get('height') is not None else -1,
t.get('id') if t.get('id') is not None else '', t.get('url')))
for i, t in enumerate(thumbnails):
t['url'] = sanitize_url(t['url'])
if t.get('width') and t.get('height'):
t['resolution'] = '%dx%d' % (t['width'], t['height'])
if t.get('id') is None:
@@ -1238,7 +1274,10 @@ class YoutubeDL(object):
self.list_thumbnails(info_dict)
return
if thumbnails and 'thumbnail' not in info_dict:
thumbnail = info_dict.get('thumbnail')
if thumbnail:
info_dict['thumbnail'] = sanitize_url(thumbnail)
elif thumbnails:
info_dict['thumbnail'] = thumbnails[-1]['url']
if 'display_id' not in info_dict and 'id' in info_dict:
@@ -1263,7 +1302,9 @@ class YoutubeDL(object):
if subtitles:
for _, subtitle in subtitles.items():
for subtitle_format in subtitle:
if 'ext' not in subtitle_format:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
if self.params.get('listsubtitles', False):
@@ -1292,6 +1333,8 @@ class YoutubeDL(object):
if 'url' not in format:
raise ExtractorError('Missing "url" key in result (index %d)' % i)
format['url'] = sanitize_url(format['url'])
if format.get('format_id') is None:
format['format_id'] = compat_str(i)
else:
@@ -1316,7 +1359,7 @@ class YoutubeDL(object):
note=' ({0})'.format(format['format_note']) if format.get('format_note') is not None else '',
)
# Automatically determine file extension if missing
if 'ext' not in format:
if format.get('ext') is None:
format['ext'] = determine_ext(format['url']).lower()
# Automatically determine protocol if missing (useful for format
# selection purposes)
@@ -1351,7 +1394,34 @@ class YoutubeDL(object):
req_format_list.append('best')
req_format = '/'.join(req_format_list)
format_selector = self.build_format_selector(req_format)
formats_to_download = list(format_selector(formats))
# While in format selection we may need to have an access to the original
# format set in order to calculate some metrics or do some processing.
# For now we need to be able to guess whether original formats provided
# by extractor are incomplete or not (i.e. whether extractor provides only
# video-only or audio-only formats) for proper formats selection for
# extractors with such incomplete formats (see
# https://github.com/rg3/youtube-dl/pull/5556).
# Since formats may be filtered during format selection and may not match
# the original formats the results may be incorrect. Thus original formats
# or pre-calculated metrics should be passed to format selection routines
# as well.
# We will pass a context object containing all necessary additional data
# instead of just formats.
# This fixes incorrect format selection issue (see
# https://github.com/rg3/youtube-dl/issues/10083).
incomplete_formats = (
# All formats are video-only or
all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) or
# all formats are audio-only
all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats))
ctx = {
'formats': formats,
'incomplete_formats': incomplete_formats,
}
formats_to_download = list(format_selector(ctx))
if not formats_to_download:
raise ExtractorError('requested format not available',
expected=True)
@@ -1538,7 +1608,9 @@ class YoutubeDL(object):
self.to_screen('[info] Video subtitle %s.%s is already_present' % (sub_lang, sub_format))
else:
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8') as subfile:
# Use newline='' to prevent conversion of newline characters
# See https://github.com/rg3/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_data)
except (OSError, IOError):
self.report_error('Cannot write subtitles file ' + sub_filename)
@@ -1626,7 +1698,7 @@ class YoutubeDL(object):
# Just a single file
success = dl(filename, info_dict)
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_error('unable to download video data: %s' % str(err))
self.report_error('unable to download video data: %s' % error_to_compat_str(err))
return
except (OSError, IOError) as err:
raise UnavailableVideoError(err)
@@ -1836,7 +1908,7 @@ class YoutubeDL(object):
if fdict.get('language'):
if res:
res += ' '
res += '[%s]' % fdict['language']
res += '[%s] ' % fdict['language']
if fdict.get('format_note') is not None:
res += fdict['format_note'] + ' '
if fdict.get('tbr') is not None:
@@ -1948,6 +2020,8 @@ class YoutubeDL(object):
write_string(encoding_str, encoding=None)
self._write_string('[debug] youtube-dl version ' + __version__ + '\n')
if _LAZY_LOADER:
self._write_string('[debug] Lazy loading extractors enabled' + '\n')
try:
sp = subprocess.Popen(
['git', 'rev-parse', '--short', 'HEAD'],
@@ -2003,6 +2077,7 @@ class YoutubeDL(object):
if opts_cookiefile is None:
self.cookiejar = compat_cookiejar.CookieJar()
else:
opts_cookiefile = compat_expanduser(opts_cookiefile)
self.cookiejar = compat_cookiejar.MozillaCookieJar(
opts_cookiefile)
if os.access(opts_cookiefile, os.R_OK):

View File

@@ -18,7 +18,6 @@ from .options import (
from .compat import (
compat_expanduser,
compat_getpass,
compat_print,
compat_shlex_split,
workaround_optparse_bug9161,
)
@@ -35,12 +34,14 @@ from .utils import (
setproctitle,
std_headers,
write_string,
render_table,
)
from .update import update_self
from .downloader import (
FileDownloader,
)
from .extractor import gen_extractors, list_extractors
from .extractor.adobepass import MSO_INFO
from .YoutubeDL import YoutubeDL
@@ -67,16 +68,16 @@ def _real_main(argv=None):
# Custom HTTP headers
if opts.headers is not None:
for h in opts.headers:
if h.find(':', 1) < 0:
if ':' not in h:
parser.error('wrong header formatting, it should be key:value, not "%s"' % h)
key, value = h.split(':', 2)
key, value = h.split(':', 1)
if opts.verbose:
write_string('[debug] Adding header from command line option %s:%s\n' % (key, value))
std_headers[key] = value
# Dump user agent
if opts.dump_user_agent:
compat_print(std_headers['User-Agent'])
write_string(std_headers['User-Agent'] + '\n', out=sys.stdout)
sys.exit(0)
# Batch file verification
@@ -86,7 +87,9 @@ def _real_main(argv=None):
if opts.batchfile == '-':
batchfd = sys.stdin
else:
batchfd = io.open(opts.batchfile, 'r', encoding='utf-8', errors='ignore')
batchfd = io.open(
compat_expanduser(opts.batchfile),
'r', encoding='utf-8', errors='ignore')
batch_urls = read_batch_urls(batchfd)
if opts.verbose:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
@@ -99,10 +102,10 @@ def _real_main(argv=None):
if opts.list_extractors:
for ie in list_extractors(opts.age_limit):
compat_print(ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie._WORKING else ''))
write_string(ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie._WORKING else '') + '\n', out=sys.stdout)
matchedUrls = [url for url in all_urls if ie.suitable(url)]
for mu in matchedUrls:
compat_print(' ' + mu)
write_string(' ' + mu + '\n', out=sys.stdout)
sys.exit(0)
if opts.list_extractor_descriptions:
for ie in list_extractors(opts.age_limit):
@@ -115,7 +118,11 @@ def _real_main(argv=None):
_SEARCHES = ('cute kittens', 'slithering pythons', 'falling cat', 'angry poodle', 'purple fish', 'running tortoise', 'sleeping bunny', 'burping cow')
_COUNTS = ('', '5', '10', 'all')
desc += ' (Example: "%s%s:%s" )' % (ie.SEARCH_KEY, random.choice(_COUNTS), random.choice(_SEARCHES))
compat_print(desc)
write_string(desc + '\n', out=sys.stdout)
sys.exit(0)
if opts.ap_list_mso:
table = [[mso_id, mso_info['name']] for mso_id, mso_info in MSO_INFO.items()]
write_string('Supported TV Providers:\n' + render_table(['mso', 'mso name'], table) + '\n', out=sys.stdout)
sys.exit(0)
# Conflicting, missing and erroneous options
@@ -123,12 +130,16 @@ def _real_main(argv=None):
parser.error('using .netrc conflicts with giving username/password')
if opts.password is not None and opts.username is None:
parser.error('account username missing\n')
if opts.ap_password is not None and opts.ap_username is None:
parser.error('TV Provider account username missing\n')
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
parser.error('using output template conflicts with using title, video ID or auto number')
if opts.usetitle and opts.useid:
parser.error('using title conflicts with using video ID')
if opts.username is not None and opts.password is None:
opts.password = compat_getpass('Type account password and press [Return]: ')
if opts.ap_username is not None and opts.ap_password is None:
opts.ap_password = compat_getpass('Type TV provider account password and press [Return]: ')
if opts.ratelimit is not None:
numeric_limit = FileDownloader.parse_bytes(opts.ratelimit)
if numeric_limit is None:
@@ -144,14 +155,32 @@ def _real_main(argv=None):
if numeric_limit is None:
parser.error('invalid max_filesize specified')
opts.max_filesize = numeric_limit
if opts.retries is not None:
if opts.retries in ('inf', 'infinite'):
opts_retries = float('inf')
if opts.sleep_interval is not None:
if opts.sleep_interval < 0:
parser.error('sleep interval must be positive or 0')
if opts.max_sleep_interval is not None:
if opts.max_sleep_interval < 0:
parser.error('max sleep interval must be positive or 0')
if opts.max_sleep_interval < opts.sleep_interval:
parser.error('max sleep interval must be greater than or equal to min sleep interval')
else:
opts.max_sleep_interval = opts.sleep_interval
if opts.ap_mso and opts.ap_mso not in MSO_INFO:
parser.error('Unsupported TV Provider, use --ap-list-mso to get a list of supported TV Providers')
def parse_retries(retries):
if retries in ('inf', 'infinite'):
parsed_retries = float('inf')
else:
try:
opts_retries = int(opts.retries)
parsed_retries = int(retries)
except (TypeError, ValueError):
parser.error('invalid retry count specified')
return parsed_retries
if opts.retries is not None:
opts.retries = parse_retries(opts.retries)
if opts.fragment_retries is not None:
opts.fragment_retries = parse_retries(opts.fragment_retries)
if opts.buffersize is not None:
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
if numeric_buffersize is None:
@@ -276,6 +305,9 @@ def _real_main(argv=None):
'password': opts.password,
'twofactor': opts.twofactor,
'videopassword': opts.videopassword,
'ap_mso': opts.ap_mso,
'ap_username': opts.ap_username,
'ap_password': opts.ap_password,
'quiet': (opts.quiet or any_getting or any_printing),
'no_warnings': opts.no_warnings,
'forceurl': opts.geturl,
@@ -299,7 +331,9 @@ def _real_main(argv=None):
'force_generic_extractor': opts.force_generic_extractor,
'ratelimit': opts.ratelimit,
'nooverwrites': opts.nooverwrites,
'retries': opts_retries,
'retries': opts.retries,
'fragment_retries': opts.fragment_retries,
'skip_unavailable_fragments': opts.skip_unavailable_fragments,
'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer,
'continuedl': opts.continue_dl,
@@ -362,6 +396,7 @@ def _real_main(argv=None):
'source_address': opts.source_address,
'call_home': opts.call_home,
'sleep_interval': opts.sleep_interval,
'max_sleep_interval': opts.max_sleep_interval,
'external_downloader': opts.external_downloader,
'list_thumbnails': opts.list_thumbnails,
'playlist_items': opts.playlist_items,
@@ -374,6 +409,8 @@ def _real_main(argv=None):
'external_downloader_args': external_downloader_args,
'postprocessor_args': postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
}
with YoutubeDL(ydl_opts) as ydl:
@@ -397,7 +434,7 @@ def _real_main(argv=None):
try:
if opts.load_info_filename is not None:
retcode = ydl.download_with_info_file(opts.load_info_filename)
retcode = ydl.download_with_info_file(compat_expanduser(opts.load_info_filename))
else:
retcode = ydl.download(all_urls)
except MaxDownloadsReached:

File diff suppressed because it is too large Load Diff

View File

@@ -41,9 +41,12 @@ def get_suitable_downloader(info_dict, params={}):
if ed.can_download(info_dict):
return ed
if protocol == 'm3u8' and params.get('hls_prefer_native'):
if protocol == 'm3u8' and params.get('hls_prefer_native') is True:
return HlsFD
if protocol == 'm3u8_native' and params.get('hls_prefer_native') is False:
return FFmpegFD
return PROTOCOL_MAP.get(protocol, HttpFD)

View File

@@ -4,6 +4,7 @@ import os
import re
import sys
import time
import random
from ..compat import compat_os_name
from ..utils import (
@@ -115,6 +116,10 @@ class FileDownloader(object):
return '%10s' % '---b/s'
return '%10s' % ('%s/s' % format_bytes(speed))
@staticmethod
def format_retries(retries):
return 'inf' if retries == float('inf') else '%.0f' % retries
@staticmethod
def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0)
@@ -297,7 +302,9 @@ class FileDownloader(object):
def report_retry(self, count, retries):
"""Report retry in case of HTTP error 5xx"""
self.to_screen('[download] Got server HTTP error. Retrying (attempt %d of %.0f)...' % (count, retries))
self.to_screen(
'[download] Got server HTTP error. Retrying (attempt %d of %s)...'
% (count, self.format_retries(retries)))
def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded."""
@@ -336,8 +343,11 @@ class FileDownloader(object):
})
return True
sleep_interval = self.params.get('sleep_interval')
if sleep_interval:
min_sleep_interval = self.params.get('sleep_interval')
if min_sleep_interval:
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
print(min_sleep_interval, max_sleep_interval)
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
time.sleep(sleep_interval)

View File

@@ -1,9 +1,9 @@
from __future__ import unicode_literals
import os
import re
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import (
sanitize_open,
encodeFilename,
@@ -18,38 +18,60 @@ class DashSegmentsFD(FragmentFD):
FD_NAME = 'dashsegments'
def real_download(self, filename, info_dict):
base_url = info_dict['url']
segment_urls = [info_dict['segment_urls'][0]] if self.params.get('test', False) else info_dict['segment_urls']
initialization_url = info_dict.get('initialization_url')
segments = info_dict['fragments'][:1] if self.params.get(
'test', False) else info_dict['fragments']
ctx = {
'filename': filename,
'total_frags': len(segment_urls) + (1 if initialization_url else 0),
'total_frags': len(segments),
}
self._prepare_and_start_frag_download(ctx)
def combine_url(base_url, target_url):
if re.match(r'^https?://', target_url):
return target_url
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
segments_filenames = []
def append_url_to_file(target_url, target_filename):
success = ctx['dl'].download(target_filename, {'url': combine_url(base_url, target_url)})
if not success:
return False
down, target_sanitized = sanitize_open(target_filename, 'rb')
ctx['dest_stream'].write(down.read())
down.close()
segments_filenames.append(target_sanitized)
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
if initialization_url:
append_url_to_file(initialization_url, ctx['tmpfilename'] + '-Init')
for i, segment_url in enumerate(segment_urls):
segment_filename = '%s-Seg%d' % (ctx['tmpfilename'], i)
append_url_to_file(segment_url, segment_filename)
def process_segment(segment, tmp_filename, num):
segment_url = segment['url']
segment_name = 'Frag%d' % num
target_filename = '%s-%s' % (tmp_filename, segment_name)
# In DASH, the first segment contains necessary headers to
# generate a valid MP4 file, so always abort for the first segment
fatal = num == 0 or not skip_unavailable_fragments
count = 0
while count <= fragment_retries:
try:
success = ctx['dl'].download(target_filename, {'url': segment_url})
if not success:
return False
down, target_sanitized = sanitize_open(target_filename, 'rb')
ctx['dest_stream'].write(down.read())
down.close()
segments_filenames.append(target_sanitized)
break
except compat_urllib_error.HTTPError as err:
# YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attemps
# is usually enough) thus allowing to download the whole file successfully.
# To be future-proof we will retry all fragments that fail with any
# HTTP error.
count += 1
if count <= fragment_retries:
self.report_retry_fragment(err, segment_name, count, fragment_retries)
if count > fragment_retries:
if not fatal:
self.report_skip_fragment(segment_name)
return True
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False
return True
for i, segment in enumerate(segments):
if not process_segment(segment, ctx['tmpfilename'], i):
return False
self._finish_frag_download(ctx)

View File

@@ -6,6 +6,7 @@ import sys
import re
from .common import FileDownloader
from ..compat import compat_setenv
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
from ..utils import (
cli_option,
@@ -84,7 +85,7 @@ class ExternalFD(FileDownloader):
cmd, stderr=subprocess.PIPE)
_, stderr = p.communicate()
if p.returncode != 0:
self.to_stderr(stderr)
self.to_stderr(stderr.decode('utf-8', 'replace'))
return p.returncode
@@ -95,6 +96,12 @@ class CurlFD(ExternalFD):
cmd = [self.exe, '--location', '-o', tmpfilename]
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._bool_option('--continue-at', 'continuedl', '-', '0')
cmd += self._valueless_option('--silent', 'noprogress')
cmd += self._valueless_option('--verbose', 'verbose')
cmd += self._option('--limit-rate', 'ratelimit')
cmd += self._option('--retry', 'retries')
cmd += self._option('--max-filesize', 'max_filesize')
cmd += self._option('--interface', 'source_address')
cmd += self._option('--proxy', 'proxy')
cmd += self._valueless_option('--insecure', 'nocheckcertificate')
@@ -102,6 +109,16 @@ class CurlFD(ExternalFD):
cmd += ['--', info_dict['url']]
return cmd
def _call_downloader(self, tmpfilename, info_dict):
cmd = [encodeArgument(a) for a in self._make_cmd(tmpfilename, info_dict)]
self._debug_cmd(cmd)
# curl writes the progress to stderr so don't capture it.
p = subprocess.Popen(cmd)
p.communicate()
return p.returncode
class AxelFD(ExternalFD):
AVAILABLE_OPT = '-V'
@@ -198,6 +215,25 @@ class FFmpegFD(ExternalFD):
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in headers.items())]
env = None
proxy = self.params.get('proxy')
if proxy:
if not re.match(r'^[\da-zA-Z]+://', proxy):
proxy = 'http://%s' % proxy
if proxy.startswith('socks'):
self.report_warning(
'%s does not support SOCKS proxies. Downloading is likely to fail. '
'Consider adding --hls-prefer-native to your command.' % self.get_basename())
# Since December 2015 ffmpeg supports -http_proxy option (see
# http://git.videolan.org/?p=ffmpeg.git;a=commit;h=b4eb1f29ebddd60c41a2eb39f5af701e38e0d3fd)
# We could switch to the following code if we are able to detect version properly
# args += ['-http_proxy', proxy]
env = os.environ.copy()
compat_setenv('HTTP_PROXY', proxy, env=env)
compat_setenv('http_proxy', proxy, env=env)
protocol = info_dict.get('protocol')
if protocol == 'rtmp':
@@ -224,8 +260,8 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_live', 'live']
args += ['-i', url, '-c', 'copy']
if protocol == 'm3u8':
if self.params.get('hls_use_mpegts', False):
if protocol in ('m3u8', 'm3u8_native'):
if self.params.get('hls_use_mpegts', False) or tmpfilename == '-':
args += ['-f', 'mpegts']
else:
args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
@@ -239,7 +275,7 @@ class FFmpegFD(ExternalFD):
self._debug_cmd(args)
proc = subprocess.Popen(args, stdin=subprocess.PIPE)
proc = subprocess.Popen(args, stdin=subprocess.PIPE, env=env)
try:
retval = proc.wait()
except KeyboardInterrupt:

View File

@@ -12,37 +12,49 @@ from ..compat import (
compat_urlparse,
compat_urllib_error,
compat_urllib_parse_urlparse,
compat_struct_pack,
compat_struct_unpack,
)
from ..utils import (
encodeFilename,
fix_xml_ampersands,
sanitize_open,
struct_pack,
struct_unpack,
xpath_text,
)
class DataTruncatedError(Exception):
pass
class FlvReader(io.BytesIO):
"""
Reader for Flv files
The file format is documented in https://www.adobe.com/devnet/f4v.html
"""
def read_bytes(self, n):
data = self.read(n)
if len(data) < n:
raise DataTruncatedError(
'FlvReader error: need %d bytes while only %d bytes got' % (
n, len(data)))
return data
# Utility functions for reading numbers and strings
def read_unsigned_long_long(self):
return struct_unpack('!Q', self.read(8))[0]
return compat_struct_unpack('!Q', self.read_bytes(8))[0]
def read_unsigned_int(self):
return struct_unpack('!I', self.read(4))[0]
return compat_struct_unpack('!I', self.read_bytes(4))[0]
def read_unsigned_char(self):
return struct_unpack('!B', self.read(1))[0]
return compat_struct_unpack('!B', self.read_bytes(1))[0]
def read_string(self):
res = b''
while True:
char = self.read(1)
char = self.read_bytes(1)
if char == b'\x00':
break
res += char
@@ -53,18 +65,18 @@ class FlvReader(io.BytesIO):
Read a box and return the info as a tuple: (box_size, box_type, box_data)
"""
real_size = size = self.read_unsigned_int()
box_type = self.read(4)
box_type = self.read_bytes(4)
header_end = 8
if size == 1:
real_size = self.read_unsigned_long_long()
header_end = 16
return real_size, box_type, self.read(real_size - header_end)
return real_size, box_type, self.read_bytes(real_size - header_end)
def read_asrt(self):
# version
self.read_unsigned_char()
# flags
self.read(3)
self.read_bytes(3)
quality_entry_count = self.read_unsigned_char()
# QualityEntryCount
for i in range(quality_entry_count):
@@ -85,7 +97,7 @@ class FlvReader(io.BytesIO):
# version
self.read_unsigned_char()
# flags
self.read(3)
self.read_bytes(3)
# time scale
self.read_unsigned_int()
@@ -119,7 +131,7 @@ class FlvReader(io.BytesIO):
# version
self.read_unsigned_char()
# flags
self.read(3)
self.read_bytes(3)
self.read_unsigned_int() # BootstrapinfoVersion
# Profile,Live,Update,Reserved
@@ -184,6 +196,11 @@ def build_fragments_list(boot_info):
first_frag_number = fragment_run_entry_table[0]['first']
fragments_counter = itertools.count(first_frag_number)
for segment, fragments_count in segment_run_table['segment_run']:
# In some live HDS streams (for example Rai), `fragments_count` is
# abnormal and causing out-of-memory errors. It's OK to change the
# number of fragments for live streams as they are updated periodically
if fragments_count == 4294967295 and boot_info['live']:
fragments_count = 2
for _ in range(fragments_count):
res.append((segment, next(fragments_counter)))
@@ -194,11 +211,11 @@ def build_fragments_list(boot_info):
def write_unsigned_int(stream, val):
stream.write(struct_pack('!I', val))
stream.write(compat_struct_pack('!I', val))
def write_unsigned_int_24(stream, val):
stream.write(struct_pack('!I', val)[1:])
stream.write(compat_struct_pack('!I', val)[1:])
def write_flv_header(stream):
@@ -223,6 +240,12 @@ def write_metadata_tag(stream, metadata):
write_unsigned_int(stream, FLV_TAG_HEADER_LEN + len(metadata))
def remove_encrypted_media(media):
return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
'drmAdditionalHeaderSetId' not in e.attrib,
media))
def _add_ns(prop):
return '{http://ns.adobe.com/f4m/1.0}%s' % prop
@@ -244,9 +267,7 @@ class F4mFD(FragmentFD):
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
if 'id' not in e.attrib:
self.report_error('Missing ID in f4m DRM')
media = list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
'drmAdditionalHeaderSetId' not in e.attrib,
media))
media = remove_encrypted_media(media)
if not media:
self.report_error('Unsupported DRM')
return media
@@ -303,7 +324,7 @@ class F4mFD(FragmentFD):
doc = compat_etree_fromstring(manifest)
formats = [(int(f.attrib.get('bitrate', -1)), f)
for f in self._get_unencrypted_media(doc)]
if requested_bitrate is None:
if requested_bitrate is None or len(formats) == 1:
# get the best format
formats = sorted(formats, key=lambda f: f[0])
rate, media = formats[-1]
@@ -313,7 +334,11 @@ class F4mFD(FragmentFD):
base_url = compat_urlparse.urljoin(man_url, media.attrib['url'])
bootstrap_node = doc.find(_add_ns('bootstrapInfo'))
boot_info, bootstrap_url = self._parse_bootstrap_node(bootstrap_node, base_url)
# From Adobe F4M 3.0 spec:
# The <baseURL> element SHALL be the base URL for all relative
# (HTTP-based) URLs in the manifest. If <baseURL> is not present, said
# URLs should be relative to the location of the containing document.
boot_info, bootstrap_url = self._parse_bootstrap_node(bootstrap_node, man_url)
live = boot_info['live']
metadata_node = media.find(_add_ns('metadata'))
if metadata_node is not None:
@@ -370,7 +395,17 @@ class F4mFD(FragmentFD):
down.close()
reader = FlvReader(down_data)
while True:
_, box_type, box_data = reader.read_box_info()
try:
_, box_type, box_data = reader.read_box_info()
except DataTruncatedError:
if test:
# In tests, segments may be truncated, and thus
# FlvReader may not be able to parse the whole
# chunk. If so, write the segment as is
# See https://github.com/rg3/youtube-dl/issues/9214
dest_stream.write(down_data)
break
raise
if box_type == b'mdat':
dest_stream.write(box_data)
break

View File

@@ -6,6 +6,7 @@ import time
from .common import FileDownloader
from .http import HttpFD
from ..utils import (
error_to_compat_str,
encodeFilename,
sanitize_open,
)
@@ -19,8 +20,23 @@ class HttpQuietDownloader(HttpFD):
class FragmentFD(FileDownloader):
"""
A base file downloader class for fragmented media (e.g. f4m/m3u8 manifests).
Available options:
fragment_retries: Number of times to retry a fragment for HTTP error (DASH
and hlsnative only)
skip_unavailable_fragments:
Skip unavailable fragments (DASH and hlsnative only)
"""
def report_retry_fragment(self, err, fragment_name, count, retries):
self.to_screen(
'[download] Got server HTTP error: %s. Retrying fragment %s (attempt %d of %s)...'
% (error_to_compat_str(err), fragment_name, count, self.format_retries(retries)))
def report_skip_fragment(self, fragment_name):
self.to_screen('[download] Skipping fragment %s...' % fragment_name)
def _prepare_and_start_frag_download(self, ctx):
self._prepare_frag_download(ctx)
self._start_frag_download(ctx)

View File

@@ -2,13 +2,26 @@ from __future__ import unicode_literals
import os.path
import re
import binascii
try:
from Crypto.Cipher import AES
can_decrypt_frag = True
except ImportError:
can_decrypt_frag = False
from .fragment import FragmentFD
from .external import FFmpegFD
from ..compat import compat_urlparse
from ..compat import (
compat_urllib_error,
compat_urlparse,
compat_struct_pack,
)
from ..utils import (
encodeFilename,
sanitize_open,
parse_m3u8_attributes,
update_url_query,
)
@@ -17,42 +30,135 @@ class HlsFD(FragmentFD):
FD_NAME = 'hlsnative'
@staticmethod
def can_download(manifest):
UNSUPPORTED_FEATURES = (
r'#EXT-X-KEY:METHOD=(?!NONE|AES-128)', # encrypted streams [1]
r'#EXT-X-BYTERANGE', # playlists composed of byte ranges of media files [2]
# Live streams heuristic does not always work (e.g. geo restricted to Germany
# http://hls-geo.daserste.de/i/videoportal/Film/c_620000/622873/format,716451,716457,716450,716458,716459,.mp4.csmil/index_4_av.m3u8?null=0)
# r'#EXT-X-MEDIA-SEQUENCE:(?!0$)', # live streams [3]
# This heuristic also is not correct since segments may not be appended as well.
# Twitch vods of finished streams have EXT-X-PLAYLIST-TYPE:EVENT despite
# no segments will definitely be appended to the end of the playlist.
# r'#EXT-X-PLAYLIST-TYPE:EVENT', # media segments may be appended to the end of
# # event media playlists [4]
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.4
# 2. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.2
# 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.2
# 4. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.5
)
check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
check_results.append(can_decrypt_frag or '#EXT-X-KEY:METHOD=AES-128' not in manifest)
return all(check_results)
def real_download(self, filename, info_dict):
man_url = info_dict['url']
self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME)
manifest = self.ydl.urlopen(man_url).read()
s = manifest.decode('utf-8', 'ignore')
fragment_urls = []
if not self.can_download(s):
self.report_warning(
'hlsnative has detected features it does not support, '
'extraction will be delegated to ffmpeg')
fd = FFmpegFD(self.ydl, self.params)
for ph in self._progress_hooks:
fd.add_progress_hook(ph)
return fd.real_download(filename, info_dict)
total_frags = 0
for line in s.splitlines():
line = line.strip()
if line and not line.startswith('#'):
segment_url = (
line
if re.match(r'^https?://', line)
else compat_urlparse.urljoin(man_url, line))
fragment_urls.append(segment_url)
# We only download the first fragment during the test
if self.params.get('test', False):
break
total_frags += 1
ctx = {
'filename': filename,
'total_frags': len(fragment_urls),
'total_frags': total_frags,
}
self._prepare_and_start_frag_download(ctx)
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
test = self.params.get('test', False)
extra_query = None
extra_param_to_segment_url = info_dict.get('extra_param_to_segment_url')
if extra_param_to_segment_url:
extra_query = compat_urlparse.parse_qs(extra_param_to_segment_url)
i = 0
media_sequence = 0
decrypt_info = {'METHOD': 'NONE'}
frags_filenames = []
for i, frag_url in enumerate(fragment_urls):
frag_filename = '%s-Frag%d' % (ctx['tmpfilename'], i)
success = ctx['dl'].download(frag_filename, {'url': frag_url})
if not success:
return False
down, frag_sanitized = sanitize_open(frag_filename, 'rb')
ctx['dest_stream'].write(down.read())
down.close()
frags_filenames.append(frag_sanitized)
for line in s.splitlines():
line = line.strip()
if line:
if not line.startswith('#'):
frag_url = (
line
if re.match(r'^https?://', line)
else compat_urlparse.urljoin(man_url, line))
frag_name = 'Frag%d' % i
frag_filename = '%s-%s' % (ctx['tmpfilename'], frag_name)
if extra_query:
frag_url = update_url_query(frag_url, extra_query)
count = 0
while count <= fragment_retries:
try:
success = ctx['dl'].download(frag_filename, {'url': frag_url})
if not success:
return False
down, frag_sanitized = sanitize_open(frag_filename, 'rb')
frag_content = down.read()
down.close()
break
except compat_urllib_error.HTTPError as err:
# Unavailable (possibly temporary) fragments may be served.
# First we try to retry then either skip or abort.
# See https://github.com/rg3/youtube-dl/issues/10165,
# https://github.com/rg3/youtube-dl/issues/10448).
count += 1
if count <= fragment_retries:
self.report_retry_fragment(err, frag_name, count, fragment_retries)
if count > fragment_retries:
if skip_unavailable_fragments:
i += 1
media_sequence += 1
self.report_skip_fragment(frag_name)
continue
self.report_error(
'giving up after %s fragment retries' % fragment_retries)
return False
if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
frag_content = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
ctx['dest_stream'].write(frag_content)
frags_filenames.append(frag_sanitized)
# We only download the first fragment during the test
if test:
break
i += 1
media_sequence += 1
elif line.startswith('#EXT-X-KEY'):
decrypt_info = parse_m3u8_attributes(line[11:])
if decrypt_info['METHOD'] == 'AES-128':
if 'IV' in decrypt_info:
decrypt_info['IV'] = binascii.unhexlify(decrypt_info['IV'][2:].zfill(32))
if not re.match(r'^https?://', decrypt_info['URI']):
decrypt_info['URI'] = compat_urlparse.urljoin(
man_url, decrypt_info['URI'])
if extra_query:
decrypt_info['URI'] = update_url_query(decrypt_info['URI'], extra_query)
decrypt_info['KEY'] = self.ydl.urlopen(decrypt_info['URI']).read()
elif line.startswith('#EXT-X-MEDIA-SEQUENCE'):
media_sequence = int(line[22:])
self._finish_frag_download(ctx)

View File

@@ -27,6 +27,8 @@ class RtspFD(FileDownloader):
self.report_error('MMS or RTSP download detected but neither "mplayer" nor "mpv" could be run. Please install any.')
return False
self._debug_cmd(args)
retval = subprocess.call(args)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))

File diff suppressed because it is too large Load Diff

View File

@@ -7,12 +7,13 @@ from ..utils import (
ExtractorError,
js_to_json,
int_or_none,
parse_iso8601,
)
class ABCIE(InfoExtractor):
IE_NAME = 'abc.net.au'
_VALID_URL = r'http://www\.abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',
@@ -93,3 +94,59 @@ class ABCIE(InfoExtractor):
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
}
class ABCIViewIE(InfoExtractor):
IE_NAME = 'abc.net.au:iview'
_VALID_URL = r'https?://iview\.abc\.net\.au/programs/[^/]+/(?P<id>[^/?#]+)'
# ABC iview programs are normally available for 14 days only.
_TESTS = [{
'url': 'http://iview.abc.net.au/programs/gardening-australia/FA1505V024S00',
'md5': '979d10b2939101f0d27a06b79edad536',
'info_dict': {
'id': 'FA1505V024S00',
'ext': 'mp4',
'title': 'Series 27 Ep 24',
'description': 'md5:b28baeae7504d1148e1d2f0e3ed3c15d',
'upload_date': '20160820',
'uploader_id': 'abc1',
'timestamp': 1471719600,
},
'skip': 'Video gone',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_params = self._parse_json(self._search_regex(
r'videoParams\s*=\s*({.+?});', webpage, 'video params'), video_id)
title = video_params['title']
stream = next(s for s in video_params['playlist'] if s.get('type') == 'program')
formats = self._extract_akamai_formats(stream['hds-unmetered'], video_id)
self._sort_formats(formats)
subtitles = {}
src_vtt = stream.get('captions', {}).get('src-vtt')
if src_vtt:
subtitles['en'] = [{
'url': src_vtt,
'ext': 'vtt',
}]
return {
'id': video_id,
'title': title,
'description': self._html_search_meta(['og:description', 'twitter:description'], webpage),
'thumbnail': self._html_search_meta(['og:image', 'twitter:image:src'], webpage),
'duration': int_or_none(video_params.get('eventDuration')),
'timestamp': parse_iso8601(video_params.get('pubDate'), ' '),
'series': video_params.get('seriesTitle'),
'series_id': video_params.get('seriesHouseNumber') or video_id[:7],
'episode_number': int_or_none(self._html_search_meta('episodeNumber', webpage)),
'episode': self._html_search_meta('episode_title', webpage),
'uploader_id': video_params.get('channel'),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -0,0 +1,135 @@
# coding: utf-8
from __future__ import unicode_literals
import calendar
import re
import time
from .amp import AMPIE
from .common import InfoExtractor
from ..compat import compat_urlparse
class AbcNewsVideoIE(AMPIE):
IE_NAME = 'abcnews:video'
_VALID_URL = r'https?://abcnews\.go\.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
_TESTS = [{
'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932',
'info_dict': {
'id': '20411932',
'ext': 'mp4',
'display_id': 'week-exclusive-irans-foreign-minister-zarif',
'title': '\'This Week\' Exclusive: Iran\'s Foreign Minister Zarif',
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
'thumbnail': 're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('display_id')
video_id = mobj.group('id')
info_dict = self._extract_feed_info(
'http://abcnews.go.com/video/itemfeed?id=%s' % video_id)
info_dict.update({
'id': video_id,
'display_id': display_id,
})
return info_dict
class AbcNewsIE(InfoExtractor):
IE_NAME = 'abcnews'
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
_TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
'info_dict': {
'id': '10498713',
'ext': 'flv',
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': 're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
},
'add_ie': ['AbcNewsVideo'],
}, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': {
'id': '39125818',
'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single',
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
'upload_date': '20160515',
'timestamp': 1463329500,
},
'params': {
# m3u8 download
'skip_download': True,
# The embedded YouTube video is blocked due to copyright issues
'playlist_items': '1',
},
'add_ie': ['AbcNewsVideo'],
}, {
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('display_id')
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL')
full_video_url = compat_urlparse.urljoin(url, video_url)
youtube_url = self._html_search_regex(
r'<iframe[^>]+src="(https://www\.youtube\.com/embed/[^"]+)"',
webpage, 'YouTube URL', default=None)
timestamp = None
date_str = self._html_search_regex(
r'<span[^>]+class="timestamp">([^<]+)</span>',
webpage, 'timestamp', fatal=False)
if date_str:
tz_offset = 0
if date_str.endswith(' ET'): # Eastern Time
tz_offset = -5
date_str = date_str[:-3]
date_formats = ['%b. %d, %Y', '%b %d, %Y, %I:%M %p']
for date_format in date_formats:
try:
timestamp = calendar.timegm(time.strptime(date_str.strip(), date_format))
except ValueError:
continue
if timestamp is not None:
timestamp -= tz_offset * 3600
entry = {
'_type': 'url_transparent',
'ie_key': AbcNewsVideoIE.ie_key(),
'url': full_video_url,
'id': video_id,
'display_id': display_id,
'timestamp': timestamp,
}
if youtube_url:
entries = [entry, self.url_result(youtube_url, 'Youtube')]
return self.playlist_result(entries)
return entry

View File

@@ -1,13 +1,19 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import parse_iso8601
from ..utils import (
int_or_none,
parse_iso8601,
)
class Abc7NewsIE(InfoExtractor):
_VALID_URL = r'https?://abc7news\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
class ABCOTVSIE(InfoExtractor):
IE_NAME = 'abcotvs'
IE_DESC = 'ABC Owned Television Stations'
_VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
_TESTS = [
{
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
@@ -15,7 +21,7 @@ class Abc7NewsIE(InfoExtractor):
'id': '472581',
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
'ext': 'mp4',
'title': 'East Bay museum celebrates history of synthesized music',
'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
'thumbnail': 're:^https?://.*\.jpg$',
'timestamp': 1421123075,
@@ -41,9 +47,10 @@ class Abc7NewsIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
m3u8 = self._html_search_meta(
'contentURL', webpage, 'm3u8 url', fatal=True)
'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0]
formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
self._sort_formats(formats)
title = self._og_search_title(webpage).strip()
description = self._og_search_description(webpage).strip()
@@ -65,3 +72,41 @@ class Abc7NewsIE(InfoExtractor):
'uploader': uploader,
'formats': formats,
}
class ABCOTVSClipsIE(InfoExtractor):
IE_NAME = 'abcotvs:clips'
_VALID_URL = r'https?://clips\.abcotvs\.com/(?:[^/]+/)*video/(?P<id>\d+)'
_TEST = {
'url': 'https://clips.abcotvs.com/kabc/video/214814',
'info_dict': {
'id': '214814',
'ext': 'mp4',
'title': 'SpaceX launch pad explosion destroys rocket, satellite',
'description': 'md5:9f186e5ad8f490f65409965ee9c7be1b',
'upload_date': '20160901',
'timestamp': 1472756695,
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json('https://clips.abcotvs.com/vogo/video/getByIds?ids=' + video_id, video_id)['results'][0]
title = video_data['title']
formats = self._extract_m3u8_formats(
video_data['videoURL'].split('?')[0], video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnailURL'),
'duration': int_or_none(video_data.get('duration')),
'timestamp': int_or_none(video_data.get('pubDate')),
'formats': formats,
}

View File

@@ -2,10 +2,14 @@
from __future__ import unicode_literals
import re
import functools
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none
from ..utils import (
int_or_none,
OnDemandPagedList,
)
class ACastIE(InfoExtractor):
@@ -26,13 +30,8 @@ class ACastIE(InfoExtractor):
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
embed_page = self._download_webpage(
re.sub('(?:www\.)?acast\.com', 'embedcdn.acast.com', url), display_id)
cast_data = self._parse_json(self._search_regex(
r'window\[\'acast/queries\'\]\s*=\s*([^;]+);', embed_page, 'acast data'),
display_id)['GetAcast/%s/%s' % (channel, display_id)]
cast_data = self._download_json(
'https://embed.acast.com/api/acasts/%s/%s' % (channel, display_id), display_id)
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
@@ -58,15 +57,26 @@ class ACastChannelIE(InfoExtractor):
'playlist_mincount': 20,
}
_API_BASE_URL = 'https://www.acast.com/api/'
_PAGE_SIZE = 10
@classmethod
def suitable(cls, url):
return False if ACastIE.suitable(url) else super(ACastChannelIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
channel_data = self._download_json(self._API_BASE_URL + 'channels/%s' % display_id, display_id)
casts = self._download_json(self._API_BASE_URL + 'channels/%s/acasts' % display_id, display_id)
entries = [self.url_result('https://www.acast.com/%s/%s' % (display_id, cast['url']), 'ACast') for cast in casts]
def _fetch_page(self, channel_slug, page):
casts = self._download_json(
self._API_BASE_URL + 'channels/%s/acasts?page=%s' % (channel_slug, page),
channel_slug, note='Download page %d of channel data' % page)
for cast in casts:
yield self.url_result(
'https://www.acast.com/%s/%s' % (channel_slug, cast['url']),
'ACast', cast['id'])
return self.playlist_result(entries, compat_str(channel_data['id']), channel_data['name'], channel_data.get('description'))
def _real_extract(self, url):
channel_slug = self._match_id(url)
channel_data = self._download_json(
self._API_BASE_URL + 'channels/%s' % channel_slug, channel_slug)
entries = OnDemandPagedList(functools.partial(
self._fetch_page, channel_slug), self._PAGE_SIZE)
return self.playlist_result(entries, compat_str(
channel_data['id']), channel_data['name'], channel_data.get('description'))

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
)
from ..utils import (
@@ -16,7 +16,7 @@ from ..utils import (
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'http://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
@@ -60,7 +60,7 @@ class AddAnimeIE(InfoExtractor):
confirm_url = (
parsed_url.scheme + '://' + parsed_url.netloc +
action + '?' +
compat_urllib_parse.urlencode({
compat_urllib_parse_urlencode({
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
self._download_webpage(
confirm_url, video_id,

File diff suppressed because it is too large Load Diff

View File

@@ -156,7 +156,10 @@ class AdobeTVVideoIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(url + '?format=json', video_id)
webpage = self._download_webpage(url, video_id)
video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
formats = [{
'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),

View File

@@ -3,16 +3,14 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .turner import TurnerBaseIE
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
xpath_text,
int_or_none,
)
class AdultSwimIE(InfoExtractor):
class AdultSwimIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<is_playlist>playlists/)?(?P<show_path>[^/]+)/(?P<episode_path>[^/?#]+)/?'
_TESTS = [{
@@ -83,6 +81,21 @@ class AdultSwimIE(InfoExtractor):
# m3u8 download
'skip_download': True,
}
}, {
# heroMetadata.trailer
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': {
'id': 'I0LQFQkaSUaFp8PnAWHhoQ',
'ext': 'mp4',
'title': 'Decker - Inside Decker: A New Hero',
'description': 'md5:c916df071d425d62d70c86d4399d3ee0',
'duration': 249.008,
},
'params': {
# m3u8 download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest'],
}]
@staticmethod
@@ -133,79 +146,56 @@ class AdultSwimIE(InfoExtractor):
if video_info is None:
if bootstrapped_data.get('slugged_video', {}).get('slug') == episode_path:
video_info = bootstrapped_data['slugged_video']
else:
raise ExtractorError('Unable to find video info')
if not video_info:
video_info = bootstrapped_data.get(
'heroMetadata', {}).get('trailer', {}).get('video')
if not video_info:
video_info = bootstrapped_data.get('onlineOriginals', [None])[0]
if not video_info:
raise ExtractorError('Unable to find video info')
show = bootstrapped_data['show']
show_title = show['title']
stream = video_info.get('stream')
clips = [stream] if stream else video_info.get('clips')
if not clips:
raise ExtractorError(
'This video is only available via cable service provider subscription that'
' is not currently supported. You may want to use --cookies.'
if video_info.get('auth') is True else 'Unable to find stream or clips',
expected=True)
segment_ids = [clip['videoPlaybackID'] for clip in clips]
if stream and stream.get('videoPlaybackID'):
segment_ids = [stream['videoPlaybackID']]
elif video_info.get('clips'):
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
elif video_info.get('videoPlaybackID'):
segment_ids = [video_info['videoPlaybackID']]
else:
if video_info.get('auth') is True:
raise ExtractorError(
'This video is only available via cable service provider subscription that'
' is not currently supported. You may want to use --cookies.', expected=True)
else:
raise ExtractorError('Unable to find stream or clips')
episode_id = video_info['id']
episode_title = video_info['title']
episode_description = video_info['description']
episode_duration = video_info.get('duration')
episode_description = video_info.get('description')
episode_duration = int_or_none(video_info.get('duration'))
view_count = int_or_none(video_info.get('views'))
entries = []
for part_num, segment_id in enumerate(segment_ids):
segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=desktop' % segment_id
segement_info = self._extract_cvp_info(
'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=desktop' % segment_id,
segment_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
},
})
segment_title = '%s - %s' % (show_title, episode_title)
if len(segment_ids) > 1:
segment_title += ' Part %d' % (part_num + 1)
idoc = self._download_xml(
segment_url, segment_title,
'Downloading segment information', 'Unable to download segment information')
segment_duration = float_or_none(
xpath_text(idoc, './/trt', 'segment duration').strip())
formats = []
file_els = idoc.findall('.//files/file') or idoc.findall('./files/file')
unique_urls = []
unique_file_els = []
for file_el in file_els:
media_url = file_el.text
if not media_url or determine_ext(media_url) == 'f4m':
continue
if file_el.text not in unique_urls:
unique_urls.append(file_el.text)
unique_file_els.append(file_el)
for file_el in unique_file_els:
bitrate = file_el.attrib.get('bitrate')
ftype = file_el.attrib.get('type')
media_url = file_el.text
if determine_ext(media_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media_url, segment_title, 'mp4', preference=0,
m3u8_id='hls', fatal=False))
else:
formats.append({
'format_id': '%s_%s' % (bitrate, ftype),
'url': file_el.text.strip(),
# The bitrate may not be a number (for example: 'iphone')
'tbr': int(bitrate) if bitrate.isdigit() else None,
})
self._sort_formats(formats)
entries.append({
segement_info.update({
'id': segment_id,
'title': segment_title,
'formats': formats,
'duration': segment_duration,
'description': episode_description
'description': episode_description,
})
entries.append(segement_info)
return {
'_type': 'playlist',
@@ -214,5 +204,6 @@ class AdultSwimIE(InfoExtractor):
'entries': entries,
'title': '%s - %s' % (show_title, episode_title),
'description': episode_description,
'duration': episode_duration
'duration': episode_duration,
'view_count': view_count,
}

View File

@@ -1,35 +1,147 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import smuggle_url
import re
from .theplatform import ThePlatformIE
from ..utils import (
smuggle_url,
update_url_query,
unescapeHTML,
extract_attributes,
get_element_by_attribute,
)
from ..compat import (
compat_urlparse,
)
class AENetworksIE(InfoExtractor):
class AENetworksBaseIE(ThePlatformIE):
_THEPLATFORM_KEY = 'crazyjava'
_THEPLATFORM_SECRET = 's3cr3t'
class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks'
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network'
_VALID_URL = r'https?://(?:www\.)?(?:(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:[^/]+/)+(?P<id>[^/]+?)(?:$|[?#])'
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
_TESTS = [{
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'info_dict': {
'id': 'g12m5Gyt3fdR',
'ext': 'mp4',
'title': "Bet You Didn't Know: Valentine's Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
'expected_warnings': ['JSON-LD'],
}, {
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': '8ff93eb073449f151d6b90c0ae1ef0c7',
'info_dict': {
'id': 'eg47EERs_JsZ',
'id': '22253814',
'ext': 'mp4',
'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241,
'upload_date': '20120529',
'uploader': 'AENE-NEW',
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/shows/ancient-aliens/season-1',
'info_dict': {
'id': '71889446852',
},
'playlist_mincount': 5,
}, {
'url': 'http://www.mylifetime.com/shows/atlanta-plastic',
'info_dict': {
'id': 'SERIES4317',
'title': 'Atlanta Plastic',
},
'playlist_mincount': 2,
}, {
'url': 'http://www.aetv.com/shows/duck-dynasty/season-9/episode-1',
'only_matching': True
}, {
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
'only_matching': True
}, {
'url': 'http://www.mylifetime.com/shows/project-runway-junior/season-1/episode-6',
'only_matching': True
}, {
'url': 'http://www.mylifetime.com/movies/center-stage-on-pointe/full-movie',
'only_matching': True
}]
_DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY',
'aetv.com': 'AETV',
'mylifetime.com': 'LIFETIME',
'fyi.tv': 'FYI',
}
def _real_extract(self, url):
domain, show_path, movie_display_id = re.match(self._VALID_URL, url).groups()
display_id = show_path or movie_display_id
webpage = self._download_webpage(url, display_id)
if show_path:
url_parts = show_path.split('/')
url_parts_len = len(url_parts)
if url_parts_len == 1:
entries = []
for season_url_path in re.findall(r'(?s)<li[^>]+data-href="(/shows/%s/season-\d+)"' % url_parts[0], webpage):
entries.append(self.url_result(
compat_urlparse.urljoin(url, season_url_path), 'AENetworks'))
return self.playlist_result(
entries, self._html_search_meta('aetn:SeriesId', webpage),
self._html_search_meta('aetn:SeriesTitle', webpage))
elif url_parts_len == 2:
entries = []
for episode_item in re.findall(r'(?s)<div[^>]+class="[^"]*episode-item[^"]*"[^>]*>', webpage):
episode_attributes = extract_attributes(episode_item)
episode_url = compat_urlparse.urljoin(
url, episode_attributes['data-canonical'])
entries.append(self.url_result(
episode_url, 'AENetworks',
episode_attributes['data-videoid']))
return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage))
query = {
'mbr': 'true',
'assetTypes': 'medium_video_s3'
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
r"media_url\s*=\s*'([^']+)'", webpage, 'video url')
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link.theplatform.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
if theplatform_metadata.get('AETN$isBehindWall'):
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating'])
query['auth'] = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
info.update(self._search_json_ld(webpage, video_id, fatal=False))
media_url = update_url_query(media_url, query)
media_url = self._sign_url(media_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
})
return info
class HistoryTopicIE(AENetworksBaseIE):
IE_NAME = 'history:topic'
IE_DESC = 'History.com Topic'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/(?:[^/]+/)?(?P<topic_id>[^/]+)(?:/[^/]+(?:/(?P<video_display_id>[^/?#]+))?)?'
_TESTS = [{
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'info_dict': {
'id': '40700995724',
'ext': 'mp4',
'title': "Bet You Didn't Know: Valentine's Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729,
'upload_date': '20130806',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
@@ -37,30 +149,60 @@ class AENetworksIE(InfoExtractor):
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.aetv.com/shows/duck-dynasty/video/inlawful-entry',
'only_matching': True
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/videos',
'info_dict':
{
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 24,
}, {
'url': 'http://www.fyi.tv/shows/tiny-house-nation/videos/207-sq-ft-minnesota-prairie-cottage',
'only_matching': True
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
}, {
'url': 'http://www.mylifetime.com/shows/project-runway-junior/video/season-1/episode-6/superstar-clients',
'only_matching': True
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/speeches',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url_re = [
r'data-href="[^"]*/%s"[^>]+data-release-url="([^"]+)"' % video_id,
r"media_url\s*=\s*'([^']+)'"
]
video_url = self._search_regex(video_url_re, webpage, 'video url')
info = self._search_json_ld(webpage, video_id, fatal=False)
info.update({
def theplatform_url_result(self, theplatform_url, video_id, query):
return {
'_type': 'url_transparent',
'url': smuggle_url(video_url, {'sig': {'key': 'crazyjava', 'secret': 's3cr3t'}}),
})
return info
'id': video_id,
'url': smuggle_url(
update_url_query(theplatform_url, query),
{
'sig': {
'key': self._THEPLATFORM_KEY,
'secret': self._THEPLATFORM_SECRET,
},
'force_smil_url': True
}),
'ie_key': 'ThePlatform',
}
def _real_extract(self, url):
topic_id, video_display_id = re.match(self._VALID_URL, url).groups()
if video_display_id:
webpage = self._download_webpage(url, video_display_id)
release_url, video_id = re.search(r"_videoPlayer.play\('([^']+)'\s*,\s*'[^']+'\s*,\s*'(\d+)'\)", webpage).groups()
release_url = unescapeHTML(release_url)
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
'switch': 'hls'
})
else:
webpage = self._download_webpage(url, topic_id)
entries = []
for episode_item in re.findall(r'<a.+?data-release-url="[^"]+"[^>]*>', webpage):
video_attributes = extract_attributes(episode_item)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls'
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@@ -0,0 +1,133 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlparse,
compat_urlparse,
)
from ..utils import (
ExtractorError,
int_or_none,
xpath_element,
xpath_text,
)
class AfreecaTVIE(InfoExtractor):
IE_DESC = 'afreecatv.com'
_VALID_URL = r'''(?x)^
https?://(?:(live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)?
(?:
/app/(?:index|read_ucc_bbs)\.cgi|
/player/[Pp]layer\.(?:swf|html))
\?.*?\bnTitleNo=(?P<id>\d+)'''
_TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
'info_dict': {
'id': '36164052',
'ext': 'mp4',
'title': '데일리 에이프릴 요정들의 시상식!',
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'dailyapril',
'uploader_id': 'dailyapril',
'upload_date': '20160503',
}
}, {
'url': 'http://afbbs.afreecatv.com:8080/app/read_ucc_bbs.cgi?nStationNo=16711924&nTitleNo=36153164&szBjId=dailyapril&nBbsNo=18605867',
'info_dict': {
'id': '36153164',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!'",
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'dailyapril',
'uploader_id': 'dailyapril',
},
'playlist_count': 2,
'playlist': [{
'md5': 'd8b7c174568da61d774ef0203159bf97',
'info_dict': {
'id': '36153164_1',
'ext': 'mp4',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!'",
'upload_date': '20160502',
},
}, {
'md5': '58f2ce7f6044e34439ab2d50612ab02b',
'info_dict': {
'id': '36153164_2',
'ext': 'mp4',
'title': "BJ유트루와 함께하는 '팅커벨 메이크업!'",
'upload_date': '20160502',
},
}],
}, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True,
}]
@staticmethod
def parse_video_key(key):
video_key = {}
m = re.match(r'^(?P<upload_date>\d{8})_\w+_(?P<part>\d+)$', key)
if m:
video_key['upload_date'] = m.group('upload_date')
video_key['part'] = m.group('part')
return video_key
def _real_extract(self, url):
video_id = self._match_id(url)
parsed_url = compat_urllib_parse_urlparse(url)
info_url = compat_urlparse.urlunparse(parsed_url._replace(
netloc='afbbs.afreecatv.com:8080',
path='/api/video/get_video_info.php'))
video_xml = self._download_xml(info_url, video_id)
if xpath_element(video_xml, './track/video/file') is None:
raise ExtractorError('Specified AfreecaTV video does not exist',
expected=True)
title = xpath_text(video_xml, './track/title', 'title')
uploader = xpath_text(video_xml, './track/nickname', 'uploader')
uploader_id = xpath_text(video_xml, './track/bj_id', 'uploader id')
duration = int_or_none(xpath_text(video_xml, './track/duration',
'duration'))
thumbnail = xpath_text(video_xml, './track/titleImage', 'thumbnail')
entries = []
for i, video_file in enumerate(video_xml.findall('./track/video/file')):
video_key = self.parse_video_key(video_file.get('key', ''))
if not video_key:
continue
entries.append({
'id': '%s_%s' % (video_id, video_key.get('part', i + 1)),
'title': title,
'upload_date': video_key.get('upload_date'),
'duration': int_or_none(video_file.get('duration')),
'url': video_file.text,
})
info = {
'id': video_id,
'title': title,
'uploader': uploader,
'uploader_id': uploader_id,
'duration': duration,
'thumbnail': thumbnail,
}
if len(entries) > 1:
info['_type'] = 'multi_video'
info['entries'] = entries
elif len(entries) == 1:
info['url'] = entries[0]['url']
info['upload_date'] = entries[0].get('upload_date')
else:
raise ExtractorError(
'No files found for the specified AfreecaTV video, either'
' the URL is incorrect or the video has been made private.',
expected=True)
return info

View File

@@ -6,7 +6,7 @@ from ..utils import int_or_none
class AftonbladetIE(InfoExtractor):
_VALID_URL = r'http://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_VALID_URL = r'https?://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
'info_dict': {
@@ -24,10 +24,10 @@ class AftonbladetIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
# find internal video meta data
meta_url = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
meta_url = 'http://aftonbladet-play-metadata.cdn.drvideo.aptoma.no/video/%s.json'
player_config = self._parse_json(self._html_search_regex(
r'data-player-config="([^"]+)"', webpage, 'player config'), video_id)
internal_meta_id = player_config['videoId']
internal_meta_id = player_config['aptomaVideoId']
internal_meta_url = meta_url % internal_meta_id
internal_meta_json = self._download_json(
internal_meta_url, video_id, 'Downloading video meta data')

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'http://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_TEST = {
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',

View File

@@ -0,0 +1,91 @@
# coding: utf-8
from __future__ import unicode_literals
from .theplatform import ThePlatformIE
from ..utils import (
update_url_query,
parse_age_limit,
int_or_none,
)
class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?season-\d+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '',
'info_dict': {
'id': 's3MX01Nl4vPH',
'ext': 'mp4',
'title': 'Maron - Season 4 - Step 1',
'description': 'In denial about his current situation, Marc is reluctantly convinced by his friends to enter rehab. Starring Marc Maron and Constance Zimmer.',
'age_limit': 17,
'upload_date': '20160505',
'timestamp': 1462468831,
'uploader': 'AMCN',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.bbcamerica.com/shows/the-hunt/full-episodes/season-1/episode-01-the-hardest-challenge',
'only_matching': True,
}, {
'url': 'http://www.amc.com/shows/preacher/full-episodes/season-01/episode-00/pilot',
'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/million-dollar-matchmaker/season-01/episode-06-the-dumped-dj-and-shallow-hal',
'only_matching': True,
}, {
'url': 'http://www.ifc.com/movies/chaos',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
query = {
'mbr': 'true',
'manifest': 'm3u',
}
media_url = self._search_regex(r'window\.platformLinkURL\s*=\s*[\'"]([^\'"]+)', webpage, 'media url')
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link.theplatform.com/s/([^?]+)', media_url, 'theplatform_path'), display_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
video_id = theplatform_metadata['pid']
title = theplatform_metadata['title']
rating = theplatform_metadata['ratings'][0]['rating']
auth_required = self._search_regex(r'window\.authRequired\s*=\s*(true|false);', webpage, 'auth required')
if auth_required == 'true':
requestor_id = self._search_regex(r'window\.requestor_id\s*=\s*[\'"]([^\'"]+)', webpage, 'requestor id')
resource = self._get_mvpd_resource(requestor_id, title, video_id, rating)
query['auth'] = self._extract_mvpd_auth(url, video_id, requestor_id, resource)
media_url = update_url_query(media_url, query)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'subtitles': subtitles,
'formats': formats,
'age_limit': parse_age_limit(parse_age_limit(rating)),
})
ns_keys = theplatform_metadata.get('$xmlns', {}).keys()
if ns_keys:
ns = list(ns_keys)[0]
series = theplatform_metadata.get(ns + '$show')
season_number = int_or_none(theplatform_metadata.get(ns + '$season'))
episode = theplatform_metadata.get(ns + '$episodeTitle')
episode_number = int_or_none(theplatform_metadata.get(ns + '$episode'))
if season_number:
title = 'Season %d - %s' % (season_number, title)
if series:
title = '%s - %s' % (series, title)
info.update({
'title': title,
'series': series,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,
})
return info

View File

@@ -5,6 +5,8 @@ from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
mimetype2ext,
determine_ext,
)
@@ -50,31 +52,37 @@ class AMPIE(InfoExtractor):
if isinstance(media_content, dict):
media_content = [media_content]
for media_data in media_content:
media = media_data['@attributes']
media_type = media['type']
if media_type == 'video/f4m':
media = media_data.get('@attributes', {})
media_url = media.get('url')
if not media_url:
continue
ext = mimetype2ext(media.get('type')) or determine_ext(media_url)
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
media['url'] + '?hdcore=3.4.0&plugin=aasp-3.4.0.132.124',
media_url + '?hdcore=3.4.0&plugin=aasp-3.4.0.132.124',
video_id, f4m_id='hds', fatal=False))
elif media_type == 'application/x-mpegURL':
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media['url'], video_id, 'mp4', m3u8_id='hls', fatal=False))
media_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
formats.append({
'format_id': media_data['media-category']['@attributes']['label'],
'format_id': media_data.get('media-category', {}).get('@attributes', {}).get('label'),
'url': media['url'],
'tbr': int_or_none(media.get('bitrate')),
'filesize': int_or_none(media.get('fileSize')),
'ext': ext,
})
self._sort_formats(formats)
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return {
'id': video_id,
'title': get_media_node('title'),
'description': get_media_node('description'),
'thumbnails': thumbnails,
'timestamp': parse_iso8601(item.get('pubDate'), ' '),
'timestamp': timestamp,
'duration': int_or_none(media_content[0].get('@attributes', {}).get('duration')),
'subtitles': subtitles,
'formats': formats,

View File

@@ -3,10 +3,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..compat import (
compat_urlparse,
compat_str,
)
from ..utils import (
determine_ext,
encode_dict,
extract_attributes,
ExtractorError,
sanitized_Request,
urlencode_postdata,
@@ -19,6 +22,7 @@ class AnimeOnDemandIE(InfoExtractor):
_APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
_NETRC_MACHINE = 'animeondemand'
_TESTS = [{
# jap, OmU
'url': 'https://www.anime-on-demand.de/anime/161',
'info_dict': {
'id': '161',
@@ -27,13 +31,21 @@ class AnimeOnDemandIE(InfoExtractor):
},
'playlist_mincount': 4,
}, {
# Film wording is used instead of Episode
# Film wording is used instead of Episode, ger/jap, Dub/OmU
'url': 'https://www.anime-on-demand.de/anime/39',
'only_matching': True,
}, {
# Episodes without titles
# Episodes without titles, jap, OmU
'url': 'https://www.anime-on-demand.de/anime/162',
'only_matching': True,
}, {
# ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/169',
'only_matching': True,
}, {
# Full length film, non-series, ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/185',
'only_matching': True,
}]
def _login(self):
@@ -44,6 +56,10 @@ class AnimeOnDemandIE(InfoExtractor):
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
if '>Our licensing terms allow the distribution of animes only to German-speaking countries of Europe' in login_page:
self.raise_geo_restricted(
'%s is only available in German-speaking countries of Europe' % self.IE_NAME)
login_form = self._form_hidden_inputs('new_user', login_page)
login_form.update({
@@ -59,7 +75,7 @@ class AnimeOnDemandIE(InfoExtractor):
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
request = sanitized_Request(
post_url, urlencode_postdata(encode_dict(login_form)))
post_url, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
@@ -99,78 +115,156 @@ class AnimeOnDemandIE(InfoExtractor):
entries = []
for num, episode_html in enumerate(re.findall(
r'(?s)<h3[^>]+class="episodebox-title".+?>Episodeninhalt<', webpage), 1):
episodebox_title = self._search_regex(
(r'class="episodebox-title"[^>]+title=(["\'])(?P<title>.+?)\1',
r'class="episodebox-title"[^>]+>(?P<title>.+?)<'),
episode_html, 'episodebox title', default=None, group='title')
if not episodebox_title:
continue
episode_number = int(self._search_regex(
r'(?:Episode|Film)\s*(\d+)',
episodebox_title, 'episode number', default=num))
episode_title = self._search_regex(
r'(?:Episode|Film)\s*\d+\s*-\s*(.+)',
episodebox_title, 'episode title', default=None)
video_id = 'episode-%d' % episode_number
common_info = {
'id': video_id,
'series': anime_title,
'episode': episode_title,
'episode_number': episode_number,
}
def extract_info(html, video_id, num=None):
title, description = [None] * 2
formats = []
playlist_url = self._search_regex(
r'data-playlist=(["\'])(?P<url>.+?)\1',
episode_html, 'data playlist', default=None, group='url')
if playlist_url:
request = sanitized_Request(
compat_urlparse.urljoin(url, playlist_url),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
})
for input_ in re.findall(
r'<input[^>]+class=["\'].*?streamstarter_html5[^>]+>', html):
attributes = extract_attributes(input_)
playlist_urls = []
for playlist_key in ('data-playlist', 'data-otherplaylist'):
playlist_url = attributes.get(playlist_key)
if isinstance(playlist_url, compat_str) and re.match(
r'/?[\da-zA-Z]+', playlist_url):
playlist_urls.append(attributes[playlist_key])
if not playlist_urls:
continue
playlist = self._download_json(
request, video_id, 'Downloading playlist JSON', fatal=False)
if playlist:
playlist = playlist['playlist'][0]
title = playlist['title']
lang = attributes.get('data-lang')
lang_note = attributes.get('value')
for playlist_url in playlist_urls:
kind = self._search_regex(
r'videomaterialurl/\d+/([^/]+)/',
playlist_url, 'media kind', default=None)
format_id_list = []
if lang:
format_id_list.append(lang)
if kind:
format_id_list.append(kind)
if not format_id_list and num is not None:
format_id_list.append(compat_str(num))
format_id = '-'.join(format_id_list)
format_note = ', '.join(filter(None, (kind, lang_note)))
request = sanitized_Request(
compat_urlparse.urljoin(url, playlist_url),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
})
playlist = self._download_json(
request, video_id, 'Downloading %s playlist JSON' % format_id,
fatal=False)
if not playlist:
continue
start_video = playlist.get('startvideo', 0)
playlist = playlist.get('playlist')
if not playlist or not isinstance(playlist, list):
continue
playlist = playlist[start_video]
title = playlist.get('title')
if not title:
continue
description = playlist.get('description')
for source in playlist.get('sources', []):
file_ = source.get('file')
if file_ and determine_ext(file_) == 'm3u8':
formats = self._extract_m3u8_formats(
if not file_:
continue
ext = determine_ext(file_)
format_id_list = [lang, kind]
if ext == 'm3u8':
format_id_list.append('hls')
elif source.get('type') == 'video/dash' or ext == 'mpd':
format_id_list.append('dash')
format_id = '-'.join(filter(None, format_id_list))
if ext == 'm3u8':
file_formats = self._extract_m3u8_formats(
file_, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls')
entry_protocol='m3u8_native', m3u8_id=format_id, fatal=False)
elif source.get('type') == 'video/dash' or ext == 'mpd':
continue
file_formats = self._extract_mpd_formats(
file_, video_id, mpd_id=format_id, fatal=False)
else:
continue
for f in file_formats:
f.update({
'language': lang,
'format_note': format_note,
})
formats.extend(file_formats)
if formats:
return {
'title': title,
'description': description,
'formats': formats,
}
def extract_entries(html, video_id, common_info, num=None):
info = extract_info(html, video_id, num)
if info['formats']:
self._sort_formats(info['formats'])
f = common_info.copy()
f.update({
'title': title,
'description': description,
'formats': formats,
})
f.update(info)
entries.append(f)
m = re.search(
r'data-dialog-header=(["\'])(?P<title>.+?)\1[^>]+href=(["\'])(?P<href>.+?)\3[^>]*>Teaser<',
episode_html)
if m:
f = common_info.copy()
f.update({
'id': '%s-teaser' % f['id'],
'title': m.group('title'),
'url': compat_urlparse.urljoin(url, m.group('href')),
})
entries.append(f)
# Extract teaser/trailer only when full episode is not available
if not info['formats']:
m = re.search(
r'data-dialog-header=(["\'])(?P<title>.+?)\1[^>]+href=(["\'])(?P<href>.+?)\3[^>]*>(?P<kind>Teaser|Trailer)<',
html)
if m:
f = common_info.copy()
f.update({
'id': '%s-%s' % (f['id'], m.group('kind').lower()),
'title': m.group('title'),
'url': compat_urlparse.urljoin(url, m.group('href')),
})
entries.append(f)
def extract_episodes(html):
for num, episode_html in enumerate(re.findall(
r'(?s)<h3[^>]+class="episodebox-title".+?>Episodeninhalt<', html), 1):
episodebox_title = self._search_regex(
(r'class="episodebox-title"[^>]+title=(["\'])(?P<title>.+?)\1',
r'class="episodebox-title"[^>]+>(?P<title>.+?)<'),
episode_html, 'episodebox title', default=None, group='title')
if not episodebox_title:
continue
episode_number = int(self._search_regex(
r'(?:Episode|Film)\s*(\d+)',
episodebox_title, 'episode number', default=num))
episode_title = self._search_regex(
r'(?:Episode|Film)\s*\d+\s*-\s*(.+)',
episodebox_title, 'episode title', default=None)
video_id = 'episode-%d' % episode_number
common_info = {
'id': video_id,
'series': anime_title,
'episode': episode_title,
'episode_number': episode_number,
}
extract_entries(episode_html, video_id, common_info)
def extract_film(html, video_id):
common_info = {
'id': anime_id,
'title': anime_title,
'description': anime_description,
}
extract_entries(html, video_id, common_info)
extract_episodes(webpage)
if not entries:
extract_film(webpage, anime_id)
return self.playlist_result(entries, anime_id, anime_title, anime_description)

View File

@@ -0,0 +1,224 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
import hashlib
import json
import random
import time
from .common import InfoExtractor
from ..aes import aes_encrypt
from ..compat import compat_str
from ..utils import (
bytes_to_intlist,
determine_ext,
intlist_to_bytes,
int_or_none,
strip_jsonp,
)
def md5_text(s):
if not isinstance(s, compat_str):
s = compat_str(s)
return hashlib.md5(s.encode('utf-8')).hexdigest()
class AnvatoIE(InfoExtractor):
# Copied from anvplayer.min.js
_ANVACK_TABLE = {
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ',
'nbcu_nbcd_desktop_web_qa_1a6f01bdd0dc45a439043b694c8a031d': 'eSxJUbA2UUKBTXryyQ2d6NuM8oEqaPySvaPzfKNA',
'nbcu_nbcd_desktop_web_acc_eb2ff240a5d4ae9a63d4c297c32716b6c523a129': '89JR3RtUGbvKuuJIiKOMK0SoarLb5MUx8v89RcbP',
'nbcu_nbcd_watchvod_web_prod_e61107507180976724ec8e8319fe24ba5b4b60e1': 'Uc7dFt7MJ9GsBWB5T7iPvLaMSOt8BBxv4hAXk5vv',
'nbcu_nbcd_watchvod_web_qa_42afedba88a36203db5a4c09a5ba29d045302232': 'T12oDYVFP2IaFvxkmYMy5dKxswpLHtGZa4ZAXEi7',
'nbcu_nbcd_watchvod_web_acc_9193214448e2e636b0ffb78abacfd9c4f937c6ca': 'MmobcxUxMedUpohNWwXaOnMjlbiyTOBLL6d46ZpR',
'nbcu_local_monitor_web_acc_f998ad54eaf26acd8ee033eb36f39a7b791c6335': 'QvfIoPYrwsjUCcASiw3AIkVtQob2LtJHfidp9iWg',
'nbcu_cable_monitor_web_acc_a413759603e8bedfcd3c61b14767796e17834077': 'uwVPJLShvJWSs6sWEIuVem7MTF8A4IknMMzIlFto',
'nbcu_nbcd_mcpstage_web_qa_4c43a8f6e95a88dbb40276c0630ba9f693a63a4e': 'PxVYZVwjhgd5TeoPRxL3whssb5OUPnM3zyAzq8GY',
'nbcu_comcast_comcast_web_prod_074080762ad4ce956b26b43fb22abf153443a8c4': 'afnaRZfDyg1Z3WZHdupKfy6xrbAG2MHqe3VfuSwh',
'nbcu_comcast_comcast_web_qa_706103bb93ead3ef70b1de12a0e95e3c4481ade0': 'DcjsVbX9b3uoPlhdriIiovgFQZVxpISZwz0cx1ZK',
'nbcu_comcast_comcastcable_web_prod_669f04817536743563d7331c9293e59fbdbe3d07': '0RwMN2cWy10qhAhOscq3eK7aEe0wqnKt3vJ0WS4D',
'nbcu_comcast_comcastcable_web_qa_3d9d2d66219094127f0f6b09cc3c7bb076e3e1ca': '2r8G9DEya7PCqBceKZgrn2XkXgASjwLMuaFE1Aad',
'hearst_hearst_demo_web_stage_960726dfef3337059a01a78816e43b29ec04dfc7': 'cuZBPXTR6kSdoTCVXwk5KGA8rk3NrgGn4H6e9Dsp',
'anvato_mcpqa_demo_web_stage_18b55e00db5a13faa8d03ae6e41f6f5bcb15b922': 'IOaaLQ8ymqVyem14QuAvE5SndQynTcH5CrLkU2Ih',
'anvato_nextmedia_demo_web_stage_9787d56a02ff6b9f43e9a2b0920d8ca88beb5818': 'Pqu9zVzI1ApiIzbVA3VkGBEQHvdKSUuKpD6s2uaR',
'anvato_scripps_app_web_prod_0837996dbe373629133857ae9eb72e740424d80a': 'du1ccmn7RxzgizwbWU7hyUaGodNlJn7HtXI0WgXW',
'anvato_scripps_app_web_stage_360797e00fe2826be142155c4618cc52fce6c26c': '2PMrQ0BRoqCWl7nzphj0GouIMEh2mZYivAT0S1Su',
'fs2go_fs2go_go_all_prod_21934911ccfafc03a075894ead2260d11e2ddd24': 'RcuHlKikW2IJw6HvVoEkqq2UsuEJlbEl11pWXs4Q',
'fs2go_fs2go_go_web_prod_ead4b0eec7460c1a07783808db21b49cf1f2f9a7': '4K0HTT2u1zkQA2MaGaZmkLa1BthGSBdr7jllrhk5',
'fs2go_fs2go_go_web_stage_407585454a4400355d4391691c67f361': 'ftnc37VKRJBmHfoGGi3kT05bHyeJzilEzhKJCyl3',
'fs2go_fs2go_go_android_stage_44b714db6f8477f29afcba15a41e1d30': 'CtxpPvVpo6AbZGomYUhkKs7juHZwNml9b9J0J2gI',
'anvato_cbslocal_app_web_prod_547f3e49241ef0e5d30c79b2efbca5d92c698f67': 'Pw0XX5KBDsyRnPS0R2JrSrXftsy8Jnz5pAjaYC8s',
'anvato_cbslocal_app_web_stage_547a5f096594cd3e00620c6f825cad1096d28c80': '37OBUhX2uwNyKhhrNzSSNHSRPZpApC3trdqDBpuz',
'fs2go_att_att_web_prod_1042dddd089a05438b6a08f972941176f699ffd8': 'JLcF20JwYvpv6uAGcLWIaV12jKwaL1R8us4b6Zkg',
'fs2go_att_att_web_stage_807c5001955fc114a3331fe027ddc76e': 'gbu1oO1y0JiOFh4SUipt86P288JHpyjSqolrrT1x',
'fs2go_fs2go_tudor_web_prod_a7dd8e5a7cdc830cae55eae6f3e9fee5ee49eb9b': 'ipcp87VCEZXPPe868j3orLqzc03oTy7DXsGkAXXH',
'anvato_mhz_app_web_prod_b808218b30de7fdf60340cbd9831512bc1bf6d37': 'Stlm5Gs6BEhJLRTZHcNquyzxGqr23EuFmE5DCgjX',
'fs2go_charter_charter_web_stage_c2c6e5a68375a1bf00fff213d3ff8f61a835a54c': 'Lz4hbJp1fwL6jlcz4M2PMzghM4jp4aAmybtT5dPc',
'fs2go_charter_charter_web_prod_ebfe3b10f1af215a7321cd3d629e0b81dfa6fa8c': 'vUJsK345A1bVmyYDRhZX0lqFIgVXuqhmuyp1EtPK',
'anvato_epfox_app_web_prod_b3373168e12f423f41504f207000188daf88251b': 'GDKq1ixvX3MoBNdU5IOYmYa2DTUXYOozPjrCJnW7',
'anvato_epfox_app_web_stage_a3c2ce60f8f83ef374a88b68ee73a950f8ab87ce': '2jz2NH4BsXMaDsoJ5qkHMbcczAfIReo2eFYuVC1C',
'fs2go_verizon_verizon_web_stage_08e6df0354a4803f1b1f2428b5a9a382e8dbcd62': 'rKTVapNaAcmnUbGL4ZcuOoY4SE7VmZSQsblPFr7e',
'fs2go_verizon_verizon_web_prod_f909564cb606eff1f731b5e22e0928676732c445': 'qLSUuHerM3u9eNPzaHyUK52obai5MvE4XDJfqYe1',
'fs2go_foxcom_synd_web_stage_f7b9091f00ea25a4fdaaae77fca5b54cdc7e7043': '96VKF2vLd24fFiDfwPFpzM5llFN4TiIGAlodE0Re',
'fs2go_foxcom_synd_web_prod_0f2cdd64d87e4ab6a1d54aada0ff7a7c8387a064': 'agiPjbXEyEZUkbuhcnmVPhe9NNVbDjCFq2xkcx51',
'anvato_own_app_web_stage_1214ade5d28422c4dae9d03c1243aba0563c4dba': 'mzhamNac3swG4WsJAiUTacnGIODi6SWeVWk5D7ho',
'anvato_own_app_web_prod_944e162ed927ec3e9ed13eb68ed2f1008ee7565e': '9TSxh6G2TXOLBoYm9ro3LdNjjvnXpKb8UR8KoIP9',
'anvato_scripps_app_ftv_prod_a10a10468edd5afb16fb48171c03b956176afad1': 'COJ2i2UIPK7xZqIWswxe7FaVBOVgRkP1F6O6qGoH',
'anvato_scripps_app_ftv_stage_77d3ad2bdb021ec37ca2e35eb09acd396a974c9a': 'Q7nnopNLe2PPfGLOTYBqxSaRpl209IhqaEuDZi1F',
'anvato_univision_app_web_stage_551236ef07a0e17718c3995c35586b5ed8cb5031': 'D92PoLS6UitwxDRA191HUGT9OYcOjV6mPMa5wNyo',
'anvato_univision_app_web_prod_039a5c0a6009e637ae8ac906718a79911e0e65e1': '5mVS5u4SQjtw6NGw2uhMbKEIONIiLqRKck5RwQLR',
'nbcu_cnbc_springfield_ios_prod_670207fae43d6e9a94c351688851a2ce': 'M7fqCCIP9lW53oJbHs19OlJlpDrVyc2OL8gNeuTa',
'nbcu_cnbc_springfieldvod_ios_prod_7a5f04b1ceceb0e9c9e2264a44aa236e08e034c2': 'Yia6QbJahW0S7K1I0drksimhZb4UFq92xLBmmMvk',
'anvato_cox_app_web_prod_ce45cda237969f93e7130f50ee8bb6280c1484ab': 'cc0miZexpFtdoqZGvdhfXsLy7FXjRAOgb9V0f5fZ',
'anvato_cox_app_web_stage_c23dbe016a8e9d8c7101d10172b92434f6088bf9': 'yivU3MYHd2eDZcOfmLbINVtqxyecKTOp8OjOuoGJ',
'anvato_chnzero_app_web_stage_b1164d1352b579e792e542fddf13ee34c0eeb46b': 'A76QkXMmVH8lTCfU15xva1mZnSVcqeY4Xb22Kp7m',
'anvato_chnzero_app_web_prod_253d358928dc08ec161eda2389d53707288a730c': 'OA5QI3ZWZZkdtUEDqh28AH8GedsF6FqzJI32596b',
'anvato_discovery_vodpoc_web_stage_9fa7077b5e8af1f8355f65d4fb8d2e0e9d54e2b7': 'q3oT191tTQ5g3JCP67PkjLASI9s16DuWZ6fYmry3',
'anvato_discovery_vodpoc_web_prod_688614983167a1af6cdf6d76343fda10a65223c1': 'qRvRQCTVHd0VVOHsMvvfidyWmlYVrTbjby7WqIuK',
'nbcu_cnbc_springfieldvod_ftv_stage_826040aad1925a46ac5dfb4b3c5143e648c6a30d': 'JQaSb5a8Tz0PT4ti329DNmzDO30TnngTHmvX8Vua',
'nbcu_cnbc_springfield_ftv_stage_826040aad1925a46ac5dfb4b3c5143e648c6a30d': 'JQaSb5a8Tz0PT4ti329DNmzDO30TnngTHmvX8Vua',
'nbcu_nbcd_capture_web_stage_4dd9d585bfb984ebf856dee35db027b2465cc4ae': '0j1Ov4Vopyi2HpBZJYdL2m8ERJVGYh3nNpzPiO8F',
'nbcu_nbcd_watch3_android_prod_7712ca5fcf1c22f19ec1870a9650f9c37db22dcf': '3LN2UB3rPUAMu7ZriWkHky9vpLMXYha8JbSnxBlx',
'nbcu_nbcd_watchvod3_android_prod_0910a3a4692d57c0b5ff4316075bc5d096be45b9': 'mJagcQ2II30vUOAauOXne7ERwbf5S9nlB3IP17lQ',
'anvato_scripps_app_atv_prod_790deda22e16e71e83df58f880cd389908a45d52': 'CB6trI1mpoDIM5o54DNTsji90NDBQPZ4z4RqBNSH',
'nbcu_nbcd_watchv4_android_prod_ff67cef9cb409158c6f8c3533edddadd0b750507': 'j8CHQCUWjlYERj4NFRmUYOND85QNbHViH09UwuKm',
'nbcu_nbcd_watchvodv4_android_prod_a814d781609989dea6a629d50ae4c7ad8cc8e907': 'rkVnUXxdA9rawVLUlDQtMue9Y4Q7lFEaIotcUhjt',
'rvVKpA50qlOPLFxMjrCGf5pdkdQDm7qn': '1J7ZkY5Qz5lMLi93QOH9IveE7EYB3rLl',
'nbcu_dtv_local_web_prod_b266cf49defe255fd4426a97e27c09e513e9f82f': 'HuLnJDqzLa4saCzYMJ79zDRSQpEduw1TzjMNQu2b',
'nbcu_att_local_web_prod_4cef038b2d969a6b7d700a56a599040b6a619f67': 'Q0Em5VDc2KpydUrVwzWRXAwoNBulWUxCq2faK0AV',
'nbcu_dish_local_web_prod_c56dcaf2da2e9157a4266c82a78195f1dd570f6b': 'bC1LWmRz9ayj2AlzizeJ1HuhTfIaJGsDBnZNgoRg',
'nbcu_verizon_local_web_prod_88bebd2ce006d4ed980de8133496f9a74cb9b3e1': 'wzhDKJZpgvUSS1EQvpCQP8Q59qVzcPixqDGJefSk',
'nbcu_charter_local_web_prod_9ad90f7fc4023643bb718f0fe0fd5beea2382a50': 'PyNbxNhEWLzy1ZvWEQelRuIQY88Eub7xbSVRMdfT',
'nbcu_suddenlink_local_web_prod_20fb711725cac224baa1c1cb0b1c324d25e97178': '0Rph41lPXZbb3fqeXtHjjbxfSrNbtZp1Ygq7Jypa',
'nbcu_wow_local_web_prod_652d9ce4f552d9c2e7b5b1ed37b8cb48155174ad': 'qayIBZ70w1dItm2zS42AptXnxW15mkjRrwnBjMPv',
'nbcu_centurylink_local_web_prod_2034402b029bf3e837ad46814d9e4b1d1345ccd5': 'StePcPMkjsX51PcizLdLRMzxMEl5k2FlsMLUNV4k',
'nbcu_atlanticbrd_local_web_prod_8d5f5ecbf7f7b2f5e6d908dd75d90ae3565f682e': 'NtYLb4TFUS0pRs3XTkyO5sbVGYjVf17bVbjaGscI',
'nbcu_nbcd_watchvod_web_dev_08bc05699be47c4f31d5080263a8cfadc16d0f7c': 'hwxi2dgDoSWgfmVVXOYZm14uuvku4QfopstXckhr',
'anvato_nextmedia_app_web_prod_a4fa8c7204aa65e71044b57aaf63711980cfe5a0': 'tQN1oGPYY1nM85rJYePWGcIb92TG0gSqoVpQTWOw',
'anvato_mcp_lin_web_prod_4c36fbfd4d8d8ecae6488656e21ac6d1ac972749': 'GUXNf5ZDX2jFUpu4WT2Go4DJ5nhUCzpnwDRRUx1K',
'anvato_mcp_univision_web_prod_37fe34850c99a3b5cdb71dab10a417dd5cdecafa': 'bLDYF8JqfG42b7bwKEgQiU9E2LTIAtnKzSgYpFUH',
'anvato_mcp_fs2go_web_prod_c7b90a93e171469cdca00a931211a2f556370d0a': 'icgGoYGipQMMSEvhplZX1pwbN69srwKYWksz3xWK',
'anvato_mcp_sps_web_prod_54bdc90dd6ba21710e9f7074338365bba28da336': 'fA2iQdI7RDpynqzQYIpXALVS83NTPr8LLFK4LFsu',
'anvato_mcp_anv_web_prod_791407490f4c1ef2a4bcb21103e0cb1bcb3352b3': 'rMOUZqe9lwcGq2mNgG3EDusm6lKgsUnczoOX3mbg',
'anvato_mcp_gray_web_prod_4c10f067c393ed8fc453d3930f8ab2b159973900': 'rMOUZqe9lwcGq2mNgG3EDusm6lKgsUnczoOX3mbg',
'anvato_mcp_hearst_web_prod_5356c3de0fc7c90a3727b4863ca7fec3a4524a99': 'P3uXJ0fXXditBPCGkfvlnVScpPEfKmc64Zv7ZgbK',
'anvato_mcp_cbs_web_prod_02f26581ff80e5bda7aad28226a8d369037f2cbe': 'mGPvo5ZA5SgjOFAPEPXv7AnOpFUICX8hvFQVz69n',
'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582': 'qyT6PXXLjVNCrHaRVj0ugAhalNRS7Ee9BP7LUokD',
'nbcu_nbcd_watchvodv4_web_stage_4108362fba2d4ede21f262fea3c4162cbafd66c7': 'DhaU5lj0W2gEdcSSsnxURq8t7KIWtJfD966crVDk',
'anvato_scripps_app_ios_prod_409c41960c60b308db43c3cc1da79cab9f1c3d93': 'WPxj5GraLTkYCyj3M7RozLqIycjrXOEcDGFMIJPn',
'EZqvRyKBJLrgpClDPDF8I7Xpdp40Vx73': '4OxGd2dEakylntVKjKF0UK9PDPYB6A9W',
'M2v78QkpleXm9hPp9jUXI63x5vA6BogR': 'ka6K32k7ZALmpINkjJUGUo0OE42Md1BQ',
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ'
}
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce'
def __init__(self, *args, **kwargs):
super(AnvatoIE, self).__init__(*args, **kwargs)
self.__server_time = None
def _server_time(self, access_key, video_id):
if self.__server_time is not None:
return self.__server_time
self.__server_time = int(self._download_json(
self._api_prefix(access_key) + 'server_time?anvack=' + access_key, video_id,
note='Fetching server time')['server_time'])
return self.__server_time
def _api_prefix(self, access_key):
return 'https://tkx2-%s.anvato.net/rest/v2/' % ('prod' if 'prod' in access_key else 'stage')
def _get_video_json(self, access_key, video_id):
# See et() in anvplayer.min.js, which is an alias of getVideoJSON()
video_data_url = self._api_prefix(access_key) + 'mcp/video/%s?anvack=%s' % (video_id, access_key)
server_time = self._server_time(access_key, video_id)
input_data = '%d~%s~%s' % (server_time, md5_text(video_data_url), md5_text(server_time))
auth_secret = intlist_to_bytes(aes_encrypt(
bytes_to_intlist(input_data[:64]), bytes_to_intlist(self._AUTH_KEY)))
video_data_url += '&X-Anvato-Adst-Auth=' + base64.b64encode(auth_secret).decode('ascii')
anvrid = md5_text(time.time() * 1000 * random.random())[:30]
payload = {
'api': {
'anvrid': anvrid,
'anvstk': md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time, self._ANVACK_TABLE[access_key])),
'anvts': server_time,
},
}
return self._download_json(
video_data_url, video_id, transform_source=strip_jsonp,
data=json.dumps(payload).encode('utf-8'))
def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json(self._html_search_regex(
r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
'Anvato player data'), video_id)
video_id = anvplayer_data['video']
access_key = anvplayer_data['accessKey']
video_data = self._get_video_json(access_key, video_id)
formats = []
for published_url in video_data['published_urls']:
video_url = published_url['embed_url']
ext = determine_ext(video_url)
if ext == 'smil':
formats.extend(self._extract_smil_formats(video_url, video_id))
continue
tbr = int_or_none(published_url.get('kbps'))
a_format = {
'url': video_url,
'format_id': ('-'.join(filter(None, ['http', published_url.get('cdn_name')]))).lower(),
'tbr': tbr if tbr != 0 else None,
}
if ext == 'm3u8':
# Not using _extract_m3u8_formats here as individual media
# playlists are also included in published_urls.
if tbr is None:
formats.append(self._m3u8_meta_format(video_url, ext='mp4', m3u8_id='hls'))
continue
else:
a_format.update({
'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])),
'ext': 'mp4',
})
elif ext == 'mp3':
a_format['vcodec'] = 'none'
else:
a_format.update({
'width': int_or_none(published_url.get('width')),
'height': int_or_none(published_url.get('height')),
})
formats.append(a_format)
self._sort_formats(formats)
subtitles = {}
for caption in video_data.get('captions', []):
a_caption = {
'url': caption['url'],
'ext': 'tt' if caption.get('format') == 'SMPTE-TT' else None
}
subtitles.setdefault(caption['language'], []).append(a_caption)
return {
'id': video_id,
'formats': formats,
'title': video_data.get('def_title'),
'description': video_data.get('def_description'),
'categories': video_data.get('categories'),
'thumbnail': video_data.get('thumbnail'),
'subtitles': subtitles,
}

View File

@@ -1,31 +1,118 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
)
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|http://on\.aol\.com/video/.*-)(?P<id>[0-9]+)(?:$|\?)'
_VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{
# video with 5min ID
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
'md5': '18ef68f48740e86ae94b98da815eec42',
'info_dict': {
'id': '518167793',
'ext': 'mp4',
'title': 'U.S. Official Warns Of \'Largest Ever\' IRS Phone Scam',
'description': 'A major phone scam has cost thousands of taxpayers more than $1 million, with less than a month until income tax returns are due to the IRS.',
'timestamp': 1395405060,
'upload_date': '20140321',
'uploader': 'Newsy Studio',
},
'add_ie': ['FiveMin'],
'params': {
# m3u8 download
'skip_download': True,
}
}, {
# video with vidible ID
'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
'title': 'Netflix is Raising Rates',
'description': 'Netflix is rewarding millions of its long-standing members with an increase in cost. Veuers Carly Figueroa has more.',
'upload_date': '20160408',
'timestamp': 1460123280,
'uploader': 'Veuer',
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://on.aol.com/partners/abc-551438d309eab105804dbfe8/sneak-peek-was-haley-really-framed-570eaebee4b0448640a5c944',
'only_matching': True,
}, {
'url': 'http://on.aol.com/shows/park-bench-shw518173474-559a1b9be4b0c3bfad3357a7?context=SH:SHW518173474:PL4327:1460619712763',
'only_matching': True,
}, {
'url': 'http://on.aol.com/video/519442220',
'only_matching': True,
}, {
'url': 'aol-video:5707d6b8e4b090497b04f706',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result('5min:%s' % video_id)
response = self._download_json(
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,
video_id)['response']
if response['statusText'] != 'Ok':
raise ExtractorError('%s said: %s' % (self.IE_NAME, response['statusText']), expected=True)
video_data = response['data']
formats = []
m3u8_url = video_data.get('videoMasterPlaylist')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
for rendition in video_data.get('renditions', []):
video_url = rendition.get('url')
if not video_url:
continue
ext = rendition.get('format')
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
f = {
'url': video_url,
'format_id': rendition.get('quality'),
}
mobj = re.search(r'(\d+)x(\d+)', video_url)
if mobj:
f.update({
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
return {
'id': video_id,
'title': video_data['title'],
'duration': int_or_none(video_data.get('duration')),
'timestamp': int_or_none(video_data.get('publishDate')),
'view_count': int_or_none(video_data.get('views')),
'description': video_data.get('description'),
'uploader': video_data.get('videoOwner'),
'formats': formats,
}
class AolFeaturesIE(InfoExtractor):
IE_NAME = 'features.aol.com'
_VALID_URL = r'http://features\.aol\.com/video/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
@@ -36,6 +123,10 @@ class AolFeaturesIE(InfoExtractor):
'title': 'What To Watch - February 17, 2016',
},
'add_ie': ['FiveMin'],
'params': {
# encrypted m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):

View File

@@ -1,8 +1,6 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@@ -15,7 +13,7 @@ class AparatIE(InfoExtractor):
_TEST = {
'url': 'http://www.aparat.com/v/wP8On',
'md5': '6714e0af7e0d875c5a39c4dc4ab46ad1',
'md5': '131aca2e14fe7c4dcb3c4877ba300c89',
'info_dict': {
'id': 'wP8On',
'ext': 'mp4',
@@ -31,13 +29,13 @@ class AparatIE(InfoExtractor):
# Note: There is an easier-to-parse configuration at
# http://www.aparat.com/video/video/config/videohash/%video_id
# but the URL in there does not work
embed_url = ('http://www.aparat.com/video/video/embed/videohash/' +
video_id + '/vt/frame')
embed_url = 'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id
webpage = self._download_webpage(embed_url, video_id)
video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
for i, video_url in enumerate(video_urls):
file_list = self._parse_json(self._search_regex(
r'fileList\s*=\s*JSON\.parse\(\'([^\']+)\'\)', webpage, 'file list'), video_id)
for i, item in enumerate(file_list[0]):
video_url = item['file']
req = HEADRequest(video_url)
res = self._request_webpage(
req, video_id, note='Testing video URL %d' % i, errnote=False)

View File

@@ -7,6 +7,8 @@ from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
int_or_none,
parse_duration,
unified_strdate,
)
@@ -16,7 +18,8 @@ class AppleTrailersIE(InfoExtractor):
_TESTS = [{
'url': 'http://trailers.apple.com/trailers/wb/manofsteel/',
'info_dict': {
'id': 'manofsteel',
'id': '5111',
'title': 'Man of Steel',
},
'playlist': [
{
@@ -70,6 +73,15 @@ class AppleTrailersIE(InfoExtractor):
'id': 'blackthorn',
},
'playlist_mincount': 2,
'expected_warnings': ['Unable to download JSON metadata'],
}, {
# json data only available from http://trailers.apple.com/trailers/feeds/data/15881.json
'url': 'http://trailers.apple.com/trailers/fox/kungfupanda3/',
'info_dict': {
'id': '15881',
'title': 'Kung Fu Panda 3',
},
'playlist_mincount': 4,
}, {
'url': 'http://trailers.apple.com/ca/metropole/autrui/',
'only_matching': True,
@@ -85,6 +97,45 @@ class AppleTrailersIE(InfoExtractor):
movie = mobj.group('movie')
uploader_id = mobj.group('company')
webpage = self._download_webpage(url, movie)
film_id = self._search_regex(r"FilmId\s*=\s*'(\d+)'", webpage, 'film id')
film_data = self._download_json(
'http://trailers.apple.com/trailers/feeds/data/%s.json' % film_id,
film_id, fatal=False)
if film_data:
entries = []
for clip in film_data.get('clips', []):
clip_title = clip['title']
formats = []
for version, version_data in clip.get('versions', {}).items():
for size, size_data in version_data.get('sizes', {}).items():
src = size_data.get('src')
if not src:
continue
formats.append({
'format_id': '%s-%s' % (version, size),
'url': re.sub(r'_(\d+p.mov)', r'_h\1', src),
'width': int_or_none(size_data.get('width')),
'height': int_or_none(size_data.get('height')),
'language': version[:2],
})
self._sort_formats(formats)
entries.append({
'id': movie + '-' + re.sub(r'[^a-zA-Z0-9]', '', clip_title).lower(),
'formats': formats,
'title': clip_title,
'thumbnail': clip.get('screen') or clip.get('thumb'),
'duration': parse_duration(clip.get('runtime') or clip.get('faded')),
'upload_date': unified_strdate(clip.get('posted')),
'uploader_id': uploader_id,
})
page_data = film_data.get('page', {})
return self.playlist_result(entries, film_id, page_data.get('movie_title'))
playlist_url = compat_urlparse.urljoin(url, 'includes/playlists/itunes.inc')
def fix_html(s):

View File

@@ -1,67 +1,65 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import unified_strdate
from .jwplatform import JWPlatformBaseIE
from ..utils import (
unified_strdate,
clean_html,
)
class ArchiveOrgIE(InfoExtractor):
class ArchiveOrgIE(JWPlatformBaseIE):
IE_NAME = 'archive.org'
IE_DESC = 'archive.org videos'
_VALID_URL = r'https?://(?:www\.)?archive\.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#]+)(?:[?].*)?$'
_TESTS = [{
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
'md5': '8af1d4cf447933ed3c7f4871162602db',
'info_dict': {
'id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect',
'ext': 'ogv',
'ext': 'ogg',
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
'description': 'md5:1780b464abaca9991d8968c877bb53ed',
'description': 'md5:da45c349df039f1cc8075268eb1b5c25',
'upload_date': '19681210',
'uploader': 'SRI International'
}
}, {
'url': 'https://archive.org/details/Cops1922',
'md5': '18f2a19e6d89af8425671da1cf3d4e04',
'md5': 'bc73c8ab3838b5a8fc6c6651fa7b58ba',
'info_dict': {
'id': 'Cops1922',
'ext': 'ogv',
'ext': 'mp4',
'title': 'Buster Keaton\'s "Cops" (1922)',
'description': 'md5:70f72ee70882f713d4578725461ffcc3',
'description': 'md5:b4544662605877edd99df22f9620d858',
}
}, {
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://archive.org/embed/' + video_id, video_id)
jwplayer_playlist = self._parse_json(self._search_regex(
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\);",
webpage, 'jwplayer playlist'), video_id)
info = self._parse_jwplayer_data(
{'playlist': jwplayer_playlist}, video_id, base_url=url)
json_url = url + ('&' if '?' in url else '?') + 'output=json'
data = self._download_json(json_url, video_id)
def get_optional(metadata, field):
return metadata.get(field, [None])[0]
def get_optional(data_dict, field):
return data_dict['metadata'].get(field, [None])[0]
title = get_optional(data, 'title')
description = get_optional(data, 'description')
uploader = get_optional(data, 'creator')
upload_date = unified_strdate(get_optional(data, 'date'))
formats = [
{
'format': fdata['format'],
'url': 'http://' + data['server'] + data['dir'] + fn,
'file_size': int(fdata['size']),
}
for fn, fdata in data['files'].items()
if 'Video' in fdata['format']]
self._sort_formats(formats)
return {
'_type': 'video',
'id': video_id,
'title': title,
'formats': formats,
'description': description,
'uploader': uploader,
'upload_date': upload_date,
'thumbnail': data.get('misc', {}).get('image'),
}
metadata = self._download_json(
'http://archive.org/details/' + video_id, video_id, query={
'output': 'json',
})['metadata']
info.update({
'title': get_optional(metadata, 'title') or info.get('title'),
'description': clean_html(get_optional(metadata, 'description')),
})
if info.get('_type') != 'playlist':
info.update({
'uploader': get_optional(metadata, 'creator'),
'upload_date': unified_strdate(get_optional(metadata, 'date')),
})
return info

View File

@@ -8,19 +8,19 @@ from .generic import GenericIE
from ..utils import (
determine_ext,
ExtractorError,
get_element_by_attribute,
qualities,
int_or_none,
parse_duration,
unified_strdate,
xpath_text,
update_url_query,
)
from ..compat import compat_etree_fromstring
class ARDMediathekIE(InfoExtractor):
IE_NAME = 'ARD:mediathek'
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.daserste\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_TESTS = [{
'url': 'http://www.ardmediathek.de/tv/Dokumentation-und-Reportage/Ich-liebe-das-Leben-trotzdem/rbb-Fernsehen/Video?documentId=29582122&bcastId=3822114',
@@ -35,6 +35,7 @@ class ARDMediathekIE(InfoExtractor):
# m3u8 download
'skip_download': True,
},
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'http://www.ardmediathek.de/tv/Tatort/Tatort-Scheinwelten-H%C3%B6rfassung-Video/Das-Erste/Video?documentId=29522730&bcastId=602916',
'md5': 'f4d98b10759ac06c0072bbcd1f0b9e3e',
@@ -45,6 +46,7 @@ class ARDMediathekIE(InfoExtractor):
'description': 'md5:196392e79876d0ac94c94e8cdb2875f1',
'duration': 5252,
},
'skip': 'HTTP Error 404: Not Found',
}, {
# audio
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
@@ -56,9 +58,22 @@ class ARDMediathekIE(InfoExtractor):
'description': 'md5:f6e39f3461f0e1f54bfa48c8875c86ef',
'duration': 3240,
},
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
'only_matching': True,
}, {
# audio
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
'md5': '4e8f00631aac0395fee17368ac0e9867',
'info_dict': {
'id': '30796318',
'ext': 'mp3',
'title': 'Vor dem Fest',
'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
'duration': 3287,
},
'skip': 'Video is no longer available',
}]
def _extract_media_info(self, media_info_url, webpage, video_id):
@@ -83,7 +98,7 @@ class ARDMediathekIE(InfoExtractor):
subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url:
subtitles['de'] = [{
'ext': 'srt',
'ext': 'ttml',
'url': subtitle_url,
}]
@@ -114,11 +129,14 @@ class ARDMediathekIE(InfoExtractor):
continue
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
stream_url + '?hdcore=3.1.1&plugin=aasp-3.1.1.69.124',
video_id, preference=-1, f4m_id='hds', fatal=False))
update_url_query(stream_url, {
'hdcore': '3.1.1',
'plugin': 'aasp-3.1.1.69.124'
}),
video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', preference=1, m3u8_id='hls', fatal=False))
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
if server and server.startswith('rtmp'):
f = {
@@ -220,7 +238,7 @@ class ARDMediathekIE(InfoExtractor):
class ARDIE(InfoExtractor):
_VALID_URL = '(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_TEST = {
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
'md5': 'd216c3a86493f9322545e045ddc3eb35',
@@ -232,7 +250,8 @@ class ARDIE(InfoExtractor):
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20140804',
'thumbnail': 're:^https?://.*\.jpg$',
}
},
'skip': 'HTTP Error 404: Not Found',
}
def _real_extract(self, url):
@@ -274,41 +293,3 @@ class ARDIE(InfoExtractor):
'upload_date': upload_date,
'thumbnail': thumbnail,
}
class SportschauIE(ARDMediathekIE):
IE_NAME = 'Sportschau'
_VALID_URL = r'(?P<baseurl>https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video(?P<id>[^/#?]+))\.html'
_TESTS = [{
'url': 'http://www.sportschau.de/tourdefrance/videoseppeltkokainhatnichtsmitklassischemdopingzutun100.html',
'info_dict': {
'id': 'seppeltkokainhatnichtsmitklassischemdopingzutun100',
'ext': 'mp4',
'title': 'Seppelt: "Kokain hat nichts mit klassischem Doping zu tun"',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'Der ARD-Doping Experte Hajo Seppelt gibt seine Einschätzung zum ersten Dopingfall der diesjährigen Tour de France um den Italiener Luca Paolini ab.',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
base_url = mobj.group('baseurl')
webpage = self._download_webpage(url, video_id)
title = get_element_by_attribute('class', 'headline', webpage)
description = self._html_search_meta('description', webpage, 'description')
info = self._extract_media_info(
base_url + '-mc_defaultQuality-h.json', webpage, video_id)
info.update({
'title': title,
'description': description,
})
return info

View File

@@ -0,0 +1,115 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
mimetype2ext,
parse_iso8601,
strip_jsonp,
)
class ArkenaIE(InfoExtractor):
_VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
_TESTS = [{
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
'info_dict': {
'id': 'b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe',
'ext': 'mp4',
'title': 'Big Buck Bunny',
'description': 'Royalty free test video',
'timestamp': 1432816365,
'upload_date': '20150528',
'is_live': False,
},
}, {
'url': 'https://play.arkena.com/config/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411/?callbackMethod=jQuery1111023664739129262213_1469227693893',
'only_matching': True,
}, {
'url': 'http://play.arkena.com/config/avp/v1/player/media/327336/darkmatter/131064/?callbackMethod=jQuery1111002221189684892677_1469227595972',
'only_matching': True,
}, {
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
'only_matching': True,
}]
@staticmethod
def _extract_url(webpage):
# See https://support.arkena.com/display/PLAY/Ways+to+embed+your+video
mobj = re.search(
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//play\.arkena\.com/embed/avp/.+?)\1',
webpage)
if mobj:
return mobj.group('url')
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
account_id = mobj.group('account_id')
playlist = self._download_json(
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
% (video_id, account_id),
video_id, transform_source=strip_jsonp)['Playlist'][0]
media_info = playlist['MediaInfo']
title = media_info['Title']
media_files = playlist['MediaFiles']
is_live = False
formats = []
for kind_case, kind_formats in media_files.items():
kind = kind_case.lower()
for f in kind_formats:
f_url = f.get('Url')
if not f_url:
continue
is_live = f.get('Live') == 'true'
exts = (mimetype2ext(f.get('Type')), determine_ext(f_url, None))
if kind == 'm3u8' or 'm3u8' in exts:
formats.extend(self._extract_m3u8_formats(
f_url, video_id, 'mp4',
entry_protocol='m3u8' if is_live else 'm3u8_native',
m3u8_id=kind, fatal=False, live=is_live))
elif kind == 'flash' or 'f4m' in exts:
formats.extend(self._extract_f4m_formats(
f_url, video_id, f4m_id=kind, fatal=False))
elif kind == 'dash' or 'mpd' in exts:
formats.extend(self._extract_mpd_formats(
f_url, video_id, mpd_id=kind, fatal=False))
elif kind == 'silverlight':
# TODO: process when ism is supported (see
# https://github.com/rg3/youtube-dl/issues/8118)
continue
else:
tbr = float_or_none(f.get('Bitrate'), 1000)
formats.append({
'url': f_url,
'format_id': '%s-%d' % (kind, tbr) if tbr else kind,
'tbr': tbr,
})
self._sort_formats(formats)
description = media_info.get('Description')
video_id = media_info.get('VideoId') or video_id
timestamp = parse_iso8601(media_info.get('PublishDate'))
thumbnails = [{
'url': thumbnail['Url'],
'width': int_or_none(thumbnail.get('Size')),
} for thumbnail in (media_info.get('Poster') or []) if thumbnail.get('Url')]
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'is_live': is_live,
'thumbnails': thumbnails,
'formats': formats,
}

View File

@@ -23,7 +23,7 @@ from ..utils import (
class ArteTvIE(InfoExtractor):
_VALID_URL = r'http://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
_VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv'
def _real_extract(self, url):
@@ -61,10 +61,7 @@ class ArteTvIE(InfoExtractor):
}
class ArteTVPlus7IE(InfoExtractor):
IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/(?:(?:sendungen|emissions|embed)/)?(?P<id>[^/]+)/(?P<name>[^/?#&+])'
class ArteTVBaseIE(InfoExtractor):
@classmethod
def _extract_url_info(cls, url):
mobj = re.match(cls._VALID_URL, url)
@@ -78,6 +75,125 @@ class ArteTVPlus7IE(InfoExtractor):
video_id = mobj.group('id')
return video_id, lang
def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer']
upload_date_str = player_info.get('shootingDate')
if not upload_date_str:
upload_date_str = (player_info.get('VRA') or player_info.get('VDA') or '').split(' ')[0]
title = (player_info.get('VTI') or title or player_info['VID']).strip()
subtitle = player_info.get('VSU', '').strip()
if subtitle:
title += ' - %s' % subtitle
info_dict = {
'id': player_info['VID'],
'title': title,
'description': player_info.get('VDE'),
'upload_date': unified_strdate(upload_date_str),
'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'),
}
qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = {
'fr': 'F',
'de': 'A',
'en': 'E[ANG]',
'es': 'E[ESP]',
}
langcode = LANGS.get(lang, lang)
formats = []
for format_id, format_dict in player_info['VSR'].items():
f = dict(format_dict)
versionCode = f.get('versionCode')
l = re.escape(langcode)
# Language preference from most to least priority
# Reference: section 5.6.3 of
# http://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-05.pdf
PREFERENCES = (
# original version in requested language, without subtitles
r'VO{0}$'.format(l),
# original version in requested language, with partial subtitles in requested language
r'VO{0}-ST{0}$'.format(l),
# original version in requested language, with subtitles for the deaf and hard-of-hearing in requested language
r'VO{0}-STM{0}$'.format(l),
# non-original (dubbed) version in requested language, without subtitles
r'V{0}$'.format(l),
# non-original (dubbed) version in requested language, with subtitles partial subtitles in requested language
r'V{0}-ST{0}$'.format(l),
# non-original (dubbed) version in requested language, with subtitles for the deaf and hard-of-hearing in requested language
r'V{0}-STM{0}$'.format(l),
# original version in requested language, with partial subtitles in different language
r'VO{0}-ST(?!{0}).+?$'.format(l),
# original version in requested language, with subtitles for the deaf and hard-of-hearing in different language
r'VO{0}-STM(?!{0}).+?$'.format(l),
# original version in different language, with partial subtitles in requested language
r'VO(?:(?!{0}).+?)?-ST{0}$'.format(l),
# original version in different language, with subtitles for the deaf and hard-of-hearing in requested language
r'VO(?:(?!{0}).+?)?-STM{0}$'.format(l),
# original version in different language, without subtitles
r'VO(?:(?!{0}))?$'.format(l),
# original version in different language, with partial subtitles in different language
r'VO(?:(?!{0}).+?)?-ST(?!{0}).+?$'.format(l),
# original version in different language, with subtitles for the deaf and hard-of-hearing in different language
r'VO(?:(?!{0}).+?)?-STM(?!{0}).+?$'.format(l),
)
for pref, p in enumerate(PREFERENCES):
if re.match(p, versionCode):
lang_pref = len(PREFERENCES) - pref
break
else:
lang_pref = -1
format = {
'format_id': format_id,
'preference': -10 if f.get('videoFormat') == 'M3U8' else None,
'language_preference': lang_pref,
'format_note': '%s, %s' % (f.get('versionCode'), f.get('versionLibelle')),
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': int_or_none(f.get('bitrate')),
'quality': qfunc(f.get('quality')),
}
if f.get('mediaType') == 'rtmp':
format['url'] = f['streamer']
format['play_path'] = 'mp4:' + f['url']
format['ext'] = 'flv'
else:
format['url'] = f['url']
formats.append(format)
self._check_formats(formats, video_id)
self._sort_formats(formats)
info_dict['formats'] = formats
return info_dict
class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/[^/]+/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
'only_matching': True,
}, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
webpage = self._download_webpage(url, video_id)
@@ -127,108 +243,47 @@ class ArteTVPlus7IE(InfoExtractor):
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
embed_url = self._search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'embed url', group='url')
return self.url_result(embed_url)
def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer']
upload_date_str = player_info.get('shootingDate')
if not upload_date_str:
upload_date_str = (player_info.get('VRA') or player_info.get('VDA') or '').split(' ')[0]
title = (player_info.get('VTI') or title or player_info['VID']).strip()
subtitle = player_info.get('VSU', '').strip()
if subtitle:
title += ' - %s' % subtitle
info_dict = {
'id': player_info['VID'],
'title': title,
'description': player_info.get('VDE'),
'upload_date': unified_strdate(upload_date_str),
'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'),
}
qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = {
'fr': 'F',
'de': 'A',
'en': 'E[ANG]',
'es': 'E[ESP]',
}
formats = []
for format_id, format_dict in player_info['VSR'].items():
f = dict(format_dict)
versionCode = f.get('versionCode')
langcode = LANGS.get(lang, lang)
lang_rexs = [r'VO?%s-' % re.escape(langcode), r'VO?.-ST%s$' % re.escape(langcode)]
lang_pref = None
if versionCode:
matched_lang_rexs = [r for r in lang_rexs if re.match(r, versionCode)]
lang_pref = -10 if not matched_lang_rexs else 10 * len(matched_lang_rexs)
source_pref = 0
if versionCode is not None:
# The original version with subtitles has lower relevance
if re.match(r'VO-ST(F|A|E)', versionCode):
source_pref -= 10
# The version with sourds/mal subtitles has also lower relevance
elif re.match(r'VO?(F|A|E)-STM\1', versionCode):
source_pref -= 9
format = {
'format_id': format_id,
'preference': -10 if f.get('videoFormat') == 'M3U8' else None,
'language_preference': lang_pref,
'format_note': '%s, %s' % (f.get('versionCode'), f.get('versionLibelle')),
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': int_or_none(f.get('bitrate')),
'quality': qfunc(f.get('quality')),
'source_preference': source_pref,
}
if f.get('mediaType') == 'rtmp':
format['url'] = f['streamer']
format['play_path'] = 'mp4:' + f['url']
format['ext'] = 'flv'
else:
format['url'] = f['url']
formats.append(format)
self._check_formats(formats, video_id)
self._sort_formats(formats)
info_dict['formats'] = formats
return info_dict
entries = [
self.url_result(url)
for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
return self.playlist_result(entries)
# It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:magazine?/)?(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
'info_dict': {
'id': '72176',
'id': '057405-001-A',
'ext': 'mp4',
'title': 'Folge 2 - Corporate Design',
'upload_date': '20131004',
'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
'upload_date': '20150716',
},
}, {
'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
'playlist_count': 11,
'add_ie': ['Youtube'],
}, {
'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
'only_matching': True,
}]
class ArteTVInfoIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:info'
_VALID_URL = r'https?://info\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://info.arte.tv/fr/service-civique-un-cache-misere',
'info_dict': {
'id': '160676',
'id': '067528-000-A',
'ext': 'mp4',
'title': 'Monty Python live (mostly)',
'description': 'Événement ! Quarante-cinq ans après leurs premiers succès, les légendaires Monty Python remontent sur scène.\n',
'upload_date': '20140805',
}
'title': 'Service civique, un cache misère ?',
'upload_date': '20160403',
},
}]
@@ -254,6 +309,8 @@ class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
_TESTS = []
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
if lang == 'folge':
@@ -272,7 +329,7 @@ class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TEST = {
_TESTS = [{
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
'md5': '9ea035b7bd69696b67aa2ccaaa218161',
'info_dict': {
@@ -282,24 +339,23 @@ class ArteTVConcertIE(ArteTVPlus7IE):
'upload_date': '20140128',
'description': 'md5:486eb08f991552ade77439fe6d82c305',
},
}
}]
class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TEST = {
'url': 'http://cinema.arte.tv/de/node/38291',
'md5': '6b275511a5107c60bacbeeda368c3aa1',
_TESTS = [{
'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
'info_dict': {
'id': '055876-000_PWA12025-D',
'id': '062494-000-A',
'ext': 'mp4',
'title': 'Tod auf dem Nil',
'upload_date': '20160122',
'description': 'md5:7f749bbb77d800ef2be11d54529b96bc',
'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
'upload_date': '20150807',
},
}
}]
class ArteTVMagazineIE(ArteTVPlus7IE):
@@ -337,16 +393,49 @@ class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x)
http://www\.arte\.tv
/playerv2/embed\.php\?json_url=
/(?:playerv2/embed|arte_vp/index)\.php\?json_url=
(?P<json_url>
http://arte\.tv/papi/tvguide/videos/stream/player/
(?P<lang>[^/]+)/(?P<id>[^/]+)[^&]*
)
'''
_TESTS = []
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
lang = mobj.group('lang')
json_url = mobj.group('json_url')
return self._extract_from_json_url(json_url, video_id, lang)
class ArteTVPlaylistIE(ArteTVBaseIE):
IE_NAME = 'arte.tv:playlist'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/[^#]*#collection/(?P<id>PL-\d+)'
_TESTS = [{
'url': 'http://www.arte.tv/guide/de/plus7/?country=DE#collection/PL-013263/ARTETV',
'info_dict': {
'id': 'PL-013263',
'title': 'Areva & Uramin',
'description': 'md5:a1dc0312ce357c262259139cfd48c9bf',
},
'playlist_mincount': 6,
}, {
'url': 'http://www.arte.tv/guide/de/playlists?country=DE#collection/PL-013190/ARTETV',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id, lang = self._extract_url_info(url)
collection = self._download_json(
'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos'
% (lang, playlist_id), playlist_id)
title = collection.get('title')
description = collection.get('shortDescription') or collection.get('teaserText')
entries = [
self._extract_from_json_url(
video['jsonUrl'], video.get('programId') or playlist_id, lang)
for video in collection['videos'] if video.get('jsonUrl')]
return self.playlist_result(entries, playlist_id, title, description)

View File

@@ -6,16 +6,14 @@ import hashlib
import re
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urllib_parse,
)
from ..compat import compat_str
from ..utils import (
int_or_none,
float_or_none,
sanitized_Request,
xpath_text,
ExtractorError,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
xpath_text,
)
@@ -86,7 +84,7 @@ class AtresPlayerIE(InfoExtractor):
}
request = sanitized_Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
response = self._download_webpage(
request, None, 'Logging in as %s' % username)

View File

@@ -6,6 +6,7 @@ import time
from .common import InfoExtractor
from .soundcloud import SoundcloudIE
from ..compat import compat_str
from ..utils import (
ExtractorError,
url_basename,
@@ -30,14 +31,14 @@ class AudiomackIE(InfoExtractor):
# audiomack wrapper around soundcloud song
{
'add_ie': ['Soundcloud'],
'url': 'http://www.audiomack.com/song/xclusiveszone/take-kare',
'url': 'http://www.audiomack.com/song/hip-hop-daily/black-mamba-freestyle',
'info_dict': {
'id': '172419696',
'id': '258901379',
'ext': 'mp3',
'description': 'md5:1fc3272ed7a635cce5be1568c2822997',
'title': 'Young Thug ft Lil Wayne - Take Kare',
'uploader': 'Young Thug World',
'upload_date': '20141016',
'description': 'mamba day freestyle for the legend Kobe Bryant ',
'title': 'Black Mamba Freestyle [Prod. By Danny Wolf]',
'uploader': 'ILOVEMAKONNEN',
'upload_date': '20160414',
}
},
]
@@ -136,7 +137,7 @@ class AudiomackAlbumIE(InfoExtractor):
result[resultkey] = api_response[apikey]
song_id = url_basename(api_response['url']).rpartition('.')[0]
result['entries'].append({
'id': api_response.get('id', song_id),
'id': compat_str(api_response.get('id', song_id)),
'uploader': api_response.get('artist'),
'title': api_response.get('title', song_id),
'url': api_response['url'],

View File

@@ -0,0 +1,184 @@
# coding: utf-8
from __future__ import unicode_literals
import re
import base64
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
compat_str,
)
from ..utils import (
int_or_none,
parse_iso8601,
smuggle_url,
unsmuggle_url,
urlencode_postdata,
)
class AWAANIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?show/(?P<show_id>\d+)/[^/]+(?:/(?P<video_id>\d+)/(?P<season_id>\d+))?'
def _real_extract(self, url):
show_id, video_id, season_id = re.match(self._VALID_URL, url).groups()
if video_id and int(video_id) > 0:
return self.url_result(
'http://awaan.ae/media/%s' % video_id, 'AWAANVideo')
elif season_id and int(season_id) > 0:
return self.url_result(smuggle_url(
'http://awaan.ae/program/season/%s' % season_id,
{'show_id': show_id}), 'AWAANSeason')
else:
return self.url_result(
'http://awaan.ae/program/%s' % show_id, 'AWAANSeason')
class AWAANBaseIE(InfoExtractor):
def _parse_video_data(self, video_data, video_id, is_live):
title = video_data.get('title_en') or video_data['title_ar']
img = video_data.get('img')
return {
'id': video_id,
'title': self._live_title(title) if is_live else title,
'description': video_data.get('description_en') or video_data.get('description_ar'),
'thumbnail': 'http://admin.mangomolo.com/analytics/%s' % img if img else None,
'duration': int_or_none(video_data.get('duration')),
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
'is_live': is_live,
}
class AWAANVideoIE(AWAANBaseIE):
IE_NAME = 'awaan:video'
_VALID_URL = r'https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?(?:video(?:/[^/]+)?|media|catchup/[^/]+/[^/]+)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.dcndigital.ae/#/video/%D8%B1%D8%AD%D9%84%D8%A9-%D8%A7%D9%84%D8%B9%D9%85%D8%B1-%D8%A7%D9%84%D8%AD%D9%84%D9%82%D8%A9-1/17375',
'md5': '5f61c33bfc7794315c671a62d43116aa',
'info_dict':
{
'id': '17375',
'ext': 'mp4',
'title': 'رحلة العمر : الحلقة 1',
'description': 'md5:0156e935d870acb8ef0a66d24070c6d6',
'duration': 2041,
'timestamp': 1227504126,
'upload_date': '20081124',
},
}, {
'url': 'http://awaan.ae/video/26723981/%D8%AF%D8%A7%D8%B1-%D8%A7%D9%84%D8%B3%D9%84%D8%A7%D9%85:-%D8%AE%D9%8A%D8%B1-%D8%AF%D9%88%D8%B1-%D8%A7%D9%84%D8%A3%D9%86%D8%B5%D8%A7%D8%B1',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(
'http://admin.mangomolo.com/analytics/index.php/plus/video?id=%s' % video_id,
video_id, headers={'Origin': 'http://awaan.ae'})
info = self._parse_video_data(video_data, video_id, False)
embed_url = 'http://admin.mangomolo.com/analytics/index.php/customers/embed/video?' + compat_urllib_parse_urlencode({
'id': video_data['id'],
'user_id': video_data['user_id'],
'signature': video_data['signature'],
'countries': 'Q0M=',
'filter': 'DENY',
})
info.update({
'_type': 'url_transparent',
'url': embed_url,
'ie_key': 'MangomoloVideo',
})
return info
class AWAANLiveIE(AWAANBaseIE):
IE_NAME = 'awaan:live'
_VALID_URL = r'https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?live/(?P<id>\d+)'
_TEST = {
'url': 'http://awaan.ae/live/6/dubai-tv',
'info_dict': {
'id': '6',
'ext': 'mp4',
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'upload_date': '20150107',
'timestamp': 1420588800,
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
channel_id = self._match_id(url)
channel_data = self._download_json(
'http://admin.mangomolo.com/analytics/index.php/plus/getchanneldetails?channel_id=%s' % channel_id,
channel_id, headers={'Origin': 'http://awaan.ae'})
info = self._parse_video_data(channel_data, channel_id, True)
embed_url = 'http://admin.mangomolo.com/analytics/index.php/customers/embed/index?' + compat_urllib_parse_urlencode({
'id': base64.b64encode(channel_data['user_id'].encode()).decode(),
'channelid': base64.b64encode(channel_data['id'].encode()).decode(),
'signature': channel_data['signature'],
'countries': 'Q0M=',
'filter': 'DENY',
})
info.update({
'_type': 'url_transparent',
'url': embed_url,
'ie_key': 'MangomoloLive',
})
return info
class AWAANSeasonIE(InfoExtractor):
IE_NAME = 'awaan:season'
_VALID_URL = r'https?://(?:www\.)?(?:awaan|dcndigital)\.ae/(?:#/)?program/(?:(?P<show_id>\d+)|season/(?P<season_id>\d+))'
_TEST = {
'url': 'http://dcndigital.ae/#/program/205024/%D9%85%D8%AD%D8%A7%D8%B6%D8%B1%D8%A7%D8%AA-%D8%A7%D9%84%D8%B4%D9%8A%D8%AE-%D8%A7%D9%84%D8%B4%D8%B9%D8%B1%D8%A7%D9%88%D9%8A',
'info_dict':
{
'id': '7910',
'title': 'محاضرات الشيخ الشعراوي',
},
'playlist_mincount': 27,
}
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
show_id, season_id = re.match(self._VALID_URL, url).groups()
data = {}
if season_id:
data['season'] = season_id
show_id = smuggled_data.get('show_id')
if show_id is None:
season = self._download_json(
'http://admin.mangomolo.com/analytics/index.php/plus/season_info?id=%s' % season_id,
season_id, headers={'Origin': 'http://awaan.ae'})
show_id = season['id']
data['show_id'] = show_id
show = self._download_json(
'http://admin.mangomolo.com/analytics/index.php/plus/show',
show_id, data=urlencode_postdata(data), headers={
'Origin': 'http://awaan.ae',
'Content-Type': 'application/x-www-form-urlencoded'
})
if not season_id:
season_id = show['default_season']
for season in show['seasons']:
if season['id'] == season_id:
title = season.get('title_en') or season['title_ar']
entries = []
for video in show['videos']:
video_id = compat_str(video['id'])
entries.append(self.url_result(
'http://awaan.ae/media/%s' % video_id, 'AWAANVideo', video_id))
return self.playlist_result(entries, season_id, title)

View File

@@ -46,6 +46,7 @@ class AzubuIE(InfoExtractor):
'uploader_id': 272749,
'view_count': int,
},
'skip': 'Channel offline',
},
]
@@ -56,22 +57,26 @@ class AzubuIE(InfoExtractor):
'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
title = data['title'].strip()
description = data['description']
thumbnail = data['thumbnail']
view_count = data['view_count']
uploader = data['user']['username']
uploader_id = data['user']['id']
description = data.get('description')
thumbnail = data.get('thumbnail')
view_count = data.get('view_count')
user = data.get('user', {})
uploader = user.get('username')
uploader_id = user.get('id')
stream_params = json.loads(data['stream_params'])
timestamp = float_or_none(stream_params['creationDate'], 1000)
duration = float_or_none(stream_params['length'], 1000)
timestamp = float_or_none(stream_params.get('creationDate'), 1000)
duration = float_or_none(stream_params.get('length'), 1000)
renditions = stream_params.get('renditions') or []
video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
if video:
renditions.append(video)
if not renditions and not user.get('channel', {}).get('is_live', True):
raise ExtractorError('%s said: channel is offline.' % self.IE_NAME, expected=True)
formats = [{
'url': fmt['url'],
'width': fmt['frameWidth'],
@@ -98,7 +103,7 @@ class AzubuIE(InfoExtractor):
class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'http://www.azubu.tv/(?P<id>[^/]+)$'
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/(?P<id>[^/]+)$'
_TEST = {
'url': 'http://www.azubu.tv/MarsTVMDLen',
@@ -120,6 +125,7 @@ class AzubuLiveIE(InfoExtractor):
bc_info = self._download_json(req, user)
m3u8_url = next(source['src'] for source in bc_info['sources'] if source['container'] == 'M2TS')
formats = self._extract_m3u8_formats(m3u8_url, user, ext='mp4')
self._sort_formats(formats)
return {
'id': info['id'],

View File

@@ -9,7 +9,7 @@ from ..utils import unescapeHTML
class BaiduVideoIE(InfoExtractor):
IE_DESC = '百度视频'
_VALID_URL = r'http://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_VALID_URL = r'https?://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_TESTS = [{
'url': 'http://v.baidu.com/comic/1069.htm?frp=bdbrand&q=%E4%B8%AD%E5%8D%8E%E5%B0%8F%E5%BD%93%E5%AE%B6',
'info_dict': {

View File

@@ -4,15 +4,13 @@ import re
import itertools
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_str,
)
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
)
@@ -58,7 +56,7 @@ class BambuserIE(InfoExtractor):
}
request = sanitized_Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)

View File

@@ -29,7 +29,7 @@ class BandcampIE(InfoExtractor):
'_skip': 'There is a limit of 200 free downloads / month for the test song'
}, {
'url': 'http://benprunty.bandcamp.com/track/lanius-battle',
'md5': '2b68e5851514c20efdff2afc5603b8b4',
'md5': '73d0b3171568232574e45652f8720b5c',
'info_dict': {
'id': '2650410135',
'ext': 'mp3',
@@ -48,6 +48,10 @@ class BandcampIE(InfoExtractor):
if m_trackinfo:
json_code = m_trackinfo.group(1)
data = json.loads(json_code)[0]
track_id = compat_str(data['id'])
if not data.get('file'):
raise ExtractorError('Not streamable', video_id=track_id, expected=True)
formats = []
for format_id, format_url in data['file'].items():
@@ -64,7 +68,7 @@ class BandcampIE(InfoExtractor):
self._sort_formats(formats)
return {
'id': compat_str(data['id']),
'id': track_id,
'title': data['title'],
'formats': formats,
'duration': float_or_none(data.get('duration')),
@@ -158,6 +162,15 @@ class BandcampAlbumIE(InfoExtractor):
'uploader_id': 'dotscale',
},
'playlist_mincount': 7,
}, {
# with escaped quote in title
'url': 'https://jstrecords.bandcamp.com/album/entropy-ep',
'info_dict': {
'title': '"Entropy" EP',
'uploader_id': 'jstrecords',
'id': 'entropy-ep',
},
'playlist_mincount': 3,
}]
def _real_extract(self, url):
@@ -172,8 +185,11 @@ class BandcampAlbumIE(InfoExtractor):
entries = [
self.url_result(compat_urlparse.urljoin(url, t_path), ie=BandcampIE.ie_key())
for t_path in tracks_paths]
title = self._search_regex(
r'album_title\s*:\s*"(.*?)"', webpage, 'title', fatal=False)
title = self._html_search_regex(
r'album_title\s*:\s*"((?:\\.|[^"\\])+?)"',
webpage, 'title', fatal=False)
if title:
title = title.replace(r'\"', '"')
return {
'_type': 'playlist',
'uploader_id': uploader_id,

View File

@@ -2,19 +2,23 @@
from __future__ import unicode_literals
import re
import itertools
from .common import InfoExtractor
from ..utils import (
dict_get,
ExtractorError,
float_or_none,
int_or_none,
parse_duration,
parse_iso8601,
try_get,
unescapeHTML,
)
from ..compat import (
compat_etree_fromstring,
compat_HTTPError,
compat_urlparse,
)
@@ -31,7 +35,7 @@ class BBCCoUkIE(InfoExtractor):
music/clips[/#]|
radio/player/
)
(?P<id>%s)
(?P<id>%s)(?!/(?:episodes|broadcasts|clips))
''' % _ID_REGEX
_MEDIASELECTOR_URLS = [
@@ -192,6 +196,7 @@ class BBCCoUkIE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'skip': 'Now it\'s really geo-restricted',
}, {
# compact player (https://github.com/rg3/youtube-dl/issues/8147)
'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player',
@@ -228,51 +233,6 @@ class BBCCoUkIE(InfoExtractor):
asx = self._download_xml(connection.get('href'), programme_id, 'Downloading ASX playlist')
return [ref.get('href') for ref in asx.findall('./Entry/ref')]
def _extract_connection(self, connection, programme_id):
formats = []
kind = connection.get('kind')
protocol = connection.get('protocol')
supplier = connection.get('supplier')
if protocol == 'http':
href = connection.get('href')
transfer_format = connection.get('transferFormat')
# ASX playlist
if supplier == 'asx':
for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
formats.append({
'url': ref,
'format_id': 'ref%s_%s' % (i, supplier),
})
# Skip DASH until supported
elif transfer_format == 'dash':
pass
elif transfer_format == 'hls':
formats.extend(self._extract_m3u8_formats(
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=supplier, fatal=False))
# Direct link
else:
formats.append({
'url': href,
'format_id': supplier or kind or protocol,
})
elif protocol == 'rtmp':
application = connection.get('application', 'ondemand')
auth_string = connection.get('authString')
identifier = connection.get('identifier')
server = connection.get('server')
formats.append({
'url': '%s://%s/%s?%s' % (protocol, server, application, auth_string),
'play_path': identifier,
'app': '%s?%s' % (application, auth_string),
'page_url': 'http://www.bbc.co.uk',
'player_url': 'http://www.bbc.co.uk/emp/releases/iplayer/revisions/617463_618125_4/617463_618125_4_emp.swf',
'rtmp_live': False,
'ext': 'flv',
'format_id': supplier,
})
return formats
def _extract_items(self, playlist):
return playlist.findall('./{%s}item' % self._EMP_PLAYLIST_NS)
@@ -293,45 +253,6 @@ class BBCCoUkIE(InfoExtractor):
def _extract_connections(self, media):
return self._findall_ns(media, './{%s}connection')
def _extract_video(self, media, programme_id):
formats = []
vbr = int_or_none(media.get('bitrate'))
vcodec = media.get('encoding')
service = media.get('service')
width = int_or_none(media.get('width'))
height = int_or_none(media.get('height'))
file_size = int_or_none(media.get('media_file_size'))
for connection in self._extract_connections(media):
conn_formats = self._extract_connection(connection, programme_id)
for format in conn_formats:
format.update({
'width': width,
'height': height,
'vbr': vbr,
'vcodec': vcodec,
'filesize': file_size,
})
if service:
format['format_id'] = '%s_%s' % (service, format['format_id'])
formats.extend(conn_formats)
return formats
def _extract_audio(self, media, programme_id):
formats = []
abr = int_or_none(media.get('bitrate'))
acodec = media.get('encoding')
service = media.get('service')
for connection in self._extract_connections(media):
conn_formats = self._extract_connection(connection, programme_id)
for format in conn_formats:
format.update({
'format_id': '%s_%s' % (service, format['format_id']),
'abr': abr,
'acodec': acodec,
})
formats.extend(conn_formats)
return formats
def _get_subtitles(self, media, programme_id):
subtitles = {}
for connection in self._extract_connections(media):
@@ -377,13 +298,87 @@ class BBCCoUkIE(InfoExtractor):
def _process_media_selector(self, media_selection, programme_id):
formats = []
subtitles = None
urls = []
for media in self._extract_medias(media_selection):
kind = media.get('kind')
if kind == 'audio':
formats.extend(self._extract_audio(media, programme_id))
elif kind == 'video':
formats.extend(self._extract_video(media, programme_id))
if kind in ('video', 'audio'):
bitrate = int_or_none(media.get('bitrate'))
encoding = media.get('encoding')
service = media.get('service')
width = int_or_none(media.get('width'))
height = int_or_none(media.get('height'))
file_size = int_or_none(media.get('media_file_size'))
for connection in self._extract_connections(media):
href = connection.get('href')
if href in urls:
continue
if href:
urls.append(href)
conn_kind = connection.get('kind')
protocol = connection.get('protocol')
supplier = connection.get('supplier')
transfer_format = connection.get('transferFormat')
format_id = supplier or conn_kind or protocol
if service:
format_id = '%s_%s' % (service, format_id)
# ASX playlist
if supplier == 'asx':
for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
formats.append({
'url': ref,
'format_id': 'ref%s_%s' % (i, format_id),
})
elif transfer_format == 'dash':
formats.extend(self._extract_mpd_formats(
href, programme_id, mpd_id=format_id, fatal=False))
elif transfer_format == 'hls':
formats.extend(self._extract_m3u8_formats(
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=format_id, fatal=False))
elif transfer_format == 'hds':
formats.extend(self._extract_f4m_formats(
href, programme_id, f4m_id=format_id, fatal=False))
else:
if not service and not supplier and bitrate:
format_id += '-%d' % bitrate
fmt = {
'format_id': format_id,
'filesize': file_size,
}
if kind == 'video':
fmt.update({
'width': width,
'height': height,
'vbr': bitrate,
'vcodec': encoding,
})
else:
fmt.update({
'abr': bitrate,
'acodec': encoding,
'vcodec': 'none',
})
if protocol == 'http':
# Direct link
fmt.update({
'url': href,
})
elif protocol == 'rtmp':
application = connection.get('application', 'ondemand')
auth_string = connection.get('authString')
identifier = connection.get('identifier')
server = connection.get('server')
fmt.update({
'url': '%s://%s/%s?%s' % (protocol, server, application, auth_string),
'play_path': identifier,
'app': '%s?%s' % (application, auth_string),
'page_url': 'http://www.bbc.co.uk',
'player_url': 'http://www.bbc.co.uk/emp/releases/iplayer/revisions/617463_618125_4/617463_618125_4_emp.swf',
'rtmp_live': False,
'ext': 'flv',
})
formats.append(fmt)
elif kind == 'captions':
subtitles = self.extract_subtitles(media, programme_id)
return formats, subtitles
@@ -588,6 +583,7 @@ class BBCIE(BBCCoUkIE):
'id': '150615_telabyad_kentin_cogu',
'ext': 'mp4',
'title': "YPG: Tel Abyad'ın tamamı kontrolümüzde",
'description': 'md5:33a4805a855c9baf7115fcbde57e7025',
'timestamp': 1434397334,
'upload_date': '20150615',
},
@@ -601,6 +597,7 @@ class BBCIE(BBCCoUkIE):
'id': '150619_video_honduras_militares_hospitales_corrupcion_aw',
'ext': 'mp4',
'title': 'Honduras militariza sus hospitales por nuevo escándalo de corrupción',
'description': 'md5:1525f17448c4ee262b64b8f0c9ce66c8',
'timestamp': 1434713142,
'upload_date': '20150619',
},
@@ -650,6 +647,23 @@ class BBCIE(BBCCoUkIE):
# rtmp download
'skip_download': True,
}
}, {
# single video embedded with Morph
'url': 'http://www.bbc.co.uk/sport/live/olympics/36895975',
'info_dict': {
'id': 'p041vhd0',
'ext': 'mp4',
'title': "Nigeria v Japan - Men's First Round",
'description': 'Live coverage of the first round from Group B at the Amazonia Arena.',
'duration': 7980,
'uploader': 'BBC Sport',
'uploader_id': 'bbc_sport',
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Georestricted to UK',
}, {
# single video with playlist.sxml URL in playlist param
'url': 'http://www.bbc.com/sport/0/football/33653409',
@@ -670,6 +684,7 @@ class BBCIE(BBCCoUkIE):
'info_dict': {
'id': '34475836',
'title': 'Jurgen Klopp: Furious football from a witty and winning coach',
'description': 'Fast-paced football, wit, wisdom and a ready smile - why Liverpool fans should come to love new boss Jurgen Klopp.',
},
'playlist_count': 3,
}, {
@@ -688,11 +703,17 @@ class BBCIE(BBCCoUkIE):
# custom redirection to www.bbc.com
'url': 'http://www.bbc.co.uk/news/science-environment-33661876',
'only_matching': True,
}, {
# single video article embedded with data-media-vpid
'url': 'http://www.bbc.co.uk/sport/rowing/35908187',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if BBCCoUkIE.suitable(url) or BBCCoUkArticleIE.suitable(url) else super(BBCIE, cls).suitable(url)
EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerPlaylistIE, BBCCoUkPlaylistIE)
return (False if any(ie.suitable(url) for ie in EXCLUDE_IE)
else super(BBCIE, cls).suitable(url))
def _extract_from_media_meta(self, media_meta, video_id):
# Direct links to media in media metadata (e.g.
@@ -740,7 +761,7 @@ class BBCIE(BBCCoUkIE):
webpage = self._download_webpage(url, playlist_id)
json_ld_info = self._search_json_ld(webpage, playlist_id, default=None)
json_ld_info = self._search_json_ld(webpage, playlist_id, default={})
timestamp = json_ld_info.get('timestamp')
playlist_title = json_ld_info.get('title')
@@ -809,15 +830,36 @@ class BBCIE(BBCCoUkIE):
# http://www.bbc.com/turkce/multimedya/2015/10/151010_vid_ankara_patlama_ani)
playlist = data_playable.get('otherSettings', {}).get('playlist', {})
if playlist:
entries.append(self._extract_from_playlist_sxml(
playlist.get('progressiveDownloadUrl'), playlist_id, timestamp))
entry = None
for key in ('streaming', 'progressiveDownload'):
playlist_url = playlist.get('%sUrl' % key)
if not playlist_url:
continue
try:
info = self._extract_from_playlist_sxml(
playlist_url, playlist_id, timestamp)
if not entry:
entry = info
else:
entry['title'] = info['title']
entry['formats'].extend(info['formats'])
except Exception as e:
# Some playlist URL may fail with 500, at the same time
# the other one may work fine (e.g.
# http://www.bbc.com/turkce/haberler/2015/06/150615_telabyad_kentin_cogu)
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 500:
continue
raise
if entry:
self._sort_formats(entry['formats'])
entries.append(entry)
if entries:
return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
# single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
programme_id = self._search_regex(
[r'data-video-player-vpid="(%s)"' % self._ID_REGEX,
[r'data-(?:video-player|media)-vpid="(%s)"' % self._ID_REGEX,
r'<param[^>]+name="externalIdentifier"[^>]+value="(%s)"' % self._ID_REGEX,
r'videoId\s*:\s*["\'](%s)["\']' % self._ID_REGEX],
webpage, 'vpid', default=None)
@@ -843,6 +885,50 @@ class BBCIE(BBCCoUkIE):
'subtitles': subtitles,
}
# Morph based embed (e.g. http://www.bbc.co.uk/sport/live/olympics/36895975)
# There are several setPayload calls may be present but the video
# seems to be always related to the first one
morph_payload = self._parse_json(
self._search_regex(
r'Morph\.setPayload\([^,]+,\s*({.+?})\);',
webpage, 'morph payload', default='{}'),
playlist_id, fatal=False)
if morph_payload:
components = try_get(morph_payload, lambda x: x['body']['components'], list) or []
for component in components:
if not isinstance(component, dict):
continue
lead_media = try_get(component, lambda x: x['props']['leadMedia'], dict)
if not lead_media:
continue
identifiers = lead_media.get('identifiers')
if not identifiers or not isinstance(identifiers, dict):
continue
programme_id = identifiers.get('vpid') or identifiers.get('playablePid')
if not programme_id:
continue
title = lead_media.get('title') or self._og_search_title(webpage)
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
description = lead_media.get('summary')
uploader = lead_media.get('masterBrand')
uploader_id = lead_media.get('mid')
duration = None
duration_d = lead_media.get('duration')
if isinstance(duration_d, dict):
duration = parse_duration(dict_get(
duration_d, ('rawDuration', 'formattedDuration', 'spokenDuration')))
return {
'id': programme_id,
'title': title,
'description': description,
'duration': duration,
'uploader': uploader,
'uploader_id': uploader_id,
'formats': formats,
'subtitles': subtitles,
}
def extract_all(pattern):
return list(filter(None, map(
lambda s: self._parse_json(s, playlist_id, fatal=False),
@@ -860,7 +946,7 @@ class BBCIE(BBCCoUkIE):
r'setPlaylist\("(%s)"\)' % EMBED_URL, webpage))
if entries:
return self.playlist_result(
[self.url_result(entry, 'BBCCoUk') for entry in entries],
[self.url_result(entry_, 'BBCCoUk') for entry_ in entries],
playlist_id, playlist_title, playlist_description)
# Multiple video article (e.g. http://www.bbc.com/news/world-europe-32668511)
@@ -942,7 +1028,7 @@ class BBCIE(BBCCoUkIE):
class BBCCoUkArticleIE(InfoExtractor):
_VALID_URL = 'http://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
IE_NAME = 'bbc.co.uk:article'
IE_DESC = 'BBC articles'
@@ -969,3 +1055,116 @@ class BBCCoUkArticleIE(InfoExtractor):
r'<div[^>]+typeof="Clip"[^>]+resource="([^"]+)"', webpage)]
return self.playlist_result(entries, playlist_id, title, description)
class BBCCoUkPlaylistBaseIE(InfoExtractor):
def _entries(self, webpage, url, playlist_id):
single_page = 'page' in compat_urlparse.parse_qs(
compat_urlparse.urlparse(url).query)
for page_num in itertools.count(2):
for video_id in re.findall(
self._VIDEO_ID_TEMPLATE % BBCCoUkIE._ID_REGEX, webpage):
yield self.url_result(
self._URL_TEMPLATE % video_id, BBCCoUkIE.ie_key())
if single_page:
return
next_page = self._search_regex(
r'<li[^>]+class=(["\'])pagination_+next\1[^>]*><a[^>]+href=(["\'])(?P<url>(?:(?!\2).)+)\2',
webpage, 'next page url', default=None, group='url')
if not next_page:
break
webpage = self._download_webpage(
compat_urlparse.urljoin(url, next_page), playlist_id,
'Downloading page %d' % page_num, page_num)
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title, description = self._extract_title_and_description(webpage)
return self.playlist_result(
self._entries(webpage, url, playlist_id),
playlist_id, title, description)
class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:iplayer:playlist'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/(?:episodes|group)/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
_URL_TEMPLATE = 'http://www.bbc.co.uk/iplayer/episode/%s'
_VIDEO_ID_TEMPLATE = r'data-ip-id=["\'](%s)'
_TESTS = [{
'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v',
'info_dict': {
'id': 'b05rcz9v',
'title': 'The Disappearance',
'description': 'French thriller serial about a missing teenager.',
},
'playlist_mincount': 6,
'skip': 'This programme is not currently available on BBC iPlayer',
}, {
# Available for over a year unlike 30 days for most other programmes
'url': 'http://www.bbc.co.uk/iplayer/group/p02tcc32',
'info_dict': {
'id': 'p02tcc32',
'title': 'Bohemian Icons',
'description': 'md5:683e901041b2fe9ba596f2ab04c4dbe7',
},
'playlist_mincount': 10,
}]
def _extract_title_and_description(self, webpage):
title = self._search_regex(r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
description = self._search_regex(
r'<p[^>]+class=(["\'])subtitle\1[^>]*>(?P<value>[^<]+)</p>',
webpage, 'description', fatal=False, group='value')
return title, description
class BBCCoUkPlaylistIE(BBCCoUkPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:playlist'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/programmes/(?P<id>%s)/(?:episodes|broadcasts|clips)' % BBCCoUkIE._ID_REGEX
_URL_TEMPLATE = 'http://www.bbc.co.uk/programmes/%s'
_VIDEO_ID_TEMPLATE = r'data-pid=["\'](%s)'
_TESTS = [{
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/clips',
'info_dict': {
'id': 'b05rcz9v',
'title': 'The Disappearance - Clips - BBC Four',
'description': 'French thriller serial about a missing teenager.',
},
'playlist_mincount': 7,
}, {
# multipage playlist, explicit page
'url': 'http://www.bbc.co.uk/programmes/b00mfl7n/clips?page=1',
'info_dict': {
'id': 'b00mfl7n',
'title': 'Frozen Planet - Clips - BBC One',
'description': 'md5:65dcbf591ae628dafe32aa6c4a4a0d8c',
},
'playlist_mincount': 24,
}, {
# multipage playlist, all pages
'url': 'http://www.bbc.co.uk/programmes/b00mfl7n/clips',
'info_dict': {
'id': 'b00mfl7n',
'title': 'Frozen Planet - Clips - BBC One',
'description': 'md5:65dcbf591ae628dafe32aa6c4a4a0d8c',
},
'playlist_mincount': 142,
}, {
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/broadcasts/2016/06',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/clips',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/programmes/b055jkys/episodes/player',
'only_matching': True,
}]
def _extract_title_and_description(self, webpage):
title = self._og_search_title(webpage, fatal=False)
description = self._og_search_description(webpage)
return title, description

View File

@@ -33,8 +33,33 @@ class BeegIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
cpl_url = self._search_regex(
r'<script[^>]+src=(["\'])(?P<url>(?:https?:)?//static\.beeg\.com/cpl/\d+\.js.*?)\1',
webpage, 'cpl', default=None, group='url')
beeg_version, beeg_salt = [None] * 2
if cpl_url:
cpl = self._download_webpage(
self._proto_relative_url(cpl_url), video_id,
'Downloading cpl JS', fatal=False)
if cpl:
beeg_version = self._search_regex(
r'beeg_version\s*=\s*(\d+)', cpl,
'beeg version', default=None) or self._search_regex(
r'/(\d+)\.js', cpl_url, 'beeg version', default=None)
beeg_salt = self._search_regex(
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg beeg_salt',
default=None, group='beeg_salt')
beeg_version = beeg_version or '1750'
beeg_salt = beeg_salt or 'MIDtGaw96f0N1kMMAM1DE46EC9pmFr'
video = self._download_json(
'https://api.beeg.com/api/v5/video/%s' % video_id, video_id)
'http://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
video_id)
def split(o, e):
def cut(s, x):
@@ -50,8 +75,8 @@ class BeegIE(InfoExtractor):
return n
def decrypt_key(key):
# Reverse engineered from http://static.beeg.com/cpl/1105.js
a = '5ShMcIQlssOd7zChAIOlmeTZDaUxULbJRnywYaiB'
# Reverse engineered from http://static.beeg.com/cpl/1738.js
a = beeg_salt
e = compat_urllib_parse_unquote(key)
o = ''.join([
compat_chr(compat_ord(e[n]) - compat_ord(a[n % len(a)]) % 21)
@@ -101,5 +126,5 @@ class BeegIE(InfoExtractor):
'duration': duration,
'tags': tags,
'formats': formats,
'age_limit': 18,
'age_limit': self._rta_search(webpage),
}

View File

@@ -8,7 +8,7 @@ from ..utils import url_basename
class BehindKinkIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_VALID_URL = r'https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_TEST = {
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
'md5': '507b57d8fdcd75a41a9a7bdb7989c762',

View File

@@ -0,0 +1,75 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class BellMediaIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?
(?P<domain>
(?:
ctv|
tsn|
bnn|
thecomedynetwork|
discovery|
discoveryvelocity|
sciencechannel|
investigationdiscovery|
animalplanet|
bravo|
mtv|
space
)\.ca|
much\.com
)/.*?(?:\bvid=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6})'''
_TESTS = [{
'url': 'http://www.ctv.ca/video/player?vid=706966',
'md5': 'ff2ebbeae0aa2dcc32a830c3fd69b7b0',
'info_dict': {
'id': '706966',
'ext': 'mp4',
'title': 'Larry Day and Richard Jutras on the TIFF red carpet of \'Stonewall\'',
'description': 'etalk catches up with Larry Day and Richard Jutras on the TIFF red carpet of "Stonewall”.',
'upload_date': '20150919',
'timestamp': 1442624700,
},
'expected_warnings': ['HTTP Error 404'],
}, {
'url': 'http://www.thecomedynetwork.ca/video/player?vid=923582',
'only_matching': True,
}, {
'url': 'http://www.tsn.ca/video/expectations-high-for-milos-raonic-at-us-open~939549',
'only_matching': True,
}, {
'url': 'http://www.bnn.ca/video/berman-s-call-part-two-viewer-questions~939654',
'only_matching': True,
}, {
'url': 'http://www.ctv.ca/YourMorning/Video/S1E6-Monday-August-29-2016-vid938009',
'only_matching': True,
}, {
'url': 'http://www.much.com/shows/atmidnight/episode948007/tuesday-september-13-2016',
'only_matching': True,
}, {
'url': 'http://www.much.com/shows/the-almost-impossible-gameshow/928979/episode-6',
'only_matching': True,
}]
_DOMAINS = {
'thecomedynetwork': 'comedy',
'discoveryvelocity': 'discvel',
'sciencechannel': 'discsci',
'investigationdiscovery': 'invdisc',
'animalplanet': 'aniplan',
}
def _real_extract(self, url):
domain, video_id = re.match(self._VALID_URL, url).groups()
domain = domain.split('.')[0]
return {
'_type': 'url_transparent',
'id': video_id,
'url': '9c9media:%s_web:%s' % (self._DOMAINS.get(domain, domain), video_id),
'ie_key': 'NineCNineMedia',
}

View File

@@ -1,31 +1,26 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
from ..utils import (
xpath_text,
xpath_with_ns,
int_or_none,
parse_iso8601,
)
from .mtv import MTVServicesInfoExtractor
from ..utils import unified_strdate
class BetIE(InfoExtractor):
class BetIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
'info_dict': {
'id': 'news/national/2014/a-conversation-with-president-obama',
'id': '07e96bd3-8850-3051-b856-271b457f0ab8',
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
'ext': 'flv',
'title': 'A Conversation With President Obama',
'description': 'md5:699d0652a350cf3e491cd15cc745b5da',
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'timestamp': 1418075340,
'upload_date': '20141208',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
},
'params': {
# rtmp download
@@ -35,16 +30,17 @@ class BetIE(InfoExtractor):
{
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
'info_dict': {
'id': 'news/national/2014/justice-for-ferguson-a-community-reacts',
'id': '9f516bf1-7543-39c4-8076-dd441b459ba9',
'display_id': 'justice-for-ferguson-a-community-reacts',
'ext': 'flv',
'title': 'Justice for Ferguson: A Community Reacts',
'description': 'A BET News special.',
'duration': 1696,
'timestamp': 1416942360,
'upload_date': '20141125',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
},
'params': {
# rtmp download
@@ -53,56 +49,32 @@ class BetIE(InfoExtractor):
}
]
_FEED_URL = "http://feeds.mtvnservices.com/od/feed/bet-mrss-player"
def _get_feed_query(self, uri):
return {
'uuid': uri,
}
def _extract_mgid(self, webpage):
return self._search_regex(r'data-uri="([^"]+)', webpage, 'mgid')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
mgid = self._extract_mgid(webpage)
videos_info = self._get_videos_info(mgid)
media_url = compat_urllib_parse_unquote(self._search_regex(
[r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
webpage, 'media URL'))
info_dict = videos_info['entries'][0]
video_id = self._search_regex(
r'/video/(.*)/_jcr_content/', media_url, 'video id')
upload_date = unified_strdate(self._html_search_meta('date', webpage))
description = self._html_search_meta('description', webpage)
mrss = self._download_xml(media_url, display_id)
item = mrss.find('./channel/item')
NS_MAP = {
'dc': 'http://purl.org/dc/elements/1.1/',
'media': 'http://search.yahoo.com/mrss/',
'ka': 'http://kickapps.com/karss',
}
title = xpath_text(item, './title', 'title')
description = xpath_text(
item, './description', 'description', fatal=False)
timestamp = parse_iso8601(xpath_text(
item, xpath_with_ns('./dc:date', NS_MAP),
'upload date', fatal=False))
uploader = xpath_text(
item, xpath_with_ns('./dc:creator', NS_MAP),
'uploader', fatal=False)
media_content = item.find(
xpath_with_ns('./media:content', NS_MAP))
duration = int_or_none(media_content.get('duration'))
smil_url = media_content.get('url')
thumbnail = media_content.find(
xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
formats = self._extract_smil_formats(smil_url, display_id)
return {
'id': video_id,
info_dict.update({
'display_id': display_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'uploader': uploader,
'duration': duration,
'formats': formats,
}
'upload_date': upload_date,
})
return info_dict

View File

@@ -11,22 +11,13 @@ from ..compat import compat_urllib_parse_unquote
class BigflixIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bigflix\.com/.+/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.bigflix.com/Hindi-movies/Action-movies/Singham-Returns/16537',
'md5': 'ec76aa9b1129e2e5b301a474e54fab74',
'info_dict': {
'id': '16537',
'ext': 'mp4',
'title': 'Singham Returns',
'description': 'md5:3d2ba5815f14911d5cc6a501ae0cf65d',
}
}, {
# 2 formats
'url': 'http://www.bigflix.com/Tamil-movies/Drama-movies/Madarasapatinam/16070',
'info_dict': {
'id': '16070',
'ext': 'mp4',
'title': 'Madarasapatinam',
'description': 'md5:63b9b8ed79189c6f0418c26d9a3452ca',
'description': 'md5:9f0470b26a4ba8e824c823b5d95c2f6b',
'formats': 'mincount:2',
},
'params': {

View File

@@ -1,110 +1,125 @@
# coding: utf-8
from __future__ import unicode_literals
import hashlib
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..compat import compat_parse_qs
from ..utils import (
int_or_none,
unescapeHTML,
ExtractorError,
xpath_text,
float_or_none,
unified_timestamp,
urlencode_postdata,
)
class BiliBiliIE(InfoExtractor):
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)(?:/index_(?P<page_num>\d+).html)?'
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
_TESTS = [{
_TEST = {
'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '2c301e4dab317596e837c3e7633e7d86',
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
'info_dict': {
'id': '1554319',
'ext': 'flv',
'id': '1074402',
'ext': 'mp4',
'title': '【金坷垃】金泡沫',
'duration': 308313,
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'duration': 308.315,
'timestamp': 1398012660,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'timestamp': 1397983878,
'uploader': '菊子桑',
'uploader_id': '156160',
},
}, {
'url': 'http://www.bilibili.com/video/av1041170/',
'info_dict': {
'id': '1041170',
'title': '【BD1080P】刀语【诸神&异域】',
'description': '这是个神奇的故事~每个人不留弹幕不给走哦~切利哦!~',
'uploader': '枫叶逝去',
'timestamp': 1396501299,
},
'playlist_count': 9,
}]
}
_APP_KEY = '6f90a59ac58a4123'
_BILIBILI_KEY = '0bfd84cc3940035173f35e6777508326'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
page_num = mobj.group('page_num') or '1'
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
view_data = self._download_json(
'http://api.bilibili.com/view?type=json&appkey=8e9fc618fbd41e28&id=%s&page=%s' % (video_id, page_num),
video_id)
if 'error' in view_data:
raise ExtractorError('%s said: %s' % (self.IE_NAME, view_data['error']), expected=True)
if 'anime/v' not in url:
cid = compat_parse_qs(self._search_regex(
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
webpage, 'player parameters'))['cid'][0]
else:
js = self._download_json(
'http://bangumi.bilibili.com/web_api/get_source', video_id,
data=urlencode_postdata({'episode_id': video_id}),
headers={'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'})
cid = js['result']['cid']
cid = view_data['cid']
title = unescapeHTML(view_data['title'])
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
doc = self._download_xml(
'http://interface.bilibili.com/v_cdn_play?appkey=8e9fc618fbd41e28&cid=%s' % cid,
cid,
'Downloading page %s/%s' % (page_num, view_data['pages'])
)
if xpath_text(doc, './result') == 'error':
raise ExtractorError('%s said: %s' % (self.IE_NAME, xpath_text(doc, './message')), expected=True)
video_info = self._download_json(
'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
video_id, note='Downloading video info page')
entries = []
for durl in doc.findall('./durl'):
size = xpath_text(durl, ['./filesize', './size'])
for idx, durl in enumerate(video_info['durl']):
formats = [{
'url': durl.find('./url').text,
'filesize': int_or_none(size),
'ext': 'flv',
'url': durl['url'],
'filesize': int_or_none(durl['size']),
}]
backup_urls = durl.find('./backup_url')
if backup_urls is not None:
for backup_url in backup_urls.findall('./url'):
formats.append({'url': backup_url.text})
formats.reverse()
for backup_url in durl.get('backup_url', []):
formats.append({
'url': backup_url,
# backup URLs have lower priorities
'preference': -2 if 'hd.mp4' in backup_url else -3,
})
self._sort_formats(formats)
entries.append({
'id': '%s_part%s' % (cid, xpath_text(durl, './order')),
'title': title,
'duration': int_or_none(xpath_text(durl, './length'), 1000),
'id': '%s_part%s' % (video_id, idx),
'duration': float_or_none(durl.get('length'), 1000),
'formats': formats,
})
title = self._html_search_regex('<h1[^>]+title="([^"]+)">', webpage, 'title')
description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex(
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', fatal=False))
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
# TODO 'view_count' requires deobfuscating Javascript
info = {
'id': compat_str(cid),
'id': video_id,
'title': title,
'description': view_data.get('description'),
'thumbnail': view_data.get('pic'),
'uploader': view_data.get('author'),
'timestamp': int_or_none(view_data.get('created')),
'view_count': int_or_none(view_data.get('play')),
'duration': int_or_none(xpath_text(doc, './timelength')),
'description': description,
'timestamp': timestamp,
'thumbnail': thumbnail,
'duration': float_or_none(video_info.get('timelength'), scale=1000),
}
uploader_mobj = re.search(
r'<a[^>]+href="https?://space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
webpage)
if uploader_mobj:
info.update({
'uploader': uploader_mobj.group('name'),
'uploader_id': uploader_mobj.group('id'),
})
for entry in entries:
entry.update(info)
if len(entries) == 1:
entries[0].update(info)
return entries[0]
else:
info.update({
for idx, entry in enumerate(entries):
entry['id'] = '%s_part%d' % (video_id, (idx + 1))
return {
'_type': 'multi_video',
'id': video_id,
'title': title,
'description': description,
'entries': entries,
})
return info
}

View File

@@ -0,0 +1,81 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
remove_end,
)
from .rudo import RudoIE
class BioBioChileTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:tv|www)\.biobiochile\.cl/(?:notas|noticias)/(?:[^/]+/)+(?P<id>[^/]+)\.shtml'
_TESTS = [{
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/sobre-camaras-y-camarillas-parlamentarias.shtml',
'md5': '26f51f03cf580265defefb4518faec09',
'info_dict': {
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
}, {
# different uploader layout
'url': 'http://tv.biobiochile.cl/notas/2016/03/18/natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades.shtml',
'md5': 'edc2e6b58974c46d5b047dea3c539ff3',
'info_dict': {
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {
'skip_download': True,
},
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
}, {
'url': 'http://www.biobiochile.cl/noticias/bbtv/comentarios-bio-bio/2016/07/08/edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos.shtml',
'info_dict': {
'id': 'edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos',
'ext': 'mp4',
'uploader': '(none)',
'upload_date': '20160708',
'title': 'Edecanes del Congreso: Figuras decorativas que le cuestan muy caro a los chilenos',
},
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml',
'only_matching': True,
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/exclusivo-hector-pinto-formador-de-chupete-revela-version-del-ex-delantero-albo.shtml',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
rudo_url = RudoIE._extract_url(webpage)
if not rudo_url:
raise ExtractorError('No videos found')
title = remove_end(self._og_search_title(webpage), ' - BioBioChile TV')
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex(
r'<a[^>]+href=["\']https?://(?:busca|www)\.biobiochile\.cl/(?:lista/)?(?:author|autor)[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False)
return {
'_type': 'url_transparent',
'url': rudo_url,
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'uploader': uploader,
}

View File

@@ -0,0 +1,40 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class BIQLEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{
'url': 'http://www.biqle.ru/watch/847655_160197695',
'md5': 'ad5f746a874ccded7b8f211aeea96637',
'info_dict': {
'id': '160197695',
'ext': 'mp4',
'title': 'Foo Fighters - The Pretender (Live at Wembley Stadium)',
'uploader': 'Andrey Rogozin',
'upload_date': '20110605',
}
}, {
'url': 'https://biqle.org/watch/-44781847_168547604',
'md5': '7f24e72af1db0edf7c1aaba513174f97',
'info_dict': {
'id': '168547604',
'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки',
'uploader': 'Dmitry Kotov',
},
'skip': ' This video was marked as adult. Embedding adult videos on external sites is prohibited.',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
embed_url = self._proto_relative_url(self._search_regex(
r'<iframe.+?src="((?:http:)?//daxab\.com/[^"]+)".*?></iframe>', webpage, 'embed url'))
return {
'_type': 'url_transparent',
'url': embed_url,
}

View File

@@ -1,3 +1,4 @@
# coding: utf-8
from __future__ import unicode_literals
import re
@@ -17,6 +18,21 @@ class BloombergIE(InfoExtractor):
'title': 'Shah\'s Presentation on Foreign-Exchange Strategies',
'description': 'md5:a8ba0302912d03d246979735c17d2761',
},
'params': {
'format': 'best[format_id^=hds]',
},
}, {
# video ID in BPlayer(...)
'url': 'http://www.bloomberg.com/features/2016-hello-world-new-zealand/',
'info_dict': {
'id': '938c7e72-3f25-4ddb-8b85-a9be731baa74',
'ext': 'flv',
'title': 'Meet the Real-Life Tech Wizards of Middle Earth',
'description': 'Hello World, Episode 1: New Zealands freaky AI babies, robot exoskeletons, and a virtual you.',
},
'params': {
'format': 'best[format_id^=hds]',
},
}, {
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
'only_matching': True,
@@ -30,7 +46,11 @@ class BloombergIE(InfoExtractor):
webpage = self._download_webpage(url, name)
video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
webpage, 'id', group='url')
webpage, 'id', group='url', default=None)
if not video_id:
bplayer_data = self._parse_json(self._search_regex(
r'BPlayer\(null,\s*({[^;]+})\);', webpage, 'id'), name)
video_id = bplayer_data['id']
title = re.sub(': Video$', '', self._og_search_title(webpage))
embed_info = self._download_json(

View File

@@ -33,7 +33,7 @@ class BokeCCBaseIE(InfoExtractor):
class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频'
_VALID_URL = r'http://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{
'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B',

View File

@@ -12,7 +12,7 @@ from ..utils import (
class BpbIE(InfoExtractor):
IE_DESC = 'Bundeszentrale für politische Bildung'
_VALID_URL = r'http://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
_VALID_URL = r'https?://(?:www\.)?bpb\.de/mediathek/(?P<id>[0-9]+)/'
_TEST = {
'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr',

View File

@@ -29,7 +29,8 @@ class BRIE(InfoExtractor):
'duration': 180,
'uploader': 'Reinhard Weber',
'upload_date': '20150422',
}
},
'skip': '404 not found',
},
{
'url': 'http://www.br.de/nachrichten/oberbayern/inhalt/muenchner-polizeipraesident-schreiber-gestorben-100.html',
@@ -40,7 +41,8 @@ class BRIE(InfoExtractor):
'title': 'Manfred Schreiber ist tot',
'description': 'md5:b454d867f2a9fc524ebe88c3f5092d97',
'duration': 26,
}
},
'skip': '404 not found',
},
{
'url': 'https://www.br-klassik.de/audio/peeping-tom-premierenkritik-dance-festival-muenchen-100.html',
@@ -51,7 +53,8 @@ class BRIE(InfoExtractor):
'title': 'Kurzweilig und sehr bewegend',
'description': 'md5:0351996e3283d64adeb38ede91fac54e',
'duration': 296,
}
},
'skip': '404 not found',
},
{
'url': 'http://www.br.de/radio/bayern1/service/team/videos/team-video-erdelt100.html',

View File

@@ -1,28 +1,74 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import smuggle_url
from .adobepass import AdobePassIE
from ..utils import (
smuggle_url,
update_url_query,
int_or_none,
)
class BravoTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+videos/(?P<id>[^/?]+)'
_TEST = {
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': 'd60cdf68904e854fac669bd26cccf801',
'md5': '9086d0b7ef0ea2aabc4781d75f4e5863',
'info_dict': {
'id': 'LitrBdX64qLn',
'id': 'zHyk1_HU_mPy',
'ext': 'mp4',
'title': 'Last Chance Kitchen Returns',
'description': 'S13: Last Chance Kitchen Returns for Top Chef Season 13',
'title': 'LCK Ep 12: Fishy Finale',
'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.',
'uploader': 'NBCU-BRAV',
'upload_date': '20160302',
'timestamp': 1456945320,
}
}
}, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
account_pid = self._search_regex(r'"account_pid"\s*:\s*"([^"]+)"', webpage, 'account pid')
release_pid = self._search_regex(r'"release_pid"\s*:\s*"([^"]+)"', webpage, 'release pid')
return self.url_result(smuggle_url(
'http://link.theplatform.com/s/%s/%s?mbr=true&switch=progressive' % (account_pid, release_pid),
{'force_smil_url': True}), 'ThePlatform', release_pid)
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'),
display_id)
info = {}
query = {
'mbr': 'true',
}
account_pid, release_pid = [None] * 2
tve = settings.get('sharedTVE')
if tve:
query['manifest'] = 'm3u'
account_pid = 'HNK2IC'
release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('adobePass', {})
resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'),
tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
else:
shared_playlist = settings['shared_playlist']
account_pid = shared_playlist['account_pid']
metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']]
release_pid = metadata['release_pid']
info.update({
'title': metadata['title'],
'description': metadata.get('description'),
'season_number': int_or_none(metadata.get('season_num')),
'episode_number': int_or_none(metadata.get('episode_num')),
})
query['switch'] = 'progressive'
info.update({
'_type': 'url_transparent',
'id': release_pid,
'url': smuggle_url(update_url_query(
'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid),
query), {'force_smil_url': True}),
'ie_key': 'ThePlatform',
})
return info

View File

@@ -11,7 +11,7 @@ from ..utils import (
class BreakIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {

View File

@@ -26,6 +26,8 @@ from ..utils import (
unescapeHTML,
unsmuggle_url,
update_url_query,
clean_html,
mimetype2ext,
)
@@ -46,6 +48,9 @@ class BrightcoveLegacyIE(InfoExtractor):
'title': 'Xavier Sala i Martín: “Un banc que no presta és un banc zombi que no serveix per a res”',
'uploader': '8TV',
'description': 'md5:a950cc4285c43e44d763d036710cd9cd',
'timestamp': 1368213670,
'upload_date': '20130510',
'uploader_id': '1589608506001',
}
},
{
@@ -57,6 +62,9 @@ class BrightcoveLegacyIE(InfoExtractor):
'title': 'JVMLS 2012: Arrays 2.0 - Opportunities and Challenges',
'description': 'John Rose speaks at the JVM Language Summit, August 1, 2012.',
'uploader': 'Oracle',
'timestamp': 1344975024,
'upload_date': '20120814',
'uploader_id': '1460825906',
},
},
{
@@ -68,6 +76,9 @@ class BrightcoveLegacyIE(InfoExtractor):
'title': 'This Bracelet Acts as a Personal Thermostat',
'description': 'md5:547b78c64f4112766ccf4e151c20b6a0',
'uploader': 'Mashable',
'timestamp': 1382041798,
'upload_date': '20131017',
'uploader_id': '1130468786001',
},
},
{
@@ -81,22 +92,26 @@ class BrightcoveLegacyIE(InfoExtractor):
'description': 'md5:363109c02998fee92ec02211bd8000df',
'uploader': 'National Ballet of Canada',
},
'skip': 'Video gone',
},
{
# test flv videos served by akamaihd.net
# From http://www.redbull.com/en/bike/stories/1331655643987/replay-uci-dh-world-cup-2014-from-fort-william
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?%40videoPlayer=ref%3ABC2996102916001&linkBaseURL=http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fvideos%2F1331655630249%2Freplay-uci-fort-william-2014-dh&playerKey=AQ%7E%7E%2CAAAApYJ7UqE%7E%2Cxqr_zXk0I-zzNndy8NlHogrCb5QdyZRf&playerID=1398061561001#__youtubedl_smuggle=%7B%22Referer%22%3A+%22http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fstories%2F1331655643987%2Freplay-uci-dh-world-cup-2014-from-fort-william%22%7D',
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?%40videoPlayer=ref%3Aevent-stream-356&linkBaseURL=http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fvideos%2F1331655630249%2Freplay-uci-fort-william-2014-dh&playerKey=AQ%7E%7E%2CAAAApYJ7UqE%7E%2Cxqr_zXk0I-zzNndy8NlHogrCb5QdyZRf&playerID=1398061561001#__youtubedl_smuggle=%7B%22Referer%22%3A+%22http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fstories%2F1331655643987%2Freplay-uci-dh-world-cup-2014-from-fort-william%22%7D',
# The md5 checksum changes on each download
'info_dict': {
'id': '2996102916001',
'id': '3750436379001',
'ext': 'flv',
'title': 'UCI MTB World Cup 2014: Fort William, UK - Downhill Finals',
'uploader': 'Red Bull TV',
'uploader': 'RBTV Old (do not use)',
'description': 'UCI MTB World Cup 2014: Fort William, UK - Downhill Finals',
'timestamp': 1409122195,
'upload_date': '20140827',
'uploader_id': '710858724001',
},
},
{
# playlist test
# playlist with 'videoList'
# from http://support.brightcove.com/en/video-cloud/docs/playlist-support-single-video-players
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?playerID=3550052898001&playerKey=AQ%7E%7E%2CAAABmA9XpXk%7E%2C-Kp7jNgisre1fG5OdqpAFUTcs0lP_ZoL',
'info_dict': {
@@ -105,7 +120,22 @@ class BrightcoveLegacyIE(InfoExtractor):
},
'playlist_mincount': 7,
},
{
# playlist with 'playlistTab' (https://github.com/rg3/youtube-dl/issues/9965)
'url': 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=AQ%7E%7E,AAABXlLMdok%7E,NJ4EoMlZ4rZdx9eU1rkMVd8EaYPBBUlg',
'info_dict': {
'id': '1522758701001',
'title': 'Lesson 08',
},
'playlist_mincount': 10,
},
]
FLV_VCODECS = {
1: 'SORENSON',
2: 'ON2',
3: 'H264',
4: 'VP8',
}
@classmethod
def _build_brighcove_url(cls, object_str):
@@ -136,13 +166,16 @@ class BrightcoveLegacyIE(InfoExtractor):
else:
flashvars = {}
data_url = object_doc.attrib.get('data', '')
data_url_params = compat_parse_qs(compat_urllib_parse_urlparse(data_url).query)
def find_param(name):
if name in flashvars:
return flashvars[name]
node = find_xpath_attr(object_doc, './param', 'name', name)
if node is not None:
return node.attrib['value']
return None
return data_url_params.get(name)
params = {}
@@ -277,24 +310,35 @@ class BrightcoveLegacyIE(InfoExtractor):
info_url, player_key, 'Downloading playlist information')
json_data = json.loads(playlist_info)
if 'videoList' not in json_data:
if 'videoList' in json_data:
playlist_info = json_data['videoList']
playlist_dto = playlist_info['mediaCollectionDTO']
elif 'playlistTabs' in json_data:
playlist_info = json_data['playlistTabs']
playlist_dto = playlist_info['lineupListDTO']['playlistDTOs'][0]
else:
raise ExtractorError('Empty playlist')
playlist_info = json_data['videoList']
videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
videos = [self._extract_video_info(video_info) for video_info in playlist_dto['videoDTOs']]
return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
playlist_title=playlist_dto['displayName'])
def _extract_video_info(self, video_info):
video_id = compat_str(video_info['id'])
publisher_id = video_info.get('publisherId')
info = {
'id': compat_str(video_info['id']),
'id': video_id,
'title': video_info['displayName'].strip(),
'description': video_info.get('shortDescription'),
'thumbnail': video_info.get('videoStillURL') or video_info.get('thumbnailURL'),
'uploader': video_info.get('publisherName'),
'uploader_id': compat_str(publisher_id) if publisher_id else None,
'duration': float_or_none(video_info.get('length'), 1000),
'timestamp': int_or_none(video_info.get('creationDate'), 1000),
}
renditions = video_info.get('renditions')
renditions = video_info.get('renditions', []) + video_info.get('IOSRenditions', [])
if renditions:
formats = []
for rend in renditions:
@@ -306,7 +350,8 @@ class BrightcoveLegacyIE(InfoExtractor):
url_comp = compat_urllib_parse_urlparse(url)
if url_comp.path.endswith('.m3u8'):
formats.extend(
self._extract_m3u8_formats(url, info['id'], 'mp4'))
self._extract_m3u8_formats(
url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
continue
elif 'akamaihd.net' in url_comp.netloc:
# This type of renditions are served through
@@ -315,19 +360,42 @@ class BrightcoveLegacyIE(InfoExtractor):
ext = 'flv'
if ext is None:
ext = determine_ext(url)
size = rend.get('size')
formats.append({
tbr = int_or_none(rend.get('encodingRate'), 1000)
a_format = {
'format_id': 'http%s' % ('-%s' % tbr if tbr else ''),
'url': url,
'ext': ext,
'height': rend.get('frameHeight'),
'width': rend.get('frameWidth'),
'filesize': size if size != 0 else None,
})
'filesize': int_or_none(rend.get('size')) or None,
'tbr': tbr,
}
if rend.get('audioOnly'):
a_format.update({
'vcodec': 'none',
})
else:
a_format.update({
'height': int_or_none(rend.get('frameHeight')),
'width': int_or_none(rend.get('frameWidth')),
'vcodec': rend.get('videoCodec'),
})
# m3u8 manifests with remote == false are media playlists
# Not calling _extract_m3u8_formats here to save network traffic
if ext == 'm3u8':
a_format.update({
'format_id': 'hls%s' % ('-%s' % tbr if tbr else ''),
'ext': 'mp4',
'protocol': 'm3u8_native',
})
formats.append(a_format)
self._sort_formats(formats)
info['formats'] = formats
elif video_info.get('FLVFullLengthURL') is not None:
info.update({
'url': video_info['FLVFullLengthURL'],
'vcodec': self.FLV_VCODECS.get(video_info.get('FLVFullCodec')),
'filesize': int_or_none(video_info.get('FLVFullSize')),
})
if self._downloader.params.get('include_ads', False):
@@ -347,7 +415,7 @@ class BrightcoveLegacyIE(InfoExtractor):
return ad_info
if 'url' not in info and not info.get('formats'):
raise ExtractorError('Unable to extract video url for %s' % info['id'])
raise ExtractorError('Unable to extract video url for %s' % video_id)
return info
@@ -383,6 +451,7 @@ class BrightcoveNewIE(InfoExtractor):
'formats': 'mincount:41',
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
@@ -393,6 +462,10 @@ class BrightcoveNewIE(InfoExtractor):
# non numeric ref: prefixed video id
'url': 'http://players.brightcove.net/710858724001/default_default/index.html?videoId=ref:event-stream-356',
'only_matching': True,
}, {
# unavailable video without message but with error_code
'url': 'http://players.brightcove.net/1305187701/c832abfb-641b-44eb-9da0-2fe76786505f_default/index.html?videoId=4377407326001',
'only_matching': True,
}]
@staticmethod
@@ -426,7 +499,7 @@ class BrightcoveNewIE(InfoExtractor):
</video>.*?
<script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/
(\d+)/([\da-f-]+)_([^/]+)/index(?:\.min)?\.js
(\d+)/([^/]+)_([^/]+)/index(?:\.min)?\.js
''', webpage):
entries.append(
'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
@@ -463,23 +536,26 @@ class BrightcoveNewIE(InfoExtractor):
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
json_data = self._parse_json(e.cause.read().decode(), video_id)
raise ExtractorError(json_data[0]['message'], expected=True)
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
raise ExtractorError(
json_data.get('message') or json_data['error_code'], expected=True)
raise
title = json_data['name']
title = json_data['name'].strip()
formats = []
for source in json_data.get('sources', []):
container = source.get('container')
source_type = source.get('type')
ext = mimetype2ext(source.get('type'))
src = source.get('src')
if source_type == 'application/x-mpegURL' or container == 'M2TS':
if ext == 'ism':
continue
elif ext == 'm3u8' or container == 'M2TS':
if not src:
continue
formats.extend(self._extract_m3u8_formats(
src, video_id, 'mp4', m3u8_id='hls', fatal=False))
elif source_type == 'application/dash+xml':
src, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
elif ext == 'mpd':
if not src:
continue
formats.extend(self._extract_mpd_formats(src, video_id, 'dash', fatal=False))
@@ -495,7 +571,7 @@ class BrightcoveNewIE(InfoExtractor):
'tbr': tbr,
'filesize': int_or_none(source.get('size')),
'container': container,
'ext': container.lower(),
'ext': ext or container.lower(),
}
if width == 0 and height == 0:
f.update({
@@ -520,7 +596,7 @@ class BrightcoveNewIE(InfoExtractor):
f.update({
'url': src or streaming_src,
'format_id': build_format_id('http' if src else 'http-streaming'),
'preference': 2 if src else 1,
'source_preference': 0 if src else -1,
})
else:
f.update({
@@ -529,22 +605,31 @@ class BrightcoveNewIE(InfoExtractor):
'format_id': build_format_id('rtmp'),
})
formats.append(f)
errors = json_data.get('errors')
if not formats and errors:
error = errors[0]
raise ExtractorError(
error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
self._sort_formats(formats)
description = json_data.get('description')
thumbnail = json_data.get('thumbnail')
timestamp = parse_iso8601(json_data.get('published_at'))
duration = float_or_none(json_data.get('duration'), 1000)
tags = json_data.get('tags', [])
subtitles = {}
for text_track in json_data.get('text_tracks', []):
if text_track.get('src'):
subtitles.setdefault(text_track.get('srclang'), []).append({
'url': text_track['src'],
})
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'description': clean_html(json_data.get('description')),
'thumbnail': json_data.get('thumbnail') or json_data.get('poster'),
'duration': float_or_none(json_data.get('duration'), 1000),
'timestamp': parse_iso8601(json_data.get('published_at')),
'uploader_id': account_id,
'formats': formats,
'tags': tags,
'subtitles': subtitles,
'tags': json_data.get('tags', []),
}

View File

@@ -5,6 +5,7 @@ import json
import re
from .common import InfoExtractor
from .facebook import FacebookIE
class BuzzFeedIE(InfoExtractor):
@@ -20,11 +21,11 @@ class BuzzFeedIE(InfoExtractor):
'info_dict': {
'id': 'aVCR29aE_OQ',
'ext': 'mp4',
'title': 'Angry Ram destroys a punching bag..',
'description': 'md5:c59533190ef23fd4458a5e8c8c872345',
'upload_date': '20141024',
'uploader_id': 'Buddhanz1',
'description': 'He likes to stay in shape with his heavy bag, he wont stop until its on the ground\n\nFollow Angry Ram on Facebook for regular updates -\nhttps://www.facebook.com/pages/Angry-Ram/1436897249899558?ref=hl',
'uploader': 'Buddhanz',
'title': 'Angry Ram destroys a punching bag',
'uploader': 'Angry Ram',
}
}]
}, {
@@ -41,13 +42,30 @@ class BuzzFeedIE(InfoExtractor):
'info_dict': {
'id': 'mVmBL8B-In0',
'ext': 'mp4',
'title': 're:Munchkin the Teddy Bear gets her exercise',
'description': 'md5:28faab95cda6e361bcff06ec12fc21d8',
'upload_date': '20141124',
'uploader_id': 'CindysMunchkin',
'description': 're:© 2014 Munchkin the',
'uploader': 're:^Munchkin the',
'title': 're:Munchkin the Teddy Bear gets her exercise',
},
}]
}, {
'url': 'http://www.buzzfeed.com/craigsilverman/the-most-adorable-crash-landing-ever#.eq7pX0BAmK',
'info_dict': {
'id': 'the-most-adorable-crash-landing-ever',
'title': 'Watch This Baby Goose Make The Most Adorable Crash Landing',
'description': 'This gosling knows how to stick a landing.',
},
'playlist': [{
'md5': '763ca415512f91ca62e4621086900a23',
'info_dict': {
'id': '971793786185728',
'ext': 'mp4',
'title': 'We set up crash pads so that the goslings on our roof would have a safe landi...',
'uploader': 'Calgary Outdoor Centre-University of Calgary',
},
}],
'add_ie': ['Facebook'],
}]
def _real_extract(self, url):
@@ -66,6 +84,10 @@ class BuzzFeedIE(InfoExtractor):
continue
entries.append(self.url_result(video['url']))
facebook_url = FacebookIE._extract_url(webpage)
if facebook_url:
entries.append(self.url_result(facebook_url))
return {
'_type': 'playlist',
'id': playlist_id,

View File

@@ -11,6 +11,7 @@ class BYUtvIE(InfoExtractor):
_VALID_URL = r'^https?://(?:www\.)?byutv.org/watch/[0-9a-f-]+/(?P<video_id>[^/?#]+)'
_TEST = {
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5',
'md5': '05850eb8c749e2ee05ad5a1c34668493',
'info_dict': {
'id': 'studio-c-season-5-episode-5',
'ext': 'mp4',
@@ -21,7 +22,8 @@ class BYUtvIE(InfoExtractor):
},
'params': {
'skip_download': True,
}
},
'add_ie': ['Ooyala'],
}
def _real_extract(self, url):

View File

@@ -1,22 +1,23 @@
# coding: utf-8
from __future__ import unicode_literals
import datetime
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_parse_urlencode,
compat_urlparse,
)
from ..utils import (
parse_iso8601,
clean_html,
parse_duration,
str_to_int,
unified_strdate,
)
class CamdemyIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
_TESTS = [{
# single file
'url': 'http://www.camdemy.com/media/5181/',
@@ -26,14 +27,14 @@ class CamdemyIE(InfoExtractor):
'ext': 'mp4',
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
'thumbnail': 're:^https?://.*\.jpg$',
'description': '',
'creator': 'ss11spring',
'duration': 1591,
'upload_date': '20130114',
'timestamp': 1358154556,
'view_count': int,
}
}, {
# With non-empty description
# webpage returns "No permission or not login"
'url': 'http://www.camdemy.com/media/13885',
'md5': '4576a3bb2581f86c61044822adbd1249',
'info_dict': {
@@ -41,70 +42,77 @@ class CamdemyIE(InfoExtractor):
'ext': 'mp4',
'title': 'EverCam + Camdemy QuickStart',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'md5:050b62f71ed62928f8a35f1a41e186c9',
'description': 'md5:2a9f989c2b153a2342acee579c6e7db6',
'creator': 'evercam',
'upload_date': '20140620',
'timestamp': 1403271569,
'duration': 318,
}
}, {
# External source
# External source (YouTube)
'url': 'http://www.camdemy.com/media/14842',
'md5': '50e1c3c3aa233d3d7b7daa2fa10b1cf7',
'info_dict': {
'id': '2vsYQzNIsJo',
'ext': 'mp4',
'title': 'Excel 2013 Tutorial - How to add Password Protection',
'description': 'Excel 2013 Tutorial for Beginners - How to add Password Protection',
'upload_date': '20130211',
'uploader': 'Hun Kim',
'description': 'Excel 2013 Tutorial for Beginners - How to add Password Protection',
'uploader_id': 'hunkimtutorials',
'title': 'Excel 2013 Tutorial - How to add Password Protection',
}
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
page = self._download_webpage(url, video_id)
webpage = self._download_webpage(url, video_id)
src_from = self._html_search_regex(
r"<div class='srcFrom'>Source: <a title='([^']+)'", page,
'external source', default=None)
r"class=['\"]srcFrom['\"][^>]*>Sources?(?:\s+from)?\s*:\s*<a[^>]+(?:href|title)=(['\"])(?P<url>(?:(?!\1).)+)\1",
webpage, 'external source', default=None, group='url')
if src_from:
return self.url_result(src_from)
oembed_obj = self._download_json(
'http://www.camdemy.com/oembed/?format=json&url=' + url, video_id)
title = oembed_obj['title']
thumb_url = oembed_obj['thumbnail_url']
video_folder = compat_urlparse.urljoin(thumb_url, 'video/')
file_list_doc = self._download_xml(
compat_urlparse.urljoin(video_folder, 'fileList.xml'),
video_id, 'Filelist XML')
video_id, 'Downloading filelist XML')
file_name = file_list_doc.find('./video/item/fileName').text
video_url = compat_urlparse.urljoin(video_folder, file_name)
timestamp = parse_iso8601(self._html_search_regex(
r"<div class='title'>Posted\s*:</div>\s*<div class='value'>([^<>]+)<",
page, 'creation time', fatal=False),
delimiter=' ', timezone=datetime.timedelta(hours=8))
view_count = str_to_int(self._html_search_regex(
r"<div class='title'>Views\s*:</div>\s*<div class='value'>([^<>]+)<",
page, 'view count', fatal=False))
# Some URLs return "No permission or not login" in a webpage despite being
# freely available via oembed JSON URL (e.g. http://www.camdemy.com/media/13885)
upload_date = unified_strdate(self._search_regex(
r'>published on ([^<]+)<', webpage,
'upload date', default=None))
view_count = str_to_int(self._search_regex(
r'role=["\']viewCnt["\'][^>]*>([\d,.]+) views',
webpage, 'view count', default=None))
description = self._html_search_meta(
'description', webpage, default=None) or clean_html(
oembed_obj.get('description'))
return {
'id': video_id,
'url': video_url,
'title': oembed_obj['title'],
'title': title,
'thumbnail': thumb_url,
'description': self._html_search_meta('description', page),
'creator': oembed_obj['author_name'],
'duration': oembed_obj['duration'],
'timestamp': timestamp,
'description': description,
'creator': oembed_obj.get('author_name'),
'duration': parse_duration(oembed_obj.get('duration')),
'upload_date': upload_date,
'view_count': view_count,
}
class CamdemyFolderIE(InfoExtractor):
_VALID_URL = r'http://www.camdemy.com/folder/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?camdemy\.com/folder/(?P<id>\d+)'
_TESTS = [{
# links with trailing slash
'url': 'http://www.camdemy.com/folder/450',
@@ -139,7 +147,7 @@ class CamdemyFolderIE(InfoExtractor):
parsed_url = list(compat_urlparse.urlparse(url))
query = dict(compat_urlparse.parse_qsl(parsed_url[4]))
query.update({'displayMode': 'list'})
parsed_url[4] = compat_urllib_parse.urlencode(query)
parsed_url[4] = compat_urllib_parse_urlencode(query)
final_url = compat_urlparse.urlunparse(parsed_url)
page = self._download_webpage(final_url, folder_id)

View File

@@ -0,0 +1,87 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
unified_strdate,
)
class CamWithHerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?camwithher\.tv/view_video\.php\?.*\bviewkey=(?P<id>\w+)'
_TESTS = [{
'url': 'http://camwithher.tv/view_video.php?viewkey=6e9a24e2c0e842e1f177&page=&viewtype=&category=',
'info_dict': {
'id': '5644',
'ext': 'flv',
'title': 'Periscope Tease',
'description': 'In the clouds teasing on periscope to my favorite song',
'duration': 240,
'view_count': int,
'comment_count': int,
'uploader': 'MileenaK',
'upload_date': '20160322',
},
'params': {
'skip_download': True,
}
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=6dfd8b7c97531a459937',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?page=&viewkey=6e9a24e2c0e842e1f177&viewtype=&category=',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=b6c3b5bea9515d1a1fc4&page=&viewtype=&category=mv',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
flv_id = self._html_search_regex(
r'<a[^>]+href=["\']/download/\?v=(\d+)', webpage, 'video id')
# Video URL construction algorithm is reverse-engineered from cwhplayer.swf
rtmp_url = 'rtmp://camwithher.tv/clipshare/%s' % (
('mp4:%s.mp4' % flv_id) if int(flv_id) > 2010 else flv_id)
title = self._html_search_regex(
r'<div[^>]+style="float:left"[^>]*>\s*<h2>(.+?)</h2>', webpage, 'title')
description = self._html_search_regex(
r'>Description:</span>(.+?)</div>', webpage, 'description', default=None)
runtime = self._search_regex(
r'Runtime\s*:\s*(.+?) \|', webpage, 'duration', default=None)
if runtime:
runtime = re.sub(r'[\s-]', '', runtime)
duration = parse_duration(runtime)
view_count = int_or_none(self._search_regex(
r'Views\s*:\s*(\d+)', webpage, 'view count', default=None))
comment_count = int_or_none(self._search_regex(
r'Comments\s*:\s*(\d+)', webpage, 'comment count', default=None))
uploader = self._search_regex(
r'Added by\s*:\s*<a[^>]+>([^<]+)</a>', webpage, 'uploader', default=None)
upload_date = unified_strdate(self._search_regex(
r'Added on\s*:\s*([\d-]+)', webpage, 'upload date', default=None))
return {
'id': flv_id,
'url': rtmp_url,
'ext': 'flv',
'no_resume': True,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'comment_count': comment_count,
'uploader': uploader,
'upload_date': upload_date,
}

View File

@@ -4,11 +4,11 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import (
ExtractorError,
HEADRequest,
unified_strdate,
url_basename,
qualities,
int_or_none,
)
@@ -16,24 +16,40 @@ from ..utils import (
class CanalplusIE(InfoExtractor):
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
_VALID_URL = r'https?://(?:www\.(?P<site>canalplus\.fr|piwiplus\.fr|d8\.tv|itele\.fr)/.*?/(?P<path>.*)|player\.canalplus\.fr/#/(?P<id>[0-9]+))'
_VALID_URL = r'''(?x)
https?://
(?:
(?:
(?:(?:www|m)\.)?canalplus\.fr|
(?:www\.)?piwiplus\.fr|
(?:www\.)?d8\.tv|
(?:www\.)?c8\.fr|
(?:www\.)?d17\.tv|
(?:www\.)?itele\.fr
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
player\.canalplus\.fr/#/(?P<id>\d+)
)
'''
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
_SITE_ID_MAP = {
'canalplus.fr': 'cplus',
'piwiplus.fr': 'teletoon',
'd8.tv': 'd8',
'itele.fr': 'itele',
'canalplus': 'cplus',
'piwiplus': 'teletoon',
'd8': 'd8',
'c8': 'd8',
'd17': 'd17',
'itele': 'itele',
}
_TESTS = [{
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1263092',
'md5': '12164a6f14ff6df8bd628e8ba9b10b78',
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1192814',
'md5': '41f438a4904f7664b91b4ed0dec969dc',
'info_dict': {
'id': '1263092',
'id': '1192814',
'ext': 'mp4',
'title': 'Le Zapping - 13/05/15',
'description': 'md5:09738c0d06be4b5d06a0940edb0da73f',
'upload_date': '20150513',
'title': "L'Année du Zapping 2014 - L'Année du Zapping 2014",
'description': "Toute l'année 2014 dans un Zapping exceptionnel !",
'upload_date': '20150105',
},
}, {
'url': 'http://www.piwiplus.fr/videos-piwi/pid1405-le-labyrinthe-boing-super-ranger.html?vid=1108190',
@@ -46,35 +62,45 @@ class CanalplusIE(InfoExtractor):
},
'skip': 'Only works from France',
}, {
'url': 'http://www.d8.tv/d8-docs-mags/pid6589-d8-campagne-intime.html',
'url': 'http://www.d8.tv/d8-docs-mags/pid5198-d8-en-quete-d-actualite.html?vid=1390231',
'info_dict': {
'id': '966289',
'ext': 'flv',
'title': 'Campagne intime - Documentaire exceptionnel',
'description': 'md5:d2643b799fb190846ae09c61e59a859f',
'upload_date': '20131108',
},
'skip': 'videos get deleted after a while',
}, {
'url': 'http://www.itele.fr/france/video/aubervilliers-un-lycee-en-colere-111559',
'md5': '38b8f7934def74f0d6f3ba6c036a5f82',
'info_dict': {
'id': '1213714',
'id': '1390231',
'ext': 'mp4',
'title': 'Aubervilliers : un lycée en colère - Le 11/02/2015 à 06h45',
'description': 'md5:8216206ec53426ea6321321f3b3c16db',
'upload_date': '20150211',
'title': "Vacances pas chères : prix discount ou grosses dépenses ? - En quête d'actualité",
'description': 'md5:edb6cf1cb4a1e807b5dd089e1ac8bfc6',
'upload_date': '20160512',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.itele.fr/chroniques/invite-bruce-toussaint/thierry-solere-nicolas-sarkozy-officialisera-sa-candidature-a-la-primaire-quand-il-le-voudra-167224',
'info_dict': {
'id': '1398334',
'ext': 'mp4',
'title': "L'invité de Bruce Toussaint du 07/06/2016 - ",
'description': 'md5:40ac7c9ad0feaeb6f605bad986f61324',
'upload_date': '20160607',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://m.canalplus.fr/?vid=1398231',
'only_matching': True,
}, {
'url': 'http://www.d17.tv/emissions/pid8303-lolywood.html?vid=1397061',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.groupdict().get('id')
video_id = mobj.groupdict().get('id') or mobj.groupdict().get('vid')
site_id = self._SITE_ID_MAP[mobj.group('site') or 'canal']
site_id = self._SITE_ID_MAP[compat_urllib_parse_urlparse(url).netloc.rsplit('.', 2)[-2]]
# Beware, some subclasses do not define an id group
display_id = url_basename(mobj.group('path'))
display_id = mobj.group('display_id') or video_id
if video_id is None:
webpage = self._download_webpage(url, display_id)

View File

@@ -1,11 +1,13 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import float_or_none
class CanvasIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?canvas\.be/video/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?(?P<site_id>canvas|een)\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.canvas.be/video/de-afspraak/najaar-2015/de-afspraak-veilt-voor-de-warmste-week',
'md5': 'ea838375a547ac787d4064d8c7860a6c',
@@ -38,22 +40,42 @@ class CanvasIE(InfoExtractor):
'params': {
'skip_download': True,
}
}, {
'url': 'https://www.een.be/sorry-voor-alles/herbekijk-sorry-voor-alles',
'info_dict': {
'id': 'mz-ast-11a587f8-b921-4266-82e2-0bce3e80d07f',
'display_id': 'herbekijk-sorry-voor-alles',
'ext': 'mp4',
'title': 'Herbekijk Sorry voor alles',
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 3788.06,
},
'params': {
'skip_download': True,
}
}, {
'url': 'https://www.canvas.be/check-point/najaar-2016/de-politie-uw-vriend',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
site_id, display_id = mobj.group('site_id'), mobj.group('id')
webpage = self._download_webpage(url, display_id)
title = self._search_regex(
title = (self._search_regex(
r'<h1[^>]+class="video__body__header__title"[^>]*>(.+?)</h1>',
webpage, 'title', default=None) or self._og_search_title(webpage)
webpage, 'title', default=None) or self._og_search_title(
webpage)).strip()
video_id = self._html_search_regex(
r'data-video=(["\'])(?P<id>.+?)\1', webpage, 'video id', group='id')
r'data-video=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video id', group='id')
data = self._download_json(
'https://mediazone.vrt.be/api/v1/canvas/assets/%s' % video_id, display_id)
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
% (site_id, video_id), display_id)
formats = []
for target in data['targetUrls']:

View File

@@ -0,0 +1,88 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
float_or_none,
int_or_none,
try_get,
)
class CarambaTVIE(InfoExtractor):
_VALID_URL = r'(?:carambatv:|https?://video1\.carambatv\.ru/v/)(?P<id>\d+)'
_TESTS = [{
'url': 'http://video1.carambatv.ru/v/191910501',
'md5': '2f4a81b7cfd5ab866ee2d7270cb34a2a',
'info_dict': {
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
'url': 'carambatv:191910501',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'http://video1.carambatv.ru/v/%s/videoinfo.js' % video_id,
video_id)
title = video['title']
base_url = video.get('video') or 'http://video1.carambatv.ru/v/%s/' % video_id
formats = [{
'url': base_url + f['fn'],
'height': int_or_none(f.get('height')),
'format_id': '%sp' % f['height'] if f.get('height') else None,
} for f in video['qualities'] if f.get('fn')]
self._sort_formats(formats)
thumbnail = video.get('splash')
duration = float_or_none(try_get(
video, lambda x: x['annotations'][0]['end_time'], compat_str))
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}
class CarambaTVPageIE(InfoExtractor):
_VALID_URL = r'https?://carambatv\.ru/(?:[^/]+/)+(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://carambatv.ru/movie/bad-comedian/razborka-v-manile/',
'md5': '',
'info_dict': {
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 2678.31,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._og_search_property('video:iframe', webpage, default=None)
if not video_url:
video_id = self._search_regex(
r'(?:video_id|crmb_vuid)\s*[:=]\s*["\']?(\d+)',
webpage, 'video id')
video_url = 'carambatv:%s' % video_id
return self.url_result(video_url, CarambaTVIE.ie_key())

View File

@@ -0,0 +1,42 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .turner import TurnerBaseIE
class CartoonNetworkIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:www\.)?cartoonnetwork\.com/video/(?:[^/]+/)+(?P<id>[^/?#]+)-(?:clip|episode)\.html'
_TEST = {
'url': 'http://www.cartoonnetwork.com/video/teen-titans-go/starfire-the-cat-lady-clip.html',
'info_dict': {
'id': '8a250ab04ed07e6c014ef3f1e2f9016c',
'ext': 'mp4',
'title': 'Starfire the Cat Lady',
'description': 'Robin decides to become a cat so that Starfire will finally love him.',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
id_type, video_id = re.search(r"_cnglobal\.cvp(Video|Title)Id\s*=\s*'([^']+)';", webpage).groups()
query = ('id' if id_type == 'Video' else 'titleId') + '=' + video_id
return self._extract_cvp_info(
'http://www.cartoonnetwork.com/video-seo-svc/episodeservices/getCvpPlaylist?networkName=CN2&' + query, video_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big',
'tokenizer_src': 'http://www.cartoonnetwork.com/cntv/mvpd/processors/services/token_ipadAdobe.do',
},
}, {
'url': url,
'site_name': 'CartoonNetwork',
'auth_required': self._search_regex(
r'_cnglobal\.cvpFullOrPreviewAuth\s*=\s*(true|false);',
webpage, 'auth required', default='false') == 'true',
})

View File

@@ -4,64 +4,92 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import js_to_json
from ..compat import compat_str
from ..utils import (
js_to_json,
smuggle_url,
try_get,
xpath_text,
xpath_element,
xpath_with_ns,
find_xpath_attr,
parse_iso8601,
parse_age_limit,
int_or_none,
ExtractorError,
)
class CBCIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cbc\.ca/(?:[^/]+/)+(?P<id>[^/?#]+)'
IE_NAME = 'cbc.ca'
_VALID_URL = r'https?://(?:www\.)?cbc\.ca/(?!player/)(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
# with mediaId
'url': 'http://www.cbc.ca/22minutes/videos/clips-season-23/don-cherry-play-offs',
'md5': '97e24d09672fc4cf56256d6faa6c25bc',
'info_dict': {
'id': '2682904050',
'ext': 'flv',
'ext': 'mp4',
'title': 'Don Cherry All-Stars',
'description': 'Don Cherry has a bee in his bonnet about AHL player John Scott because that guys got heart.',
'timestamp': 1454475540,
'timestamp': 1454463000,
'upload_date': '20160203',
'uploader': 'CBCC-NEW',
},
'params': {
# rtmp download
'skip_download': True,
'skip': 'Geo-restricted to Canada',
}, {
# with clipId, feed available via tpfeed.cbc.ca and feed.theplatform.com
'url': 'http://www.cbc.ca/22minutes/videos/22-minutes-update/22-minutes-update-episode-4',
'md5': '162adfa070274b144f4fdc3c3b8207db',
'info_dict': {
'id': '2414435309',
'ext': 'mp4',
'title': '22 Minutes Update: What Not To Wear Quebec',
'description': "This week's latest Canadian top political story is What Not To Wear Quebec.",
'upload_date': '20131025',
'uploader': 'CBCC-NEW',
'timestamp': 1382717907,
},
}, {
# with clipId
# with clipId, feed only available via tpfeed.cbc.ca
'url': 'http://www.cbc.ca/archives/entry/1978-robin-williams-freestyles-on-90-minutes-live',
'md5': '0274a90b51a9b4971fe005c63f592f12',
'info_dict': {
'id': '2487345465',
'ext': 'flv',
'ext': 'mp4',
'title': 'Robin Williams freestyles on 90 Minutes Live',
'description': 'Wacky American comedian Robin Williams shows off his infamous "freestyle" comedic talents while being interviewed on CBC\'s 90 Minutes Live.',
'upload_date': '19700101',
},
'params': {
# rtmp download
'skip_download': True,
'upload_date': '19780210',
'uploader': 'CBCC-NEW',
'timestamp': 255977160,
},
}, {
# multiple iframes
'url': 'http://www.cbc.ca/natureofthings/blog/birds-eye-view-from-vancouvers-burrard-street-bridge-how-we-got-the-shot',
'playlist': [{
'md5': '377572d0b49c4ce0c9ad77470e0b96b4',
'info_dict': {
'id': '2680832926',
'ext': 'flv',
'ext': 'mp4',
'title': 'An Eagle\'s-Eye View Off Burrard Bridge',
'description': 'Hercules the eagle flies from Vancouver\'s Burrard Bridge down to a nearby park with a mini-camera strapped to his back.',
'upload_date': '19700101',
'upload_date': '20160201',
'timestamp': 1454342820,
'uploader': 'CBCC-NEW',
},
}, {
'md5': '415a0e3f586113894174dfb31aa5bb1a',
'info_dict': {
'id': '2658915080',
'ext': 'flv',
'ext': 'mp4',
'title': 'Fly like an eagle!',
'description': 'Eagle equipped with a mini camera flies from the world\'s tallest tower',
'upload_date': '19700101',
'upload_date': '20150315',
'timestamp': 1426443984,
'uploader': 'CBCC-NEW',
},
}],
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Geo-restricted to Canada',
}]
@classmethod
@@ -79,9 +107,15 @@ class CBCIE(InfoExtractor):
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
feed = self._download_json(
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
clip_id, fatal=False)
if feed:
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
if not media_id:
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
else:
entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
@@ -89,25 +123,219 @@ class CBCIE(InfoExtractor):
class CBCPlayerIE(InfoExtractor):
IE_NAME = 'cbc.ca:player'
_VALID_URL = r'(?:cbcplayer:|https?://(?:www\.)?cbc\.ca/(?:player/play/|i/caffeine/syndicate/\?mediaId=))(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://www.cbc.ca/player/play/2683190193',
'md5': '64d25f841ddf4ddb28a235338af32e2c',
'info_dict': {
'id': '2683190193',
'ext': 'flv',
'ext': 'mp4',
'title': 'Gerry Runs a Sweat Shop',
'description': 'md5:b457e1c01e8ff408d9d801c1c2cd29b0',
'timestamp': 1455067800,
'timestamp': 1455071400,
'upload_date': '20160210',
'uploader': 'CBCC-NEW',
},
'params': {
# rtmp download
'skip_download': True,
'skip': 'Geo-restricted to Canada',
}, {
# Redirected from http://www.cbc.ca/player/AudioMobile/All%20in%20a%20Weekend%20Montreal/ID/2657632011/
'url': 'http://www.cbc.ca/player/play/2657631896',
'md5': 'e5e708c34ae6fca156aafe17c43e8b75',
'info_dict': {
'id': '2657631896',
'ext': 'mp3',
'title': 'CBC Montreal is organizing its first ever community hackathon!',
'description': 'The modern technology we tend to depend on so heavily, is never without it\'s share of hiccups and headaches. Next weekend - CBC Montreal will be getting members of the public for its first Hackathon.',
'timestamp': 1425704400,
'upload_date': '20150307',
'uploader': 'CBCC-NEW',
},
}
}, {
# available only when we add `formats=MPEG4,FLV,MP3` to theplatform url
'url': 'http://www.cbc.ca/player/play/2164402062',
'md5': '17a61eb813539abea40618d6323a7f82',
'info_dict': {
'id': '2164402062',
'ext': 'flv',
'title': 'Cancer survivor four times over',
'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.',
'timestamp': 1320410746,
'upload_date': '20111104',
'uploader': 'CBCC-NEW',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
'http://feed.theplatform.com/f/ExhSPC/vms_5akSXx4Ng_Zn?byGuid=%s' % video_id,
'ThePlatformFeed', video_id)
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': smuggle_url(
'http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/%s?mbr=true&formats=MPEG4,FLV,MP3' % video_id, {
'force_smil_url': True
}),
'id': video_id,
}
class CBCWatchBaseIE(InfoExtractor):
_device_id = None
_device_token = None
_API_BASE_URL = 'https://api-cbc.cloud.clearleap.com/cloffice/client/'
_NS_MAP = {
'media': 'http://search.yahoo.com/mrss/',
'clearleap': 'http://www.clearleap.com/namespace/clearleap/1.0/',
}
def _call_api(self, path, video_id):
url = path if path.startswith('http') else self._API_BASE_URL + path
result = self._download_xml(url, video_id, headers={
'X-Clearleap-DeviceId': self._device_id,
'X-Clearleap-DeviceToken': self._device_token,
})
error_message = xpath_text(result, 'userMessage') or xpath_text(result, 'systemMessage')
if error_message:
raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message))
return result
def _real_initialize(self):
if not self._device_id or not self._device_token:
device = self._downloader.cache.load('cbcwatch', 'device') or {}
self._device_id, self._device_token = device.get('id'), device.get('token')
if not self._device_id or not self._device_token:
result = self._download_xml(
self._API_BASE_URL + 'device/register',
None, data=b'<device><type>web</type></device>')
self._device_id = xpath_text(result, 'deviceId', fatal=True)
self._device_token = xpath_text(result, 'deviceToken', fatal=True)
self._downloader.cache.store(
'cbcwatch', 'device', {
'id': self._device_id,
'token': self._device_token,
})
def _parse_rss_feed(self, rss):
channel = xpath_element(rss, 'channel', fatal=True)
def _add_ns(path):
return xpath_with_ns(path, self._NS_MAP)
entries = []
for item in channel.findall('item'):
guid = xpath_text(item, 'guid', fatal=True)
title = xpath_text(item, 'title', fatal=True)
media_group = xpath_element(item, _add_ns('media:group'), fatal=True)
content = xpath_element(media_group, _add_ns('media:content'), fatal=True)
content_url = content.attrib['url']
thumbnails = []
for thumbnail in media_group.findall(_add_ns('media:thumbnail')):
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'id': thumbnail.get('profile'),
'url': thumbnail_url,
'width': int_or_none(thumbnail.get('width')),
'height': int_or_none(thumbnail.get('height')),
})
timestamp = None
release_date = find_xpath_attr(
item, _add_ns('media:credit'), 'role', 'releaseDate')
if release_date is not None:
timestamp = parse_iso8601(release_date.text)
entries.append({
'_type': 'url_transparent',
'url': content_url,
'id': guid,
'title': title,
'description': xpath_text(item, 'description'),
'timestamp': timestamp,
'duration': int_or_none(content.get('duration')),
'age_limit': parse_age_limit(xpath_text(item, _add_ns('media:rating'))),
'episode': xpath_text(item, _add_ns('clearleap:episode')),
'episode_number': int_or_none(xpath_text(item, _add_ns('clearleap:episodeInSeason'))),
'series': xpath_text(item, _add_ns('clearleap:series')),
'season_number': int_or_none(xpath_text(item, _add_ns('clearleap:season'))),
'thumbnails': thumbnails,
'ie_key': 'CBCWatchVideo',
})
return self.playlist_result(
entries, xpath_text(channel, 'guid'),
xpath_text(channel, 'title'),
xpath_text(channel, 'description'))
class CBCWatchVideoIE(CBCWatchBaseIE):
IE_NAME = 'cbc.ca:watch:video'
_VALID_URL = r'https?://api-cbc\.cloud\.clearleap\.com/cloffice/client/web/play/?\?.*?\bcontentId=(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
def _real_extract(self, url):
video_id = self._match_id(url)
result = self._call_api(url, video_id)
m3u8_url = xpath_text(result, 'url', fatal=True)
formats = self._extract_m3u8_formats(re.sub(r'/([^/]+)/[^/?]+\.m3u8', r'/\1/\1.m3u8', m3u8_url), video_id, 'mp4', fatal=False)
if len(formats) < 2:
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
# Despite metadata in m3u8 all video+audio formats are
# actually video-only (no audio)
for f in formats:
if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
f['acodec'] = 'none'
self._sort_formats(formats)
info = {
'id': video_id,
'title': video_id,
'formats': formats,
}
rss = xpath_element(result, 'rss')
if rss:
info.update(self._parse_rss_feed(rss)['entries'][0])
del info['url']
del info['_type']
del info['ie_key']
return info
class CBCWatchIE(CBCWatchBaseIE):
IE_NAME = 'cbc.ca:watch'
_VALID_URL = r'https?://watch\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)'
_TESTS = [{
'url': 'http://watch.cbc.ca/doc-zone/season-6/customer-disservice/38e815a-009e3ab12e4',
'info_dict': {
'id': '38e815a-009e3ab12e4',
'ext': 'mp4',
'title': 'Customer (Dis)Service',
'description': 'md5:8bdd6913a0fe03d4b2a17ebe169c7c87',
'upload_date': '20160219',
'timestamp': 1455840000,
},
'params': {
# m3u8 download
'skip_download': True,
'format': 'bestvideo',
},
'skip': 'Geo-restricted to Canada',
}, {
'url': 'http://watch.cbc.ca/arthur/all/1ed4b385-cd84-49cf-95f0-80f004680057',
'info_dict': {
'id': '1ed4b385-cd84-49cf-95f0-80f004680057',
'title': 'Arthur',
'description': 'Arthur, the sweetest 8-year-old aardvark, and his pals solve all kinds of problems with humour, kindness and teamwork.',
},
'playlist_mincount': 30,
'skip': 'Geo-restricted to Canada',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
rss = self._call_api('web/browse/' + video_id, video_id)
return self._parse_rss_feed(rss)

Some files were not shown because too many files have changed in this diff Show More