Compare commits

..

288 Commits

Author SHA1 Message Date
Philipp Hagemeister
48c971e073 release 2015.03.24 2015-03-24 16:39:53 +01:00
Philipp Hagemeister
f5e2efbbf0 [options] Handle special characters in argv (Fixes #5157) 2015-03-24 16:39:46 +01:00
Sergey M․
b0872c19ea [npo] Skip broken URL links (Closes #5266) 2015-03-23 22:15:01 +06:00
Sergey M․
9f790b9901 [mlb] Improve _VALID_URL (Closes #5260) 2015-03-23 21:23:57 +06:00
Jaime Marquínez Ferrándiz
93f787070f [twitch] Only match digits for the video id
Urls can also contain contain a query (for example a timestamp '?t=foo')
2015-03-22 15:39:35 +01:00
Jaime Marquínez Ferrándiz
f9544f6e8f [test/aes] Test aes_decrypt_text with 256 bit 2015-03-22 12:09:58 +01:00
Jaime Marquínez Ferrándiz
336d19044c [lybsyn] pep8: add space around operator 2015-03-22 11:03:52 +01:00
Sergey M․
7866c9e173 Merge branch 'fstirlitz-the-daily-show-podcast' 2015-03-22 08:24:26 +06:00
Sergey M․
1a4123de04 [comedycentral] Remove unused import 2015-03-22 08:23:38 +06:00
Sergey M․
cf2e2eb1c0 [comedycentral] Drop thedailyshow podcast extractor
Generic extractor is just fine for Libsyn embeds
2015-03-22 08:23:20 +06:00
Sergey M․
2051acdeb2 [extractor/generic] Add test for Libsyn embed 2015-03-22 08:20:27 +06:00
Sergey M․
cefdf970cc [extractor/generic] Support Libsyn embeds 2015-03-22 08:18:13 +06:00
Sergey M․
a1d0aa7b88 [libsyn] Fix extractor alphabetic order 2015-03-22 08:11:47 +06:00
Sergey M․
49aeedb8cb [libsyn] Improve and simplify 2015-03-22 08:11:10 +06:00
Sergey M․
ef249a2cd7 Merge branch 'the-daily-show-podcast' of https://github.com/fstirlitz/youtube-dl into fstirlitz-the-daily-show-podcast 2015-03-22 07:44:28 +06:00
Sergey M․
a09141548a [nrk:playlist] Relax video id regex and improve _VALID_URL 2015-03-21 20:42:48 +06:00
Jaime Marquínez Ferrándiz
5379a2d40d [test/utils] Test xpath_text 2015-03-21 14:12:43 +01:00
Jaime Marquínez Ferrándiz
c9450c7ab1 [nrk:playlist] Restrict _VALID_URL
It would also match /videos/PS... urls
2015-03-21 14:00:37 +01:00
Sergey M․
faa1b5c292 [nrk:playlist] Add extractor (Closes #5245) 2015-03-21 18:22:08 +06:00
Sergey M․
393d9fc6d2 [nrk] Extract duration 2015-03-21 18:21:19 +06:00
Sergey M․
4e6a228689 [nrk] Adapt to new URL format 2015-03-21 18:20:49 +06:00
Jaime Marquínez Ferrándiz
179d6678b1 Remove the 'stitle' field
A warning has been printed for more than 2 years (since 97cd3afc75)
2015-03-21 12:34:44 +01:00
Jaime Marquínez Ferrándiz
85698c5086 [crunchyroll] Remove unused class 2015-03-21 12:18:33 +01:00
Jaime Marquínez Ferrándiz
a7d9ded45d [test] Add tests for aes 2015-03-21 12:07:23 +01:00
Jaime Marquínez Ferrándiz
531980d89c [test/YoutubeDL] test match_filter 2015-03-20 17:05:28 +01:00
Sergey M․
1887ecd4d6 [twitch] Fix login 2015-03-20 21:45:09 +06:00
Sergey M․
cd32c2caba Merge branch 'ndac-todoroki-niconico_nm' 2015-03-20 20:53:27 +06:00
Sergey M․
1c9a1457fc [niconico] Add nm video test 2015-03-20 20:53:14 +06:00
Sergey M․
038b0eb1da Merge branch 'niconico_nm' of https://github.com/ndac-todoroki/youtube-dl into ndac-todoroki-niconico_nm 2015-03-20 20:52:56 +06:00
Jaime Marquínez Ferrándiz
f20bf146e2 [test/YoutubeDL] split in two classes
The name was misleading
2015-03-20 15:14:25 +01:00
Jaime Marquínez Ferrándiz
01218f919b [test/http] Add test for proxy support 2015-03-20 14:59:38 +01:00
Naglis Jonaitis
2684871bc1 [vine] Fix formats extraction (Closes #5239) 2015-03-20 01:50:36 +02:00
Naglis Jonaitis
ccf3960eec [nytimes] Improve _VALID_URL (Fixes #5238) 2015-03-19 20:55:05 +02:00
Naglis Jonaitis
eecc0685c9 [videomega] Fix extraction and update test (Fixes #5235) 2015-03-19 19:38:03 +02:00
Sergey M․
2ed849eccf Merge branch 'master' of github.com:rg3/youtube-dl 2015-03-19 21:27:38 +06:00
Sergey M․
3378d67a18 [generic] Add support for nytimes embeds (Closes #5234) 2015-03-19 21:26:57 +06:00
Sergey M․
f3c0c667a6 [nytimes] Modernize 2015-03-19 21:23:52 +06:00
Sergey M․
0ae8bbac2d [nytimes] Support embed URL 2015-03-19 21:17:04 +06:00
Philipp Hagemeister
cbc3cfcab4 release 2015.03.18 2015-03-18 22:02:39 +01:00
Sergey M․
b30ef07c6c [ultimedia] Handle youtube embeds 2015-03-19 01:06:39 +06:00
Sergey M․
73900846b1 [ultimedia] Capture and output unavailable video message 2015-03-19 00:53:26 +06:00
Sergey M․
d1dc7e3991 [ultimedia] Fix alphabetic order 2015-03-18 23:11:48 +06:00
Sergey M․
3073a6d5e9 [ultimedia] Add extractor
Sponsored by thankyoumotion.com
2015-03-18 23:08:18 +06:00
Roman Le Négrate
aae53774f2 [mixcloud] Try preview server first, then further numbers 2015-03-18 17:08:22 +01:00
Jaime Marquínez Ferrándiz
7a757b7194 [mixcloud] Fix extraction of some metadata
The second test had some wrong info.
I couldn't find the timestamp, so I have removed it.
2015-03-18 17:08:19 +01:00
Roman Le Négrate
fa8ce26904 [mixcloud] Fix extraction like-count 2015-03-18 16:30:29 +01:00
Sergey M․
2c2c06e359 [krasview] Fix extraction (Closes #5228) 2015-03-18 20:28:00 +06:00
Todoroki
ee580538fa fix nm video DL issue when logged in 2015-03-18 22:24:17 +09:00
Todoroki
c3c5c31517 fix nm video DL issue when logged in 2015-03-18 22:19:55 +09:00
Sergey M․
ed9a25dd61 [generic] Generalize redirect regex 2015-03-18 00:05:40 +06:00
felix
9ef4f12b53 testcases for libsyn and The Daily Show Podcast extractors 2015-03-17 18:54:36 +01:00
Sergey M․
84f8101606 [generic] Follow redirects specified by Refresh HTTP header 2015-03-17 23:51:40 +06:00
Sergey M․
b1337948eb [grooveshark] Fix extraction 2015-03-17 23:13:43 +06:00
Sergey M․
98f02fdde2 Credit @jbuchbinder for primesharetv (#5123) 2015-03-17 22:33:05 +06:00
Sergey M․
048fdc2292 Merge branch 'bonfy-douyutv' 2015-03-17 22:27:46 +06:00
Sergey M․
2ca1c5aa9f [douyutv] Improve and extract all formats 2015-03-17 22:27:33 +06:00
Sergey M․
674fb0fcc5 Merge branch 'douyutv' of https://github.com/bonfy/youtube-dl into bonfy-douyutv 2015-03-17 21:41:25 +06:00
Sergey M․
00bfe40e4d Merge branch 'yan12125-sohu_fix' 2015-03-17 21:39:45 +06:00
Sergey M․
cd459b1d49 [sohu] Fix test's note info 2015-03-17 21:39:31 +06:00
Sergey M․
92a4793b3c [utils] Place sanitize url function near other sanitizing functions 2015-03-17 21:34:22 +06:00
Sergey M․
dc03a42537 Merge branch 'sohu_fix' of https://github.com/yan12125/youtube-dl into yan12125-sohu_fix 2015-03-17 21:18:36 +06:00
Sergey M․
219da6bb68 [megavideoeu] Remove extractor 2015-03-17 21:13:42 +06:00
Sergey M․
0499cd866e [primesharetv] Clean up 2015-03-17 21:06:38 +06:00
Jeff Buchbinder
13047f4135 [Primesharetv] Handle file not existing properly. 2015-03-17 20:33:32 +06:00
Jeff Buchbinder
af69cab21d [Primesharetv] Add public domain example video 2015-03-17 20:33:24 +06:00
Jeff Buchbinder
d41a3fa1b4 [Primesharetv] Add primeshare.tv extractor, still need test data 2015-03-17 20:33:16 +06:00
Jeff Buchbinder
733be371af Add megavideoz.eu support. 2015-03-17 20:33:03 +06:00
Sergey M․
576904bce6 [letv] Clarify download message 2015-03-17 20:01:31 +06:00
Sergey M.
cf47794f09 Merge pull request #5116 from yan12125/letv_fix
[Letv] Fix test_Letv and test_Letv_1 failures in python 3
2015-03-17 19:58:34 +06:00
Sergey M․
c06a9f8730 [arte+7] Check formats (Closes #5224) 2015-03-17 19:42:50 +06:00
felix
2e90dff2c2 The Daily Show Podcast support 2015-03-16 20:05:02 +01:00
Jaime Marquínez Ferrándiz
90183a46d8 Credit @eferro for the rtve.es:infantil extractor (#5214) 2015-03-15 22:49:03 +01:00
Jaime Marquínez Ferrándiz
b68eedba23 [rtve.es:infantil] Minor fixes (closes #5214) 2015-03-15 22:18:41 +01:00
Eduardo Ferro
d5b559393b [rtve] Add new extractor for rtve infantil 2015-03-15 22:14:36 +01:00
Philipp Hagemeister
1de4ac1385 release 2015.03.15 2015-03-15 19:38:50 +01:00
Sergey M․
39aa42ffbb [ard] Capture and output time restricted videos (Closes #5213) 2015-03-16 00:21:38 +06:00
Sergey M․
ec1b9577ba [cloudy] Fix key extraction (Closes #5211) 2015-03-15 22:42:13 +06:00
Sergey M.
3b4444f99a Merge pull request #5208 from admire93/master
Fix mistyped docstring indent
2015-03-15 17:20:50 +06:00
Kang Hyojun
613b2d9dc6 Fix mistyped docstring indent 2015-03-15 20:18:23 +09:00
Sergey M․
8f4cc22455 [aftenposten] Adapt to new URL format 2015-03-15 10:08:14 +06:00
Jaime Marquínez Ferrándiz
7c42327e0e tox.ini: Add python 3.4 2015-03-14 21:41:56 +01:00
Jaime Marquínez Ferrándiz
873383e9bd tox.ini: Run the same command as 'make offlinetest' by default 2015-03-14 21:41:15 +01:00
Jaime Marquínez Ferrándiz
8508557e77 [test/YoutubeDL] Use valid urls
It failed on python 3.4 when building the http_headers field
2015-03-14 20:51:42 +01:00
Jaime Marquínez Ferrándiz
4d1652484f [test/unicode_literals] Don't look into the .git and .tox directories
The .tox directory contains python code that we can't control
2015-03-14 20:25:37 +01:00
Jaime Marquínez Ferrándiz
88cf6fb368 [metadatafromtitle] Some improvements and cleanup
* Remove the 'songtitle' field, 'title' can be used instead.
* Remove newlines in the help text, for consistency with other options.
* Add 'from __future__ import unicode_literals'.
* Call '__init__' from the parent class.
* Add test for the format_to_regex method
2015-03-14 20:06:33 +01:00
phiresky
e7db87f700 Add metadata from title parser
(Closes #5125)
2015-03-14 19:46:22 +01:00
Yen Chi Hsuan
2cb434e53e [Sohu] Fix title extraction 2015-03-15 01:05:01 +08:00
Yen Chi Hsuan
cd65491c30 [Sohu] Add a multiplart video test case 2015-03-15 00:59:49 +08:00
Jaime Marquínez Ferrándiz
082b1155a3 [livestream] Extract all videos in events (fixes #5198)
The webpage only contains the most recent ones, but if you scroll down more will appear.
2015-03-14 12:06:01 +01:00
Jaime Marquínez Ferrándiz
9202b1b787 [eighttracks] Remove unused import 2015-03-14 12:04:49 +01:00
Sergey M․
a7e01c438d [8tracks] Modernize 2015-03-14 15:55:21 +06:00
Sergey M․
05be67e77d [8tracks] Improve extraction 2015-03-14 15:54:23 +06:00
Sergey M․
85741b9986 [8tracks] Use predefined avg duration when duration is negative (Closes #5200) 2015-03-14 15:52:06 +06:00
Sergey M.
f247a199fe Merge pull request #5199 from MamayAlexander/yandexmusic
[yandexmusic] Site mirrors
2015-03-14 15:20:48 +06:00
Mamay Alexander
29171bc2d2 [yandexmusic] Site mirrors 2015-03-14 13:56:04 +06:00
Sergey M․
7be5a62ed7 [viewster] Improve extraction 2015-03-14 03:18:04 +06:00
Sergey M․
3647136f24 [viewster] Add extractor 2015-03-14 02:12:11 +06:00
Sergey M․
13598940e3 [kanalplay] Fix test 2015-03-14 01:27:21 +06:00
Sergey M․
0eb365868e Merge branch 'djpohly-beatport-pro' 2015-03-13 22:15:00 +06:00
Sergey M․
28c6411e49 Credit @djpohly for BeatportPro (#5189) 2015-03-13 22:14:51 +06:00
Sergey M․
bba3fc7960 [beatenpro] Fix tests 2015-03-13 22:13:50 +06:00
Sergey M․
fcd877013e [beatenpro] Simplify 2015-03-13 22:11:56 +06:00
Sergey M․
ba1d4c0488 [beatenpro] Improve display_id 2015-03-13 22:03:58 +06:00
Sergey M․
517bcca299 [beatenpro] Simplify and improve 2015-03-13 22:01:15 +06:00
Sergey M․
1b53778175 [beatenpro] Use generic format sort 2015-03-13 21:51:49 +06:00
Sergey M․
b7a0304d92 Merge branch 'beatport-pro' of https://github.com/djpohly/youtube-dl into djpohly-beatport-pro 2015-03-13 21:47:01 +06:00
Sergey M․
545315a985 [nrk] Use generic subtitles timecode formatter 2015-03-13 21:40:34 +06:00
Sergey M․
3f4327520c [kanalplay] Extract subtitles 2015-03-13 21:39:29 +06:00
Sergey M․
4a34f69ea6 [extractor/common] Add subtitles timecode formatter 2015-03-13 21:38:28 +06:00
Sergey M․
fb7e68833c [kanalplay] Add extractor (Closes #5188) 2015-03-13 20:51:44 +06:00
Philipp Hagemeister
486dd09e0b [YoutubeDL] Check for bytes instead of unicode output templates (#5192)
Also adapt the embedding examples for those poor souls still using 2.x.
2015-03-13 08:40:20 +01:00
Jaime Marquínez Ferrándiz
054b99a330 [jeuxvideo] Fix extraction (fixes #5190) 2015-03-12 22:33:59 +01:00
Devin J. Pohly
65c5e044c7 fix python2 2015-03-12 16:42:55 -04:00
Devin J. Pohly
11984c7467 [BeatportPro] Add new extractor
This extractor is for Beatport's 2-minute, low-quality track previews
only.  To obtain an entire track, you obviously have to purchase and
download it normally through the Beatport store!

Possible future improvements:
- Playlists for albums or other track-list pages
- User login to play from My Beatport, Hold Bin, or Cart
2015-03-12 16:03:37 -04:00
Jaime Marquínez Ferrándiz
3946864c8a [vimeo] Use https for all vimeo.com urls
Unfortunately vimeopro.com doesn't support it yet.
2015-03-12 19:08:16 +01:00
Jaime Marquínez Ferrándiz
b84037013e [vimeo] Fix login (#3886) 2015-03-12 18:45:00 +01:00
Sergey M.
1dbfc62d75 Merge pull request #5186 from leleobhz/master
* Change globo.py flash ver to 17.0.0.132 - Chrome 42.0.2311.22
2015-03-12 23:37:03 +06:00
Leonardo Amaral
d7d79106c7 * Change globo.py flash ver to 17.0.0.132 - Chrome 42.0.2311.22 2015-03-12 14:23:42 -03:00
Sergey M․
1138491631 [yam] Skip test 2015-03-12 21:59:46 +06:00
Sergey M․
71705fa70d [footyroom] Add extractor (Closes #5000) 2015-03-12 21:56:56 +06:00
Sergey M.
602814adab Merge pull request #5150 from yan12125/yam_fix
[Yam] Add an error detection and update test cases
2015-03-12 21:01:49 +06:00
Jaime Marquínez Ferrándiz
3a77719c5a Don't accept '-1' as format, 'all' is clearer 2015-03-11 17:38:35 +01:00
Sergey M․
7e195d0e92 [funnyordie] Add subtitles test 2015-03-11 22:00:37 +06:00
Sergey M․
e04793401d Merge branch 'pishposhmcgee-master' 2015-03-11 21:56:40 +06:00
Sergey M․
a3fbd18824 [funnyordie] Simplify subtitles 2015-03-11 21:56:22 +06:00
Sergey M․
c6052b8c14 Merge branch 'master' of https://github.com/pishposhmcgee/youtube-dl into pishposhmcgee-master 2015-03-11 21:45:43 +06:00
Sergey M․
c792b5011f [ssa] Add extractor (Closes #5169) 2015-03-11 21:15:36 +06:00
Sergey M․
32aaeca775 [npo] Improve smooth stream skipping and set low preference for streams other than hds ans hls (Closes #5175) 2015-03-11 20:34:32 +06:00
pishposhmcgee
1593194c63 Update funnyordie.py 2015-03-10 15:35:35 -05:00
PishPosh.McGee
614a7e1e23 Added subtitles for FunnyOrDie 2015-03-10 15:22:46 -05:00
Sergey M․
2ebfeacabc [utils] Keep dot and dotdot unmodified (Closes #5171) 2015-03-10 00:50:11 +06:00
Jaime Marquínez Ferrándiz
f5d8f58a17 [yandexmusic:album] Improve _VALID_URL to avoid matching tracks urls 2015-03-09 18:17:22 +01:00
Jaime Marquínez Ferrándiz
937daef4a7 [niconico] Use '_match_id' 2015-03-09 18:12:41 +01:00
Jaime Marquínez Ferrándiz
dd77f14c64 [yandexmusic] PEP8: remove blank line at the end of file 2015-03-09 18:07:31 +01:00
Sergey M․
c36cbe5a8a Merge branch 'MamayAlexander-YandexMusic' 2015-03-09 21:46:44 +06:00
Sergey M․
41b2194f86 Credit @MamayAlexander for yandexmusic (#5168) 2015-03-09 21:46:31 +06:00
Sergey M․
d1e2e8f583 [yamusic] Rename to yandexmusic 2015-03-09 21:44:59 +06:00
Sergey M․
47fe42e1ab [yamusic] Improve, simplify, fix python3 issues and add tests 2015-03-09 21:43:46 +06:00
Mamay Alexander
4c60393854 [YandexMusic] Add new extractor 2015-03-09 19:06:49 +06:00
Philipp Hagemeister
f848215dfc release 2015.03.09 2015-03-09 03:02:03 +01:00
Philipp Hagemeister
dcca581967 Merge remote-tracking branch 'origin/master'
Conflicts:
	youtube_dl/YoutubeDL.py
2015-03-09 03:01:28 +01:00
Philipp Hagemeister
d475b3384c [README] Better bug reporting instructions
Also address private emails which I get more and more these days.
2015-03-09 03:00:03 +01:00
Sergey M․
dd7831fe94 [breakcom] Process only play purpose media formats (Closes #5164) 2015-03-09 04:55:35 +06:00
Naglis Jonaitis
cc08b11d16 [adultswim] Improve video_info extraction (Fixes #5152)
Look for video_info inside `slugged_video`, if slug is not found among collections.
Also, simplify a bit.
2015-03-08 21:35:04 +02:00
Philipp Hagemeister
8bba753cca [options] Rename --dump-intermediate-pages to --dump-pages for consistence with --write-pages 2015-03-08 18:37:43 +01:00
Jaime Marquínez Ferrándiz
43d6280d0a [downloader/f4m] Fix use of base64 in python 3.2 (fixes #5132)
b64decode needs a byte string, but on 3.4 it also accepts strings.
2015-03-08 18:25:11 +01:00
Sergey M․
e5a11a2293 [YoutubeDL] Sanitize path before creating non-existent paths (Closes #4324) 2015-03-08 22:09:42 +06:00
Sergey M․
f18ef2d144 [utils] Disallow trailing dot in sanitize_path for a path part 2015-03-08 22:08:48 +06:00
Sergey M․
1bb5c511a5 [YoutubeDL] Sanitize outtmpl as path 2015-03-08 20:57:30 +06:00
Sergey M․
d55de57b67 [utils] Fix sanitize_open 2015-03-08 20:56:28 +06:00
Sergey M․
a2aaf4dbc6 [utils] Add sanitize_path 2015-03-08 20:55:22 +06:00
Sergey M․
bdf6eee0ae [gazeta] Extend _VALID_URL 2015-03-08 19:17:54 +06:00
Naglis Jonaitis
8b910bda0c [teamcoco] Fix extraction 2015-03-08 14:28:53 +02:00
Naglis Jonaitis
24993e3b39 [vidme] Fix view_count extraction and remove comment_count extraction (Fixes #5133)
Comment counts seem to no longer be listed on vid.me
2015-03-08 14:12:10 +02:00
Sergey M․
11101076a1 [pladform] Fix format quality sorting 2015-03-08 18:09:47 +06:00
Sergey M․
f838875726 [pladform] Add support for embeds 2015-03-08 18:07:10 +06:00
Sergey M․
28778d6bae [pladform] Add extractor 2015-03-08 18:03:12 +06:00
Naglis Jonaitis
1132eae56d [gazeta] Add new extractor (Closes #4222) 2015-03-08 13:54:01 +02:00
Sergey M․
d34e79492d [twitch] Fix live streams (Closes #5158) 2015-03-08 16:54:11 +06:00
Philipp Hagemeister
ab205b9dc8 Revert "[YoutubeDL] Sanitize outtmpl as it may contain forbidden characters"
This reverts commit 7dcad95d4f.

The output template is most definitly allowed to contain forbidden characters; otherwise -o /foo/bar/vid.mp4 wouldn't work.
2015-03-07 22:18:22 +01:00
Sergey M․
7dcad95d4f [YoutubeDL] Sanitize outtmpl as it may contain forbidden characters 2015-03-08 01:13:23 +06:00
Sergey M․
8a48223a7b [eagleplatform] Remove debug output 2015-03-07 22:35:36 +06:00
Sergey M․
d47ae7f620 [eagleplatform] Add support for ClipYou embeds 2015-03-07 22:34:44 +06:00
Sergey M․
135c9c42bf [eagleplatform] Add support for embeds 2015-03-07 22:22:57 +06:00
Sergey M․
0bf79ac455 [eagleplatform] Add extractor 2015-03-07 22:16:23 +06:00
Sergey M․
98998cded6 [youtube:search_url] Fix extraction (Closes #5155) 2015-03-07 18:59:06 +06:00
Sergey M․
14137b5781 [orf:iptv] Add extractor (Closes #5140) 2015-03-07 17:31:03 +06:00
bonfy
a172d96292 [douyutv] Add new extractor 2015-03-07 14:05:56 +08:00
Jaime Marquínez Ferrándiz
23ba76bc0e [dailymotion] Replace test
It has been removed.
2015-03-06 22:45:05 +01:00
Jaime Marquínez Ferrándiz
61e00a9775 [vimeo] Use https for player.vimeo.com urls (closes #5147) 2015-03-06 22:39:05 +01:00
Jaime Marquínez Ferrándiz
d1508cd68d [vimeo:album] Fix password protected videos
Since it only uses https now, don't recognize http urls.
2015-03-06 22:16:26 +01:00
Jaime Marquínez Ferrándiz
9c85b5376d [vimeo] Fix and use '_verify_video_password' (#5001)
It only supports verifying the password over https now.

Use it instead of manually setting the 'password' cookie because it allows to check if the password is correct.
2015-03-06 19:08:27 +01:00
Jaime Marquínez Ferrándiz
3c6f245083 [vimeo] Fix upload date extraction 2015-03-06 18:16:56 +01:00
Sergey M․
f207019ce5 [extractor/common] Remove 'm3u8' from quality selection URL 2015-03-06 22:53:53 +06:00
Yen Chi Hsuan
bd05aa4e24 [Yam] Add an error detection and update test cases 2015-03-07 00:53:52 +08:00
Sergey M․
8dc9d361c2 [extractor/common] Fix format_id when last_media is None and always include m3u8_id if present
The rationale behind `m3u8_id` was to resolve duplicates when processing several m3u8 playlists within the same media that give equal resulting `format_id`'s,
e.g. `youtube-dl http://www.rts.ch/play/tv/passe-moi-les-jumelles/video/la-fee-des-bois-mustang-les-chemins-du-vent?id=3854925 -F`
2015-03-06 22:52:50 +06:00
Philipp Hagemeister
d0e958c71c [twitch:vod] Prefer source stream (Fixes #5143) 2015-03-06 10:53:49 +01:00
Philipp Hagemeister
a0bb7c5593 [extractor/common] Improve m3u format IDs (#5143) 2015-03-06 10:49:42 +01:00
Philipp Hagemeister
7feddd9fc7 [travis] Declare 3.2 (Fixes #5144) 2015-03-06 10:44:24 +01:00
Yen Chi Hsuan
55969016e9 [utils] Add a function to sanitize consecutive slashes in URLs 2015-03-06 12:43:49 +08:00
Philipp Hagemeister
9609f02e3c [vidme] Modernize 2015-03-05 22:34:56 +01:00
Yen Chi Hsuan
5c7495a194 [sohu] Correct wrong imports 2015-03-06 02:48:27 +08:00
Yen Chi Hsuan
5ee6fc974e [sohu] Fix info extractor and add tests 2015-03-06 02:43:39 +08:00
Naglis Jonaitis
c2ebea6580 [extremetube] Fix extraction (Closes #5127) 2015-03-05 14:45:38 +02:00
Sergey M․
12a129ec6d [playwire] Add extractor 2015-03-05 02:36:53 +06:00
Jaime Marquínez Ferrándiz
f28fe66970 [downloader/http] Add missing fields for _hook_progress call
It would fail if you run 'youtube-dl --no-part URL' a second time when the file has already been downloaded.

(Reported in Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1195779)
2015-03-04 12:14:38 +01:00
Jaime Marquínez Ferrándiz
123397317c [downloader/http] Remove wrong '_hook_progress' call (fixes #5117) 2015-03-03 18:45:56 +01:00
Naglis Jonaitis
dc570c4951 [lrt] Pass --realtime to rtmpdump 2015-03-03 18:41:34 +02:00
Naglis Jonaitis
22d3628319 [tvplay] Adapt _VALID_URL and test case to domain name change 2015-03-03 18:39:28 +02:00
Sergey M․
50c9949d7a [youporn] Imrove JSON regex and preserve the old one 2015-03-03 21:39:04 +06:00
Sergey M.
376817c6d4 Merge pull request #5115 from chaos33/youporn-json
fix youporn extractor's json search regex
2015-03-03 21:32:13 +06:00
Yen Chi Hsuan
63fc800057 [Letv] Fix test_Letv and test_Letv_1 failures in python 3 2015-03-03 23:20:55 +08:00
chaos33
e0d0572b73 fix youporn extractor's json search regex 2015-03-03 22:53:05 +08:00
Philipp Hagemeister
7fde87c77d release 2015.03.03.1 2015-03-03 13:59:38 +01:00
Philipp Hagemeister
938c3f65b6 Merge branch 'cn-verification-proxy' 2015-03-03 13:57:29 +01:00
Philipp Hagemeister
2461f79d2a [utils] Correct per-request proxy handling 2015-03-03 13:56:06 +01:00
Philipp Hagemeister
499bfcbfd0 Make sure netrc works for all extractors with login support
Fixes #5112
2015-03-03 12:59:17 +01:00
Philipp Hagemeister
07490f8017 release 2015.03.03 2015-03-03 00:05:05 +01:00
Philipp Hagemeister
91410c9bfa [letv] Add --cn-verification-proxy (Closes #5077) 2015-03-03 00:03:06 +01:00
Philipp Hagemeister
a7440261c5 [utils] Streap leading dots
Fixes #2865, closes #5087
2015-03-02 19:07:19 +01:00
Philipp Hagemeister
76c73715fb [generic] Parse RSS enclosure URLs (Fixes #5091) 2015-03-02 18:21:31 +01:00
Philipp Hagemeister
c75f0b361a [downloader/external] Add support for custom options (Fixes #4885, closes #5098) 2015-03-02 18:21:31 +01:00
Sergey M․
295df4edb9 [soundcloud] Fix glitches (#5101) 2015-03-02 22:47:07 +06:00
Sergey M․
562ceab13d [soundcloud] Check direct links validity (Closes #5101) 2015-03-02 22:39:32 +06:00
Sergey M․
2f0f6578c3 [extractor/common] Assume non HTTP(S) URLs valid 2015-03-02 22:38:44 +06:00
Sergey M․
30cbd4e0d6 [lynda] Completely skip videos we don't have access to, extract base class and modernize (Closes #5093) 2015-03-02 22:12:10 +06:00
Sergey M.
549e58069c Merge pull request #5105 from Ftornik/Lynda-subtitle-hotfix-2
[lynda] Check for the empty subtitles
2015-03-02 21:15:26 +06:00
Sergey
7594be85ff [lynda] Check for the empty subtitle 2015-03-02 11:49:39 +02:00
Sergey M․
3630034609 [vk] Fix test (Closes #5100) 2015-03-02 03:30:18 +06:00
Sergey M․
4e01501bbf [vk] Fix extraction (Closes #4967, closes #4686) 2015-03-01 21:56:30 +06:00
Sergey M․
1aa5172f56 [vk] Catch temporarily unavailable video error message 2015-03-01 21:55:43 +06:00
Philipp Hagemeister
f7e2ee8fa6 Merge branch 'master' of github.com:rg3/youtube-dl 2015-03-01 12:05:13 +01:00
Philipp Hagemeister
66dc9a3701 [README] Document HTTP 429 (Closes #5092) 2015-03-01 12:04:39 +01:00
Jaime Marquínez Ferrándiz
31bd39256b --load-info: Use the fileinput module
It automatically handles the '-' filename as stdin
2015-03-01 11:54:48 +01:00
Jaime Marquínez Ferrándiz
003c69a84b Use shutil.get_terminal_size for getting the terminal width if it's available (python >= 3.3) 2015-02-28 21:44:57 +01:00
Philipp Hagemeister
0134901108 release 2015.02.28 2015-02-28 21:24:25 +01:00
Philipp Hagemeister
eee6293d57 [thechive] remove in favor of Kaltura (#5072) 2015-02-28 20:55:49 +01:00
Philipp Hagemeister
8237bec4f0 [escapist] Extract duration 2015-02-28 20:52:52 +01:00
Philipp Hagemeister
29cad7ad13 Merge remote-tracking branch 'origin/master' 2015-02-28 20:51:54 +01:00
Sergey M․
0d103de3b0 [twitch] Pass api_token along with every request (Closes #3986) 2015-02-28 22:59:55 +06:00
Sergey M․
a0090691d0 Merge branch 'HanYOLO-puls4' 2015-02-28 22:26:35 +06:00
Sergey M․
6c87c2eea8 [puls4] Improve and extract more metadata 2015-02-28 22:25:57 +06:00
Sergey M․
58c2ec6ab3 Merge branch 'puls4' of https://github.com/HanYOLO/youtube-dl 2015-02-28 21:39:10 +06:00
Sergey M․
df5ae3eb16 [oppetarkiv] Merge with svtplay 2015-02-28 21:25:04 +06:00
Sergey M․
efda2d7854 Merge branch 'thc202-oppetarkiv' 2015-02-28 21:12:23 +06:00
Sergey M․
e143f5dae9 [oppetarkiv] Extract f4m formats and age limit 2015-02-28 21:12:06 +06:00
Sergey M․
48218cdb97 Merge branch 'oppetarkiv' of https://github.com/thc202/youtube-dl into thc202-oppetarkiv 2015-02-28 20:41:56 +06:00
Jaime Marquínez Ferrándiz
e9fade72f3 Add postprocessor for converting subtitles (closes #4954) 2015-02-28 14:43:24 +01:00
Jaime Marquínez Ferrándiz
0f2c0d335b [YoutubeDL] Use the InfoExtractor._download_webpage method for getting the subtitles
It handles encodings better, for example for 'http://www.npo.nl/nos-journaal/14-02-2015/POW_00942207'
2015-02-28 14:03:27 +01:00
thc202
40b077bc7e [oppetarkiv] Add new extractor
Some, if not all, of the videos appear to be geo-blocked (Sweden).
Test might fail (403 Forbidden) if not run through a Swedish connection.
2015-02-27 22:27:30 +00:00
Sergey M․
a931092cb3 Merge branch 'puls4' of https://github.com/HanYOLO/youtube-dl into HanYOLO-puls4 2015-02-28 00:22:48 +06:00
Sergey M․
bd3749ed69 [kaltura] Extend _VALID_URL (Closes #5081) 2015-02-28 00:19:31 +06:00
Sergey M․
4ffbf77886 [odnoklassniki] Add extractor (Closes #5075) 2015-02-28 00:15:03 +06:00
Jaime Marquínez Ferrándiz
781a7ef60a [lynda] Use 'lstrip' for the subtitles
The newlines at the end are important, they separate each piece of text.
2015-02-27 16:18:18 +01:00
Sergey M.
5b2949ee0b Merge pull request #5076 from Ftornik/Lynda-subtitles-hotfix
[lynda] Fixed subtitles broken file
2015-02-27 20:56:54 +06:00
Sergey M․
a0d646135a [lynda] Extend _VALID_URL 2015-02-27 20:56:06 +06:00
HanYOLO
7862ad88b7 puls4 Add new extractor 2015-02-27 15:41:58 +01:00
Jaime Marquínez Ferrándiz
f3bff94cf9 [rtve] Extract duration 2015-02-27 12:24:51 +01:00
Sergey
0eba1e1782 [lynda] Fixed subtitles broken file 2015-02-27 00:51:22 +02:00
Naglis Jonaitis
e3216b82bf [generic] Support dynamic Kaltura embeds (#5016) (#5073) 2015-02-27 00:34:19 +02:00
Naglis Jonaitis
da419e2332 [musicvault] Use the Kaltura extractor 2015-02-26 23:47:45 +02:00
Naglis Jonaitis
0d97ef43be [kaltura] Add new extractor 2015-02-26 23:45:54 +02:00
anovicecodemonkey
1a2313a6f2 [TheChiveIE] added support for TheChive.com (Closes #5016) 2015-02-27 02:36:45 +10:30
Sergey M․
250a9bdfe2 [mpora] Improve _VALID_URL 2015-02-26 21:16:35 +06:00
Sergey M․
6317a3e9da [mpora] Fix extraction 2015-02-26 21:10:49 +06:00
Naglis Jonaitis
7ab7c9e932 [gamestar] Fix title extraction 2015-02-26 16:22:05 +02:00
Naglis Jonaitis
e129c5bc0d [laola1tv] Allow live stream downloads 2015-02-26 14:35:48 +02:00
PishPosh.McGee
2e241242a3 Adding subtitles 2015-02-26 03:59:35 -06:00
Philipp Hagemeister
9724e5d336 release 2015.02.26.2 2015-02-26 09:45:11 +01:00
Philipp Hagemeister
63a562f95e [escapist] Detect IP blocking and use another UA (Fixes #5069) 2015-02-26 09:19:26 +01:00
Philipp Hagemeister
5c340b0387 release 2015.02.26.1 2015-02-26 01:47:16 +01:00
Philipp Hagemeister
1c6510f57a [Makefile] clean pyc files in clean target 2015-02-26 01:47:12 +01:00
Philipp Hagemeister
2a15a98a6a [rmtp] Encode filename before invoking subprocess
This fixes #5066.
Reproducible with
LC_ALL=C youtube-dl "http://www.prosieben.de/tv/germanys-next-topmodel/video/playlist/ganze-folge-episode-2-das-casting-in-muenchen"
2015-02-26 01:44:20 +01:00
Philipp Hagemeister
72a406e7aa [extractor/common] Pass in video_id (#5057) 2015-02-26 01:35:43 +01:00
Philipp Hagemeister
feccc3ff37 Merge remote-tracking branch 'aajanki/wdr_live' 2015-02-26 01:34:01 +01:00
Philipp Hagemeister
265bfa2c79 [letv] Simplify 2015-02-26 01:30:18 +01:00
Philipp Hagemeister
8faf9b9b41 Merge remote-tracking branch 'yan12125/IE_Letv' 2015-02-26 01:26:55 +01:00
Philipp Hagemeister
84be7c230c Cred @duncankl for airmozilla 2015-02-26 01:25:54 +01:00
Philipp Hagemeister
3e675fabe0 [airmozilla] Be more tolerant when nonessential items are missing (#5030) 2015-02-26 01:25:00 +01:00
Philipp Hagemeister
cd5b4b0bc2 Merge remote-tracking branch 'duncankl/airmozilla' 2015-02-26 01:15:08 +01:00
Philipp Hagemeister
7ef822021b Merge remote-tracking branch 'mmue/fix-rtlnow' 2015-02-26 01:13:03 +01:00
Philipp Hagemeister
9a48926a57 [escapist] Add support for advertisements 2015-02-26 00:59:53 +01:00
Philipp Hagemeister
13cd97f3df release 2015.02.26 2015-02-26 00:42:02 +01:00
Philipp Hagemeister
183139340b [utils] Bump our user agent 2015-02-26 00:40:12 +01:00
Philipp Hagemeister
1c69bca258 [escapist] Fix config URL matching 2015-02-26 00:24:54 +01:00
Jaime Marquínez Ferrándiz
c10ea454dc [telecinco] Recognize more urls (closes #5065) 2015-02-25 23:52:54 +01:00
Markus Müller
9504fc21b5 Fix the RTL extractor for new episodes by using a different hostname 2015-02-25 23:27:19 +01:00
Jaime Marquínez Ferrándiz
13d8fbef30 [generic] Don't set the 'title' if it's not defined in the entry (closes #5061)
Some of them may be an 'url' result, which in general don't have the 'title' field.
2015-02-25 17:56:51 +01:00
Antti Ajanki
b8988b63a6 [wdr] Download a live stream 2015-02-24 21:23:59 +02:00
Antti Ajanki
5eaaeb7c31 [f4m] Tolerate missed fragments on live streams 2015-02-24 21:22:59 +02:00
Antti Ajanki
c4f8c453ae [f4m] Refresh fragment list periodically on live streams 2015-02-24 21:22:59 +02:00
Antti Ajanki
6f4ba54079 [extractor/common] Extract HTTP (possibly f4m) URLs from a .smil file 2015-02-24 21:22:59 +02:00
Antti Ajanki
637570326b [extractor/common] Extract the first of a seq of videos in a .smil file 2015-02-24 21:22:59 +02:00
Sergey M․
37f885650c [eporner] Simplify and hardcode age limit 2015-02-25 01:08:54 +06:00
Sergey M.
c8c34ccb20 Merge pull request #5056 from logon84/master
Eporner Fix (Closes #5050)
2015-02-25 01:05:35 +06:00
logon84
e765ed3a9c [eporner] Fix redirect_code error 2015-02-24 19:41:46 +01:00
Yen Chi Hsuan
677063594e [Letv] Update testcases 2015-02-25 02:10:55 +08:00
logon84
59c7cbd482 Update eporner.py
Updated to work. Old version shows an error about being unable to extract "redirect_code"
2015-02-24 18:58:32 +01:00
Yen Chi Hsuan
570311610e [Letv] Add playlist support 2015-02-25 01:26:44 +08:00
Sergey M․
41b264e77c [nrktv] Workaround subtitles conversion issues on python 2.6 (Closes #5036) 2015-02-24 23:06:44 +06:00
Philipp Hagemeister
df4bd0d53f [options] Add --yes-playlist as inverse of --no-playlist (Fixes #5051) 2015-02-24 17:25:02 +01:00
Yen Chi Hsuan
7f09a662a0 [Letv] Add new extractor. Single video only 2015-02-24 23:58:21 +08:00
Philipp Hagemeister
4f3b21e1c7 release 2015.02.24.2 2015-02-24 16:34:42 +01:00
Philipp Hagemeister
54233c9080 [escapist] Support JavaScript player (Fixes #5034) 2015-02-24 16:33:07 +01:00
Philipp Hagemeister
db8e13ef71 release 2015.02.24.1 2015-02-24 11:38:21 +01:00
Philipp Hagemeister
5a42414b9c [utils] Prevent hyphen at beginning of filename (Fixes #5035) 2015-02-24 11:38:01 +01:00
Philipp Hagemeister
9c665ab72e [rtve] PEP8 2015-02-24 11:37:27 +01:00
Duncan Keall
1b40dc92eb [airmozilla] Add new extractor 2015-02-23 16:10:08 +13:00
107 changed files with 3859 additions and 875 deletions

View File

@@ -2,6 +2,7 @@ language: python
python:
- "2.6"
- "2.7"
- "3.2"
- "3.3"
- "3.4"
before_install:

View File

@@ -112,3 +112,8 @@ Frans de Jonge
Robin de Rooij
Ryan Schmidt
Leslie P. Polzer
Duncan Keall
Alexander Mamay
Devin J. Pohly
Eduardo Ferro Aldama
Jeff Buchbinder

View File

@@ -18,7 +18,9 @@ If your report is shorter than two lines, it is almost certainly missing some of
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
If your server has multiple IPs or you suspect censorship, adding --call-home may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
### Are you using the latest version?

View File

@@ -2,6 +2,7 @@ all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bas
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find -name "*.pyc" -delete
PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin

434
README.md
View File

@@ -47,209 +47,109 @@ which means you can modify it, redistribute it or use it however you like.
# OPTIONS
-h, --help print this help text and exit
--version print program version and exit
-U, --update update this program to latest version. Make
sure that you have sufficient permissions
(run with sudo if needed)
-i, --ignore-errors continue on download errors, for example to
skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the
playlist or the command line) if an error
occurs
-U, --update update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)
-i, --ignore-errors continue on download errors, for example to skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the playlist or the command line) if an error occurs
--dump-user-agent display the current browser identification
--list-extractors List all supported extractors and the URLs
they would handle
--extractor-descriptions Output descriptions of all supported
extractors
--default-search PREFIX Use this prefix for unqualified URLs. For
example "gvsearch2:" downloads two videos
from google videos for youtube-dl "large
apple". Use the value "auto" to let
youtube-dl guess ("auto_warning" to emit a
warning when guessing). "error" just throws
an error. The default value "fixup_error"
repairs broken URLs, but emits an error if
this is not possible instead of searching.
--ignore-config Do not read configuration files. When given
in the global configuration file /etc
/youtube-dl.conf: Do not read the user
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--flat-playlist Do not extract the videos of a playlist,
only list them.
--list-extractors List all supported extractors and the URLs they would handle
--extractor-descriptions Output descriptions of all supported extractors
--default-search PREFIX Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The
default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.
--ignore-config Do not read configuration files. When given in the global configuration file /etc/youtube-dl.conf: Do not read the user configuration
in ~/.config/youtube-dl/config (%APPDATA%/youtube-dl/config.txt on Windows)
--flat-playlist Do not extract the videos of a playlist, only list them.
--no-color Do not emit color codes in output.
## Network Options:
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in
an empty string (--proxy "") for direct
connection
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in an empty string (--proxy "") for direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to
(experimental)
-4, --force-ipv4 Make all connections via IPv4
(experimental)
-6, --force-ipv6 Make all connections via IPv6
(experimental)
--source-address IP Client-side IP address to bind to (experimental)
-4, --force-ipv4 Make all connections via IPv4 (experimental)
-6, --force-ipv6 Make all connections via IPv6 (experimental)
--cn-verification-proxy URL Use this proxy to verify the IP address for some Chinese sites. The default proxy specified by --proxy (or none, if the options is
not present) is used for the actual downloading. (experimental)
## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last)
--playlist-items ITEM_SPEC playlist video items to download. Specify
indices of the videos in the playlist
seperated by commas like: "--playlist-items
1,2,5,8" if you want to download videos
indexed 1, 2, 5, 8 in the playlist. You can
specify range: "--playlist-items
1-3,7,10-13", it will download the videos
at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX download only matching titles (regex or
caseless sub-string)
--reject-title REGEX skip download for matching titles (regex or
caseless sub-string)
--playlist-items ITEM_SPEC playlist video items to download. Specify indices of the videos in the playlist seperated by commas like: "--playlist-items 1,2,5,8"
if you want to download videos indexed 1, 2, 5, 8 in the playlist. You can specify range: "--playlist-items 1-3,7,10-13", it will
download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX download only matching titles (regex or caseless sub-string)
--reject-title REGEX skip download for matching titles (regex or caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than
SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE
(e.g. 50k or 44.6m)
--min-filesize SIZE Do not download any videos smaller than SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE (e.g. 50k or 44.6m)
--date DATE download only videos uploaded in this date
--datebefore DATE download only videos uploaded on or before
this date (i.e. inclusive)
--dateafter DATE download only videos uploaded on or after
this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than
COUNT views
--max-views COUNT Do not download any videos with more than
COUNT views
--match-filter FILTER (Experimental) Generic video filter.
Specify any key (see help for -o for a list
of available keys) to match if the key is
present, !key to check if the key is not
present,key > NUMBER (like "comment_count >
12", also works with >=, <, <=, !=, =) to
compare against a number, and & to require
multiple matches. Values which are not
known are excluded unless you put a
question mark (?) after the operator.For
example, to only match videos that have
been liked more than 100 times and disliked
less than 50 times (or the dislike
functionality is not available at the given
service), but who also have a description,
use --match-filter "like_count > 100 &
--datebefore DATE download only videos uploaded on or before this date (i.e. inclusive)
--dateafter DATE download only videos uploaded on or after this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than COUNT views
--max-views COUNT Do not download any videos with more than COUNT views
--match-filter FILTER (Experimental) Generic video filter. Specify any key (see help for -o for a list of available keys) to match if the key is present,
!key to check if the key is not present,key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to compare against
a number, and & to require multiple matches. Values which are not known are excluded unless you put a question mark (?) after the
operator.For example, to only match videos that have been liked more than 100 times and disliked less than 50 times (or the dislike
functionality is not available at the given service), but who also have a description, use --match-filter "like_count > 100 &
dislike_count <? 50 & description" .
--no-playlist If the URL refers to a video and a
playlist, download only the video.
--age-limit YEARS download only videos suitable for the given
age
--download-archive FILE Download only videos not listed in the
archive file. Record the IDs of all
downloaded videos in it.
--include-ads Download advertisements as well
(experimental)
--no-playlist If the URL refers to a video and a playlist, download only the video.
--yes-playlist If the URL refers to a video and a playlist, download the playlist.
--age-limit YEARS download only videos suitable for the given age
--download-archive FILE Download only videos not listed in the archive file. Record the IDs of all downloaded videos in it.
--include-ads Download advertisements as well (experimental)
## Download Options:
-r, --rate-limit LIMIT maximum download rate in bytes per second
(e.g. 50K or 4.2M)
-R, --retries RETRIES number of retries (default is 10), or
"infinite".
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer do not automatically adjust the buffer
size. By default, the buffer size is
automatically resized from an initial value
of SIZE.
-r, --rate-limit LIMIT maximum download rate in bytes per second (e.g. 50K or 4.2M)
-R, --retries RETRIES number of retries (default is 10), or "infinite".
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K) (default is 1024)
--no-resize-buffer do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize (experimental) set file xattribute
ytdl.filesize with expected filesize
--hls-prefer-native (experimental) Use the native HLS
downloader instead of ffmpeg.
--external-downloader COMMAND (experimental) Use the specified external
downloader. Currently supports
aria2c,curl,wget
--xattr-set-filesize (experimental) set file xattribute ytdl.filesize with expected filesize
--hls-prefer-native (experimental) Use the native HLS downloader instead of ffmpeg.
--external-downloader COMMAND Use the specified external downloader. Currently supports aria2c,curl,wget
--external-downloader-args ARGS Give these arguments to the external downloader.
## Filesystem Options:
-a, --batch-file FILE file containing URLs to download ('-' for
stdin)
-a, --batch-file FILE file containing URLs to download ('-' for stdin)
--id use only video ID in file name
-o, --output TEMPLATE output filename template. Use %(title)s to
get the title, %(uploader)s for the
uploader name, %(uploader_id)s for the
uploader nickname if different,
%(autonumber)s to get an automatically
incremented number, %(ext)s for the
filename extension, %(format)s for the
format description (like "22 - 1280x720" or
"HD"), %(format_id)s for the unique id of
the format (like Youtube's itags: "137"),
%(upload_date)s for the upload date
(YYYYMMDD), %(extractor)s for the provider
(youtube, metacafe, etc), %(id)s for the
video id, %(playlist_title)s,
%(playlist_id)s, or %(playlist)s (=title if
present, ID otherwise) for the playlist the
video is in, %(playlist_index)s for the
position in the playlist. %(height)s and
%(width)s for the width and height of the
video format. %(resolution)s for a textual
description of the resolution of the video
format. %% for a literal percent. Use - to
output to stdout. Can also be used to
download to a different directory, for
example with -o '/my/downloads/%(uploader)s
/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
is given
--restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in
filenames
-A, --auto-number [deprecated; use -o
"%(autonumber)s-%(title)s.%(ext)s" ] number
downloaded files starting from 00000
-t, --title [deprecated] use title in file name
(default)
-o, --output TEMPLATE output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(uploader_id)s for the uploader
nickname if different, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(format)s for
the format description (like "22 - 1280x720" or "HD"), %(format_id)s for the unique id of the format (like Youtube's itags: "137"),
%(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id,
%(playlist_title)s, %(playlist_id)s, or %(playlist)s (=title if present, ID otherwise) for the playlist the video is in,
%(playlist_index)s for the position in the playlist. %(height)s and %(width)s for the width and height of the video format.
%(resolution)s for a textual description of the resolution of the video format. %% for a literal percent. Use - to output to stdout.
Can also be used to download to a different directory, for example with -o '/my/downloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given
--restrict-filenames Restrict filenames to only ASCII characters, and avoid "&" and spaces in filenames
-A, --auto-number [deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] number downloaded files starting from 00000
-t, --title [deprecated] use title in file name (default)
-l, --literal [deprecated] alias of --title
-w, --no-overwrites do not overwrite files
-c, --continue force resume of partially downloaded files.
By default, youtube-dl will resume
downloads if possible.
--no-continue do not resume partially downloaded files
(restart from beginning)
--no-part do not use .part files - write directly
into output file
--no-mtime do not use the Last-modified header to set
the file modification time
--write-description write video description to a .description
file
-c, --continue force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
--no-continue do not resume partially downloaded files (restart from beginning)
--no-part do not use .part files - write directly into output file
--no-mtime do not use the Last-modified header to set the file modification time
--write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file
--write-annotations write video annotations to a .annotation
file
--load-info FILE json file containing the video information
(created with the "--write-json" option)
--cookies FILE file to read cookies from and dump cookie
jar in
--cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information
permanently. By default $XDG_CACHE_HOME
/youtube-dl or ~/.cache/youtube-dl . At the
moment, only YouTube player files (for
videos with obfuscated signatures) are
cached, but that may change.
--write-annotations write video annotations to a .annotation file
--load-info FILE json file containing the video information (created with the "--write-json" option)
--cookies FILE file to read cookies from and dump cookie jar in
--cache-dir DIR Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl
or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may
change.
--no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files
## Thumbnail images:
--write-thumbnail write thumbnail image to disk
--write-all-thumbnails write all thumbnail image formats to disk
--list-thumbnails Simulate and list all available thumbnail
formats
--list-thumbnails Simulate and list all available thumbnail formats
## Verbosity / Simulation Options:
-q, --quiet activates quiet mode
--no-warnings Ignore warnings
-s, --simulate do not download the video and do not write
anything to disk
-s, --simulate do not download the video and do not write anything to disk
--skip-download do not download the video
-g, --get-url simulate, quiet but print URL
-e, --get-title simulate, quiet but print title
@@ -259,153 +159,87 @@ which means you can modify it, redistribute it or use it however you like.
--get-duration simulate, quiet but print video length
--get-filename simulate, quiet but print output filename
--get-format simulate, quiet but print output format
-j, --dump-json simulate, quiet but print JSON information.
See --output for a description of available
keys.
-J, --dump-single-json simulate, quiet but print JSON information
for each command-line argument. If the URL
refers to a playlist, dump the whole
playlist information in a single line.
--print-json Be quiet and print the video information as
JSON (video is still being downloaded).
-j, --dump-json simulate, quiet but print JSON information. See --output for a description of available keys.
-J, --dump-single-json simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist
information in a single line.
--print-json Be quiet and print the video information as JSON (video is still being downloaded).
--newline output progress bar as new lines
--no-progress do not print progress bar
--console-title display progress in console titlebar
-v, --verbose print various debugging information
--dump-intermediate-pages print downloaded pages to debug problems
(very verbose)
--write-pages Write downloaded intermediary pages to
files in the current directory to debug
problems
--dump-pages print downloaded pages to debug problems (very verbose)
--write-pages Write downloaded intermediary pages to files in the current directory to debug problems
--print-traffic Display sent and read HTTP traffic
-C, --call-home Contact the youtube-dl server for
debugging.
--no-call-home Do NOT contact the youtube-dl server for
debugging.
-C, --call-home Contact the youtube-dl server for debugging.
--no-call-home Do NOT contact the youtube-dl server for debugging.
## Workarounds:
--encoding ENCODING Force the specified encoding (experimental)
--no-check-certificate Suppress HTTPS certificate validation.
--prefer-insecure Use an unencrypted connection to retrieve
information about the video. (Currently
supported only for YouTube)
--prefer-insecure Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)
--user-agent UA specify a custom user agent
--referer URL specify a custom referer, use if the video
access is restricted to one domain
--add-header FIELD:VALUE specify a custom HTTP header and its value,
separated by a colon ':'. You can use this
option multiple times
--bidi-workaround Work around terminals that lack
bidirectional text support. Requires bidiv
or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each
download.
--referer URL specify a custom referer, use if the video access is restricted to one domain
--add-header FIELD:VALUE specify a custom HTTP header and its value, separated by a colon ':'. You can use this option multiple times
--bidi-workaround Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each download.
## Video Format Options:
-f, --format FORMAT video format code, specify the order of
preference using slashes, as in -f 22/17/18
. Instead of format codes, you can select
by extension for the extensions aac, m4a,
mp3, mp4, ogg, wav, webm. You can also use
the special names "best", "bestvideo",
"bestaudio", "worst". You can filter the
video results by putting a condition in
brackets, as in -f "best[height=720]" (or
-f "[filesize>10M]"). This works for
filesize, height, width, tbr, abr, vbr,
asr, and fps and the comparisons <, <=, >,
>=, =, != and for ext, acodec, vcodec,
container, and protocol and the comparisons
=, != . Formats for which the value is not
known are excluded unless you put a
question mark (?) after the operator. You
can combine format filters, so -f "[height
<=? 720][tbr>500]" selects up to 720p
videos (or videos where the height is not
known) with a bitrate of at least 500
KBit/s. By default, youtube-dl will pick
the best quality. Use commas to download
multiple audio formats, such as -f
136/137/mp4/bestvideo,140/m4a/bestaudio.
You can merge the video and audio of two
formats into a single file using -f <video-
format>+<audio-format> (requires ffmpeg or
avconv), for example -f
-f, --format FORMAT video format code, specify the order of preference using slashes, as in -f 22/17/18 . Instead of format codes, you can select by
extension for the extensions aac, m4a, mp3, mp4, ogg, wav, webm. You can also use the special names "best", "bestvideo", "bestaudio",
"worst". You can filter the video results by putting a condition in brackets, as in -f "best[height=720]" (or -f "[filesize>10M]").
This works for filesize, height, width, tbr, abr, vbr, asr, and fps and the comparisons <, <=, >, >=, =, != and for ext, acodec,
vcodec, container, and protocol and the comparisons =, != . Formats for which the value is not known are excluded unless you put a
question mark (?) after the operator. You can combine format filters, so -f "[height <=? 720][tbr>500]" selects up to 720p videos
(or videos where the height is not known) with a bitrate of at least 500 KBit/s. By default, youtube-dl will pick the best quality.
Use commas to download multiple audio formats, such as -f 136/137/mp4/bestvideo,140/m4a/bestaudio. You can merge the video and audio
of two formats into a single file using -f <video-format>+<audio-format> (requires ffmpeg or avconv), for example -f
bestvideo+bestaudio.
--all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific
one is requested
--prefer-free-formats prefer free video formats unless a specific one is requested
--max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats
--youtube-skip-dash-manifest Do not download the DASH manifest on
YouTube videos
--merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
webm, flv.Ignored if no merge is required
--youtube-skip-dash-manifest Do not download the DASH manifest on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g. bestvideo+bestaudio), output to given container format. One of mkv, mp4, ogg, webm, flv.Ignored if no
merge is required
## Subtitle Options:
--write-sub write subtitle file
--write-auto-sub write automatic subtitle file (youtube
only)
--all-subs downloads all the available subtitles of
the video
--write-auto-sub write automatic subtitle file (youtube only)
--all-subs downloads all the available subtitles of the video
--list-subs lists all available subtitles for the video
--sub-format FORMAT subtitle format, accepts formats
preference, for example: "ass/srt/best"
--sub-lang LANGS languages of the subtitles to download
(optional) separated by commas, use IETF
language tags like 'en,pt'
--sub-format FORMAT subtitle format, accepts formats preference, for example: "ass/srt/best"
--sub-lang LANGS languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'
## Authentication Options:
-u, --username USERNAME login with this account ID
-p, --password PASSWORD account password. If this option is left
out, youtube-dl will ask interactively.
-p, --password PASSWORD account password. If this option is left out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR two-factor auth code
-n, --netrc use .netrc authentication data
--video-password PASSWORD video password (vimeo, smotri)
## Post-processing Options:
-x, --extract-audio convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or
avprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a",
"opus", or "wav"; "best" by default
--audio-quality QUALITY ffmpeg/avconv audio quality specification,
insert a value between 0 (better) and 9
(worse) for VBR or a specific bitrate like
128K (default 5)
--recode-video FORMAT Encode the video to another format if
necessary (currently supported:
mp4|flv|ogg|webm|mkv)
-k, --keep-video keeps the video file on disk after the
post-processing; the video is erased by
default
--no-post-overwrites do not overwrite post-processed files; the
post-processed files are overwritten by
default
--embed-subs embed subtitles in the video (only for mp4
videos)
-x, --extract-audio convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "best" by default
--audio-quality QUALITY ffmpeg/avconv audio quality specification, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K
(default 5)
--recode-video FORMAT Encode the video to another format if necessary (currently supported: mp4|flv|ogg|webm|mkv)
-k, --keep-video keeps the video file on disk after the post-processing; the video is erased by default
--no-post-overwrites do not overwrite post-processed files; the post-processed files are overwritten by default
--embed-subs embed subtitles in the video (only for mp4 videos)
--embed-thumbnail embed thumbnail in the audio as cover art
--add-metadata write metadata to the video file
--xattrs write metadata to the video file's xattrs
(using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the
file. One of never (do nothing), warn (only
emit a warning), detect_or_warn(the
default; fix file if we can, warn
otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the
postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its
containing directory.
--exec CMD Execute a command on the file after
downloading, similar to find's -exec
syntax. Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
--metadata-from-title FORMAT parse additional metadata like song title / artist from the video title. The format syntax is the same as --output, the parsed
parameters replace existing values. Additional templates: %(album), %(artist). Example: --metadata-from-title "%(artist)s -
%(title)s" matches a title like "Coldplay - Paradise"
--xattrs write metadata to the video file's xattrs (using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn(the default;
fix file if we can, warn otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory.
--exec CMD Execute a command on the file after downloading, similar to find's -exec syntax. Example: --exec 'adb push {} /sdcard/Music/ && rm
{}'
--convert-subtitles FORMAT Convert the subtitles to other format (currently supported: srt|ass|vtt)
# CONFIGURATION
@@ -525,6 +359,10 @@ YouTube requires an additional signature since September 2012 which is not suppo
In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### HTTP Error 429: Too Many Requests or 402: Payment Required
These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--network-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ###
The error
@@ -569,6 +407,18 @@ A note on the service that they don't host the infringing content, but just link
Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
### How can I speed up work on my issue?
(Also known as: Help, my important issue not being solved!) The youtube-dl core developer team is quite small. While we do our best to solve as many issues as possible, sometimes that can take quite a while. To speed up your issue, here's what you can do:
First of all, please do report the issue [at our issue tracker](https://yt-dl.org/bugs). That allows us to coordinate all efforts by users and developers, and serves as a unified point. Unfortunately, the youtube-dl project has grown too large to use personal email as an effective communication channel.
Please read the [bug reporting instructions](#bugs) below. A lot of bugs lack all the necessary information. If you can, offer proxy, VPN, or shell access to the youtube-dl developers. If you are able to, test the issue from multiple computers in multiple countries to exclude local censorship or misconfiguration issues.
If nobody is interested in solving your issue, you are welcome to take matters into your own hands and submit a pull request (or coerce/pay somebody else to do so).
Feel free to bump the issue from time to time by writing a small comment ("Issue is still present in youtube-dl version ...from France, but fixed from Belgium"), but please not more than once a month. Please do not declare your issue as `important` or `urgent`.
### How can I detect whether a given URL is supported by youtube-dl?
For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/video/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
@@ -668,6 +518,7 @@ youtube-dl makes the best effort to be a good command-line program, and thus sho
From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
```python
from __future__ import unicode_literals
import youtube_dl
ydl_opts = {}
@@ -680,6 +531,7 @@ Most likely, you'll want to use various options. For a list of what can be done,
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
```python
from __future__ import unicode_literals
import youtube_dl
@@ -737,7 +589,9 @@ If your report is shorter than two lines, it is almost certainly missing some of
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
If your server has multiple IPs or you suspect censorship, adding --call-home may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
### Are you using the latest version?

View File

@@ -0,0 +1,42 @@
from __future__ import unicode_literals
import codecs
import subprocess
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.utils import intlist_to_bytes
from youtube_dl.aes import aes_encrypt, key_expansion
secret_msg = b'Secret message goes here'
def hex_str(int_list):
return codecs.encode(intlist_to_bytes(int_list), 'hex')
def openssl_encode(algo, key, iv):
cmd = ['openssl', 'enc', '-e', '-' + algo, '-K', hex_str(key), '-iv', hex_str(iv)]
prog = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
out, _ = prog.communicate(secret_msg)
return out
iv = key = [0x20, 0x15] + 14 * [0]
r = openssl_encode('aes-128-cbc', key, iv)
print('aes_cbc_decrypt')
print(repr(r))
password = key
new_key = aes_encrypt(password, key_expansion(password))
r = openssl_encode('aes-128-ctr', new_key, iv)
print('aes_decrypt_text 16')
print(repr(r))
password = key + 16 * [0]
new_key = aes_encrypt(password, key_expansion(password)) * (32 // 16)
r = openssl_encode('aes-256-ctr', new_key, iv)
print('aes_decrypt_text 32')
print(repr(r))

View File

@@ -17,6 +17,7 @@
- **AdultSwim**
- **Aftenposten**
- **Aftonbladet**
- **AirMozilla**
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
@@ -46,6 +47,7 @@
- **Bandcamp**
- **Bandcamp:album**
- **bbc.co.uk**: BBC iPlayer
- **BeatportPro**
- **Beeg**
- **BehindKink**
- **Bet**
@@ -110,12 +112,14 @@
- **Discovery**
- **divxstage**: DivxStage
- **Dotsub**
- **DouyuTV**
- **DRBonanza**
- **Dropbox**
- **DrTuber**
- **DRTV**
- **Dump**
- **dvtv**: http://video.aktualne.cz/
- **EaglePlatform**
- **EbaumsWorld**
- **EchoMsk**
- **eHow**
@@ -143,6 +147,7 @@
- **Firstpost**
- **Flickr**
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom**
- **Foxgay**
- **FoxNews**
- **france2.fr:generation-quoi**
@@ -160,6 +165,7 @@
- **GameSpot**
- **GameStar**
- **Gametrailers**
- **Gazeta**
- **GDCVault**
- **generic**: Generic downloader that works on some sites
- **GiantBomb**
@@ -209,6 +215,8 @@
- **Jove**
- **jpopsuki.tv**
- **Jukebox**
- **Kaltura**
- **KanalPlay**: Kanal 5/9/11 Play
- **Kankan**
- **Karaoketv**
- **keek**
@@ -220,6 +228,10 @@
- **Ku6**
- **la7.tv**
- **Laola1Tv**
- **Letv**
- **LetvPlaylist**
- **LetvTv**
- **Libsyn**
- **lifenews**: LIFE | NEWS
- **LiveLeak**
- **livestream**
@@ -299,16 +311,19 @@
- **npo.nl:radio**
- **npo.nl:radio:fragment**
- **NRK**
- **NRKPlaylist**
- **NRKTV**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
- **ocw.mit.edu**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **Ooyala**
- **OpenFilm**
- **orf:fm4**: radio FM4
- **orf:iptv**: iptv.ORF.at
- **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek
- **parliamentlive.tv**: UK parliament videos
@@ -316,10 +331,12 @@
- **PBS**
- **Phoenix**
- **Photobucket**
- **Pladform**
- **PlanetaPlay**
- **play.fm**
- **played.to**
- **Playvid**
- **Playwire**
- **plus.google**: Google Plus
- **pluzz.francetv.fr**
- **podomatic**
@@ -328,8 +345,10 @@
- **PornHubPlaylist**
- **Pornotube**
- **PornoXO**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
- **Puls4**
- **Pyvideo**
- **QuickVid**
- **R7**
@@ -352,6 +371,7 @@
- **RTP**
- **RTS**: RTS.ch
- **rtve.es:alacarta**: RTVE a la carta
- **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams
- **RUHD**
- **rutube**: Rutube videos
@@ -402,13 +422,14 @@
- **SportBox**
- **SportDeutschland**
- **SRMediathek**: Saarländischer Rundfunk
- **SSA**
- **stanfordoc**: Stanford Open ClassRoom
- **Steam**
- **streamcloud.eu**
- **StreamCZ**
- **StreetVoice**
- **SunPorno**
- **SVTPlay**
- **SVTPlay**: SVT Play and Öppet arkiv
- **SWRMediathek**
- **Syfy**
- **SztvHu**
@@ -471,6 +492,7 @@
- **Ubu**
- **udemy**
- **udemy:course**
- **Ultimedia**
- **Unistra**
- **Urort**: NRK P3 Urørt
- **ustream**
@@ -498,6 +520,7 @@
- **Vidzi**
- **vier**
- **vier:videos**
- **Viewster**
- **viki**
- **vimeo**
- **vimeo:album**
@@ -544,6 +567,9 @@
- **XXXYMovies**
- **Yahoo**: Yahoo screen and movies
- **Yam**
- **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек
- **YesJapan**
- **Ynet**
- **YouJizz**

View File

@@ -14,6 +14,9 @@ from test.helper import FakeYDL, assertRegexpMatches
from youtube_dl import YoutubeDL
from youtube_dl.extractor import YoutubeIE
from youtube_dl.postprocessor.common import PostProcessor
from youtube_dl.utils import match_filter_func
TEST_URL = 'http://localhost/sample.mp4'
class YDL(FakeYDL):
@@ -46,8 +49,8 @@ class TestFormatSelection(unittest.TestCase):
ydl = YDL()
ydl.params['prefer_free_formats'] = True
formats = [
{'ext': 'webm', 'height': 460, 'url': 'x'},
{'ext': 'mp4', 'height': 460, 'url': 'y'},
{'ext': 'webm', 'height': 460, 'url': TEST_URL},
{'ext': 'mp4', 'height': 460, 'url': TEST_URL},
]
info_dict = _make_result(formats)
yie = YoutubeIE(ydl)
@@ -60,8 +63,8 @@ class TestFormatSelection(unittest.TestCase):
ydl = YDL()
ydl.params['prefer_free_formats'] = True
formats = [
{'ext': 'webm', 'height': 720, 'url': 'a'},
{'ext': 'mp4', 'height': 1080, 'url': 'b'},
{'ext': 'webm', 'height': 720, 'url': TEST_URL},
{'ext': 'mp4', 'height': 1080, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
@@ -74,9 +77,9 @@ class TestFormatSelection(unittest.TestCase):
ydl = YDL()
ydl.params['prefer_free_formats'] = False
formats = [
{'ext': 'webm', 'height': 720, 'url': '_'},
{'ext': 'mp4', 'height': 720, 'url': '_'},
{'ext': 'flv', 'height': 720, 'url': '_'},
{'ext': 'webm', 'height': 720, 'url': TEST_URL},
{'ext': 'mp4', 'height': 720, 'url': TEST_URL},
{'ext': 'flv', 'height': 720, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
@@ -88,8 +91,8 @@ class TestFormatSelection(unittest.TestCase):
ydl = YDL()
ydl.params['prefer_free_formats'] = False
formats = [
{'ext': 'flv', 'height': 720, 'url': '_'},
{'ext': 'webm', 'height': 720, 'url': '_'},
{'ext': 'flv', 'height': 720, 'url': TEST_URL},
{'ext': 'webm', 'height': 720, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
@@ -133,10 +136,10 @@ class TestFormatSelection(unittest.TestCase):
def test_format_selection(self):
formats = [
{'format_id': '35', 'ext': 'mp4', 'preference': 1, 'url': '_'},
{'format_id': '45', 'ext': 'webm', 'preference': 2, 'url': '_'},
{'format_id': '47', 'ext': 'webm', 'preference': 3, 'url': '_'},
{'format_id': '2', 'ext': 'flv', 'preference': 4, 'url': '_'},
{'format_id': '35', 'ext': 'mp4', 'preference': 1, 'url': TEST_URL},
{'format_id': '45', 'ext': 'webm', 'preference': 2, 'url': TEST_URL},
{'format_id': '47', 'ext': 'webm', 'preference': 3, 'url': TEST_URL},
{'format_id': '2', 'ext': 'flv', 'preference': 4, 'url': TEST_URL},
]
info_dict = _make_result(formats)
@@ -167,10 +170,10 @@ class TestFormatSelection(unittest.TestCase):
def test_format_selection_audio(self):
formats = [
{'format_id': 'audio-low', 'ext': 'webm', 'preference': 1, 'vcodec': 'none', 'url': '_'},
{'format_id': 'audio-mid', 'ext': 'webm', 'preference': 2, 'vcodec': 'none', 'url': '_'},
{'format_id': 'audio-high', 'ext': 'flv', 'preference': 3, 'vcodec': 'none', 'url': '_'},
{'format_id': 'vid', 'ext': 'mp4', 'preference': 4, 'url': '_'},
{'format_id': 'audio-low', 'ext': 'webm', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'audio-mid', 'ext': 'webm', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'audio-high', 'ext': 'flv', 'preference': 3, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'vid', 'ext': 'mp4', 'preference': 4, 'url': TEST_URL},
]
info_dict = _make_result(formats)
@@ -185,8 +188,8 @@ class TestFormatSelection(unittest.TestCase):
self.assertEqual(downloaded['format_id'], 'audio-low')
formats = [
{'format_id': 'vid-low', 'ext': 'mp4', 'preference': 1, 'url': '_'},
{'format_id': 'vid-high', 'ext': 'mp4', 'preference': 2, 'url': '_'},
{'format_id': 'vid-low', 'ext': 'mp4', 'preference': 1, 'url': TEST_URL},
{'format_id': 'vid-high', 'ext': 'mp4', 'preference': 2, 'url': TEST_URL},
]
info_dict = _make_result(formats)
@@ -228,9 +231,9 @@ class TestFormatSelection(unittest.TestCase):
def test_format_selection_video(self):
formats = [
{'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none', 'url': '_'},
{'format_id': 'dash-video-high', 'ext': 'mp4', 'preference': 2, 'acodec': 'none', 'url': '_'},
{'format_id': 'vid', 'ext': 'mp4', 'preference': 3, 'url': '_'},
{'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'dash-video-high', 'ext': 'mp4', 'preference': 2, 'acodec': 'none', 'url': TEST_URL},
{'format_id': 'vid', 'ext': 'mp4', 'preference': 3, 'url': TEST_URL},
]
info_dict = _make_result(formats)
@@ -337,6 +340,8 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'G')
class TestYoutubeDL(unittest.TestCase):
def test_subtitles(self):
def s_formats(lang, autocaption=False):
return [{
@@ -459,6 +464,73 @@ class TestFormatSelection(unittest.TestCase):
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(audiofile)
def test_match_filter(self):
class FilterYDL(YDL):
def __init__(self, *args, **kwargs):
super(FilterYDL, self).__init__(*args, **kwargs)
self.params['simulate'] = True
def process_info(self, info_dict):
super(YDL, self).process_info(info_dict)
def _match_entry(self, info_dict, incomplete):
res = super(FilterYDL, self)._match_entry(info_dict, incomplete)
if res is None:
self.downloaded_info_dicts.append(info_dict)
return res
first = {
'id': '1',
'url': TEST_URL,
'title': 'one',
'extractor': 'TEST',
'duration': 30,
'filesize': 10 * 1024,
}
second = {
'id': '2',
'url': TEST_URL,
'title': 'two',
'extractor': 'TEST',
'duration': 10,
'description': 'foo',
'filesize': 5 * 1024,
}
videos = [first, second]
def get_videos(filter_=None):
ydl = FilterYDL({'match_filter': filter_})
for v in videos:
ydl.process_ie_result(v, download=True)
return [v['id'] for v in ydl.downloaded_info_dicts]
res = get_videos()
self.assertEqual(res, ['1', '2'])
def f(v):
if v['id'] == '1':
return None
else:
return 'Video id is not 1'
res = get_videos(f)
self.assertEqual(res, ['1'])
f = match_filter_func('duration < 30')
res = get_videos(f)
self.assertEqual(res, ['2'])
f = match_filter_func('description = foo')
res = get_videos(f)
self.assertEqual(res, ['2'])
f = match_filter_func('description =? foo')
res = get_videos(f)
self.assertEqual(res, ['1', '2'])
f = match_filter_func('filesize > 5KiB')
res = get_videos(f)
self.assertEqual(res, ['1'])
if __name__ == '__main__':
unittest.main()

55
test/test_aes.py Normal file
View File

@@ -0,0 +1,55 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.aes import aes_decrypt, aes_encrypt, aes_cbc_decrypt, aes_decrypt_text
from youtube_dl.utils import bytes_to_intlist, intlist_to_bytes
import base64
# the encrypted data can be generate with 'devscripts/generate_aes_testdata.py'
class TestAES(unittest.TestCase):
def setUp(self):
self.key = self.iv = [0x20, 0x15] + 14 * [0]
self.secret_msg = b'Secret message goes here'
def test_encrypt(self):
msg = b'message'
key = list(range(16))
encrypted = aes_encrypt(bytes_to_intlist(msg), key)
decrypted = intlist_to_bytes(aes_decrypt(encrypted, key))
self.assertEqual(decrypted, msg)
def test_cbc_decrypt(self):
data = bytes_to_intlist(
b"\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6'\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd"
)
decrypted = intlist_to_bytes(aes_cbc_decrypt(data, self.key, self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
def test_decrypt_text(self):
password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) +
b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae'
)
decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg)
password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) +
b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83'
)
decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg)
if __name__ == '__main__':
unittest.main()

View File

@@ -104,11 +104,11 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch(':tds', ['ComedyCentralShows'])
def test_vimeo_matching(self):
self.assertMatch('http://vimeo.com/channels/tributes', ['vimeo:channel'])
self.assertMatch('http://vimeo.com/channels/31259', ['vimeo:channel'])
self.assertMatch('http://vimeo.com/channels/31259/53576664', ['vimeo'])
self.assertMatch('http://vimeo.com/user7108434', ['vimeo:user'])
self.assertMatch('http://vimeo.com/user7108434/videos', ['vimeo:user'])
self.assertMatch('https://vimeo.com/channels/tributes', ['vimeo:channel'])
self.assertMatch('https://vimeo.com/channels/31259', ['vimeo:channel'])
self.assertMatch('https://vimeo.com/channels/31259/53576664', ['vimeo'])
self.assertMatch('https://vimeo.com/user7108434', ['vimeo:user'])
self.assertMatch('https://vimeo.com/user7108434/videos', ['vimeo:user'])
self.assertMatch('https://vimeo.com/user21297594/review/75524534/3c257a1b5d', ['vimeo:review'])
# https://github.com/rg3/youtube-dl/issues/1930

View File

@@ -1,4 +1,6 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
import unittest
@@ -27,5 +29,12 @@ class TestExecution(unittest.TestCase):
def test_main_exec(self):
subprocess.check_call([sys.executable, 'youtube_dl/__main__.py', '--version'], cwd=rootDir, stdout=_DEV_NULL)
def test_cmdline_umlauts(self):
p = subprocess.Popen(
[sys.executable, 'youtube_dl/__main__.py', 'ä', '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
self.assertFalse(stderr)
if __name__ == '__main__':
unittest.main()

View File

@@ -8,7 +8,7 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_http_server
from youtube_dl.compat import compat_http_server, compat_urllib_request
import ssl
import threading
@@ -68,5 +68,52 @@ class TestHTTP(unittest.TestCase):
r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
self.assertEqual(r['url'], 'https://localhost:%d/vid.mp4' % self.port)
def _build_proxy_handler(name):
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
proxy_name = name
def log_message(self, format, *args):
pass
def do_GET(self):
self.send_response(200)
self.send_header('Content-Type', 'text/plain; charset=utf-8')
self.end_headers()
self.wfile.write('{self.proxy_name}: {self.path}'.format(self=self).encode('utf-8'))
return HTTPTestRequestHandler
class TestProxy(unittest.TestCase):
def setUp(self):
self.proxy = compat_http_server.HTTPServer(
('localhost', 0), _build_proxy_handler('normal'))
self.port = self.proxy.socket.getsockname()[1]
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
self.proxy_thread.daemon = True
self.proxy_thread.start()
self.cn_proxy = compat_http_server.HTTPServer(
('localhost', 0), _build_proxy_handler('cn'))
self.cn_port = self.cn_proxy.socket.getsockname()[1]
self.cn_proxy_thread = threading.Thread(target=self.cn_proxy.serve_forever)
self.cn_proxy_thread.daemon = True
self.cn_proxy_thread.start()
def test_proxy(self):
cn_proxy = 'localhost:{0}'.format(self.cn_port)
ydl = YoutubeDL({
'proxy': 'localhost:{0}'.format(self.port),
'cn_verification_proxy': cn_proxy,
})
url = 'http://foo.com/bar'
response = ydl.urlopen(url).read().decode('utf-8')
self.assertEqual(response, 'normal: {0}'.format(url))
req = compat_urllib_request.Request(url)
req.add_header('Ytdl-request-proxy', cn_proxy)
response = ydl.urlopen(req).read().decode('utf-8')
self.assertEqual(response, 'cn: {0}'.format(url))
if __name__ == '__main__':
unittest.main()

26
test/test_netrc.py Normal file
View File

@@ -0,0 +1,26 @@
# coding: utf-8
from __future__ import unicode_literals
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.extractor import (
gen_extractors,
)
class TestNetRc(unittest.TestCase):
def test_netrc_present(self):
for ie in gen_extractors():
if not hasattr(ie, '_login'):
continue
self.assertTrue(
hasattr(ie, '_NETRC_MACHINE'),
'Extractor %s supports login, but is missing a _NETRC_MACHINE property' % ie.IE_NAME)
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,17 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.postprocessor import MetadataFromTitlePP
class TestMetadataFromTitle(unittest.TestCase):
def test_format_to_regex(self):
pp = MetadataFromTitlePP(None, '%(title)s - %(artist)s')
self.assertEqual(pp._titleregex, '(?P<title>.+)\ \-\ (?P<artist>.+)')

View File

@@ -26,6 +26,7 @@ from youtube_dl.extractor import (
VikiIE,
ThePlatformIE,
RTVEALaCartaIE,
FunnyOrDieIE,
)
@@ -320,5 +321,17 @@ class TestRtveSubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca')
class TestFunnyOrDieSubtitles(BaseTestSubtitles):
url = 'http://www.funnyordie.com/videos/224829ff6d/judd-apatow-will-direct-your-vine'
IE = FunnyOrDieIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), 'c5593c193eacd353596c11c2d4f9ecc4')
if __name__ == '__main__':
unittest.main()

View File

@@ -17,13 +17,22 @@ IGNORED_FILES = [
'buildserver.py',
]
IGNORED_DIRS = [
'.git',
'.tox',
]
from test.helper import assertRegexpMatches
class TestUnicodeLiterals(unittest.TestCase):
def test_all_files(self):
for dirpath, _, filenames in os.walk(rootDir):
for dirpath, dirnames, filenames in os.walk(rootDir):
for ignore_dir in IGNORED_DIRS:
if ignore_dir in dirnames:
# If we remove the directory from dirnames os.walk won't
# recurse into it
dirnames.remove(ignore_dir)
for basename in filenames:
if not basename.endswith('.py'):
continue

View File

@@ -24,6 +24,7 @@ from youtube_dl.utils import (
encodeFilename,
escape_rfc3986,
escape_url,
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
InAdvancePagedList,
@@ -38,6 +39,8 @@ from youtube_dl.utils import (
parse_iso8601,
read_batch_urls,
sanitize_filename,
sanitize_path,
sanitize_url_path_consecutive_slashes,
shell_quote,
smuggle_url,
str_to_int,
@@ -52,6 +55,7 @@ from youtube_dl.utils import (
urlencode_postdata,
version_tuple,
xpath_with_ns,
xpath_text,
render_table,
match_str,
)
@@ -86,6 +90,11 @@ class TestUtil(unittest.TestCase):
sanitize_filename('New World record at 0:12:34'),
'New World record at 0_12_34')
self.assertEqual(sanitize_filename('--gasdgf'), '_-gasdgf')
self.assertEqual(sanitize_filename('--gasdgf', is_id=True), '--gasdgf')
self.assertEqual(sanitize_filename('.gasdgf'), 'gasdgf')
self.assertEqual(sanitize_filename('.gasdgf', is_id=True), '.gasdgf')
forbidden = '"\0\\/'
for fc in forbidden:
for fbc in forbidden:
@@ -126,6 +135,62 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI')
def test_sanitize_path(self):
if sys.platform != 'win32':
return
self.assertEqual(sanitize_path('abc'), 'abc')
self.assertEqual(sanitize_path('abc/def'), 'abc\\def')
self.assertEqual(sanitize_path('abc\\def'), 'abc\\def')
self.assertEqual(sanitize_path('abc|def'), 'abc#def')
self.assertEqual(sanitize_path('<>:"|?*'), '#######')
self.assertEqual(sanitize_path('C:/abc/def'), 'C:\\abc\\def')
self.assertEqual(sanitize_path('C?:/abc/def'), 'C##\\abc\\def')
self.assertEqual(sanitize_path('\\\\?\\UNC\\ComputerName\\abc'), '\\\\?\\UNC\\ComputerName\\abc')
self.assertEqual(sanitize_path('\\\\?\\UNC/ComputerName/abc'), '\\\\?\\UNC\\ComputerName\\abc')
self.assertEqual(sanitize_path('\\\\?\\C:\\abc'), '\\\\?\\C:\\abc')
self.assertEqual(sanitize_path('\\\\?\\C:/abc'), '\\\\?\\C:\\abc')
self.assertEqual(sanitize_path('\\\\?\\C:\\ab?c\\de:f'), '\\\\?\\C:\\ab#c\\de#f')
self.assertEqual(sanitize_path('\\\\?\\C:\\abc'), '\\\\?\\C:\\abc')
self.assertEqual(
sanitize_path('youtube/%(uploader)s/%(autonumber)s-%(title)s-%(upload_date)s.%(ext)s'),
'youtube\\%(uploader)s\\%(autonumber)s-%(title)s-%(upload_date)s.%(ext)s')
self.assertEqual(
sanitize_path('youtube/TheWreckingYard ./00001-Not bad, Especially for Free! (1987 Yamaha 700)-20141116.mp4.part'),
'youtube\\TheWreckingYard #\\00001-Not bad, Especially for Free! (1987 Yamaha 700)-20141116.mp4.part')
self.assertEqual(sanitize_path('abc/def...'), 'abc\\def..#')
self.assertEqual(sanitize_path('abc.../def'), 'abc..#\\def')
self.assertEqual(sanitize_path('abc.../def...'), 'abc..#\\def..#')
self.assertEqual(sanitize_path('../abc'), '..\\abc')
self.assertEqual(sanitize_path('../../abc'), '..\\..\\abc')
self.assertEqual(sanitize_path('./abc'), 'abc')
self.assertEqual(sanitize_path('./../abc'), '..\\abc')
def test_sanitize_url_path_consecutive_slashes(self):
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/foo//bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname//foo/bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname//'),
'http://hostname/')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/foo/bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/'),
'http://hostname/')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/abc//'),
'http://hostname/abc/')
def test_ordered_set(self):
self.assertEqual(orderedSet([1, 1, 2, 3, 4, 4, 5, 6, 7, 3, 5]), [1, 2, 3, 4, 5, 6, 7])
self.assertEqual(orderedSet([]), [])
@@ -187,6 +252,17 @@ class TestUtil(unittest.TestCase):
self.assertEqual(find('media:song/media:author').text, 'The Author')
self.assertEqual(find('media:song/url').text, 'http://server.com/download.mp3')
def test_xpath_text(self):
testxml = '''<root>
<div>
<p>Foo</p>
</div>
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
self.assertEqual(xpath_text(doc, 'div/p'), 'Foo')
self.assertTrue(xpath_text(doc, 'div/bar') is None)
self.assertRaises(ExtractorError, xpath_text, doc, 'div/bar', fatal=True)
def test_smuggle_url(self):
data = {"ö": "ö", "abc": [3]}
url = 'https://foo.bar/baz?x=y#a'
@@ -244,6 +320,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('2.5 hours'), 9000)
self.assertEqual(parse_duration('02:03:04'), 7384)
self.assertEqual(parse_duration('01:02:03:04'), 93784)
self.assertEqual(parse_duration('1 hour 3 minutes'), 3780)
def test_fix_xml_ampersands(self):
self.assertEqual(

View File

@@ -1,8 +1,11 @@
[tox]
envlist = py26,py27,py33
envlist = py26,py27,py33,py34
[testenv]
deps =
nose
coverage
commands = nosetests --verbose {posargs:test} # --with-coverage --cover-package=youtube_dl --cover-html
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py
commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html
# test.test_download:TestDownload.test_NowVideo

View File

@@ -4,8 +4,10 @@
from __future__ import absolute_import, unicode_literals
import collections
import contextlib
import datetime
import errno
import fileinput
import io
import itertools
import json
@@ -28,6 +30,7 @@ from .compat import (
compat_basestring,
compat_cookiejar,
compat_expanduser,
compat_get_terminal_size,
compat_http_client,
compat_kwargs,
compat_str,
@@ -46,18 +49,19 @@ from .utils import (
ExtractorError,
format_bytes,
formatSeconds,
get_term_width,
locked_file,
make_HTTPS_handler,
MaxDownloadsReached,
PagedList,
parse_filesize,
PerRequestProxyHandler,
PostProcessingError,
platform_name,
preferredencoding,
render_table,
SameFileError,
sanitize_filename,
sanitize_path,
std_headers,
subtitles_filename,
takewhile_inclusive,
@@ -181,6 +185,8 @@ class YoutubeDL(object):
prefer_insecure: Use HTTP instead of HTTPS to retrieve information.
At the moment, this is only supported by YouTube.
proxy: URL of the proxy server to use
cn_verification_proxy: URL of the proxy to use for IP address verification
on Chinese sites. (Experimental)
socket_timeout: Time to wait for unresponsive hosts, in seconds
bidi_workaround: Work around buggy terminals without bidirectional text
support, using fridibi
@@ -247,10 +253,10 @@ class YoutubeDL(object):
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader:
the downloader (see youtube_dl/downloader/common.py):
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle,
xattr_set_filesize.
xattr_set_filesize, external_downloader_args.
The following options are used by the post processors:
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
@@ -284,7 +290,7 @@ class YoutubeDL(object):
try:
import pty
master, slave = pty.openpty()
width = get_term_width()
width = compat_get_terminal_size().columns
if width is None:
width_args = []
else:
@@ -317,8 +323,10 @@ class YoutubeDL(object):
'Set the LC_ALL environment variable to fix this.')
self.params['restrictfilenames'] = True
if '%(stitle)s' in self.params.get('outtmpl', ''):
self.report_warning('%(stitle)s is deprecated. Use the %(title)s and the --restrict-filenames flag(which also secures %(uploader)s et al) instead.')
if isinstance(params.get('outtmpl'), bytes):
self.report_warning(
'Parameter outtmpl is bytes, but should be a unicode string. '
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
self._setup_opener()
@@ -557,7 +565,7 @@ class YoutubeDL(object):
if v is not None)
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
outtmpl = sanitize_path(self.params.get('outtmpl', DEFAULT_OUTTMPL))
tmpl = compat_expanduser(outtmpl)
filename = tmpl % template_dict
# Temporary fix for #4787
@@ -624,7 +632,7 @@ class YoutubeDL(object):
Returns a list with a dictionary for each video we find.
If 'download', also downloads the videos.
extra_info is a dict containing the extra values to add to each result
'''
'''
if ie_key:
ies = [self.get_info_extractor(ie_key)]
@@ -1080,8 +1088,7 @@ class YoutubeDL(object):
if req_format is None:
req_format = 'best'
formats_to_download = []
# The -1 is for supporting YoutubeIE
if req_format in ('-1', 'all'):
if req_format == 'all':
formats_to_download = formats
else:
for rfstr in req_format.split(','):
@@ -1208,9 +1215,6 @@ class YoutubeDL(object):
if len(info_dict['title']) > 200:
info_dict['title'] = info_dict['title'][:197] + '...'
# Keep for backwards compatibility
info_dict['stitle'] = info_dict['title']
if 'format' not in info_dict:
info_dict['format'] = info_dict['ext']
@@ -1256,7 +1260,7 @@ class YoutubeDL(object):
return
try:
dn = os.path.dirname(encodeFilename(filename))
dn = os.path.dirname(sanitize_path(encodeFilename(filename)))
if dn and not os.path.exists(dn):
os.makedirs(dn)
except (OSError, IOError) as err:
@@ -1300,17 +1304,18 @@ class YoutubeDL(object):
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
subtitles = info_dict['requested_subtitles']
ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
if sub_info.get('data') is not None:
sub_data = sub_info['data']
else:
try:
uf = self.urlopen(sub_info['url'])
sub_data = uf.read().decode('utf-8')
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
sub_data = ie._download_webpage(
sub_info['url'], info_dict['id'], note=False)
except ExtractorError as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, compat_str(err)))
(sub_lang, compat_str(err.cause)))
continue
try:
sub_filename = subtitles_filename(filename, sub_lang, sub_format)
@@ -1451,8 +1456,11 @@ class YoutubeDL(object):
return self._download_retcode
def download_with_info_file(self, info_filename):
with io.open(info_filename, 'r', encoding='utf-8') as f:
info = json.load(f)
with contextlib.closing(fileinput.FileInput(
[info_filename], mode='r',
openhook=fileinput.hook_encoded('utf-8'))) as f:
# FileInput doesn't have a read method, we can't call json.load
info = json.loads('\n'.join(f))
try:
self.process_ie_result(info, download=True)
except DownloadError:
@@ -1756,13 +1764,14 @@ class YoutubeDL(object):
# Set HTTPS proxy to HTTP one if given (https://github.com/rg3/youtube-dl/issues/805)
if 'http' in proxies and 'https' not in proxies:
proxies['https'] = proxies['http']
proxy_handler = compat_urllib_request.ProxyHandler(proxies)
proxy_handler = PerRequestProxyHandler(proxies)
debuglevel = 1 if self.params.get('debug_printtraffic') else 0
https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
opener = compat_urllib_request.build_opener(
https_handler, proxy_handler, cookie_processor, ydlh)
proxy_handler, https_handler, cookie_processor, ydlh)
# Delete the default user-agent header, which would otherwise apply in
# cases where our custom HTTP handler doesn't come into play
# (See https://github.com/rg3/youtube-dl/issues/1309 for details)

View File

@@ -9,6 +9,7 @@ import codecs
import io
import os
import random
import shlex
import sys
@@ -170,6 +171,9 @@ def _real_main(argv=None):
if opts.recodevideo is not None:
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv']:
parser.error('invalid video recode format specified')
if opts.convertsubtitles is not None:
if opts.convertsubtitles not in ['srt', 'vtt', 'ass']:
parser.error('invalid subtitle format specified')
if opts.date is not None:
date = DateRange.day(opts.date)
@@ -209,6 +213,11 @@ def _real_main(argv=None):
# PostProcessors
postprocessors = []
# Add the metadata pp first, the other pps will copy it
if opts.metafromtitle:
postprocessors.append({
'key': 'MetadataFromTitle',
'titleformat': opts.metafromtitle
})
if opts.addmetadata:
postprocessors.append({'key': 'FFmpegMetadata'})
if opts.extractaudio:
@@ -223,6 +232,11 @@ def _real_main(argv=None):
'key': 'FFmpegVideoConvertor',
'preferedformat': opts.recodevideo,
})
if opts.convertsubtitles:
postprocessors.append({
'key': 'FFmpegSubtitlesConvertor',
'format': opts.convertsubtitles,
})
if opts.embedsubtitles:
postprocessors.append({
'key': 'FFmpegEmbedSubtitle',
@@ -247,6 +261,9 @@ def _real_main(argv=None):
xattr # Confuse flake8
except ImportError:
parser.error('setting filesize xattr requested but python-xattr is not available')
external_downloader_args = None
if opts.external_downloader_args:
external_downloader_args = shlex.split(opts.external_downloader_args)
match_filter = (
None if opts.match_filter is None
else match_filter_func(opts.match_filter))
@@ -351,6 +368,8 @@ def _real_main(argv=None):
'no_color': opts.no_color,
'ffmpeg_location': opts.ffmpeg_location,
'hls_prefer_native': opts.hls_prefer_native,
'external_downloader_args': external_downloader_args,
'cn_verification_proxy': opts.cn_verification_proxy,
}
with YoutubeDL(ydl_opts) as ydl:

View File

@@ -1,9 +1,11 @@
from __future__ import unicode_literals
import collections
import getpass
import optparse
import os
import re
import shutil
import socket
import subprocess
import sys
@@ -364,6 +366,33 @@ def workaround_optparse_bug9161():
return real_add_option(self, *bargs, **bkwargs)
optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size
else:
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
def compat_get_terminal_size():
columns = compat_getenv('COLUMNS', None)
if columns:
columns = int(columns)
else:
columns = None
lines = compat_getenv('LINES', None)
if lines:
lines = int(lines)
else:
lines = None
try:
sp = subprocess.Popen(
['stty', 'size'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
lines, columns = map(int, out.split())
except:
pass
return _terminal_size(columns, lines)
__all__ = [
'compat_HTTPError',
@@ -371,6 +400,7 @@ __all__ = [
'compat_chr',
'compat_cookiejar',
'compat_expanduser',
'compat_get_terminal_size',
'compat_getenv',
'compat_getpass',
'compat_html_entities',

View File

@@ -42,6 +42,8 @@ class FileDownloader(object):
max_filesize: Skip files larger than this size
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
(experimenatal)
external_downloader_args: A list of additional command-line arguments for the
external downloader.
Subclasses of this one must re-define the real_download method.
"""

View File

@@ -51,6 +51,13 @@ class ExternalFD(FileDownloader):
return []
return [command_option, source_address]
def _configuration_args(self, default=[]):
ex_args = self.params.get('external_downloader_args')
if ex_args is None:
return default
assert isinstance(ex_args, list)
return ex_args
def _call_downloader(self, tmpfilename, info_dict):
""" Either overwrite this or implement _make_cmd """
cmd = self._make_cmd(tmpfilename, info_dict)
@@ -79,6 +86,7 @@ class CurlFD(ExternalFD):
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._source_address('--interface')
cmd += self._configuration_args()
cmd += ['--', info_dict['url']]
return cmd
@@ -89,15 +97,16 @@ class WgetFD(ExternalFD):
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._source_address('--bind-address')
cmd += self._configuration_args()
cmd += ['--', info_dict['url']]
return cmd
class Aria2cFD(ExternalFD):
def _make_cmd(self, tmpfilename, info_dict):
cmd = [
self.exe, '-c',
'--min-split-size', '1M', '--max-connection-per-server', '4']
cmd = [self.exe, '-c']
cmd += self._configuration_args([
'--min-split-size', '1M', '--max-connection-per-server', '4'])
dn = os.path.dirname(tmpfilename)
if dn:
cmd += ['--dir', dn]

View File

@@ -11,6 +11,7 @@ from .common import FileDownloader
from .http import HttpFD
from ..compat import (
compat_urlparse,
compat_urllib_error,
)
from ..utils import (
struct_pack,
@@ -121,7 +122,8 @@ class FlvReader(io.BytesIO):
self.read_unsigned_int() # BootstrapinfoVersion
# Profile,Live,Update,Reserved
self.read(1)
flags = self.read_unsigned_char()
live = flags & 0x20 != 0
# time scale
self.read_unsigned_int()
# CurrentMediaTime
@@ -160,6 +162,7 @@ class FlvReader(io.BytesIO):
return {
'segments': segments,
'fragments': fragments,
'live': live,
}
def read_bootstrap_info(self):
@@ -182,6 +185,10 @@ def build_fragments_list(boot_info):
for segment, fragments_count in segment_run_table['segment_run']:
for _ in range(fragments_count):
res.append((segment, next(fragments_counter)))
if boot_info['live']:
res = res[-2:]
return res
@@ -246,6 +253,38 @@ class F4mFD(FileDownloader):
self.report_error('Unsupported DRM')
return media
def _get_bootstrap_from_url(self, bootstrap_url):
bootstrap = self.ydl.urlopen(bootstrap_url).read()
return read_bootstrap_info(bootstrap)
def _update_live_fragments(self, bootstrap_url, latest_fragment):
fragments_list = []
retries = 30
while (not fragments_list) and (retries > 0):
boot_info = self._get_bootstrap_from_url(bootstrap_url)
fragments_list = build_fragments_list(boot_info)
fragments_list = [f for f in fragments_list if f[1] > latest_fragment]
if not fragments_list:
# Retry after a while
time.sleep(5.0)
retries -= 1
if not fragments_list:
self.report_error('Failed to update fragments')
return fragments_list
def _parse_bootstrap_node(self, node, base_url):
if node.text is None:
bootstrap_url = compat_urlparse.urljoin(
base_url, node.attrib['url'])
boot_info = self._get_bootstrap_from_url(bootstrap_url)
else:
bootstrap_url = None
bootstrap = base64.b64decode(node.text.encode('ascii'))
boot_info = read_bootstrap_info(bootstrap)
return (boot_info, bootstrap_url)
def real_download(self, filename, info_dict):
man_url = info_dict['url']
requested_bitrate = info_dict.get('tbr')
@@ -265,18 +304,13 @@ class F4mFD(FileDownloader):
base_url = compat_urlparse.urljoin(man_url, media.attrib['url'])
bootstrap_node = doc.find(_add_ns('bootstrapInfo'))
if bootstrap_node.text is None:
bootstrap_url = compat_urlparse.urljoin(
base_url, bootstrap_node.attrib['url'])
bootstrap = self.ydl.urlopen(bootstrap_url).read()
else:
bootstrap = base64.b64decode(bootstrap_node.text)
boot_info, bootstrap_url = self._parse_bootstrap_node(bootstrap_node, base_url)
live = boot_info['live']
metadata_node = media.find(_add_ns('metadata'))
if metadata_node is not None:
metadata = base64.b64decode(metadata_node.text)
metadata = base64.b64decode(metadata_node.text.encode('ascii'))
else:
metadata = None
boot_info = read_bootstrap_info(bootstrap)
fragments_list = build_fragments_list(boot_info)
if self.params.get('test', False):
@@ -301,7 +335,8 @@ class F4mFD(FileDownloader):
(dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
write_flv_header(dest_stream)
write_metadata_tag(dest_stream, metadata)
if not live:
write_metadata_tag(dest_stream, metadata)
# This dict stores the download progress, it's updated by the progress
# hook
@@ -348,24 +383,45 @@ class F4mFD(FileDownloader):
http_dl.add_progress_hook(frag_progress_hook)
frags_filenames = []
for (seg_i, frag_i) in fragments_list:
while fragments_list:
seg_i, frag_i = fragments_list.pop(0)
name = 'Seg%d-Frag%d' % (seg_i, frag_i)
url = base_url + name
if akamai_pv:
url += '?' + akamai_pv.strip(';')
frag_filename = '%s-%s' % (tmpfilename, name)
success = http_dl.download(frag_filename, {'url': url})
if not success:
return False
with open(frag_filename, 'rb') as down:
down_data = down.read()
reader = FlvReader(down_data)
while True:
_, box_type, box_data = reader.read_box_info()
if box_type == b'mdat':
dest_stream.write(box_data)
break
frags_filenames.append(frag_filename)
try:
success = http_dl.download(frag_filename, {'url': url})
if not success:
return False
with open(frag_filename, 'rb') as down:
down_data = down.read()
reader = FlvReader(down_data)
while True:
_, box_type, box_data = reader.read_box_info()
if box_type == b'mdat':
dest_stream.write(box_data)
break
if live:
os.remove(frag_filename)
else:
frags_filenames.append(frag_filename)
except (compat_urllib_error.HTTPError, ) as err:
if live and (err.code == 404 or err.code == 410):
# We didn't keep up with the live window. Continue
# with the next available fragment.
msg = 'Fragment %d unavailable' % frag_i
self.report_warning(msg)
fragments_list = []
else:
raise
if not fragments_list and live and bootstrap_url:
fragments_list = self._update_live_fragments(bootstrap_url, frag_i)
total_frags += len(fragments_list)
if fragments_list and (fragments_list[0][1] > frag_i + 1):
msg = 'Missed %d fragments' % (fragments_list[0][1] - (frag_i + 1))
self.report_warning(msg)
dest_stream.close()

View File

@@ -92,6 +92,8 @@ class HttpFD(FileDownloader):
self._hook_progress({
'filename': filename,
'status': 'finished',
'downloaded_bytes': resume_len,
'total_bytes': resume_len,
})
return True
else:
@@ -218,12 +220,6 @@ class HttpFD(FileDownloader):
if tmpfilename != '-':
stream.close()
self._hook_progress({
'downloaded_bytes': byte_counter,
'total_bytes': data_len,
'tmpfilename': tmpfilename,
'status': 'error',
})
if data_len is not None and byte_counter != data_len:
raise ContentTooShortError(byte_counter, int(data_len))
self.try_rename(tmpfilename, filename)

View File

@@ -119,7 +119,9 @@ class RtmpFD(FileDownloader):
# Download using rtmpdump. rtmpdump returns exit code 2 when
# the connection was interrumpted and resuming appears to be
# possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = ['rtmpdump', '--verbose', '-r', url, '-o', tmpfilename]
basic_args = [
'rtmpdump', '--verbose', '-r', url,
'-o', encodeFilename(tmpfilename, True)]
if player_url is not None:
basic_args += ['--swfVfy', player_url]
if page_url is not None:

View File

@@ -8,6 +8,7 @@ from .adobetv import AdobeTVIE
from .adultswim import AdultSwimIE
from .aftenposten import AftenpostenIE
from .aftonbladet import AftonbladetIE
from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .anitube import AnitubeIE
@@ -36,6 +37,7 @@ from .bandcamp import BandcampIE, BandcampAlbumIE
from .bbccouk import BBCCoUkIE
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .beatportpro import BeatportProIE
from .bet import BetIE
from .bild import BildIE
from .bilibili import BiliBiliIE
@@ -105,6 +107,7 @@ from .dctp import DctpTvIE
from .deezer import DeezerPlaylistIE
from .dfb import DFBIE
from .dotsub import DotsubIE
from .douyutv import DouyuTVIE
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
@@ -115,6 +118,7 @@ from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .divxstage import DivxStageIE
from .dropbox import DropboxIE
from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE
from .ehow import EHowIE
@@ -149,6 +153,7 @@ from .fktv import (
)
from .flickr import FlickrIE
from .folketinget import FolketingetIE
from .footyroom import FootyRoomIE
from .fourtube import FourTubeIE
from .foxgay import FoxgayIE
from .foxnews import FoxNewsIE
@@ -173,6 +178,7 @@ from .gameone import (
from .gamespot import GameSpotIE
from .gamestar import GameStarIE
from .gametrailers import GametrailersIE
from .gazeta import GazetaIE
from .gdcvault import GDCVaultIE
from .generic import GenericIE
from .giantbomb import GiantBombIE
@@ -226,6 +232,8 @@ from .jeuxvideo import JeuxVideoIE
from .jove import JoveIE
from .jukebox import JukeboxIE
from .jpopsukitv import JpopsukiIE
from .kaltura import KalturaIE
from .kanalplay import KanalPlayIE
from .kankan import KankanIE
from .karaoketv import KaraoketvIE
from .keezmovies import KeezMoviesIE
@@ -237,6 +245,12 @@ from .krasview import KrasViewIE
from .ku6 import Ku6IE
from .la7 import LA7IE
from .laola1tv import Laola1TvIE
from .letv import (
LetvIE,
LetvTvIE,
LetvPlaylistIE
)
from .libsyn import LibsynIE
from .lifenews import LifeNewsIE
from .liveleak import LiveLeakIE
from .livestream import (
@@ -333,12 +347,14 @@ from .npo import (
)
from .nrk import (
NRKIE,
NRKPlaylistIE,
NRKTVIE,
)
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .nytimes import NYTimesIE
from .nuvid import NuvidIE
from .odnoklassniki import OdnoklassnikiIE
from .oktoberfesttv import OktoberfestTVIE
from .ooyala import OoyalaIE
from .openfilm import OpenFilmIE
@@ -346,6 +362,7 @@ from .orf import (
ORFTVthekIE,
ORFOE1IE,
ORFFM4IE,
ORFIPTVIE,
)
from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE
@@ -353,9 +370,11 @@ from .pbs import PBSIE
from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE
from .planetaplay import PlanetaPlayIE
from .pladform import PladformIE
from .played import PlayedIE
from .playfm import PlayFMIE
from .playvid import PlayvidIE
from .playwire import PlaywireIE
from .podomatic import PodomaticIE
from .pornhd import PornHdIE
from .pornhub import (
@@ -364,8 +383,10 @@ from .pornhub import (
)
from .pornotube import PornotubeIE
from .pornoxo import PornoXOIE
from .primesharetv import PrimeShareTVIE
from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE
from .puls4 import Puls4IE
from .pyvideo import PyvideoIE
from .quickvid import QuickVidIE
from .r7 import R7IE
@@ -388,7 +409,7 @@ from .rtlnow import RTLnowIE
from .rtl2 import RTL2IE
from .rtp import RTPIE
from .rts import RTSIE
from .rtve import RTVEALaCartaIE, RTVELiveIE
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE
from .ruhd import RUHDIE
from .rutube import (
RutubeIE,
@@ -446,6 +467,7 @@ from .sport5 import Sport5IE
from .sportbox import SportBoxIE
from .sportdeutschland import SportDeutschlandIE
from .srmediathek import SRMediathekIE
from .ssa import SSAIE
from .stanfordoc import StanfordOpenClassroomIE
from .steam import SteamIE
from .streamcloud import StreamcloudIE
@@ -518,6 +540,7 @@ from .udemy import (
UdemyIE,
UdemyCourseIE
)
from .ultimedia import UltimediaIE
from .unistra import UnistraIE
from .urort import UrortIE
from .ustream import UstreamIE, UstreamChannelIE
@@ -541,6 +564,7 @@ from .videoweed import VideoWeedIE
from .vidme import VidmeIE
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE
from .viewster import ViewsterIE
from .vimeo import (
VimeoIE,
VimeoAlbumIE,
@@ -597,6 +621,11 @@ from .yahoo import (
YahooSearchIE,
)
from .yam import YamIE
from .yandexmusic import (
YandexMusicTrackIE,
YandexMusicAlbumIE,
YandexMusicPlaylistIE,
)
from .yesjapan import YesJapanIE
from .ynet import YnetIE
from .youjizz import YouJizzIE

View File

@@ -2,13 +2,12 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
xpath_text,
float_or_none,
xpath_text,
)
@@ -60,6 +59,24 @@ class AdultSwimIE(InfoExtractor):
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},
}, {
'url': 'http://www.adultswim.com/videos/tim-and-eric-awesome-show-great-job/dr-steve-brule-for-your-wine/',
'playlist': [
{
'md5': '3e346a2ab0087d687a05e1e7f3b3e529',
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ-0',
'ext': 'flv',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
},
}
],
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
},
}]
@staticmethod
@@ -80,6 +97,7 @@ class AdultSwimIE(InfoExtractor):
for video in collection.get('videos'):
if video.get('slug') == slug:
return collection, video
return None, None
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
@@ -90,28 +108,30 @@ class AdultSwimIE(InfoExtractor):
webpage = self._download_webpage(url, episode_path)
# Extract the value of `bootstrappedData` from the Javascript in the page.
bootstrappedDataJS = self._search_regex(r'var bootstrappedData = ({.*});', webpage, episode_path)
try:
bootstrappedData = json.loads(bootstrappedDataJS)
except ValueError as ve:
errmsg = '%s: Failed to parse JSON ' % episode_path
raise ExtractorError(errmsg, cause=ve)
bootstrapped_data = self._parse_json(self._search_regex(
r'var bootstrappedData = ({.*});', webpage, 'bootstraped data'), episode_path)
# Downloading videos from a /videos/playlist/ URL needs to be handled differently.
# NOTE: We are only downloading one video (the current one) not the playlist
if is_playlist:
collections = bootstrappedData['playlists']['collections']
collections = bootstrapped_data['playlists']['collections']
collection = self.find_collection_by_linkURL(collections, show_path)
video_info = self.find_video_info(collection, episode_path)
show_title = video_info['showTitle']
segment_ids = [video_info['videoPlaybackID']]
else:
collections = bootstrappedData['show']['collections']
collections = bootstrapped_data['show']['collections']
collection, video_info = self.find_collection_containing_video(collections, episode_path)
show = bootstrappedData['show']
# Video wasn't found in the collections, let's try `slugged_video`.
if video_info is None:
if bootstrapped_data.get('slugged_video', {}).get('slug') == episode_path:
video_info = bootstrapped_data['slugged_video']
else:
raise ExtractorError('Unable to find video info')
show = bootstrapped_data['show']
show_title = show['title']
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]

View File

@@ -14,10 +14,10 @@ from ..utils import (
class AftenpostenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/(?:#!/)?video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=&section=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
'url': 'http://www.aftenposten.no/webtv/#!/video/21039/trailer-sweatshop-i-can-t-take-any-more',
'md5': 'fd828cd29774a729bf4d4425fe192972',
'info_dict': {
'id': '21039',
@@ -30,12 +30,7 @@ class AftenpostenIE(InfoExtractor):
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(
r'data-xs-id="(\d+)"', webpage, 'video id')
video_id = self._match_id(url)
data = self._download_xml(
'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)

View File

@@ -0,0 +1,74 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
)
class AirMozillaIE(InfoExtractor):
_VALID_URL = r'https?://air\.mozilla\.org/(?P<id>[0-9a-z-]+)/?'
_TEST = {
'url': 'https://air.mozilla.org/privacy-lab-a-meetup-for-privacy-minded-people-in-san-francisco/',
'md5': '2e3e7486ba5d180e829d453875b9b8bf',
'info_dict': {
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
'thumbnail': 're:https://\w+\.cloudfront\.net/6x4q2w/poster\.jpg\?t=\d+',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',
'location': 'SFO Commons',
'duration': 3780,
'view_count': int,
'categories': ['Main'],
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(r'//vid.ly/(.*?)/embed', webpage, 'id')
embed_script = self._download_webpage('https://vid.ly/{0}/embed'.format(video_id), video_id)
jwconfig = self._search_regex(r'\svar jwconfig = (\{.*?\});\s', embed_script, 'metadata')
metadata = self._parse_json(jwconfig, video_id)
formats = [{
'url': source['file'],
'ext': source['type'],
'format_id': self._search_regex(r'&format=(.*)$', source['file'], 'video format'),
'format': source['label'],
'height': int(source['label'].rstrip('p')),
} for source in metadata['playlist'][0]['sources']]
self._sort_formats(formats)
view_count = int_or_none(self._html_search_regex(
r'Views since archived: ([0-9]+)',
webpage, 'view count', fatal=False))
timestamp = parse_iso8601(self._html_search_regex(
r'<time datetime="(.*?)"', webpage, 'timestamp', fatal=False))
duration = parse_duration(self._search_regex(
r'Duration:\s*(\d+\s*hours?\s*\d+\s*minutes?)',
webpage, 'duration', fatal=False))
return {
'id': video_id,
'title': self._og_search_title(webpage),
'formats': formats,
'url': self._og_search_url(webpage),
'display_id': display_id,
'thumbnail': metadata['playlist'][0].get('image'),
'description': self._og_search_description(webpage),
'timestamp': timestamp,
'location': self._html_search_regex(r'Location: (.*)', webpage, 'location', default=None),
'duration': duration,
'view_count': view_count,
'categories': re.findall(r'<a href=".*?" class="channel">(.*?)</a>', webpage),
}

View File

@@ -50,6 +50,9 @@ class ARDMediathekIE(InfoExtractor):
if '>Der gewünschte Beitrag ist nicht mehr verfügbar.<' in webpage:
raise ExtractorError('Video %s is no longer available' % video_id, expected=True)
if 'Diese Sendung ist für Jugendliche unter 12 Jahren nicht geeignet. Der Clip ist deshalb nur von 20 bis 6 Uhr verfügbar.' in webpage:
raise ExtractorError('This program is only suitable for those aged 12 and older. Video %s is therefore only available between 20 pm and 6 am.' % video_id, expected=True)
if re.search(r'[\?&]rss($|[=&])', url):
doc = parse_xml(webpage)
if doc.tag == 'rss':

View File

@@ -146,6 +146,7 @@ class ArteTVPlus7IE(InfoExtractor):
formats.append(format)
self._check_formats(formats, video_id)
self._sort_formats(formats)
info_dict['formats'] = formats

View File

@@ -19,6 +19,7 @@ from ..utils import (
class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_NETRC_MACHINE = 'atresplayer'
_TESTS = [
{
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',

View File

@@ -0,0 +1,103 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none
class BeatportProIE(InfoExtractor):
_VALID_URL = r'https?://pro\.beatport\.com/track/(?P<display_id>[^/]+)/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://pro.beatport.com/track/synesthesia-original-mix/5379371',
'md5': 'b3c34d8639a2f6a7f734382358478887',
'info_dict': {
'id': '5379371',
'display_id': 'synesthesia-original-mix',
'ext': 'mp4',
'title': 'Froxic - Synesthesia (Original Mix)',
},
}, {
'url': 'https://pro.beatport.com/track/love-and-war-original-mix/3756896',
'md5': 'e44c3025dfa38c6577fbaeb43da43514',
'info_dict': {
'id': '3756896',
'display_id': 'love-and-war-original-mix',
'ext': 'mp3',
'title': 'Wolfgang Gartner - Love & War (Original Mix)',
},
}, {
'url': 'https://pro.beatport.com/track/birds-original-mix/4991738',
'md5': 'a1fd8e8046de3950fd039304c186c05f',
'info_dict': {
'id': '4991738',
'display_id': 'birds-original-mix',
'ext': 'mp4',
'title': "Tos, Middle Milk, Mumblin' Johnsson - Birds (Original Mix)",
}
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
track_id = mobj.group('id')
display_id = mobj.group('display_id')
webpage = self._download_webpage(url, display_id)
playables = self._parse_json(
self._search_regex(
r'window\.Playables\s*=\s*({.+?});', webpage,
'playables info', flags=re.DOTALL),
track_id)
track = next(t for t in playables['tracks'] if t['id'] == int(track_id))
title = ', '.join((a['name'] for a in track['artists'])) + ' - ' + track['name']
if track['mix']:
title += ' (' + track['mix'] + ')'
formats = []
for ext, info in track['preview'].items():
if not info['url']:
continue
fmt = {
'url': info['url'],
'ext': ext,
'format_id': ext,
'vcodec': 'none',
}
if ext == 'mp3':
fmt['preference'] = 0
fmt['acodec'] = 'mp3'
fmt['abr'] = 96
fmt['asr'] = 44100
elif ext == 'mp4':
fmt['preference'] = 1
fmt['acodec'] = 'aac'
fmt['abr'] = 96
fmt['asr'] = 44100
formats.append(fmt)
self._sort_formats(formats)
images = []
for name, info in track['images'].items():
image_url = info.get('url')
if name == 'dynamic' or not image_url:
continue
image = {
'id': name,
'url': image_url,
'height': int_or_none(info.get('height')),
'width': int_or_none(info.get('width')),
}
images.append(image)
return {
'id': compat_str(track.get('id')) or track_id,
'display_id': track.get('slug') or display_id,
'title': title,
'formats': formats,
'thumbnails': images,
}

View File

@@ -41,7 +41,7 @@ class BreakIE(InfoExtractor):
'tbr': media['bitRate'],
'width': media['width'],
'height': media['height'],
} for media in info['media']]
} for media in info['media'] if media.get('mediaPurpose') == 'play']
if not formats:
formats.append({

View File

@@ -105,6 +105,7 @@ class CloudyIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
file_key = self._search_regex(
r'filekey\s*=\s*"([^"]+)"', webpage, 'file_key')
[r'key\s*:\s*"([^"]+)"', r'filekey\s*=\s*"([^"]+)"'],
webpage, 'file_key')
return self._extract_video(video_host, video_id, file_key)

View File

@@ -250,6 +250,8 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
})
self._sort_formats(formats)
subtitles = self._extract_subtitles(cdoc, guid)
virtual_id = show_name + ' ' + epTitle + ' part ' + compat_str(part_num + 1)
entries.append({
'id': guid,
@@ -260,6 +262,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
'duration': duration,
'thumbnail': thumbnail,
'description': description,
'subtitles': subtitles,
})
return {

View File

@@ -767,6 +767,10 @@ class InfoExtractor(object):
formats)
def _is_valid_url(self, url, video_id, item='video'):
url = self._proto_relative_url(url, scheme='http:')
# For now assume non HTTP(S) URLs always valid
if not (url.startswith('http://') or url.startswith('https://')):
return True
try:
self._request_webpage(url, video_id, 'Checking %s URL' % item)
return True
@@ -835,7 +839,7 @@ class InfoExtractor(object):
m3u8_id=None):
formats = [{
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
'format_id': '-'.join(filter(None, [m3u8_id, 'meta'])),
'url': m3u8_url,
'ext': ext,
'protocol': 'm3u8',
@@ -879,8 +883,13 @@ class InfoExtractor(object):
formats.append({'url': format_url(line)})
continue
tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
format_id = []
if m3u8_id:
format_id.append(m3u8_id)
last_media_name = last_media.get('NAME') if last_media else None
format_id.append(last_media_name if last_media_name else '%d' % (tbr if tbr else len(formats)))
f = {
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
'format_id': '-'.join(format_id),
'url': format_url(line.strip()),
'tbr': tbr,
'ext': ext,
@@ -921,39 +930,57 @@ class InfoExtractor(object):
formats = []
rtmp_count = 0
for video in smil.findall('./body/switch/video'):
src = video.get('src')
if not src:
continue
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
width = int_or_none(video.get('width'))
height = int_or_none(video.get('height'))
proto = video.get('proto')
if not proto:
if base:
if base.startswith('rtmp'):
proto = 'rtmp'
elif base.startswith('http'):
proto = 'http'
ext = video.get('ext')
if proto == 'm3u8':
formats.extend(self._extract_m3u8_formats(src, video_id, ext))
elif proto == 'rtmp':
rtmp_count += 1
streamer = video.get('streamer') or base
formats.append({
'url': streamer,
'play_path': src,
'ext': 'flv',
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
'tbr': bitrate,
'width': width,
'height': height,
})
if smil.findall('./body/seq/video'):
video = smil.findall('./body/seq/video')[0]
fmts, rtmp_count = self._parse_smil_video(video, video_id, base, rtmp_count)
formats.extend(fmts)
else:
for video in smil.findall('./body/switch/video'):
fmts, rtmp_count = self._parse_smil_video(video, video_id, base, rtmp_count)
formats.extend(fmts)
self._sort_formats(formats)
return formats
def _parse_smil_video(self, video, video_id, base, rtmp_count):
src = video.get('src')
if not src:
return ([], rtmp_count)
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
width = int_or_none(video.get('width'))
height = int_or_none(video.get('height'))
proto = video.get('proto')
if not proto:
if base:
if base.startswith('rtmp'):
proto = 'rtmp'
elif base.startswith('http'):
proto = 'http'
ext = video.get('ext')
if proto == 'm3u8':
return (self._extract_m3u8_formats(src, video_id, ext), rtmp_count)
elif proto == 'rtmp':
rtmp_count += 1
streamer = video.get('streamer') or base
return ([{
'url': streamer,
'play_path': src,
'ext': 'flv',
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
'tbr': bitrate,
'width': width,
'height': height,
}], rtmp_count)
elif proto.startswith('http'):
return ([{
'url': base + src,
'ext': ext or 'flv',
'tbr': bitrate,
'width': width,
'height': height,
}], rtmp_count)
def _live_title(self, name):
""" Generate the title for a live video """
now = datetime.datetime.now()
@@ -1035,6 +1062,9 @@ class InfoExtractor(object):
def _get_automatic_captions(self, *args, **kwargs):
raise NotImplementedError("This method must be implemented by subclasses")
def _subtitles_timecode(self, seconds):
return '%02d:%02d:%02d.%03d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 1000)
class SearchInfoExtractor(InfoExtractor):
"""

View File

@@ -23,12 +23,12 @@ from ..utils import (
)
from ..aes import (
aes_cbc_decrypt,
inc,
)
class CrunchyrollIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:[^/]*/[^/?&]*?|media/\?id=)(?P<video_id>[0-9]+))(?:[/?&]|$)'
_NETRC_MACHINE = 'crunchyroll'
_TESTS = [{
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
'info_dict': {
@@ -101,13 +101,6 @@ class CrunchyrollIE(InfoExtractor):
key = obfuscate_key(id)
class Counter:
__value = iv
def next_value(self):
temp = self.__value
self.__value = inc(self.__value)
return temp
decrypted_data = intlist_to_bytes(aes_cbc_decrypt(data, key, iv))
return zlib.decompress(decrypted_data)

View File

@@ -46,13 +46,13 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
_TESTS = [
{
'url': 'http://www.dailymotion.com/video/x33vw9_tutoriel-de-youtubeur-dl-des-video_tech',
'md5': '392c4b85a60a90dc4792da41ce3144eb',
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames',
'md5': '2137c41a8e78554bb09225b8eb322406',
'info_dict': {
'id': 'x33vw9',
'id': 'x2iuewm',
'ext': 'mp4',
'uploader': 'Amphora Alex and Van .',
'title': 'Tutoriel de Youtubeur"DL DES VIDEO DE YOUTUBE"',
'uploader': 'IGN',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
}
},
# Vevo video

View File

@@ -0,0 +1,77 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import ExtractorError
class DouyuTVIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?douyutv\.com/(?P<id>[A-Za-z0-9]+)'
_TEST = {
'url': 'http://www.douyutv.com/iseven',
'info_dict': {
'id': 'iseven',
'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:9e525642c25a0a24302869937cf69d17',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': '7师傅',
'uploader_id': '431925',
'is_live': True,
},
'params': {
'skip_download': True,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
config = self._download_json(
'http://www.douyutv.com/api/client/room/%s' % video_id, video_id)
data = config['data']
error_code = config.get('error', 0)
show_status = data.get('show_status')
if error_code is not 0:
raise ExtractorError(
'Server reported error %i' % error_code, expected=True)
# 1 = live, 2 = offline
if show_status == '2':
raise ExtractorError(
'Live stream is offline', expected=True)
base_url = data['rtmp_url']
live_path = data['rtmp_live']
title = self._live_title(data['room_name'])
description = data.get('show_details')
thumbnail = data.get('room_src')
uploader = data.get('nickname')
uploader_id = data.get('owner_uid')
multi_formats = data.get('rtmp_multi_bitrate')
if not isinstance(multi_formats, dict):
multi_formats = {}
multi_formats['live'] = live_path
formats = [{
'url': '%s/%s' % (base_url, format_path),
'format_id': format_id,
'preference': 1 if format_id == 'live' else 0,
} for format_id, format_path in multi_formats.items()]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'uploader': uploader,
'uploader_id': uploader_id,
'formats': formats,
'is_live': True,
}

View File

@@ -0,0 +1,98 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
)
class EaglePlatformIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:
eagleplatform:(?P<custom_host>[^/]+):|
https?://(?P<host>.+?\.media\.eagleplatform\.com)/index/player\?.*\brecord_id=
)
(?P<id>\d+)
'''
_TESTS = [{
# http://lenta.ru/news/2015/03/06/navalny/
'url': 'http://lentaru.media.eagleplatform.com/index/player?player=new&record_id=227304&player_template_id=5201',
'md5': '0b7994faa2bd5c0f69a3db6db28d078d',
'info_dict': {
'id': '227304',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
},
}, {
# http://muz-tv.ru/play/7129/
# http://media.clipyou.ru/index/player?record_id=12820&width=730&height=415&autoplay=true
'url': 'eagleplatform:media.clipyou.ru:12820',
'md5': '6c2ebeab03b739597ce8d86339d5a905',
'info_dict': {
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
}]
def _handle_error(self, response):
status = int_or_none(response.get('status', 200))
if status != 200:
raise ExtractorError(' '.join(response['errors']), expected=True)
def _download_json(self, url_or_request, video_id, note='Downloading JSON metadata'):
response = super(EaglePlatformIE, self)._download_json(url_or_request, video_id, note)
self._handle_error(response)
return response
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id')
player_data = self._download_json(
'http://%s/api/player_data?id=%s' % (host, video_id), video_id)
media = player_data['data']['playlist']['viewports'][0]['medialist'][0]
title = media['title']
description = media.get('description')
thumbnail = media.get('snapshot')
duration = int_or_none(media.get('duration'))
view_count = int_or_none(media.get('views'))
age_restriction = media.get('age_restriction')
age_limit = None
if age_restriction:
age_limit = 0 if age_restriction == 'allow_all' else 18
m3u8_data = self._download_json(
media['sources']['secure_m3u8']['auto'],
video_id, 'Downloading m3u8 JSON')
formats = self._extract_m3u8_formats(
m3u8_data['data'][0], video_id,
'mp4', entry_protocol='m3u8_native')
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'view_count': view_count,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -3,7 +3,6 @@ from __future__ import unicode_literals
import json
import random
import re
from .common import InfoExtractor
from ..compat import (
@@ -103,20 +102,23 @@ class EightTracksIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
json_like = self._search_regex(
r"(?s)PAGE.mix = (.*?);\n", webpage, 'trax information')
data = json.loads(json_like)
data = self._parse_json(
self._search_regex(
r"(?s)PAGE\.mix\s*=\s*({.+?});\n", webpage, 'trax information'),
playlist_id)
session = str(random.randint(0, 1000000000))
mix_id = data['id']
track_count = data['tracks_count']
duration = data['duration']
avg_song_duration = float(duration) / track_count
# duration is sometimes negative, use predefined avg duration
if avg_song_duration <= 0:
avg_song_duration = 300
first_url = 'http://8tracks.com/sets/%s/play?player=sm&mix_id=%s&format=jsonh' % (session, mix_id)
next_url = first_url
entries = []

View File

@@ -35,10 +35,7 @@ class EpornerIE(InfoExtractor):
title = self._html_search_regex(
r'<title>(.*?) - EPORNER', webpage, 'title')
redirect_code = self._html_search_regex(
r'<script type="text/javascript" src="/config5/%s/([a-f\d]+)/">' % video_id,
webpage, 'redirect_code')
redirect_url = 'http://www.eporner.com/config5/%s/%s' % (video_id, redirect_code)
redirect_url = 'http://www.eporner.com/config5/%s' % video_id
player_code = self._download_webpage(
redirect_url, display_id, note='Downloading player config')
@@ -69,5 +66,5 @@ class EpornerIE(InfoExtractor):
'duration': duration,
'view_count': view_count,
'formats': formats,
'age_limit': self._rta_search(webpage),
'age_limit': 18,
}

View File

@@ -3,15 +3,18 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
js_to_json,
parse_duration,
)
class EscapistIE(InfoExtractor):
_VALID_URL = r'https?://?(www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])'
_USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko'
_TEST = {
'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'md5': 'ab3a706c681efca53f0a35f1415cf0d1',
@@ -23,12 +26,15 @@ class EscapistIE(InfoExtractor):
'uploader': 'The Escapist Presents',
'title': "Breaking Down Baldur's Gate",
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 264,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage_req = compat_urllib_request.Request(url)
webpage_req.add_header('User-Agent', self._USER_AGENT)
webpage = self._download_webpage(webpage_req, video_id)
uploader_id = self._html_search_regex(
r"<h1\s+class='headline'>\s*<a\s+href='/videos/view/(.*?)'",
@@ -37,31 +43,50 @@ class EscapistIE(InfoExtractor):
r"<h1\s+class='headline'>(.*?)</a>",
webpage, 'uploader', fatal=False)
description = self._html_search_meta('description', webpage)
duration = parse_duration(self._html_search_meta('duration', webpage))
raw_title = self._html_search_meta('title', webpage, fatal=True)
title = raw_title.partition(' : ')[2]
config_url = compat_urllib_parse.unquote(self._html_search_regex(
r'<param\s+name="flashvars"\s+value="config=([^"&]+)', webpage, 'config URL'))
r'''(?x)
(?:
<param\s+name="flashvars".*?\s+value="config=|
flashvars=&quot;config=
)
(https?://[^"&]+)
''',
webpage, 'config URL'))
formats = []
ad_formats = []
def _add_format(name, cfgurl, quality):
def _add_format(name, cfg_url, quality):
cfg_req = compat_urllib_request.Request(cfg_url)
cfg_req.add_header('User-Agent', self._USER_AGENT)
config = self._download_json(
cfgurl, video_id,
cfg_req, video_id,
'Downloading ' + name + ' configuration',
'Unable to download ' + name + ' configuration',
transform_source=js_to_json)
playlist = config['playlist']
video_url = next(
p['url'] for p in playlist
if p.get('eventCategory') == 'Video')
formats.append({
'url': video_url,
'format_id': name,
'quality': quality,
})
for p in playlist:
if p.get('eventCategory') == 'Video':
ar = formats
elif p.get('eventCategory') == 'Video Postroll':
ar = ad_formats
else:
continue
ar.append({
'url': p['url'],
'format_id': name,
'quality': quality,
'http_headers': {
'User-Agent': self._USER_AGENT,
},
})
_add_format('normal', config_url, quality=0)
hq_url = (config_url +
@@ -70,10 +95,12 @@ class EscapistIE(InfoExtractor):
_add_format('hq', hq_url, quality=1)
except ExtractorError:
pass # That's fine, we'll just use normal quality
self._sort_formats(formats)
return {
if '/escapist/sales-marketing/' in formats[-1]['url']:
raise ExtractorError('This IP address has been blocked by The Escapist', expected=True)
res = {
'id': video_id,
'formats': formats,
'uploader': uploader,
@@ -81,4 +108,21 @@ class EscapistIE(InfoExtractor):
'title': title,
'thumbnail': self._og_search_thumbnail(webpage),
'description': description,
'duration': duration,
}
if self._downloader.params.get('include_ads') and ad_formats:
self._sort_formats(ad_formats)
ad_res = {
'id': '%s-ad' % video_id,
'title': '%s (Postroll)' % title,
'formats': ad_formats,
}
return {
'_type': 'playlist',
'entries': [res, ad_res],
'title': title,
'id': video_id,
}
return res

View File

@@ -4,11 +4,11 @@ import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlparse,
compat_parse_qs,
compat_urllib_request,
compat_urllib_parse,
)
from ..utils import (
qualities,
str_to_int,
)
@@ -17,7 +17,7 @@ class ExtremeTubeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?P<url>extremetube\.com/.*?video/.+?(?P<id>[0-9]+))(?:[/?&]|$)'
_TESTS = [{
'url': 'http://www.extremetube.com/video/music-video-14-british-euro-brit-european-cumshots-swallow-652431',
'md5': '1fb9228f5e3332ec8c057d6ac36f33e0',
'md5': '344d0c6d50e2f16b06e49ca011d8ac69',
'info_dict': {
'id': '652431',
'ext': 'mp4',
@@ -49,19 +49,27 @@ class ExtremeTubeIE(InfoExtractor):
r'Views:\s*</strong>\s*<span>([\d,\.]+)</span>',
webpage, 'view count', fatal=False))
video_url = compat_urllib_parse.unquote(self._html_search_regex(
r'video_url=(.+?)&amp;', webpage, 'video_url'))
path = compat_urllib_parse_urlparse(video_url).path
format = path.split('/')[5].split('_')[:2]
format = "-".join(format)
flash_vars = compat_parse_qs(self._search_regex(
r'<param[^>]+?name="flashvars"[^>]+?value="([^"]+)"', webpage, 'flash vars'))
formats = []
quality = qualities(['180p', '240p', '360p', '480p', '720p', '1080p'])
for k, vals in flash_vars.items():
m = re.match(r'quality_(?P<quality>[0-9]+p)$', k)
if m is not None:
formats.append({
'format_id': m.group('quality'),
'quality': quality(m.group('quality')),
'url': vals[0],
})
self._sort_formats(formats)
return {
'id': video_id,
'title': video_title,
'formats': formats,
'uploader': uploader,
'view_count': view_count,
'url': video_url,
'format': format,
'format_id': format,
'age_limit': 18,
}

View File

@@ -0,0 +1,41 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class FootyRoomIE(InfoExtractor):
_VALID_URL = r'http://footyroom\.com/(?P<id>[^/]+)'
_TEST = {
'url': 'http://footyroom.com/schalke-04-0-2-real-madrid-2015-02/',
'info_dict': {
'id': 'schalke-04-0-2-real-madrid-2015-02',
'title': 'Schalke 04 0 2 Real Madrid',
},
'playlist_count': 3,
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
playlist = self._parse_json(
self._search_regex(
r'VideoSelector\.load\((\[.+?\])\);', webpage, 'video selector'),
playlist_id)
playlist_title = self._og_search_title(webpage)
entries = []
for video in playlist:
payload = video.get('payload')
if not payload:
continue
playwire_url = self._search_regex(
r'data-config="([^"]+)"', payload,
'playwire url', default=None)
if playwire_url:
entries.append(self.url_result(playwire_url, 'Playwire'))
return self.playlist_result(entries, playlist_id, playlist_title)

View File

@@ -50,7 +50,6 @@ class FunnyOrDieIE(InfoExtractor):
bitrates.sort()
formats = []
for bitrate in bitrates:
for link in links:
formats.append({
@@ -59,6 +58,13 @@ class FunnyOrDieIE(InfoExtractor):
'vbr': bitrate,
})
subtitles = {}
for src, src_lang in re.findall(r'<track kind="captions" src="([^"]+)" srclang="([^"]+)"', webpage):
subtitles[src_lang] = [{
'ext': src.split('/')[-1],
'url': 'http://www.funnyordie.com%s' % src,
}]
post_json = self._search_regex(
r'fb_post\s*=\s*(\{.*?\});', webpage, 'post details')
post = json.loads(post_json)
@@ -69,4 +75,5 @@ class FunnyOrDieIE(InfoExtractor):
'description': post.get('description'),
'thumbnail': post.get('picture'),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
@@ -31,7 +33,7 @@ class GameStarIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
og_title = self._og_search_title(webpage)
title = og_title.replace(' - Video bei GameStar.de', '').strip()
title = re.sub(r'\s*- Video (bei|-) GameStar\.de$', '', og_title)
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id

View File

@@ -0,0 +1,38 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class GazetaIE(InfoExtractor):
_VALID_URL = r'(?P<url>https?://(?:www\.)?gazeta\.ru/(?:[^/]+/)?video/(?:(?:main|\d{4}/\d{2}/\d{2})/)?(?P<id>[A-Za-z0-9-_.]+)\.s?html)'
_TESTS = [{
'url': 'http://www.gazeta.ru/video/main/zadaite_vopros_vladislavu_yurevichu.shtml',
'md5': 'd49c9bdc6e5a7888f27475dc215ee789',
'info_dict': {
'id': '205566',
'ext': 'mp4',
'title': '«7080 процентов гражданских в Донецке на грани голода»',
'description': 'md5:38617526050bd17b234728e7f9620a71',
'thumbnail': 're:^https?://.*\.jpg',
},
}, {
'url': 'http://www.gazeta.ru/lifestyle/video/2015/03/08/master-klass_krasivoi_byt._delaem_vesennii_makiyazh.shtml',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
embed_url = '%s?p=embed' % mobj.group('url')
embed_page = self._download_webpage(
embed_url, display_id, 'Downloading embed page')
video_id = self._search_regex(
r'<div[^>]*?class="eagleplayer"[^>]*?data-id="([^"]+)"', embed_page, 'video id')
return self.url_result(
'eagleplatform:gazeta.media.eagleplatform.com:%s' % video_id, 'EaglePlatform')

View File

@@ -12,6 +12,7 @@ from ..utils import remove_end
class GDCVaultIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)/(?P<name>(\w|-)+)'
_NETRC_MACHINE = 'gdcvault'
_TESTS = [
{
'url': 'http://www.gdcvault.com/play/1019721/Doki-Doki-Universe-Sweet-Simple',

View File

@@ -26,6 +26,7 @@ from ..utils import (
unsmuggle_url,
UnsupportedError,
url_basename,
xpath_text,
)
from .brightcove import BrightcoveIE
from .ooyala import OoyalaIE
@@ -526,6 +527,17 @@ class GenericIE(InfoExtractor):
},
'add_ie': ['Viddler'],
},
# Libsyn embed
{
'url': 'http://thedailyshow.cc.com/podcast/episodetwelve',
'info_dict': {
'id': '3377616',
'ext': 'mp3',
'title': "The Daily Show Podcast without Jon Stewart - Episode 12: Bassem Youssef: Egypt's Jon Stewart",
'description': 'md5:601cb790edd05908957dae8aaa866465',
'upload_date': '20150220',
},
},
# jwplayer YouTube
{
'url': 'http://media.nationalarchives.gov.uk/index.php/webinar-using-discovery-national-archives-online-catalogue/',
@@ -557,6 +569,67 @@ class GenericIE(InfoExtractor):
'title': 'EP3S5 - Bon Appétit - Baqueira Mi Corazon !',
}
},
# Kaltura embed
{
'url': 'http://www.monumentalnetwork.com/videos/john-carlson-postgame-2-25-15',
'info_dict': {
'id': '1_eergr3h1',
'ext': 'mp4',
'upload_date': '20150226',
'uploader_id': 'MonumentalSports-Kaltura@perfectsensedigital.com',
'timestamp': int,
'title': 'John Carlson Postgame 2/25/15',
},
},
# Eagle.Platform embed (generic URL)
{
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
'info_dict': {
'id': '227304',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
},
},
# ClipYou (Eagle.Platform) embed (custom URL)
{
'url': 'http://muz-tv.ru/play/7129/',
'info_dict': {
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
},
# Pladform embed
{
'url': 'http://muz-tv.ru/kinozal/view/7400/',
'info_dict': {
'id': '100183293',
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • Тайна перевала Дятлова 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
},
# RSS feed with enclosure
{
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
'info_dict': {
'id': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
'ext': 'm4v',
'upload_date': '20150228',
'title': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
}
}
]
def report_following_redirect(self, new_url):
@@ -568,11 +641,24 @@ class GenericIE(InfoExtractor):
playlist_desc_el = doc.find('./channel/description')
playlist_desc = None if playlist_desc_el is None else playlist_desc_el.text
entries = [{
'_type': 'url',
'url': e.find('link').text,
'title': e.find('title').text,
} for e in doc.findall('./channel/item')]
entries = []
for it in doc.findall('./channel/item'):
next_url = xpath_text(it, 'link', fatal=False)
if not next_url:
enclosure_nodes = it.findall('./enclosure')
for e in enclosure_nodes:
next_url = e.attrib.get('url')
if next_url:
break
if not next_url:
continue
entries.append({
'_type': 'url',
'url': next_url,
'title': it.find('title').text,
})
return {
'_type': 'playlist',
@@ -931,6 +1017,19 @@ class GenericIE(InfoExtractor):
if mobj is not None:
return self.url_result(mobj.group('url'))
# Look for NYTimes player
mobj = re.search(
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//graphics8\.nytimes\.com/bcvideo/[^/]+/iframe/embed\.html.+?)\1>',
webpage)
if mobj is not None:
return self.url_result(mobj.group('url'))
# Look for Libsyn player
mobj = re.search(
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//html5-player\.libsyn\.com/embed/.+?)\1', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'))
# Look for Ooyala videos
mobj = (re.search(r'player\.ooyala\.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage) or
@@ -1113,6 +1212,30 @@ class GenericIE(InfoExtractor):
if mobj is not None:
return self.url_result(mobj.group('url'), 'Zapiks')
# Look for Kaltura embeds
mobj = re.search(
r"(?s)kWidget\.(?:thumb)?[Ee]mbed\(\{.*?'wid'\s*:\s*'_?(?P<partner_id>[^']+)',.*?'entry_id'\s*:\s*'(?P<id>[^']+)',", webpage)
if mobj is not None:
return self.url_result('kaltura:%(partner_id)s:%(id)s' % mobj.groupdict(), 'Kaltura')
# Look for Eagle.Platform embeds
mobj = re.search(
r'<iframe[^>]+src="(?P<url>https?://.+?\.media\.eagleplatform\.com/index/player\?.+?)"', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'), 'EaglePlatform')
# Look for ClipYou (uses Eagle.Platform) embeds
mobj = re.search(
r'<iframe[^>]+src="https?://(?P<host>media\.clipyou\.ru)/index/player\?.*\brecord_id=(?P<id>\d+).*"', webpage)
if mobj is not None:
return self.url_result('eagleplatform:%(host)s:%(id)s' % mobj.groupdict(), 'EaglePlatform')
# Look for Pladform embeds
mobj = re.search(
r'<iframe[^>]+src="(?P<url>https?://out\.pladform\.ru/player\?.+?)"', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'), 'Pladform')
def check_video(vurl):
if YoutubeIE.suitable(vurl):
return True
@@ -1169,10 +1292,16 @@ class GenericIE(InfoExtractor):
# HTML5 video
found = re.findall(r'(?s)<video[^<]*(?:>.*?<source[^>]*)?\s+src=["\'](.*?)["\']', webpage)
if not found:
REDIRECT_REGEX = r'[0-9]{,2};\s*(?:URL|url)=\'?([^\'"]+)'
found = re.search(
r'(?i)<meta\s+(?=(?:[a-z-]+="[^"]+"\s+)*http-equiv="refresh")'
r'(?:[a-z-]+="[^"]+"\s+)*?content="[0-9]{,2};url=\'?([^\'"]+)',
r'(?:[a-z-]+="[^"]+"\s+)*?content="%s' % REDIRECT_REGEX,
webpage)
if not found:
# Look also in Refresh HTTP header
refresh_header = head_response.headers.get('Refresh')
if refresh_header:
found = re.search(REDIRECT_REGEX, refresh_header)
if found:
new_url = found.group(1)
self.report_following_redirect(new_url)
@@ -1208,7 +1337,9 @@ class GenericIE(InfoExtractor):
return entries[0]
else:
for num, e in enumerate(entries, start=1):
e['title'] = '%s (%d)' % (e['title'], num)
# 'url' results don't have a title
if e.get('title') is not None:
e['title'] = '%s (%d)' % (e['title'], num)
return {
'_type': 'playlist',
'entries': entries,

View File

@@ -20,7 +20,7 @@ class GloboIE(InfoExtractor):
_VALID_URL = 'https?://.+?\.globo\.com/(?P<id>.+)'
_API_URL_TEMPLATE = 'http://api.globovideos.com/videos/%s/playlist'
_SECURITY_URL_TEMPLATE = 'http://security.video.globo.com/videos/%s/hash?player=flash&version=2.9.9.50&resource_id=%s'
_SECURITY_URL_TEMPLATE = 'http://security.video.globo.com/videos/%s/hash?player=flash&version=17.0.0.132&resource_id=%s'
_VIDEOID_REGEXES = [
r'\bdata-video-id="(\d+)"',

View File

@@ -140,9 +140,9 @@ class GroovesharkIE(InfoExtractor):
if webpage is not None:
o = GroovesharkHtmlParser.extract_object_tags(webpage)
return (webpage, [x for x in o if x['attrs']['id'] == 'jsPlayerEmbed'])
return webpage, [x for x in o if x['attrs']['id'] == 'jsPlayerEmbed']
return (webpage, None)
return webpage, None
def _real_initialize(self):
self.ts = int(time.time() * 1000) # timestamp in millis
@@ -154,7 +154,7 @@ class GroovesharkIE(InfoExtractor):
swf_referer = None
if self.do_playerpage_request:
(_, player_objs) = self._get_playerpage(url)
if player_objs is not None:
if player_objs:
swf_referer = self._build_swf_referer(url, player_objs[0])
self.to_screen('SWF Referer: %s' % swf_referer)

View File

@@ -2,7 +2,6 @@
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
@@ -15,10 +14,10 @@ class JeuxVideoIE(InfoExtractor):
'url': 'http://www.jeuxvideo.com/reportages-videos-jeux/0004/00046170/tearaway-playstation-vita-gc-2013-tearaway-nous-presente-ses-papiers-d-identite-00115182.htm',
'md5': '046e491afb32a8aaac1f44dd4ddd54ee',
'info_dict': {
'id': '5182',
'id': '114765',
'ext': 'mp4',
'title': 'GC 2013 : Tearaway nous présente ses papiers d\'identité',
'description': 'Lorsque les développeurs de LittleBigPlanet proposent un nouveau titre, on ne peut que s\'attendre à un résultat original et fort attrayant.\n',
'title': 'Tearaway : GC 2013 : Tearaway nous présente ses papiers d\'identité',
'description': 'Lorsque les développeurs de LittleBigPlanet proposent un nouveau titre, on ne peut que s\'attendre à un résultat original et fort attrayant.',
},
}
@@ -26,26 +25,29 @@ class JeuxVideoIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
title = mobj.group(1)
webpage = self._download_webpage(url, title)
xml_link = self._html_search_regex(
r'<param name="flashvars" value="config=(.*?)" />',
title = self._html_search_meta('name', webpage)
config_url = self._html_search_regex(
r'data-src="(/contenu/medias/video.php.*?)"',
webpage, 'config URL')
config_url = 'http://www.jeuxvideo.com' + config_url
video_id = self._search_regex(
r'http://www\.jeuxvideo\.com/config/\w+/\d+/(.*?)/\d+_player\.xml',
xml_link, 'video ID')
r'id=(\d+)',
config_url, 'video ID')
config = self._download_xml(
xml_link, title, 'Downloading XML config')
info_json = config.find('format.json').text
info = json.loads(info_json)['versions'][0]
config = self._download_json(
config_url, title, 'Downloading JSON config')
video_url = 'http://video720.jeuxvideo.com/' + info['file']
formats = [{
'url': source['file'],
'format_id': source['label'],
'resolution': source['label'],
} for source in reversed(config['sources'])]
return {
'id': video_id,
'title': config.find('titre_video').text,
'ext': 'mp4',
'url': video_url,
'title': title,
'formats': formats,
'description': self._og_search_description(webpage),
'thumbnail': config.find('image').text,
'thumbnail': config.get('image'),
}

View File

@@ -0,0 +1,138 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
ExtractorError,
int_or_none,
)
class KalturaIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:kaltura:|
https?://(:?(?:www|cdnapisec)\.)?kaltura\.com/index\.php/kwidget/(?:[^/]+/)*?wid/_
)(?P<partner_id>\d+)
(?::|
/(?:[^/]+/)*?entry_id/
)(?P<id>[0-9a-z_]+)'''
_API_BASE = 'http://cdnapi.kaltura.com/api_v3/index.php?'
_TESTS = [
{
'url': 'kaltura:269692:1_1jc2y3e4',
'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': {
'id': '1_1jc2y3e4',
'ext': 'mp4',
'title': 'Track 4',
'upload_date': '20131219',
'uploader_id': 'mlundberg@wolfgangsvault.com',
'description': 'The Allman Brothers Band, 12/16/1981',
'thumbnail': 're:^https?://.*/thumbnail/.*',
'timestamp': int,
},
},
{
'url': 'http://www.kaltura.com/index.php/kwidget/cache_st/1300318621/wid/_269692/uiconf_id/3873291/entry_id/1_1jc2y3e4',
'only_matching': True,
},
{
'url': 'https://cdnapisec.kaltura.com/index.php/kwidget/wid/_557781/uiconf_id/22845202/entry_id/1_plr1syf3',
'only_matching': True,
},
]
def _kaltura_api_call(self, video_id, actions, *args, **kwargs):
params = actions[0]
if len(actions) > 1:
for i, a in enumerate(actions[1:], start=1):
for k, v in a.items():
params['%d:%s' % (i, k)] = v
query = compat_urllib_parse.urlencode(params)
url = self._API_BASE + query
data = self._download_json(url, video_id, *args, **kwargs)
status = data if len(actions) == 1 else data[0]
if status.get('objectType') == 'KalturaAPIException':
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, status['message']))
return data
def _get_kaltura_signature(self, video_id, partner_id):
actions = [{
'apiVersion': '3.1',
'expiry': 86400,
'format': 1,
'service': 'session',
'action': 'startWidgetSession',
'widgetId': '_%s' % partner_id,
}]
return self._kaltura_api_call(
video_id, actions, note='Downloading Kaltura signature')['ks']
def _get_video_info(self, video_id, partner_id):
signature = self._get_kaltura_signature(video_id, partner_id)
actions = [
{
'action': 'null',
'apiVersion': '3.1.5',
'clientTag': 'kdp:v3.8.5',
'format': 1, # JSON, 2 = XML, 3 = PHP
'service': 'multirequest',
'ks': signature,
},
{
'action': 'get',
'entryId': video_id,
'service': 'baseentry',
'version': '-1',
},
{
'action': 'getContextData',
'contextDataParams:objectType': 'KalturaEntryContextDataParams',
'contextDataParams:referrer': 'http://www.kaltura.com/',
'contextDataParams:streamerType': 'http',
'entryId': video_id,
'service': 'baseentry',
},
]
return self._kaltura_api_call(
video_id, actions, note='Downloading video info JSON')
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
partner_id, entry_id = mobj.group('partner_id'), mobj.group('id')
info, source_data = self._get_video_info(entry_id, partner_id)
formats = [{
'format_id': '%(fileExt)s-%(bitrate)s' % f,
'ext': f['fileExt'],
'tbr': f['bitrate'],
'fps': f.get('frameRate'),
'filesize_approx': int_or_none(f.get('size'), invscale=1024),
'container': f.get('containerFormat'),
'vcodec': f.get('videoCodecId'),
'height': f.get('height'),
'width': f.get('width'),
'url': '%s/flavorId/%s' % (info['dataUrl'], f['id']),
} for f in source_data['flavorAssets']]
self._sort_formats(formats)
return {
'id': video_id,
'title': info['name'],
'formats': formats,
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'duration': info.get('duration'),
'timestamp': info.get('createdAt'),
'uploader_id': info.get('userId'),
'view_count': info.get('plays'),
}

View File

@@ -0,0 +1,96 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
)
class KanalPlayIE(InfoExtractor):
IE_DESC = 'Kanal 5/9/11 Play'
_VALID_URL = r'https?://(?:www\.)?kanal(?P<channel_id>5|9|11)play\.se/(?:#!/)?(?:play/)?program/\d+/video/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.kanal5play.se/#!/play/program/3060212363/video/3270012277',
'info_dict': {
'id': '3270012277',
'ext': 'flv',
'title': 'Saknar både dusch och avlopp',
'description': 'md5:6023a95832a06059832ae93bc3c7efb7',
'duration': 2636.36,
},
'params': {
# rtmp download
'skip_download': True,
}
}, {
'url': 'http://www.kanal9play.se/#!/play/program/335032/video/246042',
'only_matching': True,
}, {
'url': 'http://www.kanal11play.se/#!/play/program/232835958/video/367135199',
'only_matching': True,
}]
def _fix_subtitles(self, subs):
return '\r\n\r\n'.join(
'%s\r\n%s --> %s\r\n%s'
% (
num,
self._subtitles_timecode(item['startMillis'] / 1000.0),
self._subtitles_timecode(item['endMillis'] / 1000.0),
item['text'],
) for num, item in enumerate(subs, 1))
def _get_subtitles(self, channel_id, video_id):
subs = self._download_json(
'http://www.kanal%splay.se/api/subtitles/%s' % (channel_id, video_id),
video_id, 'Downloading subtitles JSON', fatal=False)
return {'se': [{'ext': 'srt', 'data': self._fix_subtitles(subs)}]} if subs else {}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
channel_id = mobj.group('channel_id')
video = self._download_json(
'http://www.kanal%splay.se/api/getVideo?format=FLASH&videoId=%s' % (channel_id, video_id),
video_id)
reasons_for_no_streams = video.get('reasonsForNoStreams')
if reasons_for_no_streams:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, '\n'.join(reasons_for_no_streams)),
expected=True)
title = video['title']
description = video.get('description')
duration = float_or_none(video.get('length'), 1000)
thumbnail = video.get('posterUrl')
stream_base_url = video['streamBaseUrl']
formats = [{
'url': stream_base_url,
'play_path': stream['source'],
'ext': 'flv',
'tbr': float_or_none(stream.get('bitrate'), 1000),
'rtmp_real_time': True,
} for stream in video['streams']]
self._sort_formats(formats)
subtitles = {}
if video.get('hasSubtitle'):
subtitles = self.extract_subtitles(channel_id, video_id)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -40,8 +40,10 @@ class KrasViewIE(InfoExtractor):
description = self._og_search_description(webpage, default=None)
thumbnail = flashvars.get('image') or self._og_search_thumbnail(webpage)
duration = int_or_none(flashvars.get('duration'))
width = int_or_none(self._og_search_property('video:width', webpage, 'video width'))
height = int_or_none(self._og_search_property('video:height', webpage, 'video height'))
width = int_or_none(self._og_search_property(
'video:width', webpage, 'video width', default=None))
height = int_or_none(self._og_search_property(
'video:height', webpage, 'video height', default=None))
return {
'id': video_id,

View File

@@ -27,8 +27,6 @@ class Laola1TvIE(InfoExtractor):
}
}
_BROKEN = True # Not really - extractor works fine, but f4m downloader does not support live streams yet.
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
@@ -57,11 +55,7 @@ class Laola1TvIE(InfoExtractor):
title = xpath_text(hd_doc, './/video/title', fatal=True)
flash_url = xpath_text(hd_doc, './/video/url', fatal=True)
uploader = xpath_text(hd_doc, './/video/meta_organistation')
is_live = xpath_text(hd_doc, './/video/islive') == 'true'
if is_live:
raise ExtractorError(
'Live streams are not supported by the f4m downloader.')
categories = xpath_text(hd_doc, './/video/meta_sports')
if categories:

View File

@@ -0,0 +1,207 @@
# coding: utf-8
from __future__ import unicode_literals
import datetime
import re
import time
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
compat_urlparse,
)
from ..utils import (
determine_ext,
ExtractorError,
parse_iso8601,
)
class LetvIE(InfoExtractor):
_VALID_URL = r'http://www\.letv\.com/ptv/vplay/(?P<id>\d+).html'
_TESTS = [{
'url': 'http://www.letv.com/ptv/vplay/22005890.html',
'md5': 'cab23bd68d5a8db9be31c9a222c1e8df',
'info_dict': {
'id': '22005890',
'ext': 'mp4',
'title': '第87届奥斯卡颁奖礼完美落幕 《鸟人》成最大赢家',
'timestamp': 1424747397,
'upload_date': '20150224',
'description': 'md5:a9cb175fd753e2962176b7beca21a47c',
}
}, {
'url': 'http://www.letv.com/ptv/vplay/1415246.html',
'info_dict': {
'id': '1415246',
'ext': 'mp4',
'title': '美人天下01',
'description': 'md5:f88573d9d7225ada1359eaf0dbf8bcda',
},
}, {
'note': 'This video is available only in Mainland China, thus a proxy is needed',
'url': 'http://www.letv.com/ptv/vplay/1118082.html',
'md5': 'f80936fbe20fb2f58648e81386ff7927',
'info_dict': {
'id': '1118082',
'ext': 'mp4',
'title': '与龙共舞 完整版',
'description': 'md5:7506a5eeb1722bb9d4068f85024e3986',
},
'params': {
'cn_verification_proxy': 'http://proxy.uku.im:8888'
},
}]
@staticmethod
def urshift(val, n):
return val >> n if val >= 0 else (val + 0x100000000) >> n
# ror() and calc_time_key() are reversed from a embedded swf file in KLetvPlayer.swf
def ror(self, param1, param2):
_loc3_ = 0
while _loc3_ < param2:
param1 = self.urshift(param1, 1) + ((param1 & 1) << 31)
_loc3_ += 1
return param1
def calc_time_key(self, param1):
_loc2_ = 773625421
_loc3_ = self.ror(param1, _loc2_ % 13)
_loc3_ = _loc3_ ^ _loc2_
_loc3_ = self.ror(_loc3_, _loc2_ % 17)
return _loc3_
def _real_extract(self, url):
media_id = self._match_id(url)
page = self._download_webpage(url, media_id)
params = {
'id': media_id,
'platid': 1,
'splatid': 101,
'format': 1,
'tkey': self.calc_time_key(int(time.time())),
'domain': 'www.letv.com'
}
play_json_req = compat_urllib_request.Request(
'http://api.letv.com/mms/out/video/playJson?' + compat_urllib_parse.urlencode(params)
)
cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
if cn_verification_proxy:
play_json_req.add_header('Ytdl-request-proxy', cn_verification_proxy)
play_json = self._download_json(
play_json_req,
media_id, 'Downloading playJson data')
# Check for errors
playstatus = play_json['playstatus']
if playstatus['status'] == 0:
flag = playstatus['flag']
if flag == 1:
msg = 'Country %s auth error' % playstatus['country']
else:
msg = 'Generic error. flag = %d' % flag
raise ExtractorError(msg, expected=True)
playurl = play_json['playurl']
formats = ['350', '1000', '1300', '720p', '1080p']
dispatch = playurl['dispatch']
urls = []
for format_id in formats:
if format_id in dispatch:
media_url = playurl['domain'][0] + dispatch[format_id][0]
# Mimic what flvxz.com do
url_parts = list(compat_urlparse.urlparse(media_url))
qs = dict(compat_urlparse.parse_qs(url_parts[4]))
qs.update({
'platid': '14',
'splatid': '1401',
'tss': 'no',
'retry': 1
})
url_parts[4] = compat_urllib_parse.urlencode(qs)
media_url = compat_urlparse.urlunparse(url_parts)
url_info_dict = {
'url': media_url,
'ext': determine_ext(dispatch[format_id][1]),
'format_id': format_id,
}
if format_id[-1:] == 'p':
url_info_dict['height'] = format_id[:-1]
urls.append(url_info_dict)
publish_time = parse_iso8601(self._html_search_regex(
r'发布时间&nbsp;([^<>]+) ', page, 'publish time', default=None),
delimiter=' ', timezone=datetime.timedelta(hours=8))
description = self._html_search_meta('description', page, fatal=False)
return {
'id': media_id,
'formats': urls,
'title': playurl['title'],
'thumbnail': playurl['pic'],
'description': description,
'timestamp': publish_time,
}
class LetvTvIE(InfoExtractor):
_VALID_URL = r'http://www.letv.com/tv/(?P<id>\d+).html'
_TESTS = [{
'url': 'http://www.letv.com/tv/46177.html',
'info_dict': {
'id': '46177',
'title': '美人天下',
'description': 'md5:395666ff41b44080396e59570dbac01c'
},
'playlist_count': 35
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
page = self._download_webpage(url, playlist_id)
media_urls = list(set(re.findall(
r'http://www.letv.com/ptv/vplay/\d+.html', page)))
entries = [self.url_result(media_url, ie='Letv')
for media_url in media_urls]
title = self._html_search_meta('keywords', page,
fatal=False).split('')[0]
description = self._html_search_meta('description', page, fatal=False)
return self.playlist_result(entries, playlist_id, playlist_title=title,
playlist_description=description)
class LetvPlaylistIE(LetvTvIE):
_VALID_URL = r'http://tv.letv.com/[a-z]+/(?P<id>[a-z]+)/index.s?html'
_TESTS = [{
'url': 'http://tv.letv.com/izt/wuzetian/index.html',
'info_dict': {
'id': 'wuzetian',
'title': '武媚娘传奇',
'description': 'md5:e12499475ab3d50219e5bba00b3cb248'
},
# This playlist contains some extra videos other than the drama itself
'playlist_mincount': 96
}, {
'url': 'http://tv.letv.com/pzt/lswjzzjc/index.shtml',
'info_dict': {
'id': 'lswjzzjc',
# The title should be "劲舞青春", but I can't find a simple way to
# determine the playlist title
'title': '乐视午间自制剧场',
'description': 'md5:b1eef244f45589a7b5b1af9ff25a4489'
},
'playlist_mincount': 7
}]

View File

@@ -0,0 +1,59 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import unified_strdate
class LibsynIE(InfoExtractor):
_VALID_URL = r'https?://html5-player\.libsyn\.com/embed/episode/id/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://html5-player.libsyn.com/embed/episode/id/3377616/',
'md5': '443360ee1b58007bc3dcf09b41d093bb',
'info_dict': {
'id': '3377616',
'ext': 'mp3',
'title': "The Daily Show Podcast without Jon Stewart - Episode 12: Bassem Youssef: Egypt's Jon Stewart",
'description': 'md5:601cb790edd05908957dae8aaa866465',
'upload_date': '20150220',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
formats = [{
'url': media_url,
} for media_url in set(re.findall('var\s+mediaURL(?:Libsyn)?\s*=\s*"([^"]+)"', webpage))]
podcast_title = self._search_regex(
r'<h2>([^<]+)</h2>', webpage, 'title')
episode_title = self._search_regex(
r'<h3>([^<]+)</h3>', webpage, 'title', default=None)
title = '%s - %s' % (podcast_title, episode_title) if podcast_title else episode_title
description = self._html_search_regex(
r'<div id="info_text_body">(.+?)</div>', webpage,
'description', fatal=False)
thumbnail = self._search_regex(
r'<img[^>]+class="info-show-icon"[^>]+src="([^"]+)"',
webpage, 'thumbnail', fatal=False)
release_date = unified_strdate(self._search_regex(
r'<div class="release_date">Released: ([^<]+)<', webpage, 'release date', fatal=False))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': release_date,
'formats': formats,
}

View File

@@ -2,6 +2,7 @@ from __future__ import unicode_literals
import re
import json
import itertools
from .common import InfoExtractor
from ..compat import (
@@ -40,6 +41,13 @@ class LivestreamIE(InfoExtractor):
'id': '2245590',
},
'playlist_mincount': 4,
}, {
'url': 'http://new.livestream.com/chess24/tatasteelchess',
'info_dict': {
'title': 'Tata Steel Chess',
'id': '3705884',
},
'playlist_mincount': 60,
}, {
'url': 'https://new.livestream.com/accounts/362/events/3557232/videos/67864563/player?autoPlay=false&height=360&mute=false&width=640',
'only_matching': True,
@@ -117,6 +125,30 @@ class LivestreamIE(InfoExtractor):
'view_count': video_data.get('views'),
}
def _extract_event(self, info):
event_id = compat_str(info['id'])
account = compat_str(info['owner_account_id'])
root_url = (
'https://new.livestream.com/api/accounts/{account}/events/{event}/'
'feed.json'.format(account=account, event=event_id))
def _extract_videos():
last_video = None
for i in itertools.count(1):
if last_video is None:
info_url = root_url
else:
info_url = '{root}?&id={id}&newer=-1&type=video'.format(
root=root_url, id=last_video)
videos_info = self._download_json(info_url, event_id, 'Downloading page {0}'.format(i))['data']
videos_info = [v['data'] for v in videos_info if v['type'] == 'video']
if not videos_info:
break
for v in videos_info:
yield self._extract_video_info(v)
last_video = videos_info[-1]['id']
return self.playlist_result(_extract_videos(), event_id, info['full_name'])
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
@@ -144,14 +176,13 @@ class LivestreamIE(InfoExtractor):
result = result and compat_str(vdata['data']['id']) == vid
return result
videos = [self._extract_video_info(video_data['data'])
for video_data in info['feed']['data']
if is_relevant(video_data, video_id)]
if video_id is None:
# This is an event page:
return self.playlist_result(
videos, '%s' % info['id'], info['full_name'])
return self._extract_event(info)
else:
videos = [self._extract_video_info(video_data['data'])
for video_data in info['feed']['data']
if is_relevant(video_data, video_id)]
if not videos:
raise ExtractorError('Cannot find video %s' % video_id)
return videos[0]

View File

@@ -52,6 +52,7 @@ class LRTIE(InfoExtractor):
'url': data['streamer'],
'play_path': 'mp4:%s' % data['file'],
'preference': -1,
'rtmp_real_time': True,
})
else:
formats.extend(

View File

@@ -15,19 +15,74 @@ from ..utils import (
)
class LyndaIE(InfoExtractor):
IE_NAME = 'lynda'
IE_DESC = 'lynda.com videos'
_VALID_URL = r'https?://www\.lynda\.com/[^/]+/[^/]+/\d+/(\d+)-\d\.html'
class LyndaBaseIE(InfoExtractor):
_LOGIN_URL = 'https://www.lynda.com/login/login.aspx'
_SUCCESSFUL_LOGIN_REGEX = r'isLoggedIn: true'
_ACCOUNT_CREDENTIALS_HINT = 'Use --username and --password options to provide lynda.com account credentials.'
_NETRC_MACHINE = 'lynda'
def _real_initialize(self):
self._login()
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
'remember': 'false',
'stayPut': 'false'
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form))
login_page = self._download_webpage(
request, None, 'Logging in as %s' % username)
# Not (yet) logged in
m = re.search(r'loginResultJson = \'(?P<json>[^\']+)\';', login_page)
if m is not None:
response = m.group('json')
response_json = json.loads(response)
state = response_json['state']
if state == 'notlogged':
raise ExtractorError(
'Unable to login, incorrect username and/or password',
expected=True)
# This is when we get popup:
# > You're already logged in to lynda.com on two devices.
# > If you log in here, we'll log you out of another device.
# So, we need to confirm this.
if state == 'conflicted':
confirm_form = {
'username': '',
'password': '',
'resolve': 'true',
'remember': 'false',
'stayPut': 'false',
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(confirm_form))
login_page = self._download_webpage(
request, None,
'Confirming log in and log out from another device')
if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
raise ExtractorError('Unable to log in')
class LyndaIE(LyndaBaseIE):
IE_NAME = 'lynda'
IE_DESC = 'lynda.com videos'
_VALID_URL = r'https?://www\.lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
_NETRC_MACHINE = 'lynda'
_SUCCESSFUL_LOGIN_REGEX = r'isLoggedIn: true'
_TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]'
ACCOUNT_CREDENTIALS_HINT = 'Use --username and --password options to provide lynda.com account credentials.'
_TEST = {
_TESTS = [{
'url': 'http://www.lynda.com/Bootstrap-tutorials/Using-exercise-files/110885/114408-4.html',
'md5': 'ecfc6862da89489161fb9cd5f5a6fac1',
'info_dict': {
@@ -36,25 +91,27 @@ class LyndaIE(InfoExtractor):
'title': 'Using the exercise files',
'duration': 68
}
}
def _real_initialize(self):
self._login()
}, {
'url': 'https://www.lynda.com/player/embed/133770?tr=foo=1;bar=g;fizz=rt&fs=0',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
video_id = self._match_id(url)
page = self._download_webpage('http://www.lynda.com/ajax/player?videoId=%s&type=video' % video_id, video_id,
'Downloading video JSON')
page = self._download_webpage(
'http://www.lynda.com/ajax/player?videoId=%s&type=video' % video_id,
video_id, 'Downloading video JSON')
video_json = json.loads(page)
if 'Status' in video_json:
raise ExtractorError('lynda returned error: %s' % video_json['Message'], expected=True)
raise ExtractorError(
'lynda returned error: %s' % video_json['Message'], expected=True)
if video_json['HasAccess'] is False:
raise ExtractorError(
'Video %s is only available for members. ' % video_id + self.ACCOUNT_CREDENTIALS_HINT, expected=True)
'Video %s is only available for members. '
% video_id + self._ACCOUNT_CREDENTIALS_HINT, expected=True)
video_id = compat_str(video_json['ID'])
duration = video_json['DurationInSeconds']
@@ -97,50 +154,9 @@ class LyndaIE(InfoExtractor):
'formats': formats
}
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
'remember': 'false',
'stayPut': 'false'
}
request = compat_urllib_request.Request(self._LOGIN_URL, compat_urllib_parse.urlencode(login_form))
login_page = self._download_webpage(request, None, 'Logging in as %s' % username)
# Not (yet) logged in
m = re.search(r'loginResultJson = \'(?P<json>[^\']+)\';', login_page)
if m is not None:
response = m.group('json')
response_json = json.loads(response)
state = response_json['state']
if state == 'notlogged':
raise ExtractorError('Unable to login, incorrect username and/or password', expected=True)
# This is when we get popup:
# > You're already logged in to lynda.com on two devices.
# > If you log in here, we'll log you out of another device.
# So, we need to confirm this.
if state == 'conflicted':
confirm_form = {
'username': '',
'password': '',
'resolve': 'true',
'remember': 'false',
'stayPut': 'false',
}
request = compat_urllib_request.Request(self._LOGIN_URL, compat_urllib_parse.urlencode(confirm_form))
login_page = self._download_webpage(request, None, 'Confirming log in and log out from another device')
if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
raise ExtractorError('Unable to log in')
def _fix_subtitles(self, subs):
srt = ''
seq_counter = 0
for pos in range(0, len(subs) - 1):
seq_current = subs[pos]
m_current = re.match(self._TIMECODE_REGEX, seq_current['Timecode'])
@@ -152,8 +168,10 @@ class LyndaIE(InfoExtractor):
continue
appear_time = m_current.group('timecode')
disappear_time = m_next.group('timecode')
text = seq_current['Caption']
srt += '%s\r\n%s --> %s\r\n%s' % (str(pos), appear_time, disappear_time, text)
text = seq_current['Caption'].strip()
if text:
seq_counter += 1
srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (seq_counter, appear_time, disappear_time, text)
if srt:
return srt
@@ -166,7 +184,7 @@ class LyndaIE(InfoExtractor):
return {}
class LyndaCourseIE(InfoExtractor):
class LyndaCourseIE(LyndaBaseIE):
IE_NAME = 'lynda:course'
IE_DESC = 'lynda.com online courses'
@@ -179,35 +197,37 @@ class LyndaCourseIE(InfoExtractor):
course_path = mobj.group('coursepath')
course_id = mobj.group('courseid')
page = self._download_webpage('http://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id,
course_id, 'Downloading course JSON')
page = self._download_webpage(
'http://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id,
course_id, 'Downloading course JSON')
course_json = json.loads(page)
if 'Status' in course_json and course_json['Status'] == 'NotFound':
raise ExtractorError('Course %s does not exist' % course_id, expected=True)
raise ExtractorError(
'Course %s does not exist' % course_id, expected=True)
unaccessible_videos = 0
videos = []
(username, _) = self._get_login_info()
# Might want to extract videos right here from video['Formats'] as it seems 'Formats' is not provided
# by single video API anymore
for chapter in course_json['Chapters']:
for video in chapter['Videos']:
if username is None and video['HasAccess'] is False:
if video['HasAccess'] is False:
unaccessible_videos += 1
continue
videos.append(video['ID'])
if unaccessible_videos > 0:
self._downloader.report_warning('%s videos are only available for members and will not be downloaded. '
% unaccessible_videos + LyndaIE.ACCOUNT_CREDENTIALS_HINT)
self._downloader.report_warning(
'%s videos are only available for members (or paid members) and will not be downloaded. '
% unaccessible_videos + self._ACCOUNT_CREDENTIALS_HINT)
entries = [
self.url_result('http://www.lynda.com/%s/%s-4.html' %
(course_path, video_id),
'Lynda')
self.url_result(
'http://www.lynda.com/%s/%s-4.html' % (course_path, video_id),
'Lynda')
for video_id in videos]
course_title = course_json['Title']

View File

@@ -18,7 +18,7 @@ class MiTeleIE(InfoExtractor):
IE_NAME = 'mitele.es'
_VALID_URL = r'http://www\.mitele\.es/[^/]+/[^/]+/[^/]+/(?P<id>[^/]+)/'
_TEST = {
_TESTS = [{
'url': 'http://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144/',
'md5': '6a75fe9d0d3275bead0cb683c616fddb',
'info_dict': {
@@ -29,7 +29,7 @@ class MiTeleIE(InfoExtractor):
'display_id': 'programa-144',
'duration': 2913,
},
}
}]
def _real_extract(self, url):
episode = self._match_id(url)

View File

@@ -1,6 +1,7 @@
from __future__ import unicode_literals
import re
import itertools
from .common import InfoExtractor
from ..compat import (
@@ -10,7 +11,6 @@ from ..utils import (
ExtractorError,
HEADRequest,
str_to_int,
parse_iso8601,
)
@@ -27,8 +27,6 @@ class MixcloudIE(InfoExtractor):
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
'uploader': 'Daniel Holbach',
'uploader_id': 'dholbach',
'upload_date': '20111115',
'timestamp': 1321359578,
'thumbnail': 're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
@@ -37,31 +35,30 @@ class MixcloudIE(InfoExtractor):
'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
'info_dict': {
'id': 'gillespeterson-caribou-7-inch-vinyl-mix-chat',
'ext': 'm4a',
'title': 'Electric Relaxation vol. 3',
'ext': 'mp3',
'title': 'Caribou 7 inch Vinyl Mix & Chat',
'description': 'md5:2b8aec6adce69f9d41724647c65875e8',
'uploader': 'Daniel Drumz',
'uploader': 'Gilles Peterson Worldwide',
'uploader_id': 'gillespeterson',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': 're:https?://.*/images/',
'view_count': int,
'like_count': int,
},
}]
def _get_url(self, track_id, template_url):
server_count = 30
for i in range(server_count):
url = template_url % i
def _get_url(self, track_id, template_url, server_number):
boundaries = (1, 30)
for nr in server_numbers(server_number, boundaries):
url = template_url % nr
try:
# We only want to know if the request succeed
# don't download the whole file
self._request_webpage(
HEADRequest(url), track_id,
'Checking URL %d/%d ...' % (i + 1, server_count + 1))
'Checking URL %d/%d ...' % (nr, boundaries[-1]))
return url
except ExtractorError:
pass
return None
def _real_extract(self, url):
@@ -75,17 +72,18 @@ class MixcloudIE(InfoExtractor):
preview_url = self._search_regex(
r'\s(?:data-preview-url|m-preview)="([^"]+)"', webpage, 'preview url')
song_url = preview_url.replace('/previews/', '/c/originals/')
server_number = int(self._search_regex(r'stream(\d+)', song_url, 'server number'))
template_url = re.sub(r'(stream\d*)', 'stream%d', song_url)
final_song_url = self._get_url(track_id, template_url)
final_song_url = self._get_url(track_id, template_url, server_number)
if final_song_url is None:
self.to_screen('Trying with m4a extension')
template_url = template_url.replace('.mp3', '.m4a').replace('originals/', 'm4a/64/')
final_song_url = self._get_url(track_id, template_url)
final_song_url = self._get_url(track_id, template_url, server_number)
if final_song_url is None:
raise ExtractorError('Unable to extract track url')
PREFIX = (
r'<span class="play-button[^"]*?"'
r'm-play-on-spacebar[^>]+'
r'(?:\s+[a-zA-Z0-9-]+(?:="[^"]+")?)*?\s+')
title = self._html_search_regex(
PREFIX + r'm-title="([^"]+)"', webpage, 'title')
@@ -99,16 +97,12 @@ class MixcloudIE(InfoExtractor):
r'\s+"profile": "([^"]+)",', webpage, 'uploader id', fatal=False)
description = self._og_search_description(webpage)
like_count = str_to_int(self._search_regex(
[r'<meta itemprop="interactionCount" content="UserLikes:([0-9]+)"',
r'/favorites/?">([0-9]+)<'],
r'\bbutton-favorite\b.+m-ajax-toggle-count="([^"]+)"',
webpage, 'like count', fatal=False))
view_count = str_to_int(self._search_regex(
[r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
r'/listeners/?">([0-9,.]+)</a>'],
webpage, 'play count', fatal=False))
timestamp = parse_iso8601(self._search_regex(
r'<time itemprop="dateCreated" datetime="([^"]+)">',
webpage, 'upload date', default=None))
return {
'id': track_id,
@@ -118,7 +112,38 @@ class MixcloudIE(InfoExtractor):
'thumbnail': thumbnail,
'uploader': uploader,
'uploader_id': uploader_id,
'timestamp': timestamp,
'view_count': view_count,
'like_count': like_count,
}
def server_numbers(first, boundaries):
""" Server numbers to try in descending order of probable availability.
Starting from first (i.e. the number of the server hosting the preview file)
and going further and further up to the higher boundary and down to the
lower one in an alternating fashion. Namely:
server_numbers(2, (1, 5))
# Where the preview server is 2, min number is 1 and max is 5.
# Yields: 2, 3, 1, 4, 5
Why not random numbers or increasing sequences? Since from what I've seen,
full length files seem to be hosted on servers whose number is closer to
that of the preview; to be confirmed.
"""
zip_longest = getattr(itertools, 'zip_longest', None)
if zip_longest is None:
# python 2.x
zip_longest = itertools.izip_longest
if len(boundaries) != 2:
raise ValueError("boundaries should be a two-element tuple")
min, max = boundaries
highs = range(first + 1, max + 1)
lows = range(first - 1, min - 1, -1)
rest = filter(
None, itertools.chain.from_iterable(zip_longest(highs, lows)))
yield first
for n in rest:
yield n

View File

@@ -10,7 +10,7 @@ from ..utils import (
class MLBIE(InfoExtractor):
_VALID_URL = r'https?://m(?:lb)?\.mlb\.com/(?:(?:.*?/)?video/(?:topic/[\da-z_-]+/)?v|(?:shared/video/embed/embed\.html|[^/]+/video/play\.jsp)\?.*?\bcontent_id=)(?P<id>n?\d+)'
_VALID_URL = r'https?://m(?:lb)?\.(?:[\da-z_-]+\.)?mlb\.com/(?:(?:.*?/)?video/(?:topic/[\da-z_-]+/)?v|(?:shared/video/embed/embed\.html|[^/]+/video/play\.jsp)\?.*?\bcontent_id=)(?P<id>n?\d+)'
_TESTS = [
{
'url': 'http://m.mlb.com/sea/video/topic/51231442/v34698933/nymsea-ackley-robs-a-home-run-with-an-amazing-catch/?c_id=sea',
@@ -80,6 +80,10 @@ class MLBIE(InfoExtractor):
'url': 'http://mlb.mlb.com/es/video/play.jsp?content_id=36599553',
'only_matching': True,
},
{
'url': 'http://m.cardinals.mlb.com/stl/video/v51175783/atlstl-piscotty-makes-great-sliding-catch-on-line/?partnerId=as_mlb_20150321_42500876&adbid=579409712979910656&adbpl=tw&adbpr=52847728',
'only_matching': True,
}
]
def _real_extract(self, url):

View File

@@ -5,7 +5,7 @@ from ..utils import int_or_none
class MporaIE(InfoExtractor):
_VALID_URL = r'https?://(www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)'
_VALID_URL = r'https?://(?:www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)'
IE_NAME = 'MPORA'
_TEST = {
@@ -25,7 +25,9 @@ class MporaIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
data_json = self._search_regex(
r"new FM\.Player\('[^']+',\s*(\{.*?)\).player;", webpage, 'json')
[r"new FM\.Player\('[^']+',\s*(\{.*?)\).player;",
r"new\s+FM\.Kaltura\.Player\('[^']+'\s*,\s*({.+?})\);"],
webpage, 'json')
data = self._parse_json(data_json, video_id)
uploader = data['info_overlay'].get('username')

View File

@@ -3,17 +3,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
parse_duration,
unified_strdate,
)
class MusicVaultIE(InfoExtractor):
_VALID_URL = r'https?://www\.musicvault\.com/(?P<uploader_id>[^/?#]*)/video/(?P<display_id>[^/?#]*)_(?P<id>[0-9]+)\.html'
_TEST = {
'url': 'http://www.musicvault.com/the-allman-brothers-band/video/straight-from-the-heart_1010863.html',
'md5': '2cdbb3ae75f7fb3519821507d2fb3c15',
'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': {
'id': '1010863',
'ext': 'mp4',
@@ -22,9 +18,10 @@ class MusicVaultIE(InfoExtractor):
'duration': 244,
'uploader': 'The Allman Brothers Band',
'thumbnail': 're:^https?://.*/thumbnail/.*',
'upload_date': '19811216',
'upload_date': '20131219',
'location': 'Capitol Theatre (Passaic, NJ)',
'description': 'Listen to The Allman Brothers Band perform Straight from the Heart at Capitol Theatre (Passaic, NJ) on Dec 16, 1981',
'timestamp': int,
}
}
@@ -43,34 +40,24 @@ class MusicVaultIE(InfoExtractor):
r'<h1.*?>(.*?)</h1>', data_div, 'uploader', fatal=False)
title = self._html_search_regex(
r'<h2.*?>(.*?)</h2>', data_div, 'title')
upload_date = unified_strdate(self._html_search_regex(
r'<h3.*?>(.*?)</h3>', data_div, 'uploader', fatal=False))
location = self._html_search_regex(
r'<h4.*?>(.*?)</h4>', data_div, 'location', fatal=False)
duration = parse_duration(self._html_search_meta('duration', webpage))
VIDEO_URL_TEMPLATE = 'http://cdnapi.kaltura.com/p/%(uid)s/sp/%(wid)s/playManifest/entryId/%(entry_id)s/format/url/protocol/http'
kaltura_id = self._search_regex(
r'<div id="video-detail-player" data-kaltura-id="([^"]+)"',
webpage, 'kaltura ID')
video_url = VIDEO_URL_TEMPLATE % {
'entry_id': kaltura_id,
'wid': self._search_regex(r'/wid/_([0-9]+)/', webpage, 'wid'),
'uid': self._search_regex(r'uiconf_id/([0-9]+)/', webpage, 'uid'),
}
wid = self._search_regex(r'/wid/_([0-9]+)/', webpage, 'wid')
return {
'id': mobj.group('id'),
'url': video_url,
'ext': 'mp4',
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (wid, kaltura_id),
'ie_key': 'Kaltura',
'display_id': display_id,
'uploader_id': mobj.group('uploader_id'),
'thumbnail': thumbnail,
'description': self._html_search_meta('description', webpage),
'upload_date': upload_date,
'location': location,
'title': title,
'uploader': uploader,
'duration': duration,
}

View File

@@ -22,7 +22,7 @@ class NiconicoIE(InfoExtractor):
IE_NAME = 'niconico'
IE_DESC = 'ニコニコ動画'
_TEST = {
_TESTS = [{
'url': 'http://www.nicovideo.jp/watch/sm22312215',
'md5': 'd1a75c0823e2f629128c43e1212760f9',
'info_dict': {
@@ -39,9 +39,26 @@ class NiconicoIE(InfoExtractor):
'username': 'ydl.niconico@gmail.com',
'password': 'youtube-dl',
},
}
}, {
'url': 'http://www.nicovideo.jp/watch/nm14296458',
'md5': '8db08e0158457cf852a31519fceea5bc',
'info_dict': {
'id': 'nm14296458',
'ext': 'swf',
'title': '【鏡音リン】Dance on media【オリジナル】take2!',
'description': 'md5:',
'uploader': 'りょうた',
'uploader_id': '18822557',
'upload_date': '20110429',
'duration': 209,
},
'params': {
'username': 'ydl.niconico@gmail.com',
'password': 'youtube-dl',
},
}]
_VALID_URL = r'https?://(?:www\.|secure\.)?nicovideo\.jp/watch/((?:[a-z]{2})?[0-9]+)'
_VALID_URL = r'https?://(?:www\.|secure\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
# Determine whether the downloader used authentication to download video
_AUTHENTICATED = False
@@ -76,8 +93,7 @@ class NiconicoIE(InfoExtractor):
return True
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
video_id = self._match_id(url)
# Get video webpage. We are not actually interested in it, but need
# the cookies in order to be able to download the info webpage
@@ -90,7 +106,7 @@ class NiconicoIE(InfoExtractor):
if self._AUTHENTICATED:
# Get flv info
flv_info_webpage = self._download_webpage(
'http://flapi.nicovideo.jp/api/getflv?v=' + video_id,
'http://flapi.nicovideo.jp/api/getflv/' + video_id + '?as3=1',
video_id, 'Downloading flv info')
else:
# Get external player info

View File

@@ -219,7 +219,8 @@ class NPOLiveIE(NPOBaseIE):
if streams:
for stream in streams:
stream_type = stream.get('type').lower()
if stream_type == 'ss':
# smooth streaming is not supported
if stream_type in ['ss', 'ms']:
continue
stream_info = self._download_json(
'http://ida.omroep.nl/aapi/?stream=%s&token=%s&type=jsonp'
@@ -230,7 +231,10 @@ class NPOLiveIE(NPOBaseIE):
stream_url = self._download_json(
stream_info['stream'], display_id,
'Downloading %s URL' % stream_type,
transform_source=strip_jsonp)
'Unable to download %s URL' % stream_type,
transform_source=strip_jsonp, fatal=False)
if not stream_url:
continue
if stream_type == 'hds':
f4m_formats = self._extract_f4m_formats(stream_url, display_id)
# f4m downloader downloads only piece of live stream
@@ -242,6 +246,7 @@ class NPOLiveIE(NPOBaseIE):
else:
formats.append({
'url': stream_url,
'preference': -10,
})
self._sort_formats(formats)

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
float_or_none,
@@ -13,46 +14,48 @@ from ..utils import (
class NRKIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?nrk\.no/(?:video|lyd)/[^/]+/(?P<id>[\dA-F]{16})'
_VALID_URL = r'(?:nrk:|http://(?:www\.)?nrk\.no/video/PS\*)(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.nrk.no/video/dompap_og_andre_fugler_i_piip_show/D0FA54B5C8B6CE59/emne/piipshow/',
'md5': 'a6eac35052f3b242bb6bb7f43aed5886',
'url': 'http://www.nrk.no/video/PS*150533',
'md5': 'bccd850baebefe23b56d708a113229c2',
'info_dict': {
'id': '150533',
'ext': 'flv',
'title': 'Dompap og andre fugler i Piip-Show',
'description': 'md5:d9261ba34c43b61c812cb6b0269a5c8f'
'description': 'md5:d9261ba34c43b61c812cb6b0269a5c8f',
'duration': 263,
}
},
{
'url': 'http://www.nrk.no/lyd/lyd_av_oppleser_for_blinde/AEFDDD5473BA0198/',
'md5': '3471f2a51718195164e88f46bf427668',
'url': 'http://www.nrk.no/video/PS*154915',
'md5': '0b1493ba1aae7d9579a5ad5531bc395a',
'info_dict': {
'id': '154915',
'ext': 'flv',
'title': 'Slik høres internett ut når du er blind',
'description': 'md5:a621f5cc1bd75c8d5104cb048c6b8568',
'duration': 20,
}
},
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
page = self._download_webpage(url, video_id)
video_id = self._html_search_regex(r'<div class="nrk-video" data-nrk-id="(\d+)">', page, 'video id')
video_id = self._match_id(url)
data = self._download_json(
'http://v7.psapi.nrk.no/mediaelement/%s' % video_id, video_id, 'Downloading media JSON')
'http://v8.psapi.nrk.no/mediaelement/%s' % video_id,
video_id, 'Downloading media JSON')
if data['usageRights']['isGeoBlocked']:
raise ExtractorError('NRK har ikke rettig-heter til å vise dette programmet utenfor Norge', expected=True)
raise ExtractorError(
'NRK har ikke rettig-heter til å vise dette programmet utenfor Norge',
expected=True)
video_url = data['mediaUrl'] + '?hdcore=3.1.1&plugin=aasp-3.1.1.69.124'
video_url = data['mediaUrl'] + '?hdcore=3.5.0&plugin=aasp-3.5.0.151.81'
duration = parse_duration(data.get('duration'))
images = data.get('images')
if images:
@@ -68,10 +71,51 @@ class NRKIE(InfoExtractor):
'ext': 'flv',
'title': data['title'],
'description': data['description'],
'duration': duration,
'thumbnail': thumbnail,
}
class NRKPlaylistIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?nrk\.no/(?!video)(?:[^/]+/)+(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.nrk.no/troms/gjenopplev-den-historiske-solformorkelsen-1.12270763',
'info_dict': {
'id': 'gjenopplev-den-historiske-solformorkelsen-1.12270763',
'title': 'Gjenopplev den historiske solformørkelsen',
'description': 'md5:c2df8ea3bac5654a26fc2834a542feed',
},
'playlist_count': 2,
}, {
'url': 'http://www.nrk.no/kultur/bok/rivertonprisen-til-karin-fossum-1.12266449',
'info_dict': {
'id': 'rivertonprisen-til-karin-fossum-1.12266449',
'title': 'Rivertonprisen til Karin Fossum',
'description': 'Første kvinne på 15 år til å vinne krimlitteraturprisen.',
},
'playlist_count': 5,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result('nrk:%s' % video_id, 'NRK')
for video_id in re.findall(
r'class="[^"]*\brich\b[^"]*"[^>]+data-video-id="([^"]+)"',
webpage)
]
playlist_title = self._og_search_title(webpage)
playlist_description = self._og_search_description(webpage)
return self.playlist_result(
entries, playlist_id, playlist_title, playlist_description)
class NRKTVIE(InfoExtractor):
_VALID_URL = r'(?P<baseurl>http://tv\.nrk(?:super)?\.no/)(?:serie/[^/]+|program)/(?P<id>[a-zA-Z]{4}\d{8})(?:/\d{2}-\d{2}-\d{4})?(?:#del=(?P<part_id>\d+))?'
@@ -148,9 +192,6 @@ class NRKTVIE(InfoExtractor):
}
]
def _seconds2str(self, s):
return '%02d:%02d:%02d.%03d' % (s / 3600, (s % 3600) / 60, s % 60, (s % 1) * 1000)
def _debug_print(self, txt):
if self._downloader.params.get('verbose', False):
self.to_screen('[debug] %s' % txt)
@@ -158,17 +199,18 @@ class NRKTVIE(InfoExtractor):
def _get_subtitles(self, subtitlesurl, video_id, baseurl):
url = "%s%s" % (baseurl, subtitlesurl)
self._debug_print('%s: Subtitle url: %s' % (video_id, url))
captions = self._download_xml(url, video_id, 'Downloading subtitles')
captions = self._download_xml(
url, video_id, 'Downloading subtitles',
transform_source=lambda s: s.replace(r'<br />', '\r\n'))
lang = captions.get('lang', 'no')
ps = captions.findall('./{0}body/{0}div/{0}p'.format('{http://www.w3.org/ns/ttml}'))
srt = ''
for pos, p in enumerate(ps):
begin = parse_duration(p.get('begin'))
duration = parse_duration(p.get('dur'))
starttime = self._seconds2str(begin)
endtime = self._seconds2str(begin + duration)
text = '\n'.join(p.itertext())
srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (str(pos), starttime, endtime, text)
starttime = self._subtitles_timecode(begin)
endtime = self._subtitles_timecode(begin + duration)
srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (compat_str(pos), starttime, endtime, p.text)
return {lang: [
{'ext': 'ttml', 'url': url},
{'ext': 'srt', 'data': srt},

View File

@@ -1,15 +1,17 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import parse_iso8601
from ..utils import (
float_or_none,
int_or_none,
parse_iso8601,
)
class NYTimesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?nytimes\.com/video/(?:[^/]+/)+(?P<id>\d+)'
_VALID_URL = r'https?://(?:(?:www\.)?nytimes\.com/video/(?:[^/]+/)+?|graphics8\.nytimes\.com/bcvideo/\d+(?:\.\d+)?/iframe/embed\.html\?videoId=)(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://www.nytimes.com/video/opinion/100000002847155/verbatim-what-is-a-photocopier.html?playlistId=100000001150263',
'md5': '18a525a510f942ada2720db5f31644c0',
'info_dict': {
@@ -22,18 +24,21 @@ class NYTimesIE(InfoExtractor):
'uploader': 'Brett Weiner',
'duration': 419,
}
}
}, {
'url': 'http://www.nytimes.com/video/travel/100000003550828/36-hours-in-dubai.html',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
video_data = self._download_json(
'http://www.nytimes.com/svc/video/api/v2/video/%s' % video_id, video_id, 'Downloading video JSON')
'http://www.nytimes.com/svc/video/api/v2/video/%s' % video_id,
video_id, 'Downloading video JSON')
title = video_data['headline']
description = video_data['summary']
duration = video_data['duration'] / 1000.0
description = video_data.get('summary')
duration = float_or_none(video_data.get('duration'), 1000)
uploader = video_data['byline']
timestamp = parse_iso8601(video_data['publication_date'][:-8])
@@ -49,11 +54,11 @@ class NYTimesIE(InfoExtractor):
formats = [
{
'url': video['url'],
'format_id': video['type'],
'vcodec': video['video_codec'],
'width': video['width'],
'height': video['height'],
'filesize': get_file_size(video['fileSize']),
'format_id': video.get('type'),
'vcodec': video.get('video_codec'),
'width': int_or_none(video.get('width')),
'height': int_or_none(video.get('height')),
'filesize': get_file_size(video.get('fileSize')),
} for video in video_data['renditions']
]
self._sort_formats(formats)
@@ -61,7 +66,8 @@ class NYTimesIE(InfoExtractor):
thumbnails = [
{
'url': 'http://www.nytimes.com/%s' % image['url'],
'resolution': '%dx%d' % (image['width'], image['height']),
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
} for image in video_data['images']
]

View File

@@ -0,0 +1,85 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
unified_strdate,
int_or_none,
qualities,
)
class OdnoklassnikiIE(InfoExtractor):
_VALID_URL = r'https?://(?:odnoklassniki|ok)\.ru/(?:video|web-api/video/moviePlayer)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://ok.ru/video/20079905452',
'md5': '8e24ad2da6f387948e7a7d44eb8668fe',
'info_dict': {
'id': '20079905452',
'ext': 'mp4',
'title': 'Культура меняет нас (прекрасный ролик!))',
'duration': 100,
'upload_date': '20141207',
'uploader_id': '330537914540',
'uploader': 'Виталий Добровольский',
'like_count': int,
'age_limit': 0,
},
}, {
'url': 'http://ok.ru/web-api/video/moviePlayer/20079905452',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player = self._parse_json(
self._search_regex(
r"OKVideo\.start\(({.+?})\s*,\s*'VideoAutoplay_player'", webpage, 'player'),
video_id)
metadata = self._parse_json(player['flashvars']['metadata'], video_id)
movie = metadata['movie']
title = movie['title']
thumbnail = movie.get('poster')
duration = int_or_none(movie.get('duration'))
author = metadata.get('author', {})
uploader_id = author.get('id')
uploader = author.get('name')
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date'))
age_limit = None
adult = self._html_search_meta(
'ya:ovs:adult', webpage, 'age limit')
if adult:
age_limit = 18 if adult == 'true' else 0
like_count = int_or_none(metadata.get('likeCount'))
quality = qualities(('mobile', 'lowest', 'low', 'sd', 'hd'))
formats = [{
'url': f['url'],
'ext': 'mp4',
'format_id': f['name'],
'quality': quality(f['name']),
} for f in metadata['videos']]
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'upload_date': upload_date,
'uploader': uploader,
'uploader_id': uploader_id,
'like_count': like_count,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -11,6 +11,11 @@ from ..utils import (
HEADRequest,
unified_strdate,
ExtractorError,
strip_jsonp,
int_or_none,
float_or_none,
determine_ext,
remove_end,
)
@@ -197,3 +202,92 @@ class ORFFM4IE(InfoExtractor):
'description': data['subtitle'],
'entries': entries
}
class ORFIPTVIE(InfoExtractor):
IE_NAME = 'orf:iptv'
IE_DESC = 'iptv.ORF.at'
_VALID_URL = r'http://iptv\.orf\.at/(?:#/)?stories/(?P<id>\d+)'
_TEST = {
'url': 'http://iptv.orf.at/stories/2267952',
'md5': '26ffa4bab6dbce1eee78bbc7021016cd',
'info_dict': {
'id': '339775',
'ext': 'flv',
'title': 'Kreml-Kritiker Nawalny wieder frei',
'description': 'md5:6f24e7f546d364dacd0e616a9e409236',
'duration': 84.729,
'thumbnail': 're:^https?://.*\.jpg$',
'upload_date': '20150306',
},
}
def _real_extract(self, url):
story_id = self._match_id(url)
webpage = self._download_webpage(
'http://iptv.orf.at/stories/%s' % story_id, story_id)
video_id = self._search_regex(
r'data-video(?:id)?="(\d+)"', webpage, 'video id')
data = self._download_json(
'http://bits.orf.at/filehandler/static-api/json/current/data.json?file=%s' % video_id,
video_id)[0]
duration = float_or_none(data['duration'], 1000)
video = data['sources']['default']
load_balancer_url = video['loadBalancerUrl']
abr = int_or_none(video.get('audioBitrate'))
vbr = int_or_none(video.get('bitrate'))
fps = int_or_none(video.get('videoFps'))
width = int_or_none(video.get('videoWidth'))
height = int_or_none(video.get('videoHeight'))
thumbnail = video.get('preview')
rendition = self._download_json(
load_balancer_url, video_id, transform_source=strip_jsonp)
f = {
'abr': abr,
'vbr': vbr,
'fps': fps,
'width': width,
'height': height,
}
formats = []
for format_id, format_url in rendition['redirect'].items():
if format_id == 'rtmp':
ff = f.copy()
ff.update({
'url': format_url,
'format_id': format_id,
})
formats.append(ff)
elif determine_ext(format_url) == 'f4m':
formats.extend(self._extract_f4m_formats(
format_url, video_id, f4m_id=format_id))
elif determine_ext(format_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', m3u8_id=format_id))
else:
continue
self._sort_formats(formats)
title = remove_end(self._og_search_title(webpage), ' - iptv.ORF.at')
description = self._og_search_description(webpage)
upload_date = unified_strdate(self._html_search_meta(
'dc.date', webpage, 'upload date'))
return {
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'thumbnail': thumbnail,
'upload_date': upload_date,
'formats': formats,
}

View File

@@ -0,0 +1,90 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
xpath_text,
qualities,
)
class PladformIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:
out\.pladform\.ru/player|
static\.pladform\.ru/player\.swf
)
\?.*\bvideoid=|
video\.pladform\.ru/catalog/video/videoid/
)
(?P<id>\d+)
'''
_TESTS = [{
# http://muz-tv.ru/kinozal/view/7400/
'url': 'http://out.pladform.ru/player?pl=24822&videoid=100183293',
'md5': '61f37b575dd27f1bb2e1854777fe31f4',
'info_dict': {
'id': '100183293',
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • Тайна перевала Дятлова 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
}, {
'url': 'http://static.pladform.ru/player.swf?pl=21469&videoid=100183293&vkcid=0',
'only_matching': True,
}, {
'url': 'http://video.pladform.ru/catalog/video/videoid/100183293/vkcid/0',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_xml(
'http://out.pladform.ru/getVideo?pl=1&videoid=%s' % video_id,
video_id)
if video.tag == 'error':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, video.text),
expected=True)
quality = qualities(('ld', 'sd', 'hd'))
formats = [{
'url': src.text,
'format_id': src.get('quality'),
'quality': quality(src.get('quality')),
} for src in video.findall('./src')]
self._sort_formats(formats)
webpage = self._download_webpage(
'http://video.pladform.ru/catalog/video/videoid/%s' % video_id,
video_id)
title = self._og_search_title(webpage, fatal=False) or xpath_text(
video, './/title', 'title', fatal=True)
description = self._search_regex(
r'</h3>\s*<p>([^<]+)</p>', webpage, 'description', fatal=False)
thumbnail = self._og_search_thumbnail(webpage) or xpath_text(
video, './/cover', 'cover')
duration = int_or_none(xpath_text(video, './/time', 'duration'))
age_limit = int_or_none(xpath_text(video, './/age18', 'age limit'))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -0,0 +1,78 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
xpath_text,
float_or_none,
int_or_none,
)
class PlaywireIE(InfoExtractor):
_VALID_URL = r'https?://(?:config|cdn)\.playwire\.com(?:/v2)?/(?P<publisher_id>\d+)/(?:videos/v2|embed|config)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://config.playwire.com/14907/videos/v2/3353705/player.json',
'md5': 'e6398701e3595888125729eaa2329ed9',
'info_dict': {
'id': '3353705',
'ext': 'mp4',
'title': 'S04_RM_UCL_Rus',
'thumbnail': 're:^http://.*\.png$',
'duration': 145.94,
},
}, {
'url': 'http://cdn.playwire.com/11625/embed/85228.html',
'only_matching': True,
}, {
'url': 'http://config.playwire.com/12421/videos/v2/3389892/zeus.json',
'only_matching': True,
}, {
'url': 'http://cdn.playwire.com/v2/12342/config/1532636.json',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
publisher_id, video_id = mobj.group('publisher_id'), mobj.group('id')
player = self._download_json(
'http://config.playwire.com/%s/videos/v2/%s/zeus.json' % (publisher_id, video_id),
video_id)
title = player['settings']['title']
duration = float_or_none(player.get('duration'), 1000)
content = player['content']
thumbnail = content.get('poster')
src = content['media']['f4m']
f4m = self._download_xml(src, video_id)
base_url = xpath_text(f4m, './{http://ns.adobe.com/f4m/1.0}baseURL', 'base url', fatal=True)
formats = []
for media in f4m.findall('./{http://ns.adobe.com/f4m/1.0}media'):
media_url = media.get('url')
if not media_url:
continue
tbr = int_or_none(media.get('bitrate'))
width = int_or_none(media.get('width'))
height = int_or_none(media.get('height'))
f = {
'url': '%s/%s' % (base_url, media.attrib['url']),
'tbr': tbr,
'width': width,
'height': height,
}
if not (tbr or width or height):
f['quality'] = 1 if '-hd.' in media_url else 0
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}

View File

@@ -0,0 +1,69 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import ExtractorError
class PrimeShareTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?primeshare\.tv/download/(?P<id>[\da-zA-Z]+)'
_TEST = {
'url': 'http://primeshare.tv/download/238790B611',
'md5': 'b92d9bf5461137c36228009f31533fbc',
'info_dict': {
'id': '238790B611',
'ext': 'mp4',
'title': 'Public Domain - 1960s Commercial - Crest Toothpaste-YKsuFona',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if '>File not exist<' in webpage:
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
fields = dict(re.findall(r'''(?x)<input\s+
type="hidden"\s+
name="([^"]+)"\s+
(?:id="[^"]+"\s+)?
value="([^"]*)"
''', webpage))
headers = {
'Referer': url,
'Content-Type': 'application/x-www-form-urlencoded',
}
wait_time = int(self._search_regex(
r'var\s+cWaitTime\s*=\s*(\d+)',
webpage, 'wait time', default=7)) + 1
self._sleep(wait_time, video_id)
req = compat_urllib_request.Request(
url, compat_urllib_parse.urlencode(fields), headers)
video_page = self._download_webpage(
req, video_id, 'Downloading video page')
video_url = self._search_regex(
r"url\s*:\s*'([^']+\.primeshare\.tv(?::443)?/file/[^']+)'",
video_page, 'video url')
title = self._html_search_regex(
r'<h1>Watch\s*(?:&nbsp;)?\s*\((.+?)(?:\s*\[\.\.\.\])?\)\s*(?:&nbsp;)?\s*<strong>',
video_page, 'title')
return {
'id': video_id,
'url': video_url,
'title': title,
'ext': 'mp4',
}

View File

@@ -0,0 +1,88 @@
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
unified_strdate,
int_or_none,
)
class Puls4IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?puls4\.com/video/[^/]+/play/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.puls4.com/video/pro-und-contra/play/2716816',
'md5': '49f6a6629747eeec43cef6a46b5df81d',
'info_dict': {
'id': '2716816',
'ext': 'mp4',
'title': 'Pro und Contra vom 23.02.2015',
'description': 'md5:293e44634d9477a67122489994675db6',
'duration': 2989,
'upload_date': '20150224',
'uploader': 'PULS_4',
},
'skip': 'Only works from Germany',
}, {
'url': 'http://www.puls4.com/video/kult-spielfilme/play/1298106',
'md5': '6a48316c8903ece8dab9b9a7bf7a59ec',
'info_dict': {
'id': '1298106',
'ext': 'mp4',
'title': 'Lucky Fritz',
},
'skip': 'Only works from Germany',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
error_message = self._html_search_regex(
r'<div class="message-error">(.+?)</div>',
webpage, 'error message', default=None)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
real_url = self._html_search_regex(
r'\"fsk-button\".+?href=\"([^"]+)',
webpage, 'fsk_button', default=None)
if real_url:
webpage = self._download_webpage(real_url, video_id)
player = self._search_regex(
r'p4_video_player(?:_iframe)?\("video_\d+_container"\s*,(.+?)\);\s*\}',
webpage, 'player')
player_json = self._parse_json(
'[%s]' % player, video_id,
transform_source=lambda s: s.replace('undefined,', ''))
formats = None
result = None
for v in player_json:
if isinstance(v, list) and not formats:
formats = [{
'url': f['url'],
'format': 'hd' if f.get('hd') else 'sd',
'width': int_or_none(f.get('size_x')),
'height': int_or_none(f.get('size_y')),
'tbr': int_or_none(f.get('bitrate')),
} for f in v]
self._sort_formats(formats)
elif isinstance(v, dict) and not result:
result = {
'id': video_id,
'title': v['videopartname'].strip(),
'description': v.get('videotitle'),
'duration': int_or_none(v.get('videoduration') or v.get('episodeduration')),
'upload_date': unified_strdate(v.get('clipreleasetime')),
'uploader': v.get('channel'),
}
result['formats'] = formats
return result

View File

@@ -146,7 +146,7 @@ class RTLnowIE(InfoExtractor):
mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text)
if mobj:
fmt = {
'url': 'rtmpe://fmspay-fra2.rtl.de/' + mobj.group('hoster'),
'url': 'rtmpe://fms.rtl.de/' + mobj.group('hoster'),
'play_path': 'mp4:' + mobj.group('play_path'),
'page_url': url,
'player_url': video_page_url + 'includes/vodplayer.swf',

View File

@@ -8,8 +8,9 @@ import time
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
struct_unpack,
float_or_none,
remove_end,
struct_unpack,
)
@@ -67,6 +68,7 @@ class RTVEALaCartaIE(InfoExtractor):
'id': '2491869',
'ext': 'mp4',
'title': 'Balonmano - Swiss Cup masculina. Final: España-Suecia',
'duration': 5024.566,
},
}, {
'note': 'Live stream',
@@ -113,16 +115,59 @@ class RTVEALaCartaIE(InfoExtractor):
'thumbnail': info.get('image'),
'page_url': url,
'subtitles': subtitles,
'duration': float_or_none(info.get('duration'), scale=1000),
}
def _get_subtitles(self, video_id, sub_file):
subs = self._download_json(
sub_file + '.json', video_id,
'Downloading subtitles info')['page']['items']
return dict((s['lang'], [{'ext': 'vtt', 'url': s['src']}])
return dict(
(s['lang'], [{'ext': 'vtt', 'url': s['src']}])
for s in subs)
class RTVEInfantilIE(InfoExtractor):
IE_NAME = 'rtve.es:infantil'
IE_DESC = 'RTVE infantil'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/infantil/serie/(?P<show>[^/]*)/video/(?P<short_title>[^/]*)/(?P<id>[0-9]+)/'
_TESTS = [{
'url': 'http://www.rtve.es/infantil/serie/cleo/video/maneras-vivir/3040283/',
'md5': '915319587b33720b8e0357caaa6617e6',
'info_dict': {
'id': '3040283',
'ext': 'mp4',
'title': 'Maneras de vivir',
'thumbnail': 'http://www.rtve.es/resources/jpg/6/5/1426182947956.JPG',
'duration': 357.958,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._download_json(
'http://www.rtve.es/api/videos/%s/config/alacarta_videos.json' % video_id,
video_id)['page']['items'][0]
webpage = self._download_webpage(url, video_id)
vidplayer_id = self._search_regex(
r' id="vidplayer([0-9]+)"', webpage, 'internal video ID')
png_url = 'http://www.rtve.es/ztnr/movil/thumbnail/default/videos/%s.png' % vidplayer_id
png = self._download_webpage(png_url, video_id, 'Downloading url information')
video_url = _decrypt_url(png)
return {
'id': video_id,
'ext': 'mp4',
'title': info['title'],
'url': video_url,
'thumbnail': info.get('image'),
'duration': float_or_none(info.get('duration'), scale=1000),
}
class RTVELiveIE(InfoExtractor):
IE_NAME = 'rtve.es:live'
IE_DESC = 'RTVE.es live streams'

View File

@@ -4,22 +4,87 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .common import compat_str
from ..compat import (
compat_str,
compat_urllib_request
)
from ..utils import sanitize_url_path_consecutive_slashes
class SohuIE(InfoExtractor):
_VALID_URL = r'https?://(?P<mytv>my\.)?tv\.sohu\.com/.+?/(?(mytv)|n)(?P<id>\d+)\.shtml.*?'
_TEST = {
_TESTS = [{
'note': 'This video is available only in Mainland China',
'url': 'http://tv.sohu.com/20130724/n382479172.shtml#super',
'md5': 'bde8d9a6ffd82c63a1eefaef4eeefec7',
'md5': '29175c8cadd8b5cc4055001e85d6b372',
'info_dict': {
'id': '382479172',
'ext': 'mp4',
'title': 'MVFar East Movement《The Illest》',
},
'skip': 'Only available from China',
}
'params': {
'cn_verification_proxy': 'proxy.uku.im:8888'
}
}, {
'url': 'http://tv.sohu.com/20150305/n409385080.shtml',
'md5': '699060e75cf58858dd47fb9c03c42cfb',
'info_dict': {
'id': '409385080',
'ext': 'mp4',
'title': '《2015湖南卫视羊年元宵晚会》唐嫣《花好月圆》',
}
}, {
'url': 'http://my.tv.sohu.com/us/232799889/78693464.shtml',
'md5': '9bf34be48f2f4dadcb226c74127e203c',
'info_dict': {
'id': '78693464',
'ext': 'mp4',
'title': '【爱范品】第31期MWC见不到的奇葩手机',
}
}, {
'note': 'Multipart video',
'url': 'http://my.tv.sohu.com/pl/8384802/78910339.shtml',
'info_dict': {
'id': '78910339',
},
'playlist': [{
'md5': 'bdbfb8f39924725e6589c146bc1883ad',
'info_dict': {
'id': '78910339_part1',
'ext': 'mp4',
'duration': 294,
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
}
}, {
'md5': '3e1f46aaeb95354fd10e7fca9fc1804e',
'info_dict': {
'id': '78910339_part2',
'ext': 'mp4',
'duration': 300,
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
}
}, {
'md5': '8407e634175fdac706766481b9443450',
'info_dict': {
'id': '78910339_part3',
'ext': 'mp4',
'duration': 150,
'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
}
}]
}, {
'note': 'Video with title containing dash',
'url': 'http://my.tv.sohu.com/us/249884221/78932792.shtml',
'info_dict': {
'id': '78932792',
'ext': 'mp4',
'title': 'youtube-dl testing video',
},
'params': {
'skip_download': True
}
}]
def _real_extract(self, url):
@@ -29,8 +94,14 @@ class SohuIE(InfoExtractor):
else:
base_data_url = 'http://hot.vrs.sohu.com/vrs_flash.action?vid='
req = compat_urllib_request.Request(base_data_url + vid_id)
cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
if cn_verification_proxy:
req.add_header('Ytdl-request-proxy', cn_verification_proxy)
return self._download_json(
base_data_url + vid_id, video_id,
req, video_id,
'Downloading JSON data for %s' % vid_id)
mobj = re.match(self._VALID_URL, url)
@@ -38,10 +109,8 @@ class SohuIE(InfoExtractor):
mytv = mobj.group('mytv') is not None
webpage = self._download_webpage(url, video_id)
raw_title = self._html_search_regex(
r'(?s)<title>(.+?)</title>',
webpage, 'video title')
title = raw_title.partition('-')[0].strip()
title = self._og_search_title(webpage)
vid = self._html_search_regex(
r'var vid ?= ?["\'](\d+)["\']',
@@ -77,7 +146,9 @@ class SohuIE(InfoExtractor):
% (format_id, i + 1, part_count))
part_info = part_str.split('|')
video_url = '%s%s?key=%s' % (part_info[0], su[i], part_info[3])
video_url = sanitize_url_path_consecutive_slashes(
'%s%s?key=%s' % (part_info[0], su[i], part_info[3]))
formats.append({
'url': video_url,

View File

@@ -180,7 +180,7 @@ class SoundcloudIE(InfoExtractor):
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': ext,
'ext': 'flv',
'vcodec': 'none',
})
@@ -200,8 +200,9 @@ class SoundcloudIE(InfoExtractor):
if f['format_id'].startswith('rtmp'):
f['protocol'] = 'rtmp'
self._sort_formats(formats)
result['formats'] = formats
self._check_formats(formats, track_id)
self._sort_formats(formats)
result['formats'] = formats
return result

View File

@@ -0,0 +1,58 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
unescapeHTML,
parse_duration,
)
class SSAIE(InfoExtractor):
_VALID_URL = r'http://ssa\.nls\.uk/film/(?P<id>\d+)'
_TEST = {
'url': 'http://ssa.nls.uk/film/3561',
'info_dict': {
'id': '3561',
'ext': 'flv',
'title': 'SHETLAND WOOL',
'description': 'md5:c5afca6871ad59b4271e7704fe50ab04',
'duration': 900,
'thumbnail': 're:^https?://.*\.jpg$',
},
'params': {
# rtmp download
'skip_download': True,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
streamer = self._search_regex(
r"'streamer'\s*,\S*'(rtmp[^']+)'", webpage, 'streamer')
play_path = self._search_regex(
r"'file'\s*,\s*'([^']+)'", webpage, 'file').rpartition('.')[0]
def search_field(field_name, fatal=False):
return self._search_regex(
r'<span\s+class="field_title">%s:</span>\s*<span\s+class="field_content">([^<]+)</span>' % field_name,
webpage, 'title', fatal=fatal)
title = unescapeHTML(search_field('Title', fatal=True)).strip('()[]')
description = unescapeHTML(search_field('Description'))
duration = parse_duration(search_field('Running time'))
thumbnail = self._search_regex(
r"'image'\s*,\s*'([^']+)'", webpage, 'thumbnails', fatal=False)
return {
'id': video_id,
'url': streamer,
'play_path': play_path,
'ext': 'flv',
'title': title,
'description': description,
'duration': duration,
'thumbnail': thumbnail,
}

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
@@ -8,23 +10,40 @@ from ..utils import (
class SVTPlayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?svtplay\.se/video/(?P<id>[0-9]+)'
_TEST = {
IE_DESC = 'SVT Play and Öppet arkiv'
_VALID_URL = r'https?://(?:www\.)?(?P<host>svtplay|oppetarkiv)\.se/video/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.svtplay.se/video/2609989/sm-veckan/sm-veckan-rally-final-sasong-1-sm-veckan-rally-final',
'md5': 'f4a184968bc9c802a9b41316657aaa80',
'md5': 'ade3def0643fa1c40587a422f98edfd9',
'info_dict': {
'id': '2609989',
'ext': 'mp4',
'ext': 'flv',
'title': 'SM veckan vinter, Örebro - Rally, final',
'duration': 4500,
'thumbnail': 're:^https?://.*[\.-]jpg$',
'age_limit': 0,
},
}
}, {
'url': 'http://www.oppetarkiv.se/video/1058509/rederiet-sasong-1-avsnitt-1-av-318',
'md5': 'c3101a17ce9634f4c1f9800f0746c187',
'info_dict': {
'id': '1058509',
'ext': 'flv',
'title': 'Farlig kryssning',
'duration': 2566,
'thumbnail': 're:^https?://.*[\.-]jpg$',
'age_limit': 0,
},
'skip': 'Only works from Sweden',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
host = mobj.group('host')
info = self._download_json(
'http://www.svtplay.se/video/%s?output=json' % video_id, video_id)
'http://www.%s.se/video/%s?output=json' % (host, video_id), video_id)
title = info['context']['title']
thumbnail = info['context'].get('thumbnailImage')
@@ -33,11 +52,16 @@ class SVTPlayIE(InfoExtractor):
formats = []
for vr in video_info['videoReferences']:
vurl = vr['url']
if determine_ext(vurl) == 'm3u8':
ext = determine_ext(vurl)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
vurl, video_id,
ext='mp4', entry_protocol='m3u8_native',
m3u8_id=vr.get('playerType')))
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
vurl + '?hdcore=3.3.0', video_id,
f4m_id=vr.get('playerType')))
else:
formats.append({
'format_id': vr.get('playerType'),
@@ -46,6 +70,7 @@ class SVTPlayIE(InfoExtractor):
self._sort_formats(formats)
duration = video_info.get('materialLength')
age_limit = 18 if video_info.get('inappropriateForChildren') else 0
return {
'id': video_id,
@@ -53,4 +78,5 @@ class SVTPlayIE(InfoExtractor):
'formats': formats,
'thumbnail': thumbnail,
'duration': duration,
'age_limit': age_limit,
}

View File

@@ -53,10 +53,10 @@ class TeamcocoIE(InfoExtractor):
embed = self._download_webpage(
embed_url, video_id, 'Downloading embed page')
encoded_data = self._search_regex(
r'"preload"\s*:\s*"([^"]+)"', embed, 'encoded data')
player_data = self._parse_json(self._search_regex(
r'Y\.Ginger\.Module\.Player\((\{.*?\})\);', embed, 'player data'), video_id)
data = self._parse_json(
base64.b64decode(encoded_data.encode('ascii')).decode('utf-8'), video_id)
base64.b64decode(player_data['preload'].encode('ascii')).decode('utf-8'), video_id)
formats = []
get_quality = qualities(['500k', '480p', '1000k', '720p', '1080p'])

View File

@@ -6,9 +6,9 @@ from .mitele import MiTeleIE
class TelecincoIE(MiTeleIE):
IE_NAME = 'telecinco.es'
_VALID_URL = r'https?://www\.telecinco\.es/[^/]+/[^/]+/[^/]+/(?P<id>.*?)\.html'
_VALID_URL = r'https?://www\.telecinco\.es/[^/]+/[^/]+/(?:[^/]+/)?(?P<id>.*?)\.html'
_TEST = {
_TESTS = [{
'url': 'http://www.telecinco.es/robinfood/temporada-01/t01xp14/Bacalao-cocochas-pil-pil_0_1876350223.html',
'info_dict': {
'id': 'MDSVID20141015_0058',
@@ -16,4 +16,7 @@ class TelecincoIE(MiTeleIE):
'title': 'Con Martín Berasategui, hacer un bacalao al ...',
'duration': 662,
},
}
}, {
'url': 'http://www.telecinco.es/informativos/nacional/Pablo_Iglesias-Informativos_Telecinco-entrevista-Pedro_Piqueras_2_1945155182.html',
'only_matching': True,
}]

View File

@@ -16,6 +16,7 @@ class TVPlayIE(InfoExtractor):
_VALID_URL = r'''(?x)http://(?:www\.)?
(?:tvplay\.lv/parraides|
tv3play\.lt/programos|
play\.tv3\.lt/programos|
tv3play\.ee/sisu|
tv3play\.se/program|
tv6play\.se/program|
@@ -45,7 +46,7 @@ class TVPlayIE(InfoExtractor):
},
},
{
'url': 'http://www.tv3play.lt/programos/moterys-meluoja-geriau/409229?autostart=true',
'url': 'http://play.tv3.lt/programos/moterys-meluoja-geriau/409229?autostart=true',
'info_dict': {
'id': '409229',
'ext': 'flv',

View File

@@ -23,6 +23,8 @@ class TwitchBaseIE(InfoExtractor):
_API_BASE = 'https://api.twitch.tv'
_USHER_BASE = 'http://usher.twitch.tv'
_LOGIN_URL = 'https://secure.twitch.tv/user/login'
_LOGIN_POST_URL = 'https://secure-login.twitch.tv/login'
_NETRC_MACHINE = 'twitch'
def _handle_error(self, response):
if not isinstance(response, dict):
@@ -34,7 +36,15 @@ class TwitchBaseIE(InfoExtractor):
expected=True)
def _download_json(self, url, video_id, note='Downloading JSON metadata'):
response = super(TwitchBaseIE, self)._download_json(url, video_id, note)
headers = {
'Referer': 'http://api.twitch.tv/crossdomain/receiver.html?v=2',
'X-Requested-With': 'XMLHttpRequest',
}
for cookie in self._downloader.cookiejar:
if cookie.name == 'api_token':
headers['Twitch-Api-Token'] = cookie.value
request = compat_urllib_request.Request(url, headers=headers)
response = super(TwitchBaseIE, self)._download_json(request, video_id, note)
self._handle_error(response)
return response
@@ -58,14 +68,14 @@ class TwitchBaseIE(InfoExtractor):
'authenticity_token': authenticity_token,
'redirect_on_login': '',
'embed_form': 'false',
'mp_source_action': '',
'mp_source_action': 'login-button',
'follow': '',
'user[login]': username,
'user[password]': password,
'login': username,
'password': password,
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
self._LOGIN_POST_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
@@ -76,6 +86,14 @@ class TwitchBaseIE(InfoExtractor):
raise ExtractorError(
'Unable to login: %s' % m.group('msg').strip(), expected=True)
def _prefer_source(self, formats):
try:
source = next(f for f in formats if f['format_id'] == 'Source')
source['preference'] = 10
except StopIteration:
pass # No Source stream present
self._sort_formats(formats)
class TwitchItemBaseIE(TwitchBaseIE):
def _download_info(self, item, item_id):
@@ -131,7 +149,7 @@ class TwitchItemBaseIE(TwitchBaseIE):
class TwitchVideoIE(TwitchItemBaseIE):
IE_NAME = 'twitch:video'
_VALID_URL = r'%s/[^/]+/b/(?P<id>[^/]+)' % TwitchBaseIE._VALID_URL_BASE
_VALID_URL = r'%s/[^/]+/b/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
_ITEM_TYPE = 'video'
_ITEM_SHORTCUT = 'a'
@@ -147,7 +165,7 @@ class TwitchVideoIE(TwitchItemBaseIE):
class TwitchChapterIE(TwitchItemBaseIE):
IE_NAME = 'twitch:chapter'
_VALID_URL = r'%s/[^/]+/c/(?P<id>[^/]+)' % TwitchBaseIE._VALID_URL_BASE
_VALID_URL = r'%s/[^/]+/c/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
_ITEM_TYPE = 'chapter'
_ITEM_SHORTCUT = 'c'
@@ -166,7 +184,7 @@ class TwitchChapterIE(TwitchItemBaseIE):
class TwitchVodIE(TwitchItemBaseIE):
IE_NAME = 'twitch:vod'
_VALID_URL = r'%s/[^/]+/v/(?P<id>[^/]+)' % TwitchBaseIE._VALID_URL_BASE
_VALID_URL = r'%s/[^/]+/v/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
_ITEM_TYPE = 'vod'
_ITEM_SHORTCUT = 'v'
@@ -200,6 +218,7 @@ class TwitchVodIE(TwitchItemBaseIE):
'%s/vod/%s?nauth=%s&nauthsig=%s'
% (self._USHER_BASE, item_id, access_token['token'], access_token['sig']),
item_id, 'mp4')
self._prefer_source(formats)
info['formats'] = formats
return info
@@ -340,21 +359,14 @@ class TwitchStreamIE(TwitchBaseIE):
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'segment_preference': '4',
'sig': access_token['sig'],
'token': access_token['token'],
'sig': access_token['sig'].encode('utf-8'),
'token': access_token['token'].encode('utf-8'),
}
formats = self._extract_m3u8_formats(
'%s/api/channel/hls/%s.m3u8?%s'
% (self._USHER_BASE, channel_id, compat_urllib_parse.urlencode(query).encode('utf-8')),
% (self._USHER_BASE, channel_id, compat_urllib_parse.urlencode(query)),
channel_id, 'mp4')
# prefer the 'source' stream, the others are limited to 30 fps
def _sort_source(f):
if f.get('m3u8_media') is not None and f['m3u8_media'].get('NAME') == 'Source':
return 1
return 0
formats = sorted(formats, key=_sort_source)
self._prefer_source(formats)
view_count = stream.get('viewers')
timestamp = parse_iso8601(stream.get('created_at'))

View File

@@ -0,0 +1,104 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
qualities,
unified_strdate,
clean_html,
)
class UltimediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ultimedia\.com/default/index/video[^/]+/id/(?P<id>[\d+a-z]+)'
_TESTS = [{
# news
'url': 'https://www.ultimedia.com/default/index/videogeneric/id/s8uk0r',
'md5': '276a0e49de58c7e85d32b057837952a2',
'info_dict': {
'id': 's8uk0r',
'ext': 'mp4',
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
'description': 'md5:3e5c8fd65791487333dda5db8aed32af',
'thumbnail': 're:^https?://.*\.jpg',
'upload_date': '20150317',
},
}, {
# music
'url': 'https://www.ultimedia.com/default/index/videomusic/id/xvpfp8',
'md5': '2ea3513813cf230605c7e2ffe7eca61c',
'info_dict': {
'id': 'xvpfp8',
'ext': 'mp4',
'title': "Two - C'est la vie (Clip)",
'description': 'Two',
'thumbnail': 're:^https?://.*\.jpg',
'upload_date': '20150224',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
deliver_url = self._search_regex(
r'<iframe[^>]+src="(https?://(?:www\.)?ultimedia\.com/deliver/[^"]+)"',
webpage, 'deliver URL')
deliver_page = self._download_webpage(
deliver_url, video_id, 'Downloading iframe page')
if '>This video is currently not available' in deliver_page:
raise ExtractorError(
'Video %s is currently not available' % video_id, expected=True)
player = self._parse_json(
self._search_regex(
r"jwplayer\('player(?:_temp)?'\)\.setup\(({.+?})\)\.on", deliver_page, 'player'),
video_id)
quality = qualities(['flash', 'html5'])
formats = []
for mode in player['modes']:
video_url = mode.get('config', {}).get('file')
if not video_url:
continue
if re.match(r'https?://www\.youtube\.com/.+?', video_url):
return self.url_result(video_url, 'Youtube')
formats.append({
'url': video_url,
'format_id': mode.get('type'),
'quality': quality(mode.get('type')),
})
self._sort_formats(formats)
thumbnail = player.get('image')
title = clean_html((
self._html_search_regex(
r'(?s)<div\s+id="catArticle">.+?</div>(.+?)</h1>',
webpage, 'title', default=None)
or self._search_regex(
r"var\s+nameVideo\s*=\s*'([^']+)'",
deliver_page, 'title')))
description = clean_html(self._html_search_regex(
r'(?s)<span>Description</span>(.+?)</p>', webpage,
'description', fatal=False))
upload_date = unified_strdate(self._search_regex(
r'Ajouté le\s*<span>([^<]+)', webpage,
'upload date', fatal=False))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
'formats': formats,
}

View File

@@ -4,28 +4,21 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
remove_start,
)
from ..compat import compat_urllib_request
class VideoMegaIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://
(?:www\.)?videomega\.tv/
(?:iframe\.php)?\?ref=(?P<id>[A-Za-z0-9]+)
(?:iframe\.php|cdn\.php)?\?ref=(?P<id>[A-Za-z0-9]+)
'''
_TEST = {
'url': 'http://videomega.tv/?ref=QR0HCUHI1661IHUCH0RQ',
'url': 'http://videomega.tv/?ref=4GNA688SU99US886ANG4',
'md5': 'bf5c2f95c4c917536e80936af7bc51e1',
'info_dict': {
'id': 'QR0HCUHI1661IHUCH0RQ',
'id': '4GNA688SU99US886ANG4',
'ext': 'mp4',
'title': 'Big Buck Bunny',
'title': 'BigBuckBunny_320x180',
'thumbnail': 're:^https?://.*\.jpg$',
}
}
@@ -33,34 +26,24 @@ class VideoMegaIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
iframe_url = 'http://videomega.tv/iframe.php?ref={0:}'.format(video_id)
iframe_url = 'http://videomega.tv/cdn.php?ref=%s' % video_id
req = compat_urllib_request.Request(iframe_url)
req.add_header('Referer', url)
webpage = self._download_webpage(req, video_id)
try:
escaped_data = re.findall(r'unescape\("([^"]+)"\)', webpage)[-1]
except IndexError:
raise ExtractorError('Unable to extract escaped data')
playlist = compat_urllib_parse.unquote(escaped_data)
title = self._html_search_regex(
r'<title>(.*?)</title>', webpage, 'title')
title = re.sub(
r'(?:^[Vv]ideo[Mm]ega\.tv\s-\s?|\s?-\svideomega\.tv$)', '', title)
thumbnail = self._search_regex(
r'image:\s*"([^"]+)"', playlist, 'thumbnail', fatal=False)
video_url = self._search_regex(r'file:\s*"([^"]+)"', playlist, 'URL')
title = remove_start(self._html_search_regex(
r'<title>(.*?)</title>', webpage, 'title'), 'VideoMega.tv - ')
formats = [{
'format_id': 'sd',
'url': video_url,
}]
self._sort_formats(formats)
r'<video[^>]+?poster="([^"]+)"', webpage, 'thumbnail', fatal=False)
video_url = self._search_regex(
r'<source[^>]+?src="([^"]+)"', webpage, 'video URL')
return {
'id': video_id,
'title': title,
'formats': formats,
'url': video_url,
'thumbnail': thumbnail,
'http_headers': {
'Referer': iframe_url,

View File

@@ -1,7 +1,5 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
@@ -28,12 +26,11 @@ class VidmeIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(r'<source src="([^"]+)"', webpage, 'video URL')
video_url = self._html_search_regex(
r'<source src="([^"]+)"', webpage, 'video URL')
title = self._og_search_title(webpage)
description = self._og_search_description(webpage, default='')
@@ -44,13 +41,10 @@ class VidmeIE(InfoExtractor):
duration = float_or_none(self._html_search_regex(
r'data-duration="([^"]+)"', webpage, 'duration', fatal=False))
view_count = str_to_int(self._html_search_regex(
r'<span class="video_views">\s*([\d,\.]+)\s*plays?', webpage, 'view count', fatal=False))
r'<(?:li|span) class="video_views">\s*([\d,\.]+)\s*plays?', webpage, 'view count', fatal=False))
like_count = str_to_int(self._html_search_regex(
r'class="score js-video-vote-score"[^>]+data-score="([\d,\.\s]+)">',
webpage, 'like count', fatal=False))
comment_count = str_to_int(self._html_search_regex(
r'class="js-comment-count"[^>]+data-count="([\d,\.\s]+)">',
webpage, 'comment count', fatal=False))
return {
'id': video_id,
@@ -64,5 +58,4 @@ class VidmeIE(InfoExtractor):
'duration': duration,
'view_count': view_count,
'like_count': like_count,
'comment_count': comment_count,
}

View File

@@ -0,0 +1,129 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_request
class ViewsterIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?viewster\.com/movie/(?P<id>\d+-\d+-\d+)'
_TESTS = [{
# movielink, paymethod=fre
'url': 'http://www.viewster.com/movie/1293-19341-000/hout-wood/',
'playlist': [{
'md5': '8f9d94b282d80c42b378dffdbb11caf3',
'info_dict': {
'id': '1293-19341-000-movie',
'ext': 'flv',
'title': "'Hout' (Wood) - Movie",
},
}],
'info_dict': {
'id': '1293-19341-000',
'title': "'Hout' (Wood)",
'description': 'md5:925733185a9242ef96f436937683f33b',
}
}, {
# movielink, paymethod=adv
'url': 'http://www.viewster.com/movie/1140-11855-000/the-listening-project/',
'playlist': [{
'md5': '77a005453ca7396cbe3d35c9bea30aef',
'info_dict': {
'id': '1140-11855-000-movie',
'ext': 'flv',
'title': "THE LISTENING PROJECT - Movie",
},
}],
'info_dict': {
'id': '1140-11855-000',
'title': "THE LISTENING PROJECT",
'description': 'md5:714421ae9957e112e672551094bf3b08',
}
}, {
# direct links, no movielink
'url': 'http://www.viewster.com/movie/1198-56411-000/sinister/',
'playlist': [{
'md5': '0307b7eac6bfb21ab0577a71f6eebd8f',
'info_dict': {
'id': '1198-56411-000-trailer',
'ext': 'mp4',
'title': "Sinister - Trailer",
},
}, {
'md5': '80b9ee3ad69fb368f104cb5d9732ae95',
'info_dict': {
'id': '1198-56411-000-behind-scenes',
'ext': 'mp4',
'title': "Sinister - Behind Scenes",
},
}, {
'md5': '3b3ea897ecaa91fca57a8a94ac1b15c5',
'info_dict': {
'id': '1198-56411-000-scene-from-movie',
'ext': 'mp4',
'title': "Sinister - Scene from movie",
},
}],
'info_dict': {
'id': '1198-56411-000',
'title': "Sinister",
'description': 'md5:014c40b0488848de9683566a42e33372',
}
}]
_ACCEPT_HEADER = 'application/json, text/javascript, */*; q=0.01'
def _real_extract(self, url):
video_id = self._match_id(url)
request = compat_urllib_request.Request(
'http://api.live.viewster.com/api/v1/movie/%s' % video_id)
request.add_header('Accept', self._ACCEPT_HEADER)
movie = self._download_json(
request, video_id, 'Downloading movie metadata JSON')
title = movie.get('title') or movie['original_title']
description = movie.get('synopsis')
thumbnail = movie.get('large_artwork') or movie.get('artwork')
entries = []
for clip in movie['play_list']:
entry = None
# movielink api
link_request = clip.get('link_request')
if link_request:
request = compat_urllib_request.Request(
'http://api.live.viewster.com/api/v1/movielink?movieid=%(movieid)s&action=%(action)s&paymethod=%(paymethod)s&price=%(price)s&currency=%(currency)s&language=%(language)s&subtitlelanguage=%(subtitlelanguage)s&ischromecast=%(ischromecast)s'
% link_request)
request.add_header('Accept', self._ACCEPT_HEADER)
movie_link = self._download_json(
request, video_id, 'Downloading movie link JSON', fatal=False)
if movie_link:
formats = self._extract_f4m_formats(
movie_link['url'] + '&hdcore=3.2.0&plugin=flowplayer-3.2.0.1', video_id)
self._sort_formats(formats)
entry = {
'formats': formats,
}
# direct link
clip_url = clip.get('clip_data', {}).get('url')
if clip_url:
entry = {
'url': clip_url,
'ext': 'mp4',
}
if entry:
entry.update({
'id': '%s-%s' % (video_id, clip['canonical_title']),
'title': '%s - %s' % (title, clip['title']),
})
entries.append(entry)
playlist = self.playlist_result(entries, video_id, title, description)
playlist['thumbnail'] = thumbnail
return playlist

View File

@@ -4,7 +4,6 @@ from __future__ import unicode_literals
import json
import re
import itertools
import hashlib
from .common import InfoExtractor
from ..compat import (
@@ -20,6 +19,7 @@ from ..utils import (
RegexNotFoundError,
smuggle_url,
std_headers,
unified_strdate,
unsmuggle_url,
urlencode_postdata,
)
@@ -38,7 +38,7 @@ class VimeoBaseInfoExtractor(InfoExtractor):
self.report_login()
login_url = 'https://vimeo.com/log_in'
webpage = self._download_webpage(login_url, None, False)
token = self._search_regex(r'xsrft: \'(.*?)\'', webpage, 'login token')
token = self._search_regex(r'xsrft = \'(.*?)\'', webpage, 'login token')
data = urlencode_postdata({
'email': username,
'password': password,
@@ -140,6 +140,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'description': 'md5:8678b246399b070816b12313e8b4eb5c',
'uploader_id': 'atencio',
'uploader': 'Peter Atencio',
'upload_date': '20130927',
'duration': 187,
},
},
@@ -176,17 +177,15 @@ class VimeoIE(VimeoBaseInfoExtractor):
password = self._downloader.params.get('videopassword', None)
if password is None:
raise ExtractorError('This video is protected by a password, use the --video-password option', expected=True)
token = self._search_regex(r'xsrft: \'(.*?)\'', webpage, 'login token')
data = compat_urllib_parse.urlencode({
token = self._search_regex(r'xsrft = \'(.*?)\'', webpage, 'login token')
data = urlencode_postdata({
'password': password,
'token': token,
})
# I didn't manage to use the password with https
if url.startswith('https'):
pass_url = url.replace('https', 'http')
else:
pass_url = url
password_request = compat_urllib_request.Request(pass_url + '/password', data)
if url.startswith('http://'):
# vimeo only supports https now, but the user can give an http url
url = url.replace('http://', 'https://')
password_request = compat_urllib_request.Request(url + '/password', data)
password_request.add_header('Content-Type', 'application/x-www-form-urlencoded')
password_request.add_header('Cookie', 'xsrft=%s' % token)
return self._download_webpage(
@@ -223,12 +222,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
video_id = mobj.group('id')
orig_url = url
if mobj.group('pro') or mobj.group('player'):
url = 'http://player.vimeo.com/video/' + video_id
password = self._downloader.params.get('videopassword', None)
if password:
headers['Cookie'] = '%s_password=%s' % (
video_id, hashlib.md5(password.encode('utf-8')).hexdigest())
url = 'https://player.vimeo.com/video/' + video_id
# Retrieve video webpage to extract further information
request = compat_urllib_request.Request(url, None, headers)
@@ -323,9 +317,9 @@ class VimeoIE(VimeoBaseInfoExtractor):
# Extract upload date
video_upload_date = None
mobj = re.search(r'<meta itemprop="dateCreated" content="(\d{4})-(\d{2})-(\d{2})T', webpage)
mobj = re.search(r'<time[^>]+datetime="([^"]+)"', webpage)
if mobj is not None:
video_upload_date = mobj.group(1) + mobj.group(2) + mobj.group(3)
video_upload_date = unified_strdate(mobj.group(1))
try:
view_count = int(self._search_regex(r'UserPlays:(\d+)', webpage, 'view count'))
@@ -379,7 +373,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
for tt in text_tracks:
subtitles[tt['lang']] = [{
'ext': 'vtt',
'url': 'http://vimeo.com' + tt['url'],
'url': 'https://vimeo.com' + tt['url'],
}]
return {
@@ -402,11 +396,11 @@ class VimeoIE(VimeoBaseInfoExtractor):
class VimeoChannelIE(InfoExtractor):
IE_NAME = 'vimeo:channel'
_VALID_URL = r'https?://vimeo\.com/channels/(?P<id>[^/?#]+)/?(?:$|[?#])'
_VALID_URL = r'https://vimeo\.com/channels/(?P<id>[^/?#]+)/?(?:$|[?#])'
_MORE_PAGES_INDICATOR = r'<a.+?rel="next"'
_TITLE_RE = r'<link rel="alternate"[^>]+?title="(.*?)"'
_TESTS = [{
'url': 'http://vimeo.com/channels/tributes',
'url': 'https://vimeo.com/channels/tributes',
'info_dict': {
'id': 'tributes',
'title': 'Vimeo Tributes',
@@ -435,10 +429,10 @@ class VimeoChannelIE(InfoExtractor):
name="([^"]+)"\s+
value="([^"]*)"
''', login_form))
token = self._search_regex(r'xsrft: \'(.*?)\'', webpage, 'login token')
token = self._search_regex(r'xsrft = \'(.*?)\'', webpage, 'login token')
fields['token'] = token
fields['password'] = password
post = compat_urllib_parse.urlencode(fields)
post = urlencode_postdata(fields)
password_path = self._search_regex(
r'action="([^"]+)"', login_form, 'password URL')
password_url = compat_urlparse.urljoin(page_url, password_path)
@@ -465,7 +459,7 @@ class VimeoChannelIE(InfoExtractor):
if re.search(self._MORE_PAGES_INDICATOR, webpage, re.DOTALL) is None:
break
entries = [self.url_result('http://vimeo.com/%s' % video_id, 'Vimeo')
entries = [self.url_result('https://vimeo.com/%s' % video_id, 'Vimeo')
for video_id in video_ids]
return {'_type': 'playlist',
'id': list_id,
@@ -476,15 +470,15 @@ class VimeoChannelIE(InfoExtractor):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
channel_id = mobj.group('id')
return self._extract_videos(channel_id, 'http://vimeo.com/channels/%s' % channel_id)
return self._extract_videos(channel_id, 'https://vimeo.com/channels/%s' % channel_id)
class VimeoUserIE(VimeoChannelIE):
IE_NAME = 'vimeo:user'
_VALID_URL = r'https?://vimeo\.com/(?![0-9]+(?:$|[?#/]))(?P<name>[^/]+)(?:/videos|[#?]|$)'
_VALID_URL = r'https://vimeo\.com/(?![0-9]+(?:$|[?#/]))(?P<name>[^/]+)(?:/videos|[#?]|$)'
_TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>'
_TESTS = [{
'url': 'http://vimeo.com/nkistudio/videos',
'url': 'https://vimeo.com/nkistudio/videos',
'info_dict': {
'title': 'Nki',
'id': 'nkistudio',
@@ -495,15 +489,15 @@ class VimeoUserIE(VimeoChannelIE):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
name = mobj.group('name')
return self._extract_videos(name, 'http://vimeo.com/%s' % name)
return self._extract_videos(name, 'https://vimeo.com/%s' % name)
class VimeoAlbumIE(VimeoChannelIE):
IE_NAME = 'vimeo:album'
_VALID_URL = r'https?://vimeo\.com/album/(?P<id>\d+)'
_VALID_URL = r'https://vimeo\.com/album/(?P<id>\d+)'
_TITLE_RE = r'<header id="page_header">\n\s*<h1>(.*?)</h1>'
_TESTS = [{
'url': 'http://vimeo.com/album/2632481',
'url': 'https://vimeo.com/album/2632481',
'info_dict': {
'id': '2632481',
'title': 'Staff Favorites: November 2013',
@@ -527,14 +521,14 @@ class VimeoAlbumIE(VimeoChannelIE):
def _real_extract(self, url):
album_id = self._match_id(url)
return self._extract_videos(album_id, 'http://vimeo.com/album/%s' % album_id)
return self._extract_videos(album_id, 'https://vimeo.com/album/%s' % album_id)
class VimeoGroupsIE(VimeoAlbumIE):
IE_NAME = 'vimeo:group'
_VALID_URL = r'(?:https?://)?vimeo\.com/groups/(?P<name>[^/]+)'
_VALID_URL = r'https://vimeo\.com/groups/(?P<name>[^/]+)'
_TESTS = [{
'url': 'http://vimeo.com/groups/rolexawards',
'url': 'https://vimeo.com/groups/rolexawards',
'info_dict': {
'id': 'rolexawards',
'title': 'Rolex Awards for Enterprise',
@@ -548,13 +542,13 @@ class VimeoGroupsIE(VimeoAlbumIE):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
name = mobj.group('name')
return self._extract_videos(name, 'http://vimeo.com/groups/%s' % name)
return self._extract_videos(name, 'https://vimeo.com/groups/%s' % name)
class VimeoReviewIE(InfoExtractor):
IE_NAME = 'vimeo:review'
IE_DESC = 'Review pages on vimeo'
_VALID_URL = r'https?://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
_VALID_URL = r'https://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://vimeo.com/user21297594/review/75524534/3c257a1b5d',
'md5': 'c507a72f780cacc12b2248bb4006d253',
@@ -566,7 +560,7 @@ class VimeoReviewIE(InfoExtractor):
}
}, {
'note': 'video player needs Referer',
'url': 'http://vimeo.com/user22258446/review/91613211/13f927e053',
'url': 'https://vimeo.com/user22258446/review/91613211/13f927e053',
'md5': '6295fdab8f4bf6a002d058b2c6dce276',
'info_dict': {
'id': '91613211',
@@ -588,11 +582,11 @@ class VimeoReviewIE(InfoExtractor):
class VimeoWatchLaterIE(VimeoBaseInfoExtractor, VimeoChannelIE):
IE_NAME = 'vimeo:watchlater'
IE_DESC = 'Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)'
_VALID_URL = r'https?://vimeo\.com/home/watchlater|:vimeowatchlater'
_VALID_URL = r'https://vimeo\.com/home/watchlater|:vimeowatchlater'
_LOGIN_REQUIRED = True
_TITLE_RE = r'href="/home/watchlater".*?>(.*?)<'
_TESTS = [{
'url': 'http://vimeo.com/home/watchlater',
'url': 'https://vimeo.com/home/watchlater',
'only_matching': True,
}]
@@ -612,7 +606,7 @@ class VimeoWatchLaterIE(VimeoBaseInfoExtractor, VimeoChannelIE):
class VimeoLikesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vimeo\.com/user(?P<id>[0-9]+)/likes/?(?:$|[?#]|sort:)'
_VALID_URL = r'https://(?:www\.)?vimeo\.com/user(?P<id>[0-9]+)/likes/?(?:$|[?#]|sort:)'
IE_NAME = 'vimeo:likes'
IE_DESC = 'Vimeo user likes'
_TEST = {
@@ -640,8 +634,8 @@ class VimeoLikesIE(InfoExtractor):
description = self._html_search_meta('description', webpage)
def _get_page(idx):
page_url = '%s//vimeo.com/user%s/likes/page:%d/sort:date' % (
self.http_scheme(), user_id, idx + 1)
page_url = 'https://vimeo.com/user%s/likes/page:%d/sort:date' % (
user_id, idx + 1)
webpage = self._download_webpage(
page_url, user_id,
note='Downloading page %d/%d' % (idx + 1, page_count))

View File

@@ -33,14 +33,13 @@ class VineIE(InfoExtractor):
r'window\.POST_DATA = { %s: ({.+?}) }' % video_id, webpage, 'vine data'))
formats = [{
'url': data['videoLowURL'],
'ext': 'mp4',
'format_id': 'low',
}, {
'url': data['videoUrl'],
'ext': 'mp4',
'format_id': 'standard',
}]
'format_id': '%(format)s-%(rate)s' % f,
'vcodec': f['format'],
'quality': f['rate'],
'url': f['videoUrl'],
} for f in data['videoUrls'] if f.get('rate')]
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -31,7 +31,7 @@ class VKIE(InfoExtractor):
'id': '162222515',
'ext': 'flv',
'title': 'ProtivoGunz - Хуёвая песня',
'uploader': 're:Noize MC.*',
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
'duration': 195,
'upload_date': '20120212',
},
@@ -140,7 +140,7 @@ class VKIE(InfoExtractor):
if not video_id:
video_id = '%s_%s' % (mobj.group('oid'), mobj.group('id'))
info_url = 'http://vk.com/al_video.php?act=show&al=1&video=%s' % video_id
info_url = 'http://vk.com/al_video.php?act=show&al=1&module=video&video=%s' % video_id
info_page = self._download_webpage(info_url, video_id)
ERRORS = {
@@ -152,7 +152,10 @@ class VKIE(InfoExtractor):
'use --username and --password options to provide account credentials.',
r'<!>Unknown error':
'Video %s does not exist.'
'Video %s does not exist.',
r'<!>Видео временно недоступно':
'Video %s is temporarily unavailable.',
}
for error_re, error_msg in ERRORS.items():

View File

@@ -28,6 +28,7 @@ class WDRIE(InfoExtractor):
'title': 'Servicezeit',
'description': 'md5:c8f43e5e815eeb54d0b96df2fba906cb',
'upload_date': '20140310',
'is_live': False
},
'params': {
'skip_download': True,
@@ -41,6 +42,7 @@ class WDRIE(InfoExtractor):
'title': 'Marga Spiegel ist tot',
'description': 'md5:2309992a6716c347891c045be50992e4',
'upload_date': '20140311',
'is_live': False
},
'params': {
'skip_download': True,
@@ -55,6 +57,7 @@ class WDRIE(InfoExtractor):
'title': 'Erlebte Geschichten: Marga Spiegel (29.11.2009)',
'description': 'md5:2309992a6716c347891c045be50992e4',
'upload_date': '20091129',
'is_live': False
},
},
{
@@ -66,6 +69,7 @@ class WDRIE(InfoExtractor):
'title': 'Flavia Coelho: Amar é Amar',
'description': 'md5:7b29e97e10dfb6e265238b32fa35b23a',
'upload_date': '20140717',
'is_live': False
},
},
{
@@ -74,6 +78,20 @@ class WDRIE(InfoExtractor):
'info_dict': {
'id': 'mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100',
}
},
{
'url': 'http://www1.wdr.de/mediathek/video/livestream/index.html',
'info_dict': {
'id': 'mdb-103364',
'title': 're:^WDR Fernsehen [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:ae2ff888510623bf8d4b115f95a9b7c9',
'ext': 'flv',
'upload_date': '20150212',
'is_live': True
},
'params': {
'skip_download': True,
},
}
]
@@ -119,6 +137,10 @@ class WDRIE(InfoExtractor):
video_url = flashvars['dslSrc'][0]
title = flashvars['trackerClipTitle'][0]
thumbnail = flashvars['startPicture'][0] if 'startPicture' in flashvars else None
is_live = flashvars.get('isLive', ['0'])[0] == '1'
if is_live:
title = self._live_title(title)
if 'trackerClipAirTime' in flashvars:
upload_date = flashvars['trackerClipAirTime'][0]
@@ -131,6 +153,13 @@ class WDRIE(InfoExtractor):
if video_url.endswith('.f4m'):
video_url += '?hdcore=3.2.0&plugin=aasp-3.2.0.77.18'
ext = 'flv'
elif video_url.endswith('.smil'):
fmt = self._extract_smil_formats(video_url, page_id)[0]
video_url = fmt['url']
sep = '&' if '?' in video_url else '?'
video_url += sep
video_url += 'hdcore=3.3.0&plugin=aasp-3.3.0.99.43'
ext = fmt['ext']
else:
ext = determine_ext(video_url)
@@ -144,6 +173,7 @@ class WDRIE(InfoExtractor):
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
'is_live': is_live
}

View File

@@ -8,6 +8,7 @@ from ..compat import compat_urlparse
from ..utils import (
float_or_none,
month_by_abbreviation,
ExtractorError,
)
@@ -28,23 +29,45 @@ class YamIE(InfoExtractor):
}
}, {
# An external video hosted on YouTube
'url': 'http://mymedia.yam.com/m/3598173',
'md5': '0238ceec479c654e8c2f1223755bf3e9',
'url': 'http://mymedia.yam.com/m/3599430',
'md5': '03127cf10d8f35d120a9e8e52e3b17c6',
'info_dict': {
'id': 'pJ2Deys283c',
'id': 'CNpEoQlrIgA',
'ext': 'mp4',
'upload_date': '20150202',
'upload_date': '20150306',
'uploader': '新莊社大瑜伽社',
'description': 'md5:f5cc72f0baf259a70fb731654b0d2eff',
'description': 'md5:11e2e405311633ace874f2e6226c8b17',
'uploader_id': '2323agoy',
'title': '外婆的澎湖灣KTV-潘安邦',
}
'title': '20090412陽明山二子坪-1',
},
'skip': 'Video does not exist',
}, {
'url': 'http://mymedia.yam.com/m/3598173',
'info_dict': {
'id': '3598173',
'ext': 'mp4',
},
'skip': 'cause Yam system error',
}, {
'url': 'http://mymedia.yam.com/m/3599437',
'info_dict': {
'id': '3599437',
'ext': 'mp4',
},
'skip': 'invalid YouTube URL',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
page = self._download_webpage(url, video_id)
# Check for errors
system_msg = self._html_search_regex(
r'系統訊息(?:<br>|\n|\r)*([^<>]+)<br>', page, 'system message',
default=None)
if system_msg:
raise ExtractorError(system_msg, expected=True)
# Is it hosted externally on YouTube?
youtube_url = self._html_search_regex(
r'<embed src="(http://www.youtube.com/[^"]+)"',

View File

@@ -0,0 +1,127 @@
# coding=utf-8
from __future__ import unicode_literals
import re
import hashlib
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
float_or_none,
)
class YandexMusicBaseIE(InfoExtractor):
def _get_track_url(self, storage_dir, track_id):
data = self._download_json(
'http://music.yandex.ru/api/v1.5/handlers/api-jsonp.jsx?action=getTrackSrc&p=download-info/%s'
% storage_dir,
track_id, 'Downloading track location JSON')
key = hashlib.md5(('XGRlBW9FXlekgbPrRHuSiA' + data['path'][1:] + data['s']).encode('utf-8')).hexdigest()
storage = storage_dir.split('.')
return ('http://%s/get-mp3/%s/%s?track-id=%s&from=service-10-track&similarities-experiment=default'
% (data['host'], key, data['ts'] + data['path'], storage[1]))
def _get_track_info(self, track):
return {
'id': track['id'],
'ext': 'mp3',
'url': self._get_track_url(track['storageDir'], track['id']),
'title': '%s - %s' % (track['artists'][0]['name'], track['title']),
'filesize': int_or_none(track.get('fileSize')),
'duration': float_or_none(track.get('durationMs'), 1000),
}
class YandexMusicTrackIE(YandexMusicBaseIE):
IE_NAME = 'yandexmusic:track'
IE_DESC = 'Яндекс.Музыка - Трек'
_VALID_URL = r'https?://music\.yandex\.(?:ru|kz|ua|by)/album/(?P<album_id>\d+)/track/(?P<id>\d+)'
_TEST = {
'url': 'http://music.yandex.ru/album/540508/track/4878838',
'md5': 'f496818aa2f60b6c0062980d2e00dc20',
'info_dict': {
'id': '4878838',
'ext': 'mp3',
'title': 'Carlo Ambrosio - Gypsy Eyes 1',
'filesize': 4628061,
'duration': 193.04,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
album_id, track_id = mobj.group('album_id'), mobj.group('id')
track = self._download_json(
'http://music.yandex.ru/handlers/track.jsx?track=%s:%s' % (track_id, album_id),
track_id, 'Downloading track JSON')['track']
return self._get_track_info(track)
class YandexMusicAlbumIE(YandexMusicBaseIE):
IE_NAME = 'yandexmusic:album'
IE_DESC = 'Яндекс.Музыка - Альбом'
_VALID_URL = r'https?://music\.yandex\.(?:ru|kz|ua|by)/album/(?P<id>\d+)/?(\?|$)'
_TEST = {
'url': 'http://music.yandex.ru/album/540508',
'info_dict': {
'id': '540508',
'title': 'Carlo Ambrosio - Gypsy Soul (2009)',
},
'playlist_count': 50,
}
def _real_extract(self, url):
album_id = self._match_id(url)
album = self._download_json(
'http://music.yandex.ru/handlers/album.jsx?album=%s' % album_id,
album_id, 'Downloading album JSON')
entries = [self._get_track_info(track) for track in album['volumes'][0]]
title = '%s - %s' % (album['artists'][0]['name'], album['title'])
year = album.get('year')
if year:
title += ' (%s)' % year
return self.playlist_result(entries, compat_str(album['id']), title)
class YandexMusicPlaylistIE(YandexMusicBaseIE):
IE_NAME = 'yandexmusic:playlist'
IE_DESC = 'Яндекс.Музыка - Плейлист'
_VALID_URL = r'https?://music\.yandex\.(?:ru|kz|ua|by)/users/[^/]+/playlists/(?P<id>\d+)'
_TEST = {
'url': 'http://music.yandex.ru/users/music.partners/playlists/1245',
'info_dict': {
'id': '1245',
'title': 'Что слушают Enter Shikari',
'description': 'md5:3b9f27b0efbe53f2ee1e844d07155cc9',
},
'playlist_count': 6,
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
playlist = self._parse_json(
self._search_regex(
r'var\s+Mu\s*=\s*({.+?});\s*</script>', webpage, 'player'),
playlist_id)['pageData']['playlist']
entries = [self._get_track_info(track) for track in playlist['tracks']]
return self.playlist_result(
entries, compat_str(playlist_id),
playlist['title'], playlist.get('description'))

View File

@@ -47,7 +47,8 @@ class YouPornIE(InfoExtractor):
# Get JSON parameters
json_params = self._search_regex(
r'var currentVideo = new Video\((.*)\)[,;]',
[r'var\s+videoJa?son\s*=\s*({.+?});',
r'var\s+currentVideo\s*=\s*new\s+Video\((.+?)\)[,;]'],
webpage, 'JSON parameters')
try:
params = json.loads(json_params)

Some files were not shown because too many files have changed in this diff Show More