Compare commits

...

903 Commits

Author SHA1 Message Date
Philipp Hagemeister
4b9f9010b0 release 2015.06.15 2015-06-15 01:35:50 +02:00
Sergey M․
67d95f177c [niconico] Simplify format info 2015-06-15 03:43:33 +06:00
Sergey M.
44773ad125 Merge pull request #5975 from chaoskagami/niconico_qualitynote
Quality note for niconico
2015-06-15 02:38:33 +05:00
Sergey M․
5774ef35c4 [options] Add missing whitespace for --fixup description 2015-06-15 02:57:07 +06:00
Sergey M․
b95cfa9170 [liveleak] Clarify test 2015-06-15 02:54:49 +06:00
Sergey M․
afa1ded425 [liveleak] Clarify rationale for restoring raw video 2015-06-15 02:54:05 +06:00
Sergey M․
00ac23e6e0 [liveleak] Improve regex for restoring original video URL 2015-06-15 02:51:21 +06:00
Sergey M.
7d0c934a3e Merge pull request #5983 from jomo/master
LiveLeak: support more original videos
2015-06-15 01:49:41 +05:00
jomo
8f75761f24 LiveLak: add test for URLs with 'h264_270p' 2015-06-14 22:41:44 +02:00
jomo
9fd24e3a22 LiveLeak: support more original videos
some (old?) videos use ...mp4.h264_270p.mp4... instead of ...mp4.h264_base.mp4...
This is an addition to #4768
2015-06-14 21:50:03 +02:00
Naglis Jonaitis
755a9d3d1a [tvplay] Add support for NovaTv 2015-06-14 20:59:22 +03:00
Sergey M.
ac499cb61c Merge pull request #5976 from SpEcHiDe/patch-1
spelling mistake corrected
2015-06-14 18:26:53 +05:00
Shrimadhav U K
180940e02d spelling mistake corrected
acces changed to accessing
2015-06-14 11:19:42 +05:30
chaoskagami
976b03c56b Quality note for niconico - at least notify whether you'll get low or src 2015-06-14 00:18:40 -04:00
Sergey M․
450d89ddc1 [dramafever] Improve _VALID_URL 2015-06-14 09:58:26 +06:00
Sergey M․
463b2e5542 [dramafever:series] Rollback _PAGE_SIZE to max possible 2015-06-14 09:51:07 +06:00
Sergey M․
70a2002399 [dramafever:series] Fix _VALID_URL (Closes #5973) 2015-06-14 09:50:23 +06:00
Sergey M․
a617b10075 Merge branch 'ping-dramafever' 2015-06-14 07:43:50 +06:00
Sergey M․
0029071adb [dramefever] Improve and simplify 2015-06-14 07:43:14 +06:00
Sergey M․
ad49fe7c8f Merge branch 'dramafever' of https://github.com/ping/youtube-dl into ping-dramafever 2015-06-14 04:56:54 +06:00
Sergey M․
49bc802f81 Merge branch 'atomicdryad-brightcove_custombc_extractor' 2015-06-13 19:54:02 +06:00
Sergey M․
af9cdee9cb [brightcove] Improve and generalize brightcove URL extraction from JS 2015-06-13 19:53:32 +06:00
fnord
b4e1576aee Brightcove extractor: support customBC.createVideo(...); method
found in http://www.americanbar.org/groups/family_law.html and
http://america.aljazeera.com/watch/shows/america-tonight/2015/6/exclusive-hunting-isil-with-the-pkk.html
2015-06-13 06:20:30 -05:00
Sergey M․
78e2b74bb9 [tumblr] Add support for pornhub embeds (Closes #5963) 2015-06-13 03:39:14 +06:00
Sergey M․
65d161c480 [extractor/generic] Add support for pornhub embeds 2015-06-13 03:36:16 +06:00
Sergey M․
9fcbd5db2a [pornhub] Add support for embeds 2015-06-13 03:24:36 +06:00
Sergey M․
4f3bf679f5 [vk] Fix authentication for non-ASCII login/password 2015-06-13 03:09:35 +06:00
Sergey M․
8b6c896c4b [prosiebensat1] Add title regex 2015-06-12 21:18:13 +06:00
Sergey M․
185dbc4974 [prosiebensat1] Fix rtmp extraction (Closes #5962) 2015-06-12 21:13:14 +06:00
Sergey M․
3d535e0471 [tvc] Fix embed regex 2015-06-12 19:31:52 +06:00
Sergey M․
9872d3110c [extractor/generic] Add support for tvigle embeds 2015-06-12 18:37:09 +06:00
Sergey M․
b859971873 [extractor/generic] Rename tvc embed url variable 2015-06-12 18:15:30 +06:00
Sergey M․
e5095f1198 Merge branch 'hlintala-5tv' 2015-06-12 17:49:07 +06:00
Sergey M․
499a077761 [5tv] Improve 2015-06-12 17:48:42 +06:00
Sergey M․
5da7177729 Merge branch '5tv' of https://github.com/hlintala/youtube-dl into hlintala-5tv 2015-06-12 16:34:28 +06:00
Sergey M․
3507766bd0 Merge branch 'hlintala-tvc' 2015-06-12 16:29:10 +06:00
Sergey M․
f37bdbe537 [extractor/generic] Add test for tvc embed 2015-06-12 16:28:45 +06:00
Sergey M․
2da09ff8b0 [extractor/generic] Fix tvc ie_key 2015-06-12 16:26:31 +06:00
Sergey M․
5ccddb7ecf [tvc] Fix ie_key 2015-06-12 16:25:26 +06:00
Sergey M․
954c1d0529 [tvc] Refactor extractor names 2015-06-12 16:24:13 +06:00
Sergey M․
494f20cbdc [extractor/generic] Add support for tvc embeds 2015-06-12 16:22:46 +06:00
Sergey M․
29902c8ec0 [tvc:embed] Add embed extraction routine 2015-06-12 16:22:23 +06:00
Sergey M․
9f15bdabc8 [tvc] Separate embed extractor 2015-06-12 16:13:36 +06:00
Sergey M․
fff3455f58 Merge branch 'tvc' of https://github.com/hlintala/youtube-dl into hlintala-tvc 2015-06-12 15:12:54 +06:00
Hannu Lintala
87446dc618 [tvc] Add extractor (Closes #5795) 2015-06-12 01:34:10 +03:00
Hannu Lintala
99ac0390f5 [fivetv] Add extractor (Closes #5794) 2015-06-12 01:03:14 +03:00
Sergey M․
ff0f0b9172 [tube8] Fix extraction (Closes #5952) 2015-06-11 22:18:08 +06:00
Sergey M․
97b570a94c [generic] Improve rtl.nl embeds detection (Closes #5950) 2015-06-11 19:04:12 +06:00
Sergey M․
a9d56c6843 [rtlnl] Improve _VALID_URL (#5950) 2015-06-11 19:03:22 +06:00
Sergey M․
f98470df69 [bilibili] Fix FutureWarning 2015-06-10 23:01:12 +06:00
Jaime Marquínez Ferrándiz
eb8be1fe76 [rtbf] Extract all formats (closes #5947) 2015-06-10 14:12:43 +02:00
Yen Chi Hsuan
7ebd5376fe [nfl] Relax _VALID_URL (fixes #5940) 2015-06-10 14:17:03 +08:00
Jaime Marquínez Ferrándiz
70219b0f43 [youtube:playlist] Use an iterator for the entries (closes #5935)
So that '--playlist-end' downloads only the required pages.
2015-06-09 23:49:11 +02:00
Sergey M․
bd5bc0cd5a [theplatform] Check for /select/media URLs first (#5746) 2015-06-09 23:12:13 +06:00
Sergey M․
6e054aacca [theplatform] Take care of /select/media URLs (Closes #5746) 2015-06-09 23:07:22 +06:00
Sergey M․
9d581f3d52 [cbs] Extract display_id 2015-06-09 21:39:45 +06:00
Sergey M․
9bf99891d0 [cbs] Add support for colbertlateshow (Closes #5888) 2015-06-09 21:23:53 +06:00
Sergey M․
d9cf48e81e [spiegeltv] Extract all formats and prefer hls (Closes #5843) 2015-06-09 20:36:08 +06:00
Yen Chi Hsuan
e1b9322b09 [youtube] Restricter DASH signature pattern
A problematic DASH url is:
https://manifest.googlevideo.com/api/manifest/dash/mm/35/key/yt5/ip/140.112.247.145/ms/pm/mv/s/mt/1433794435/id/o-AD2Od_dsOlAUYPu03ZsVWKSbGEbCJJrMp9vnXGhnyRhd/mn/sn-aigllm7r/sparams/as%2Chfr%2Cid%2Cip%2Cipbits%2Citag%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cplayback_host%2Crequiressl%2Csource%2Cexpire/fexp/9406009%2C9406821%2C9407575%2C9408142%2C9408420%2C9408710%2C9409121%2C9409208%2C9412514%2C9412780%2C9413208%2C9413426%2C9413476%2C9413503%2C9415304%2C9415753/upn/viDQrs8SnmE/as/fmp4_audio_clear%2Cwebm_audio_clear%2Cfmp4_sd_hd_clear%2Cwebm_sd_hd_clear%2Cwebm2_sd_hd_clear/playback_host/r4---sn-aigllm7r.googlevideo.com/ipbits/0/requiressl/yes/pl/20/itag/0/source/youtube/expire/1433824806/nh/EAQ/signature/81ABE6391E351BA495F5B041B00FF1257A353318.1A6E48ABB74E8F4AE73CA2CB1F963FC34E33DEE7/sver/3/hfr/1
2015-06-09 14:48:18 +08:00
Yen Chi Hsuan
627b964825 [kickstarted] Extract thumbnails in embedded videos (#5929) 2015-06-09 11:54:13 +08:00
Sergey M․
a55e36f48d [YoutubeDL] Handle out-of-range timestamps (#5826) 2015-06-08 21:05:17 +06:00
Yen Chi Hsuan
01e21b89ee [noco] Skip invalid timestamps (closes #5826) 2015-06-08 17:39:55 +08:00
Yen Chi Hsuan
788be3313d [cnet] Fix theplatform vid extraction (fixes #5924) 2015-06-08 13:34:23 +08:00
Yen Chi Hsuan
e1ec93304d [instagram:user] Truncate title to 80 characters (#5919)
This is a workaround. Currently YoutubeDL.process_info() truncates
info_dict['title'] to 200 characters, but the implementation can't
handle wide characters.
2015-06-08 01:46:33 +08:00
Yen Chi Hsuan
edb99d4c18 [instagram] Handling null values (fixes #5919)
I didn't add the test case here because it takes too much time. (7
minutes on my machine)
2015-06-08 01:17:21 +08:00
Yen Chi Hsuan
68477c3dab [tlc] Fix test failure due to DiscoveryIE changes 2015-06-07 16:38:39 +08:00
Yen Chi Hsuan
65ba8b23f4 [discovery] Rewrite DiscoveryIE (fixes #5898)
Discovery.com now uses a completely different approach for serving
videos. At least in both test cases brightcove are involved. However,
AMF support is necessary for these brightcove videos. As a result, I
try to extract videos from the info page ('?flat=1'). The downloaded
file can be different from the one in browsers.
2015-06-07 16:34:19 +08:00
Yen Chi Hsuan
621ed9f5f4 [common] Add note and errnote field for _extract_m3u8_formats 2015-06-07 16:33:22 +08:00
Yen Chi Hsuan
b26733ba7f [brightcove] Allow single quotes in Brightcove URLs (fixes #5901) 2015-06-07 15:29:42 +08:00
Sergey M․
9836cfb8d6 [options] Clarify --list-extractors (Closes #5916) 2015-06-07 08:12:21 +06:00
Sergey M․
665b6c1236 Merge branch 'hlintala-ruutu' 2015-06-07 05:38:29 +06:00
Sergey M․
9414338a48 [ruutu] Improve, make more robust and fix python 2.6 support 2015-06-07 05:37:29 +06:00
Jaime Marquínez Ferrándiz
de390ea077 update: Use https for getting the version info (fixes #5909) 2015-06-07 00:21:30 +02:00
Sergey M․
717b0239fd Merge branch 'ruutu' of https://github.com/hlintala/youtube-dl into hlintala-ruutu 2015-06-07 04:01:28 +06:00
Hannu Lintala
d00735a0c5 [ruutu] Don't use fallback for DASH and other non-HTTP urls 2015-06-06 23:01:23 +03:00
Yen Chi Hsuan
c23d5ce926 Merge branch 'PeterDing-iqiyi' 2015-06-07 02:59:27 +08:00
Yen Chi Hsuan
b5a3c7f109 [iqiyi] Cache encryption keys 2015-06-07 02:47:36 +08:00
Yen Chi Hsuan
9c5f685ef1 [iqiyi] Improve regex pattern again 2015-06-07 02:39:03 +08:00
Yen Chi Hsuan
08bb8ef201 [iqiyi] Unify get_format() and get_bid() 2015-06-07 02:25:00 +08:00
Yen Chi Hsuan
865ab62f43 [iqiyi] Make _VALID_URL more accurate
v_* urls are individual videos, while a_* urls are playlists, which are
not supported yet.
2015-06-07 02:13:22 +08:00
Yen Chi Hsuan
9948113590 [iqiyi] Add a multipart test case 2015-06-07 02:09:33 +08:00
Yen Chi Hsuan
c4ee87022b [iqiyi] Change id for multipart videos 2015-06-07 01:57:05 +08:00
Yen Chi Hsuan
ffba4edb06 [iqiyi] Improve some variable names and add download notes 2015-06-07 01:52:51 +08:00
Yen Chi Hsuan
958d0b659b [iqiyi] Reorder imports 2015-06-07 01:35:09 +08:00
Yen Chi Hsuan
aacda28b28 [iqiyi] Give error message for assertion failures 2015-06-07 01:32:03 +08:00
Yen Chi Hsuan
29e7e0781b [iqiyi] Simplify and improve regex patterns
See the comments in #5849
2015-06-07 00:56:08 +08:00
Yen Chi Hsuan
7012620e2b [iqiyi] Remove format selection codes 2015-06-07 00:44:54 +08:00
Yen Chi Hsuan
f1da861018 [iqiyi] PEP8 2015-06-07 00:37:29 +08:00
Naglis Jonaitis
05aa9c82d9 [sunporno] Fix view_count extraction 2015-06-06 13:58:52 +03:00
Naglis Jonaitis
a9e58ecd3f [turbo] Improve description extraction
`og:description` is empty for some videos.
2015-06-06 13:58:51 +03:00
Hannu Lintala
223544552f [Ruutu] Add new extractor 2015-06-06 04:29:03 +03:00
Sergey M․
3d8e9573a4 [youtube:channel] Improve channel id extraction (#5904) 2015-06-06 06:25:37 +06:00
Naglis Jonaitis
54eb81a087 [pornovoisines] Improve average_rating extraction and update test case 2015-06-06 03:11:43 +03:00
Naglis Jonaitis
c33c547d66 [izlesene] Avoid timestamp differences in tests due to DST 2015-06-06 02:57:21 +03:00
Naglis Jonaitis
dfe7dd9bdb [izlesene] Unquote video URLs and simplify 2015-06-06 02:57:21 +03:00
Yen Chi Hsuan
63ccf6474d Merge branch 'ping-qqmusic-more-formats' 2015-06-05 23:19:54 +08:00
Yen Chi Hsuan
e8ac61e840 [qqmusic] Use meaningful variable names 2015-06-05 23:19:25 +08:00
Yen Chi Hsuan
f00a650705 [qqmusic] Rearrange codes 2015-06-05 23:16:34 +08:00
Yen Chi Hsuan
4bde5ce992 Merge branch 'qqmusic-more-formats' of https://github.com/ping/youtube-dl into ping-qqmusic-more-formats 2015-06-05 23:14:44 +08:00
Yen Chi Hsuan
d31573fa37 [teamcoco] Handle incomplete m3u8 URLs (fixes #5798)
There are 2 TODOs. I don't know how to handle these cases correctly.
2015-06-05 22:59:04 +08:00
ping
8b8cde2140 [qqmusic] Set abr for mp3 formats 2015-06-05 06:04:26 +08:00
Philipp Hagemeister
0e805e782b release 2015.06.04.1 2015-06-04 21:54:33 +02:00
Philipp Hagemeister
f5c78d118b release 2015.06.04 2015-06-04 21:49:02 +02:00
Yen Chi Hsuan
9d4f213f90 [qqmusic:toplist] List name and description are optional 2015-06-05 00:52:18 +08:00
Yen Chi Hsuan
168db222c6 Merge pull request #5891 from ping/qqmusic-toplist-fix
[qqmusic] Fix toplist extraction
2015-06-05 00:50:59 +08:00
Sergey M․
3d6388e34e [tnaflix] Fix relative URLs (empflix) 2015-06-04 20:42:37 +06:00
Sergey M․
3ce9bc712a [empflix] Fix typo 2015-06-04 20:39:03 +06:00
Sergey M․
e52c0bd0eb [tnaflix] Modernize 2015-06-04 20:37:05 +06:00
Sergey M․
56c837ccb7 [tnaflix] Fix typo 2015-06-04 20:34:48 +06:00
ping
55e5841f14 [qqmusic] Extract additional formats (mp3-128, mp3-320) 2015-06-04 17:41:29 +08:00
ping
ed15e9ba02 [qqmusic] Remove unused import 2015-06-04 17:32:06 +08:00
ping
eedda32e6b [qqmusic] Fix toplist 2015-06-04 11:27:18 +08:00
Jaime Marquínez Ferrándiz
4c8fea92f3 [test/aes] Fix on python 3.3 and higher
Since 878563c847 the aes functions only accepts the base64 data as a unicode string.
2015-06-03 23:50:38 +02:00
Sergey M.
d073055dcd Merge pull request #5876 from slava-sh/nova
[nova] Update
2015-06-03 23:18:01 +05:00
Slava Shklyaev
e4ac7bb1e5 [nova] Revert "Fix extension extraction bug"
This reverts commit 9464a194db.
2015-06-03 19:25:30 +03:00
Yen Chi Hsuan
9bac8c57e3 Merge branch 'iqiyi' of https://github.com/PeterDing/youtube-dl into PeterDing-iqiyi 2015-06-03 23:59:52 +08:00
Sergey M․
3153a2c98d [tvigle] Skip tests 2015-06-03 20:53:54 +06:00
Sergey M․
15b74b94be [tvigle] Capture error message 2015-06-03 20:52:47 +06:00
Sergey M․
687cb3ad35 [24video] Fix uploader extraction 2015-06-03 20:47:11 +06:00
Yen Chi Hsuan
8f94784124 [tumblr] Detect vid.me embeds (fixes #5883) 2015-06-03 10:26:39 +08:00
Yen Chi Hsuan
23dd1fc74c [vidme] Always use the non-embedded page
For example, https://vid.me/Wmur contains more information than
https://vid.me/e/Wmur
2015-06-03 10:24:02 +08:00
Slava Shklyaev
fa971259e6 [nova] Add a comment about html in description 2015-06-02 19:09:47 +03:00
Slava Shklyaev
b0cda32f72 [nova] Fix Python 2.6 compatability issue 2015-06-02 18:30:25 +03:00
Slava Shklyaev
08b7968e28 [nova] Fix display_id extraction bug 2015-06-02 18:24:19 +03:00
Slava Shklyaev
4b5fe1349f [nova] Comply with review 2015-06-02 18:23:42 +03:00
Sergey M․
d23da75b32 [iprima] Fix description extraction
`og:description` does not contain actual description anymore.
2015-06-02 21:10:18 +06:00
Sergey M.
06e027992d Merge pull request #5877 from slava-sh/iprima
[iprima] Update
2015-06-02 20:04:04 +05:00
Slava Shklyaev
b5597738d4 [iprima] Comply with review 2015-06-02 17:42:53 +03:00
Slava Shklyaev
bc03e58565 [iprima] Update 2015-06-02 13:19:02 +03:00
Slava Shklyaev
a00234f1c5 [nova] Minor style improvement 2015-06-02 12:57:03 +03:00
Slava Shklyaev
34c0f95db2 [nova] Remove html tags from description 2015-06-02 12:56:36 +03:00
Slava Shklyaev
fcb04bcaca [nova] Extract upload_date in some cases 2015-06-02 12:55:41 +03:00
Slava Shklyaev
9464a194db [nova] Fix extension extraction bug
Replace the hardcoded flv with determine_ext. Let rtmpdump parse the url.
2015-06-02 12:54:20 +03:00
Slava Shklyaev
9f4b9118cc [nova] Fix display_id extraction bug
Make id group non-greedy so that .html is not included in it.
2015-06-02 12:49:01 +03:00
Sergey M․
60158217ef [nova] Add tv test 2015-06-02 00:57:08 +06:00
Sergey M․
923e79e2e4 [nova] Add extractor 2015-06-02 00:53:04 +06:00
Naglis Jonaitis
866b296d0f [aftonbladet] Fix extraction and update _VALID_URL (Fixes #5863) 2015-06-01 16:12:11 +03:00
Yen Chi Hsuan
4053ee9104 Credit @PeterDing for 91porn extractor (#5830) 2015-06-01 14:44:10 +08:00
Sergey M․
47fd8c2f76 [patreon] Fix embeds extraction (Closes #5862) 2015-06-01 00:04:36 +06:00
Sergey M․
96b9690985 [imgur] Improve extraction 2015-05-31 04:05:26 +06:00
Sergey M․
df15ef8dab [YoutubeDL] Tweak select_format for video only media 2015-05-31 04:05:09 +06:00
Sergey M.
002c0fb511 Merge pull request #5852 from ivan/make-playlist-url
[youtube] Construct a playlist URL in case the page is missing one
2015-05-31 02:19:00 +05:00
Sergey M․
7584e38ce4 [tvigle] Modernize 2015-05-31 03:01:41 +06:00
Sergey M․
eb47569f8a [tvigle] Add support for m3u8 2015-05-31 03:00:13 +06:00
Ivan Kozik
d2a9de78df [youtube] Construct a playlist URL in case the page is missing one
This fixes jumping from user/channel -> playlist for some users like
https://www.youtube.com/user/BitcoinFoundation

This also removes the superfluous log message
"add --no-playlist to just download video VIDEOID"
when downloading a user/channel.
2015-05-30 20:54:03 +00:00
Sergey M․
c5138a7ce4 [extractor/generic] Clarify test comment 2015-05-31 02:36:20 +06:00
Sergey M․
c5fa81fe81 [extractor/generic] Put all direct link tests near to each other for better navigation 2015-05-31 02:22:29 +06:00
Sergey M․
a074e92296 [extractor/generic] Add test for large compressed media 2015-05-31 02:13:24 +06:00
Sergey M․
1ddb9456c4 [extractor/generic] Use compat_urllib_parse_unquote for unquoting video_id and title from URL 2015-05-31 01:23:58 +06:00
Sergey M․
58bde34a23 [extractor/generic] Force Accept-Encoding to any for extraction pass 2015-05-31 00:44:54 +06:00
Sergey M․
339516072b [extractor/generic] Unescape video_id and title extracted from URL 2015-05-30 23:16:14 +06:00
Sergey M․
931bc3c3a7 [YoutubeDL] Do not loose request method information 2015-05-30 22:52:02 +06:00
Yen Chi Hsuan
db1e9ee771 Merge branch 'PeterDing-porn91' 2015-05-31 00:33:30 +08:00
Yen Chi Hsuan
a2d971309b [porn91] Use single quotes 2015-05-31 00:31:18 +08:00
Yen Chi Hsuan
d05a1dbe70 [porn91] Catch daily limit error 2015-05-31 00:26:12 +08:00
Yen Chi Hsuan
a80601f8d9 [porn91] Extract more info 2015-05-31 00:20:37 +08:00
Yen Chi Hsuan
1c22238756 [porn91] Simplify 2015-05-31 00:03:19 +08:00
Yen Chi Hsuan
9ff811c5cd [porn91] PEP8 2015-05-30 23:35:55 +08:00
Yen Chi Hsuan
1ebc05df91 Merge branch 'porn91' of https://github.com/PeterDing/youtube-dl into PeterDing-porn91 2015-05-30 23:33:10 +08:00
Sergey M․
386bdfa698 [youtube:user] Workaround 35 pages limitation (Closes #5778) 2015-05-30 18:29:16 +06:00
Naglis Jonaitis
1ae7ff771b [tubitv] Add error message for videos that require login (#5524) 2015-05-30 14:33:27 +03:00
Naglis Jonaitis
5196b98897 [tubitv] Add new extractor (Closes #5524) 2015-05-30 14:16:18 +03:00
Sergey M․
e6e63e91a7 [tf1] Extend _VALID_URL (Closes #5848) 2015-05-30 16:18:11 +06:00
Sergey M․
b4dd98358f [vgtv] Properly handle lives 2015-05-30 16:12:07 +06:00
Sergey M․
181c7053e3 [YoutubeDL] Make sure all formats have unique format_id 2015-05-30 16:04:44 +06:00
Sergey M․
4d454c5e4b [vgtv] Check for inactive videos 2015-05-30 15:15:42 +06:00
Sergey M․
5c2191a605 [vgtv] Skip wasLive hds (Closes #5835) 2015-05-30 15:14:10 +06:00
Sergey M․
bba5bfc890 Merge branch 'ping-soompi' 2015-05-30 14:37:18 +06:00
Sergey M․
1a5b77dc21 [crunchyroll] Fix python 3.2 2015-05-30 14:36:45 +06:00
Sergey M․
b2cf6543b2 [soompi] Improve and simplify 2015-05-30 14:30:04 +06:00
Sergey M․
0385d64223 [crunchyroll] Extract subtitles extraction routine 2015-05-30 14:12:58 +06:00
pulpe
6ebdfe43e4 [tube8] fix extractor (fixes #5846) 2015-05-30 09:30:14 +02:00
Yen Chi Hsuan
fafec39d41 [spiegeltv] Changed RTMP server (fixes #5788 and fixes #5843)
Thanks to @brickleroux for finding out the problem
2015-05-30 13:23:09 +08:00
PeterDing
670861bd20 [iqiyi] Do not request for unneeded formats 2015-05-30 10:37:54 +08:00
PeterDing
605ec701b7 [iqiyi] Add new extractor for iqiyi.com 2015-05-29 23:32:04 +08:00
pulpe
d6aa68ce75 [postprocessor/embedthumbnail] embed mp4 too (fixes #5840) 2015-05-29 12:47:20 +02:00
Philipp Hagemeister
eb6cb9fbe9 release 2015.05.29 2015-05-29 07:52:17 +02:00
Yen Chi Hsuan
84e1e036c2 [senate] Extend _VALID_URL (fixes #5836) 2015-05-29 12:44:31 +08:00
PeterDing
806598b94d [porn91] the one that _search_regex returns not needs to be checked 2015-05-29 08:21:24 +08:00
Sergey M․
e26be70bca Merge branch 'soompi' of https://github.com/ping/youtube-dl into ping-soompi 2015-05-28 22:22:29 +06:00
Sergey M․
9e0b579128 [nowtv] Add test for rtlnitro 2015-05-28 01:26:14 +06:00
Sergey M․
ff4a1279f2 [nowtv] Do not request unnecessary metadata 2015-05-28 01:15:04 +06:00
Sergey M․
9b254aa177 [nowtv] Add non-free video check 2015-05-27 23:41:43 +06:00
PeterDing
703d78bbf5 [porn91] change re to _search_regex 2015-05-28 01:37:24 +08:00
Sergey M․
d9446c7319 Merge branch 'akirk-nowtv' 2015-05-27 23:22:19 +06:00
Sergey M․
b25b645d51 [nowtv] Improve and simplify 2015-05-27 23:20:32 +06:00
PeterDing
d90b3854ca [porn91] Add new extractor for 91porn.com 2015-05-28 00:37:00 +08:00
Sergey M․
bf24c3d017 [facebook] Improve title regex (Closes #5816) 2015-05-27 21:25:07 +06:00
Yen Chi Hsuan
f0bfaa2d7d [nrk] Update subtitles test
Subtitle conversion routine is removed, so the subtitles are TTML now. See
1c7e2e64f6
2015-05-27 15:23:34 +08:00
Yen Chi Hsuan
f9f3e3df9a [teamcoco] Use determine_ext to determine the video type
Some videos does not contain a 'type' field (#5798)
2015-05-27 14:51:18 +08:00
Yen Chi Hsuan
f8d5e1cfb5 [naver] Fix video url (fixes #5809)
RTMP urls in test:naver does not work. Need more investigation.
2015-05-27 14:44:08 +08:00
Yen Chi Hsuan
c23848b3c5 [naver] Enhanced error detection 2015-05-27 14:20:29 +08:00
Yen Chi Hsuan
6d00a2dcd1 [bilibili] Catch API call failures
JSON are returned in a failed API call
2015-05-27 04:23:21 +08:00
Yen Chi Hsuan
b535170b21 [bilibili] Skip assertion if HQ videos not available 2015-05-27 04:14:24 +08:00
Sergey M․
1434184c57 [spankwire] Do not modify aes key string 2015-05-27 01:42:53 +06:00
Sergey M․
7a372b64df [pornhub] Do not modify aes key string (Closes #5824) 2015-05-27 01:41:00 +06:00
Sergey M․
5406af92bc [dailymotion:user] Fix _VALID_URL 2015-05-26 22:16:47 +06:00
Sergey M․
7d65242dc3 [dailymotion:user] Process user home as user (Closes #5823) 2015-05-26 22:12:26 +06:00
Naglis Jonaitis
544a8693b7 Remove Firedrive and Sockshare imports
Oops
2015-05-26 13:53:14 +03:00
Naglis Jonaitis
35a4f24a37 [firedrive] Remove extractor (Closes #3870)
Haywire since last October.
2015-05-26 13:44:46 +03:00
Naglis Jonaitis
ff305edd64 [sockshare] Remove extractor
Haywire since last October.
2015-05-26 13:43:00 +03:00
Yen Chi Hsuan
efec4358b9 [cinemassacre] Support an alternative form of screenwavemedia URL
fixes #5821
2015-05-26 13:54:41 +08:00
Yen Chi Hsuan
db3ca36403 [facebook] Move the title extraction warning below (fixes #5820) 2015-05-26 13:41:38 +08:00
Yen Chi Hsuan
42833b44b5 [tf1] Extend _VALID_URL (fixes #5819) 2015-05-26 13:32:43 +08:00
Alexander Kirk
5d0a33eebc rtlnow is now hosted at nowtv.de 2015-05-25 20:36:25 +02:00
Sergey M․
ba2df04b41 [odnoklassniki] Make URL explicit 2015-05-25 21:27:43 +06:00
Sergey M․
c6bbdadd79 [odnoklassniki] Support extraction from metadata URL (Closes #5813) 2015-05-25 21:22:13 +06:00
Sergey M․
b885bae634 Credit @misterhat for karrierevideos (#5729) 2015-05-25 04:53:53 +06:00
Sergey M?
d41ebe146b [tenplay] Fix formats and modernize (Closes #5806) 2015-05-24 23:58:09 +06:00
Jaime Marquínez Ferrándiz
4b4e1af059 [arte] Remove unused import 2015-05-24 18:46:29 +02:00
Sergey M.
80240b347e Merge pull request #5780 from jaimeMF/remove-nondash
[youtube] Remove the nondash formats (fixes #5774)
2015-05-24 21:42:15 +05:00
Jaime Marquínez Ferrándiz
04b3b3df05 [youtube] Remove the nondash formats (fixes #5774)
Since we use fixed values for some fields like width and height they can be wrong, and would get picked by some formats filters.
For example for https://www.youtube.com/watch?v=EQCrhbBxsjA the biggest height is 720 and for nondash formats it's set to 1440, so -f 'bestvideo[height>=1200]+bestaudio' would incorrectly pick the nondash format, instead it should report that the requested format is not available.
2015-05-24 18:26:20 +02:00
Sergey M․
2ad5708c43 [arte:future] Switch to search_regex for now (Closes #5801) 2015-05-24 21:25:00 +06:00
Sergey M․
63f3cab4ae [rtbf] Fix extraction (Closes #5803) 2015-05-24 21:09:08 +06:00
Sergey M․
8cdf03a7a2 Merge branch 'misterhat-karrierevideos' 2015-05-24 20:14:54 +06:00
Sergey M․
d78c834ead [karrierevideos] Improve and simplify 2015-05-24 20:04:13 +06:00
Sergey M․
05a976cd99 Merge branch 'karrierevideos' of https://github.com/misterhat/youtube-dl into misterhat-karrierevideos 2015-05-24 19:19:48 +06:00
Sergey M․
34fb7e46ad [empflix] Relax _VALID_URL 2015-05-24 19:11:40 +06:00
Sergey M․
abac15f3c6 [tnaflix] Do not capture cat_id 2015-05-24 19:11:31 +06:00
Sergey M.
b700055ba4 Merge pull request #5772 from frenchy1983/fix_tnaflix_regex
[TNAFlix] Allow dot (and more) in cat_id and display_id
2015-05-24 17:54:25 +05:00
Sergey M.
23905927e1 [README.md] Keep more idiomatic rwx order 2015-05-24 18:32:04 +06:00
Sergey M.
56be5f1567 Merge pull request #5800 from WassimAttar/patch-1
[README.md] chmod error
2015-05-24 17:29:26 +05:00
WassimAttar
1807ae22dd chmod error
After installing youtube-dl with this method
    sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
    sudo chmod a+xr /usr/local/bin/youtube-dl
When i try to use it, i get this error
    python: can't open file '/usr/local/bin/youtube-dl': [Errno 13] Permission denied

The correct chmod is a+xr
2015-05-24 10:37:05 +02:00
Sergey M․
71646e4653 [YoutubeDL] Initialize files_to_delete (Closes #5797) 2015-05-24 04:14:01 +06:00
Sergey M?
1335c3aca8 [drtv] Improve extraction (Closes #5792) 2015-05-24 01:22:11 +06:00
Yen Chi Hsuan
30455ce255 [nextmedia] Extend and reorder _VALID_URL 2015-05-24 02:42:01 +08:00
Yen Chi Hsuan
9bf87ae3aa [nextmedia] Merge AppleDailyRealtimeNewsIE and AppleDailyAnimationNewsIE 2015-05-24 02:36:47 +08:00
Yen Chi Hsuan
abca34cbc0 [cnn] Relax _VALID_URL again (fixes #5737)
The problem is the same as test:CNN_1, so I didn't add the test case
2015-05-24 02:04:02 +08:00
Sergey M․
d386878af9 [prosiebensat1] Add support for .at domain names (Closes #5786) 2015-05-23 21:25:53 +06:00
Sergey M․
685c74d315 [rutv] Extend embed URL (Closes #5782) 2015-05-23 01:01:47 +06:00
Sergey M․
69e0f1b445 Credit @ping for viki:channel, qqmusic:toplist 2015-05-23 00:08:10 +06:00
Jaime Marquínez Ferrándiz
79979c6897 Clarify that --dump-pages encodes the pages using base64 (#5781) 2015-05-22 16:15:50 +02:00
Jaime Marquínez Ferrándiz
ba64547616 [sportbox] Remove unused import 2015-05-22 11:35:09 +02:00
frenchy1983
ed5a637d62 [TNAFlix] Restore test
See dstftw's comment in #5772
2015-05-22 09:29:35 +02:00
Yen Chi Hsuan
8a278a1d7e [nba] Fix duration extraction (fixes #5777) 2015-05-22 13:30:39 +08:00
Sergey M․
77d9cb2f04 [sportbox] Fix extraction 2015-05-22 00:45:33 +06:00
Sergey M․
0459432d96 [shared] Fix for python 3.2 2015-05-22 00:10:53 +06:00
Sergey M․
43150d7ac3 [shared] Fix for python 3.2 2015-05-22 00:10:05 +06:00
Sergey M․
afe8b594be [rtve.es:alacarta] Fix for python 3.2 2015-05-22 00:09:15 +06:00
Sergey M․
878563c847 [aes] Fix for python 3.2 2015-05-22 00:06:10 +06:00
Sergey M․
06947add03 [chilloutzone] Fix for python 3.2 2015-05-22 00:03:47 +06:00
Sergey M․
5cd47a5e4f [videott] Fix for python 3.2 2015-05-21 23:58:46 +06:00
Sergey M․
53de95da5e [viki] Extend _VALID_URLs 2015-05-21 22:27:22 +06:00
Sergey M․
663004ac2b [options] Clarify --metadata-from-title additional templates 2015-05-21 22:06:25 +06:00
Jaime Marquínez Ferrándiz
6ad9cb224a [mitele] It now uses m3u8 (#5764)
It should also be possible to use Adobe HDS, but it would require more work.
2015-05-21 12:02:53 +02:00
frenchy1983
e7752cd578 [TNAFlix] Allow dot (and more) in cat_id and display_id
URLs with dots were raising a "UnsupportedError: Unsupported URL" error.
2015-05-21 11:47:16 +02:00
Jaime Marquínez Ferrándiz
4d2f42361e [viki] remove unused import 2015-05-21 11:42:20 +02:00
Sergey M․
4d8ee01389 [viki] Fix typo 2015-05-21 02:38:43 +06:00
Sergey M․
d01924f488 [viki:channel] Extend matching URLs and extract movies 2015-05-21 02:30:04 +06:00
Sergey M․
bc56355ec6 [viki:channel] Switch to API 2015-05-21 02:08:13 +06:00
Sergey M․
ac20d95f97 [viki] Add support for youtube externals 2015-05-21 01:56:02 +06:00
Sergey M․
1a83c731bd [viki] Switch extraction to API 2015-05-21 01:44:05 +06:00
Sergey M․
ca57a59883 Merge branch 'ping-viki-shows' 2015-05-20 22:10:06 +06:00
Sergey M․
b0d619fde2 [viki:channel] Extract title from JSON 2015-05-20 21:28:04 +06:00
Sergey M․
cc7051efd7 Merge branch 'viki-shows' of https://github.com/ping/youtube-dl into ping-viki-shows 2015-05-20 20:17:47 +06:00
ping
5137adb94d [soompi] Switch to non-geoblocked test video 2015-05-20 16:16:10 +08:00
Philipp Hagemeister
0b9f7cd074 release 2015.05.20 2015-05-20 10:01:48 +02:00
ping
2632941f32 [soompi] Add new extractor for tv.soompi.com 2015-05-20 15:53:45 +08:00
ping
137597b0ea [dramafever] Streamline code 2015-05-20 15:15:28 +08:00
Yen Chi Hsuan
051df9ad99 [letv/sohu] Skip tests relying on external proxies
The proxy is currently broken. See #5655 and zhuzhuor/Unblock-Youku#427
2015-05-20 14:08:23 +08:00
ping
f670ef1c8e [dramafever] Add new extractor for dramafever.com 2015-05-20 13:51:43 +08:00
Sergey M․
d9d747a06a [ultimedia] Fix extraction 2015-05-19 21:28:41 +06:00
Yen Chi Hsuan
b813d8caf1 [qqmusic] Unescape '\\n' in description (#5705) 2015-05-19 01:01:42 +08:00
Yen Chi Hsuan
ecee572411 [yahoo] Add support for closed captions (closes #5714) 2015-05-19 00:50:24 +08:00
Yen Chi Hsuan
1b0427e6c4 [utils] Support TTML without default namespace
In a strict sense such TTML is invalid, but Yahoo uses it.
2015-05-19 00:45:01 +08:00
Jaime Marquínez Ferrándiz
2aa64b89b3 tox: Pass HOME environment variable
Since version 2.0 it only passes a limited set of variables and we need HOME for the tests
2015-05-18 17:58:53 +02:00
Sergey M․
484c9d2d5b [vier] Fix extraction 2015-05-18 21:43:54 +06:00
Sergey M․
5d8dcb5342 [vuclip] Fix extraction 2015-05-18 21:39:15 +06:00
Sergey M․
2328f2fe68 [vulture] Fix extraction 2015-05-18 21:34:20 +06:00
Sergey M․
4f514c7e88 [wimp] Fix youtube extraction (Closes #5690) 2015-05-18 21:29:41 +06:00
Sergey M․
5bdc520cf1 [xminus] Fix extraction 2015-05-18 21:23:05 +06:00
Jaime Marquínez Ferrándiz
fc6e75dd57 [instagram] Only recognize https urls (fixes #5739)
http urls redirect to them.
2015-05-18 11:21:09 +02:00
Sergey M․
4a5a898a8f [YoutubeDL] Clarify incompatible formats merge message
When `-f` is not specified it's misleading to see `You have requested ...` as user did not actually request any formats.
2015-05-17 20:56:03 +06:00
Mister Hat
ba9d16291b manually specify namespace 2015-05-17 03:35:08 -05:00
Mister Hat
725652e924 [karrierevideos] add support for www.karrierevideos.at (closes #5354) 2015-05-16 19:50:58 -05:00
ping
8da0e0e946 [viki] Change IE name to channel, better message output 2015-05-17 06:19:38 +08:00
Sergey M․
588b82bbf8 [tv2:article] Add extractor (Closes #5724) 2015-05-17 03:32:53 +06:00
Sergey M․
bc0f937b55 [tv2] Add extractor (#5724) 2015-05-17 03:01:52 +06:00
Sergey M․
baa43cbaf0 [extractor/common] Relax valid url check verbosity 2015-05-17 02:59:35 +06:00
Sergey M․
adb6b1b316 Merge branch 'viki-shows' of https://github.com/ping/youtube-dl into ping-viki-shows 2015-05-17 00:38:58 +06:00
ping
1c18de0019 [viki] Add proper paging and include clips 2015-05-17 01:38:50 +08:00
Jaime Marquínez Ferrándiz
4d52f2eb7f [sbs] Remove unused import 2015-05-16 18:38:28 +02:00
Sergey M․
363cf58645 Merge branch 'viki-shows' of https://github.com/ping/youtube-dl into ping-viki-shows 2015-05-16 21:28:36 +06:00
Sergey M․
7e760fc188 [espn] Add extractor (#4396)
Unfinished
2015-05-16 21:14:19 +06:00
Sergey M․
ef2dcbe4ad [sbs] Fix extraction (Closes #5725) 2015-05-16 21:07:29 +06:00
Sergey M․
9354a5fad4 [ooyala] Fix unresolved reference 2015-05-16 20:15:31 +06:00
Sergey M․
1c97b0a777 [ooyala:external] Add extractor 2015-05-16 20:00:40 +06:00
ping
2f3bdab2b9 [viki] Fix code format 2015-05-16 15:56:37 +08:00
ping
0d7f036429 [viki] Add support for shows 2015-05-16 15:43:13 +08:00
Sergey M.
2cda13213d Merge pull request #5717 from blissland/master
[CBSNewsIE] Relax thumbnail regex so test passes
2015-05-15 22:36:07 +05:00
Sergey M․
70d0d43b5e [rts] Check formats (Closes #5711) 2015-05-15 23:32:25 +06:00
Sergey M․
25c3a7348f [generic] Fix typo 2015-05-15 23:23:51 +06:00
Sergey M․
9123d64592 Merge branch 'maddoger-sportbox-fix' 2015-05-15 23:19:21 +06:00
Sergey M․
b827a6015c [generic] Add test for sportbox embeds 2015-05-15 23:18:21 +06:00
Sergey M․
d40a3b5b55 [generic] Add support for sportbox embeds 2015-05-15 23:09:34 +06:00
Sergey M․
ef28a6cb26 [sportbox:embed] Relax thumbnail 2015-05-15 23:09:10 +06:00
Sergey M․
1436a6835e [sportbox:embed] Add _extract_urls 2015-05-15 23:08:44 +06:00
blissland
e8cfacae37 [CBSNewsIE] Relax thumbnail regex so test passes 2015-05-15 17:57:32 +01:00
Sergey M․
3a7382950b [sportbox:embed] Add extractor 2015-05-15 22:50:44 +06:00
Jaime Marquínez Ferrándiz
eeb23eb7ea [gamespot] The protocol is not optional 2015-05-15 18:44:08 +02:00
Jaime Marquínez Ferrándiz
34fe5a94ba [gamespot] Add support for videos that don't use 'f4m_stream' (fixes #5707) 2015-05-15 18:42:59 +02:00
Sergey M․
6181864290 Merge branch 'sportbox-fix' of https://github.com/maddoger/youtube-dl into maddoger-sportbox-fix 2015-05-15 22:09:18 +06:00
Vitaliy Syrchikov
e9ca615a98 New test 2015-05-15 19:57:54 +04:00
Sergey M․
62c95fd5fc [youtube:feed] Check each 'load more' portion for unique video ids 2015-05-15 21:42:34 +06:00
Sergey M․
25f14e9f93 [youtube] Separate feed extractor 2015-05-15 21:06:59 +06:00
Vitaliy Syrchikov
ae670a6ed8 Sportbox source fix. HD videos support. 2015-05-15 17:53:05 +04:00
Vitaliy Syrchikov
a7b8467ac0 Sportbox extractor fix. 2015-05-15 16:52:11 +04:00
blissland
15da7ce7fb Fix file format extraction regex and update test file checksum 2015-05-15 14:12:52 +02:00
Jaime Marquínez Ferrándiz
e9eaf3fbcf [test/YoutubeDL] Add tests for 'playliststart', 'playlistend' and 'playlist_items' 2015-05-15 14:08:26 +02:00
Jaime Marquínez Ferrándiz
3884dcf313 YoutubeDL: ignore indexes from 'playlist_items' that are not in the list (fixes #5706)
We ignore them instead of failing to match the behaviour of the 'playliststart' parameter.
2015-05-15 14:08:26 +02:00
Philipp Hagemeister
c4fc559f45 release 2015.05.15 2015-05-15 10:13:43 +02:00
Jaime Marquínez Ferrándiz
2bc4330303 [youtube:history] Fix extraction (fixes #5702)
It uses the same method as YoutubeSubscriptionsIE, if other feed starts using it we should consider using base class.
2015-05-14 23:41:27 +02:00
Yen Chi Hsuan
12675275a1 [teamcoco] Detect expired videos (#5626) 2015-05-15 02:28:41 +08:00
Yen Chi Hsuan
3a105f7b20 [teamcoco] Rewrite preload data extraction
Idea: "puncture" some consecutive fragments and check whether the
b64decode result of a punctured string is a valid JSON or not.

It's a O(N^3) algorithm, but should be fast for a small N (less than 30
fragments in all test cases)
2015-05-15 02:28:40 +08:00
Sergey M․
1ae72fb23d [soundcloud:user] Defer download link resolve (Closes #5248)
Looks like final download links can expire before downloading process reach them. So, resolving download links right before actual downloading.
2015-05-14 22:28:42 +06:00
Yen Chi Hsuan
7ec676bb3d [qqmusic] Add IE_NAME for all extractors 2015-05-14 23:32:36 +08:00
Yen Chi Hsuan
29ea57283e [qqmusic] Refactoring QQMusicToplistIE 2015-05-14 23:28:42 +08:00
Yen Chi Hsuan
5488973961 [qqmusic] flake8 2015-05-14 23:25:43 +08:00
Yen Chi Hsuan
96d45a5489 Merge pull request #5680 from ping/qqmusic-toplist-ie
[qqmusic] Add support for charts / top lists
2015-05-14 23:23:32 +08:00
Sergey M․
7a012d5a16 [screenwavemedia] Add support for player2 URLs (Closes #5696) 2015-05-14 16:39:35 +06:00
Yen Chi Hsuan
fa6a16996e [worldstarhiphop] Support Android URLs (fixes #5629) 2015-05-14 18:00:57 +08:00
Sergey M․
82245a6de7 [YoutubeDL] Restore filename for thumbnails 2015-05-14 15:21:27 +06:00
Sergey M․
ff28ede2d1 Merge branch 'dstftw-best-fallback-on-outdated-avconv' 2015-05-14 15:19:14 +06:00
Sergey M․
98b8ec8616 Merge branch 'best-fallback-on-outdated-avconv' of https://github.com/dstftw/youtube-dl into dstftw-best-fallback-on-outdated-avconv
Conflicts:
	youtube_dl/YoutubeDL.py
2015-05-14 15:18:58 +06:00
Yen Chi Hsuan
88f9d8748c Merge remote-tracking branch 'upstream/master' 2015-05-14 17:07:02 +08:00
Sergey M․
7d57d2e18b [canalplus] Restore checksums in tests 2015-05-14 14:59:27 +06:00
Sergey M.
38caa00d18 Merge pull request #5695 from blissland/master
[CanalplusIE] Update tests that were no longer working
2015-05-14 13:57:56 +05:00
Yen Chi Hsuan
c827d4cfdb [xattr] Enhanced error messages on Windows 2015-05-14 16:53:10 +08:00
blissland
509c630db8 [CanalplusIE] Update tests that were no longer working 2015-05-14 08:09:56 +01:00
Yen Chi Hsuan
fbff30d2db [xattr] Catch 'Argument list too long' 2015-05-14 14:51:00 +08:00
Yen Chi Hsuan
86c7fdb17c [xattr] Enhance error handling to catch ENOSPC
Fixes #5589
2015-05-14 14:28:41 +08:00
Yen Chi Hsuan
62bd6589c7 Merge pull request #5692 from yan12125/fix-embedthumbnailpp
Use thumbnails downloaded by YoutubeDL in EmbedThumbnailPP
2015-05-14 12:35:58 +08:00
Yen Chi Hsuan
2cc6d13547 [postprocessor/embedthumbnail] Encode arguments in calling AtomicParsley 2015-05-14 04:41:30 +08:00
Yen Chi Hsuan
bb8ca1d112 [postprocessor/embedthumbnail] Use run_ffmpeg_multiple_files 2015-05-14 02:35:28 +08:00
Yen Chi Hsuan
8e59539752 [postprocessor/embedthumbnail] Use thumbnails downloaded by YoutubeDL 2015-05-14 02:32:00 +08:00
Sergey M․
372744c544 [odnoklassniki] Fix extraction (Closes #5671) 2015-05-13 22:26:30 +06:00
Sergey M.
83880949a1 Merge pull request #5682 from blissland/master
[BYUtvIE] Relax thumbnail regex so test does not fail
2015-05-13 19:36:22 +05:00
Yen Chi Hsuan
3749e36e9f [YoutubeDL] Fix PEP8 W503 2015-05-13 21:16:45 +08:00
blissland
0b4253fa37 [BYUtvIE] Change thumbnail regex so test does not fail 2015-05-12 18:57:06 +01:00
ping
86ec1e487c [qqmusic] Code fixes 2015-05-13 01:37:56 +08:00
ping
fd4eefed39 [qqmusic] Fix extraction for global list 2015-05-13 01:14:02 +08:00
ping
b480e7874b [qqmusic] Fix code formatting 2015-05-12 22:41:37 +08:00
ping
41333b97b9 [qqmusic] Add support for charts / top lists 2015-05-12 22:35:16 +08:00
Yen Chi Hsuan
c1c924abfe [utils,common] Merge format_srt_time and _subtitles_timecode
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
2015-05-12 13:04:54 +08:00
Yen Chi Hsuan
1c7e2e64f6 [nrk] Remove TTML to srt conversion codes
A common routine is implemented in utils.py and can be used via
--convert-subtitles.
2015-05-12 12:55:14 +08:00
Yen Chi Hsuan
7dff03636a [utils] Support 'dur' field in TTML 2015-05-12 12:47:37 +08:00
Yen Chi Hsuan
5332fd91bf [nytimes] Correct _VALID_URL of NYTimesArticleIE 2015-05-12 12:42:13 +08:00
Sergey M․
d4b963d0a6 [vine] Relax alt_title (Closes #5677) 2015-05-12 01:54:56 +06:00
Sergey M․
6d3f5935e5 [southpark] Fix IE_NAME 2015-05-11 23:47:50 +06:00
rrooij
968ee17677 [southparkdk] Add extractor 2015-05-11 23:45:38 +06:00
rrooij
81ed3bb9c0 [southpark] Sort alphabetically 2015-05-11 23:45:29 +06:00
Sergey M․
5115652828 [zingmp3] Capture error message 2015-05-11 21:31:36 +06:00
Sergey M․
1f92865494 [dumpert] Add cpc cookie (Closes #5672) 2015-05-11 21:05:39 +06:00
Yen Chi Hsuan
e41f450f28 [tmz] Add support for articles (fixes #5477) 2015-05-11 20:06:10 +08:00
Sergey M․
97fcf1bbd0 [YoutubeDL] Check if merger can actually merge 2015-05-11 02:01:16 +06:00
Sergey M․
13763ce599 [postprocessor/ffmpeg] Add can_merge method 2015-05-11 02:00:31 +06:00
Sergey M․
7fcb605b82 [YoutubeDL] Fallback to -f best when merger is outdated 2015-05-11 00:27:29 +06:00
Sergey M․
70484b9f8a [postprocessor/ffmpeg] Extract check_outdated method 2015-05-11 00:26:39 +06:00
Jaime Marquínez Ferrándiz
69b46b3d95 ExecAfterDownloadPP: fix __init__ method 2015-05-10 17:47:49 +02:00
Jaime Marquínez Ferrándiz
95c5534f8e ExecAfterDownloadPP, YoutubeDL: remove unused parameters 2015-05-10 17:41:11 +02:00
Sergey M․
370b39e8ec [voicerepublic] Fix fallback branch formats extraction 2015-05-10 18:37:52 +06:00
Sergey M․
3da8038918 Merge branch 'duncankl-voicerepublic' 2015-05-10 18:29:36 +06:00
Sergey M․
a6762c4a22 [voicerepublic] Make more robust and extract more metadata 2015-05-10 18:29:15 +06:00
Sergey M․
98c2c0febc Merge branch 'voicerepublic' of https://github.com/duncankl/youtube-dl into duncankl-voicerepublic 2015-05-10 17:31:55 +06:00
Yen Chi Hsuan
63cbd19f50 [ndr] Replace the 404 test case 2015-05-10 18:30:26 +08:00
Yen Chi Hsuan
1934f3a0ea [ndr] Extended to support n-joy.de as well (closes #4527)
According to http://en.wikipedia.org/wiki/N-Joy, n-joy.de is a service
hosted by NDR, so I put them together.
2015-05-10 18:22:07 +08:00
ping
a909e6ad43 [dailymotion] Patch upload_date detection.
(closes #5665)
2015-05-10 11:13:14 +02:00
Duncan
1dcb52188d [voicerepublic] Remove hardcoded paths to media files 2015-05-10 17:06:34 +12:00
Duncan
28ebef0b1b [voicerepublic] Detect list of available formats from the web page 2015-05-10 16:03:09 +12:00
Duncan
f03a8a3c4e [voicerepublic] Raise ExtractorError if audio is still being processed 2015-05-10 15:50:06 +12:00
Duncan
03f760b1c0 [voicerepublic] Remove creator field 2015-05-10 15:41:27 +12:00
Duncan
f900dc3fb9 [voicerepublic] Extract author using _html_search_meta 2015-05-10 15:01:58 +12:00
Sergey M․
95eb1adda8 [life:embed] Sort formats 2015-05-10 08:54:50 +06:00
Duncan
c6ddbdb66c [voicerepublic] Add new extractor 2015-05-10 12:39:24 +12:00
Sergey M․
3800b908b1 [mlb] Fix #5663 2015-05-10 06:14:34 +06:00
Philipp Hagemeister
69fe3a5f09 release 2015.05.10 2015-05-10 01:05:24 +02:00
Sergey M․
754270313a [life:embed] Move to separated extractor and extract m3u8 formats 2015-05-10 01:03:26 +06:00
Sergey M․
057ebeaca3 [lifenews] Add test for #5660 2015-05-10 00:27:49 +06:00
Sergey M․
480065172d [lifenews] Add support for video URLs (Closes #5660) 2015-05-10 00:26:42 +06:00
Sergey M․
f2e0056579 [vgtv] Avoid duplicate format_id 2015-05-09 21:23:09 +06:00
Sergey M․
32fffff2cc [eroprofile] Fix video URL extraction (Closes #5657) 2015-05-09 21:19:09 +06:00
Sergey M.
3c47824d6b Merge pull request #5658 from blissland/master
[BRIE] Updated two test cases
2015-05-09 20:07:21 +05:00
blissland
0892090a56 Added audio test for BRIE 2015-05-09 16:02:07 +01:00
blissland
d592b42f5c Updated two tests for BRIE 2015-05-09 15:26:00 +01:00
Jaime Marquínez Ferrándiz
3b5f65a64c [mlb] Fix extraction of articles
And move test from generic, since it's directly handled by MLBIE
2015-05-09 12:41:56 +02:00
Jaime Marquínez Ferrándiz
5c0b2c16a8 [vgtv] Escape '#' in _VALID_URL and remove empty newlines at the end
In verbose mode, '#' is interpreted as the start of a comment.
2015-05-09 12:34:45 +02:00
Yen Chi Hsuan
d39e0f05db [utils] Remove sanitize_url_path_consecutive_slashes()
This function is used only in SohuIE, which is updated to use a new
extraction logic.
2015-05-09 17:37:39 +08:00
Yen Chi Hsuan
6d14d08e06 [yam] Fix title and uploader id 2015-05-09 17:36:07 +08:00
Yen Chi Hsuan
32060c6d6b [sohu] Update extractor
The original extraction logic always fails for all test videos
2015-05-09 14:02:11 +08:00
Yen Chi Hsuan
3dbec410a0 [sohu] Enhance error handling 2015-05-09 14:02:11 +08:00
Sergey M․
de765f6c31 [foxsports] Support some more URLs (#5611) 2015-05-09 02:15:51 +06:00
Sergey M․
dc455a5f88 [extractor/generic] Add test for svt embed 2015-05-09 00:27:37 +06:00
Sergey M․
bab19a8e91 [extractor/generic] Add support for svt embeds (Closes #5622) 2015-05-09 00:23:35 +06:00
Sergey M․
322915014f [svtplay] Rename to svt 2015-05-09 00:13:40 +06:00
Sergey M․
79998cd5af [svtplay] Generalize svt extractors and add svt.se extractor 2015-05-09 00:12:42 +06:00
Sergey M.
50b9013064 [README.md] Fix typo 2015-05-08 23:21:23 +06:00
Sergey M.
bb03fdae0d [README.md] Clarify format selection when streaming to stdout 2015-05-08 23:19:57 +06:00
Sergey M․
4384cf9e7d [extractor/__init__] Fix alphabetic order 2015-05-08 23:04:27 +06:00
Sergey M.
d47e980d0d Merge pull request #5641 from dstftw/preserve-best-for-stdout-outtmpl
[YoutubeDL] Do not force bestvideo+bestaudio when outtmpl is stdout
2015-05-08 22:01:50 +05:00
Sergey M․
fe373287eb [vgtv] Add support for bt vestlendingen (Closes #5620) 2015-05-08 22:59:50 +06:00
Sergey M․
cbe443362f [aftenposten] Implement in terms of xtream extractor 2015-05-08 22:52:20 +06:00
Sergey M․
2c0c9dc46c [xstream] Move xstream to separate extractor 2015-05-08 22:50:01 +06:00
Sergey M․
0ceab84749 [vgtv] Add support for bt.no articles (#5620) 2015-05-08 22:18:43 +06:00
Sergey M․
34e7dc81a9 [vgtv] Add support for generic bt.no URLs (#5620) 2015-05-08 22:03:03 +06:00
Sergey M․
4e6e9d21bd [mlb] Improve _VALID_URL 2015-05-08 21:48:47 +06:00
Sergey M․
d1feb30811 [mlb] Fallback to extracting video id from webpage for all URLs that does not contain it explicitly (Closes #5630) 2015-05-08 20:07:53 +06:00
blissland
43837189c1 Fix URL template extraction for netzkino. Fixes #5614 2015-05-08 12:20:34 +02:00
blissland
249962ffa2 [bet] Use unique part of xml url as the video id and fix tests (closes #5642)
The guid changes often.
2015-05-08 11:31:05 +02:00
Jaime Marquínez Ferrándiz
541168039d [utils] get_exe_version: encode executable name (fixes #5647)
It failed in python 2.x when $PATH contains a directory with non-ascii characters.
2015-05-08 11:01:24 +02:00
Yen Chi Hsuan
7ef00afe9d [nhl] Support RTMP videos (fixes #4481) 2015-05-08 03:11:25 +08:00
Yen Chi Hsuan
156fc83a55 [downloader/rtmp] Fix a typo 2015-05-08 03:11:24 +08:00
Naglis Jonaitis
46be82b811 [vessel] Use main_video_asset when searching for video_asset (Fixes #5623) 2015-05-07 22:00:07 +03:00
Yen Chi Hsuan
09b412dafa [nhl] Partial support for hlg id (fixes #4285) 2015-05-08 02:14:28 +08:00
Jaime Marquínez Ferrándiz
5268a05e47 [ooyala] Style fix 2015-05-07 17:04:15 +02:00
Sergey M․
406224be52 [extractor/generic] Fix following incomplete redirects (#5640) 2015-05-07 21:02:59 +06:00
Sergey M․
3799834dcf [YoutubeDL] Do not force bestvideo+bestaudio when outtmpl is stdout (#5627) 2015-05-07 20:46:11 +06:00
Yen Chi Hsuan
553e412bda Merge branch 'master' of github.com:rg3/youtube-dl 2015-05-07 22:24:49 +08:00
Sergey M․
f22834a372 [bild] Relax thumbnail test check 2015-05-07 20:20:43 +06:00
Sergey M.
bd349a8704 Merge pull request #5638 from blissland/master
[BildIE] Fix ampersands in xml attributes & update test thumbnails
2015-05-07 19:18:35 +05:00
blissland
bc08873cff Fix indents 2015-05-07 15:09:27 +01:00
Yen Chi Hsuan
aafe273990 [ooyala] Use SAS API to extract info (fixes #4336) 2015-05-07 22:07:32 +08:00
blissland
c09593c04e [BildIE] Escape ampersands in xml and update test thumbnail 2015-05-07 15:07:11 +01:00
Yen Chi Hsuan
84bf31aaf8 [ooyala] Extract m3u8 information (#2292) 2015-05-07 18:12:01 +08:00
Yen Chi Hsuan
05d5392cda [common] Ignore subtitles in m3u8 2015-05-07 18:06:22 +08:00
Yen Chi Hsuan
d9a743d917 [vice] Remove a redundant print 2015-05-07 18:05:37 +08:00
Yen Chi Hsuan
ac6c358c2a [teamcoco] Fix extracting preload data again 2015-05-07 12:58:00 +08:00
Sergey M․
ad0c0ad3b4 [historicfilms] Fix tape id extraction 2015-05-06 21:52:26 +06:00
Sergey M․
1ed34f3dd6 [gorillavid] Switch 404 test to only matching 2015-05-06 21:43:36 +06:00
Sergey M․
6a8f9cd22e [giga] Fix view count extraction 2015-05-06 21:39:53 +06:00
Sergey M․
e8b9ab8957 [pbs] Add format_id for direct links 2015-05-06 21:31:25 +06:00
Sergey M․
74f728249f [extractor/common] Fallback to empty string for (yet) missing format_id in _sort_formats (Closes #5624) 2015-05-06 21:24:24 +06:00
blissland
d6a1738892 [archive.org] Fix incorrect url condition (closes #5628)
The condition for assigning to json_url is the wrong way round:

currently for url: aaa.com/xxx

we get:

aaa.com/xxx&output=json

instead of the correct value:

aaa.com/xxx?output=json
2015-05-06 15:06:10 +02:00
Sergey M․
b326b07adc [lifenews] Use _proto_relative_url 2015-05-05 21:49:36 +06:00
Yen Chi Hsuan
07d2921c6d [lifenews] Correctly determine iframe links (fixes #5618) 2015-05-05 23:39:54 +08:00
Sergey M.
22e462c97a Merge pull request #5612 from rrooij/southparknl
Southparknl
2015-05-05 19:32:27 +05:00
rrooij
dcf8077906 [southparknl] Fix test to match playlist tests 2015-05-05 09:17:21 +02:00
rrooij
3408f6e64a [southparkde] Fix naming inconsistency
The class was first called 'SouthparkDe'. It is now changed to
'SouthParkDe' to match the name of the other extractors.
2015-05-05 09:01:07 +02:00
rrooij
e10dc0e1f0 [southparknl] Add extractor for southpark.nl 2015-05-05 08:59:09 +02:00
Sergey M․
ce5c1ae517 [noco] Remove unused import 2015-05-05 02:52:21 +06:00
Sergey M․
bbe718c97f Merge branch 'Tassatux-noco' 2015-05-05 02:50:58 +06:00
Sergey M․
01e4b1ee14 [noco] Update tests 2015-05-05 02:50:39 +06:00
Sergey M․
815ac0293e [noco] Modernize 2015-05-05 02:38:13 +06:00
Sergey M․
6568382d6f [noco] Extract all variations of audio/subtitles media 2015-05-05 02:27:24 +06:00
Sergey M․
f943b7ddce Merge branch 'noco' of https://github.com/Tassatux/youtube-dl into Tassatux-noco 2015-05-05 00:39:24 +06:00
Aurélien Dunand
ff9d68e7be [noco] Add test for multi languages video 2015-05-04 19:55:29 +02:00
Aurélien Dunand
7212560f4d [noco] Retrieve video language according to user options 2015-05-04 18:06:12 +02:00
Sergey M․
1aa43d77c0 [rutv] Remove superfluous check 2015-05-04 21:29:56 +06:00
Sergey M․
e038d5c4e3 [rutv] Fix preference 2015-05-04 21:29:32 +06:00
Sergey M․
dfad3aac98 [rutv] Fix live stream test URL 2015-05-04 21:23:26 +06:00
Yen Chi Hsuan
df8418ffcf [nytimes] Extend _VALID_URL (#2754) 2015-05-04 23:03:47 +08:00
Yen Chi Hsuan
50aa43b3ae [nytimes] Implement extracting videos from articles (closes #5436) 2015-05-04 23:03:47 +08:00
Jaime Marquínez Ferrándiz
a90552663e [livestream:original] Update url format (fixes #5598) 2015-05-04 16:54:01 +02:00
Jaime Marquínez Ferrándiz
883340c107 [livestream:original] Fix extraction (fixes #4702) 2015-05-04 16:52:17 +02:00
Yen Chi Hsuan
0fe2ff78e6 [NBC] Enhance embedURL extraction (closes #2549) 2015-05-04 21:55:04 +08:00
Philipp Hagemeister
dc1eed93be release 2015.05.04 2015-05-04 15:12:48 +02:00
Sergey M․
b2f82360d7 [escapist] Add uploader to tests 2015-05-04 19:06:07 +06:00
Sergey M․
782e0568ef [escapist] Modernize 2015-05-04 19:04:49 +06:00
Sergey M․
90b4b0eabe [escapist] Improve _VALID_URL 2015-05-04 19:01:08 +06:00
Sergey M․
cec04ef3a6 [escapist] Update tests' checksums 2015-05-04 19:00:34 +06:00
Sergey M․
71fa56b887 [escapist] Fix formats extraction 2015-05-04 18:59:22 +06:00
Yen Chi Hsuan
b9b3ab45ea [NBC] Enhance extraction of ThePlatform URL (fixes #5470) 2015-05-04 19:09:18 +08:00
Philipp Hagemeister
957b794c26 release 2015.05.03 2015-05-03 22:31:39 +02:00
Yen Chi Hsuan
8001607e90 [generic] Detect more MLB videos (fixes #5443) 2015-05-04 02:20:07 +08:00
Yen Chi Hsuan
3e7202c1bc [MLB] Extend _VALID_URL (#5443) 2015-05-04 01:59:26 +08:00
Yen Chi Hsuan
848edeab89 [lifenews] Detect <iframe> (fixes #5346) 2015-05-04 01:24:19 +08:00
Yen Chi Hsuan
1748d67aea [lifenews] Fix view count and comment count 2015-05-04 01:11:23 +08:00
Jaime Marquínez Ferrándiz
5477ca8239 [dailymotion] Use https urls
The video url still redirects to an http url, but it doesn't explicitly contain the video id.
2015-05-03 16:59:14 +02:00
Sergey M․
d0fd305023 [rutv] Add test for #5584 2015-05-03 10:00:34 +06:00
Sergey M․
8dab1e9072 [rutv] Recognize live streams (#5584) 2015-05-03 09:56:03 +06:00
Sergey M․
963aea5279 [baiduvideo] Improve _VALID_URL 2015-05-03 07:45:15 +06:00
Sergey M․
0a64aa7355 [vgtv] Fix _VALID_URL (Closes #5578) 2015-05-03 00:58:42 +06:00
Sergey M․
0669c89c55 [options] Clarify --write-annotations help 2015-05-02 23:38:30 +06:00
Sergey M․
2699da8041 [YoutubeDL] Improve description file naming 2015-05-02 23:36:55 +06:00
Sergey M․
98727e123f [YoutubeDL] Improve annotations file naming 2015-05-02 23:35:18 +06:00
Sergey M․
b29e0000e6 [YoutubeDL] Improve JSON info file naming 2015-05-02 23:23:44 +06:00
Sergey M․
b3ed15b760 [utils] Add replace_extension 2015-05-02 23:23:06 +06:00
Sergey M․
666a9a2b95 [YoutubeDL] Improve audio/video-only file naming 2015-05-02 23:11:34 +06:00
Sergey M․
a4bcaad773 [test_utils] Add tests for prepend_extension 2015-05-02 23:10:48 +06:00
Sergey M․
e65e4c8874 [utils] Improve prepend_extension
Now `ext` is appended to filename if real extension != expected extension.
2015-05-02 23:06:01 +06:00
Yen Chi Hsuan
21f6330274 [baiduvideo] Add new extractor (closes #4563) 2015-05-03 00:53:24 +08:00
Sergey M․
38c6902b90 [YoutubeDL] Ensure correct extension is always present for a merged file (Closes #5535) 2015-05-02 22:52:21 +06:00
Jaime Marquínez Ferrándiz
2ddcd88129 Remove code that was only used by the Grooveshark extractor 2015-05-02 17:29:56 +02:00
Yen Chi Hsuan
dd8920653c [Grooveshark] Remove the extractor
grooveshark.com was shut down on 2015/04/30
2015-05-02 21:46:33 +08:00
Sergey M․
c938c35f95 [iconosquare] Fix extraction 2015-05-02 07:18:22 +06:00
Yen Chi Hsuan
2eb0192155 [viki] Remove clean_html call 2015-05-02 01:35:46 +08:00
Yen Chi Hsuan
d948e09b61 [viki] Extract m3u8 videos (#4855) 2015-05-02 01:20:16 +08:00
Yen Chi Hsuan
89966a5aea [viki] Enhance error message handling (#3774) 2015-05-02 01:20:15 +08:00
Yen Chi Hsuan
8e3df9dfee [viki] Fix extractor and add a global availble test case 2015-05-02 01:20:15 +08:00
Sergey M․
5890eef6b0 [pbs] Add support for HD (Closes #3564, closes #5390) 2015-05-01 17:43:06 +06:00
Nikoli
083c1bb960 Add ability to embed subtitles in mkv files (closes #5434) 2015-05-01 11:54:40 +02:00
Yen Chi Hsuan
861e65eb05 [yahoo] Extend _VALID_URL 2015-05-01 12:32:24 +08:00
Sergey M․
650cfd0cb0 [bbccouk] Mute thumbnail 2015-05-01 04:07:30 +06:00
Sergey M․
e68ae99a41 [bbccouk] Add test for #5530 2015-05-01 04:02:56 +06:00
Sergey M․
8683b4d8d9 [bbccouk] Improve extraction (Closes #5530) 2015-05-01 03:59:13 +06:00
Sergey M․
1dbd717eb4 [theplaform] Fix FutureWarning 2015-05-01 02:51:55 +06:00
Sergey M․
6a8422b942 [foxsports] Add extractor (Closes #5517) 2015-05-01 02:49:06 +06:00
Sergey M․
cb202fd286 [YoutubeDL] Filter requested info fields on --load-info as well
In order to properly handle JSON info files generated by youtube-dl versions prior to 4070b458ec
2015-05-01 00:44:34 +06:00
Naglis Jonaitis
67fc8ecd53 [dreisat] Extend _VALID_URL (Closes #5548) 2015-04-30 21:28:08 +03:00
Jaime Marquínez Ferrándiz
df8301fef5 [YoutubeDL] pep8: use 'k not in' instead of 'not k in' 2015-04-30 20:18:42 +02:00
Sergey M․
4070b458ec [YoutubeDL] Do not write requested info in info JSON file (Closes #5562, closes #5564) 2015-04-30 23:55:05 +06:00
Yen Chi Hsuan
ffbc3901d2 Merge remote-tracking branch 'upstream/master' 2015-04-30 23:33:49 +08:00
Sergey M․
7a03280df4 [vporn] More metadata extraction fixes and tests update (#5560) 2015-04-30 21:31:38 +06:00
Yen Chi Hsuan
482a1258de [VeeHD] Replace the third test case due to copyright issues 2015-04-30 23:27:07 +08:00
Sergey M․
cd298882cd [vporn] Fix metadata extraction (#5560) 2015-04-30 21:25:17 +06:00
Sergey M․
e01c56f9e1 [YoutubeDL] Generalize best/worst format match behavior 2015-04-30 21:06:51 +06:00
Sergey M.
4d72df4031 Merge pull request #5556 from jaimeMF/best-format-nodash
Make 'best' format only match non-DASH formats (closes #5554)
2015-04-30 19:57:02 +05:00
Yen Chi Hsuan
f7f1df1d82 [VeeHD] Enhance extraction and fix tests (fixes #4965) 2015-04-30 22:37:41 +08:00
Yen Chi Hsuan
c4a21bc9db [bilibili] Extract multipart videos (closes #3250) 2015-04-30 18:26:08 +08:00
Yen Chi Hsuan
621ffe7bf4 [niconico] Fix so* video extraction (fixes #4874) (#2087) 2015-04-30 17:05:02 +08:00
Jaime Marquínez Ferrándiz
8dd5418803 Make 'best' format only match non-DASH formats (closes #5554)
Otherwise it's impossible to only download non-DASH formats, for example `best[height=?480]/best` would download a DASH video if it's the only one with height=480, instead for falling back to the second format specifier.
For audio only urls (soundcloud, bandcamp ...), the best audio will be downloaded as before.
2015-04-29 22:53:18 +02:00
Jaime Marquínez Ferrándiz
965cb8d530 [escapist] pep8 fixes 2015-04-29 22:46:19 +02:00
Yen Chi Hsuan
b2e8e7dab5 [niconico] Try to extract all optional fields from various sources 2015-04-30 02:24:05 +08:00
Yen Chi Hsuan
59d814f793 [niconico] Remove credentials from tests and enhance title extraction
All test videos can be downloaded without username and password now.
2015-04-30 00:50:48 +08:00
Yen Chi Hsuan
bb865f3a5e [niconico] Fix extraction and update tests (closes #5511) 2015-04-30 00:50:48 +08:00
Yen Chi Hsuan
9ee53a49f0 [YouPorn] Fix extractor 2015-04-30 00:50:48 +08:00
Sergey M.
79adb09baa Merge pull request #5553 from zouhair/master
Typo: twice "the the" to "the"
2015-04-29 20:05:48 +05:00
zouhair
cf0649f8b7 Typo: twice "the the" to "the" 2015-04-29 11:03:10 -04:00
Sergey M.
f8690631e2 Merge pull request #5552 from zouhair/master
Typo "incompatible" instead of "uncompatible"
2015-04-29 19:09:47 +05:00
zouhair
5456d78f0c Typo "incompatible" instead of "uncompatible" 2015-04-29 10:07:49 -04:00
Yen Chi Hsuan
cbbece96a2 [yourupload] Simplify 2015-04-29 04:05:14 +08:00
Yen Chi Hsuan
9d8ba307ef [yourupload] Fix extraction 2015-04-29 04:03:07 +08:00
Yen Chi Hsuan
ec7c1e85e0 [testtube] Fix test case 1
Seems the site now provides webm with higher bitrates
2015-04-29 00:24:58 +08:00
Yen Chi Hsuan
e70c7568c0 [testtube] Detect Youtube iframes (fixes #4867) 2015-04-29 00:22:17 +08:00
Yen Chi Hsuan
39b62db116 [youtube] Catch more alert messages (closes #5074) 2015-04-28 23:07:56 +08:00
Jaime Marquínez Ferrándiz
2edce52584 [vimeo] Fix password protected videos again (#5082)
Since they have changed again to the previous format, I've modified the regex to match both formats.
2015-04-28 15:06:08 +02:00
pulpe
10831b5ec9 [vimeo] Fix redirection 2015-04-28 14:56:48 +02:00
Philipp Hagemeister
3a0f0c263a release 2015.04.28 2015-04-28 09:11:18 +02:00
Sergey M․
2419a376b9 [moniker] Check not found error (#5541) 2015-04-27 23:46:16 +06:00
Sergey M․
e206740fd7 [moniker] Capture and output error message (#5541) 2015-04-27 23:44:05 +06:00
Sergey M․
290a5a8d85 [escapist] Fix imsVideo regex (#5090) 2015-04-27 22:17:51 +06:00
pulpe
e2dc351d25 [escapist] Fix extractor (fixes #5090) 2015-04-27 17:44:13 +02:00
Sergey M․
c86b61428b [utils] Fix another old python 2.6 kwargs issue (Closes #5539) 2015-04-27 20:00:18 +06:00
Sergey M.
40b96352c9 Merge pull request #5523 from jaimeMF/remove-format-limit
Remove the --max-quality option
2015-04-27 16:44:58 +05:00
Sergey M.
189ba90996 [README] Use youtube-dl test video URL 2015-04-27 16:05:01 +06:00
Sergey M.
c8183e661d [README] Document special characters escaping (#5538) 2015-04-27 16:01:30 +06:00
Sergey M․
053c94f1b3 [README] Clarify youtube-dl version when format selection changed to bestvideo+bestaudio/best 2015-04-27 15:21:51 +06:00
Sergey M․
b9d76a9571 Merge branch 'fstirlitz-philharmoniedeparis' 2015-04-27 03:36:46 +06:00
Sergey M․
a01cfc2951 [philharmoniedeparis] Fix extraction and tests, improve, simplify 2015-04-27 03:36:32 +06:00
Philipp Hagemeister
4eb5c65bee release 2015.04.26 2015-04-26 22:45:20 +02:00
felix
06d07c4000 New extractor: live.philharmoniedeparis.fr 2015-04-26 14:15:29 +02:00
Sergey M․
74f8654a53 [downloader/external] Use encodeArgument 2015-04-26 04:33:43 +06:00
Sergey M․
9e105a858c [downloader/rtmp] Fix arguments encoding and simplify retry logic (Closes #5528) 2015-04-26 04:32:54 +06:00
Sergey M․
cd8a07a764 [downloader/common] Use decodeArgument 2015-04-26 04:30:45 +06:00
Sergey M․
aa49acd15a [utils] Add get_subprocess_encoding and filename/argument decode counterparts 2015-04-26 04:29:41 +06:00
Jaime Marquínez Ferrándiz
642f23bd81 [southpark] Use 'ñ' in the spanish extractor name
IE_NAME can contain unicode characters, so it shouldn't be a problem.
2015-04-25 22:36:11 +02:00
pulpe
2e24e6bd17 Merge branch 'mp3' 2015-04-25 20:41:59 +02:00
pulpe
2a09c1b8ab [postprocessor/embedthumbnail] Fix mp3 embedding with avconv (fixes #5526) 2015-04-25 20:41:15 +02:00
Sergey M․
a5ebf77d87 [mplayer] Rename to RTSP 2015-04-26 00:25:51 +06:00
Sergey M․
b874495b1f [mplayer] Simplify 2015-04-26 00:23:16 +06:00
Sergey M․
b860f5dfd4 [mplayer] Clarify error message 2015-04-26 00:22:13 +06:00
Sergey M.
b19fc36c81 Merge pull request #5521 from mrkrossxdx/mpv
Added support for mpv if mplayer is not available (new version)
2015-04-25 23:19:59 +05:00
Sergey M․
d2d8248f68 [instagram] Modernize 2015-04-25 22:42:15 +06:00
Sergey M․
f54bab4d67 [instagram] Improve _VALID_URL 2015-04-25 22:39:50 +06:00
Yen Chi Hsuan
bf6427d2fb [ffmpeg] Add dfxp (TTML) subtitles support (#3432, #5146) 2015-04-25 23:18:27 +08:00
Yen Chi Hsuan
672f1bd849 [cspan] Extract subtitles 2015-04-25 23:18:27 +08:00
Sergey M․
529d26c3e1 [orf:iptv] Update test 2015-04-25 21:06:27 +06:00
Sergey M․
857f00ed94 [southpark] Improve some _VALID_URL's 2015-04-25 20:24:15 +06:00
Sergey M․
e4a5e772f2 [southpark:espanol] Add extractor (Closes #5525) 2015-04-25 20:23:42 +06:00
Sergey M․
a542e372ab [mtv] Stuff lang into info URL when available 2015-04-25 20:22:20 +06:00
Jaime Marquínez Ferrándiz
0d1bd5d62f README: remove --max-quality 2015-04-25 15:14:16 +02:00
Jaime Marquínez Ferrándiz
9f3fa89f7c Remove the --max-quality option
It doesn't work well with 'bestvideo' and 'bestaudio' because they are usually before the max quality.
Format filters should be used instead, they are more flexible and don't require the requested quality to exist for each video.
2015-04-25 11:59:54 +02:00
Jaime Marquínez Ferrándiz
92995e6265 [postprocessor/embedthumbnail] Style fix 2015-04-24 22:08:00 +02:00
Jaime Marquínez Ferrándiz
a4196c3ea5 [ellentv] Remove unused import 2015-04-24 22:06:22 +02:00
mrkrossxdx
db37e0c273 Added support for mpv if mplayer is not available 2015-04-24 20:50:34 +02:00
Sergey M․
d0aefec99a [ellentv:clips] Fix test 2015-04-24 22:10:27 +06:00
Sergey M․
66be4b89d7 [ellentv:clips] Fix extraction 2015-04-24 22:09:54 +06:00
Sergey M․
870744ce8f [ellentv] Fix tests 2015-04-24 22:07:15 +06:00
Sergey M․
2ad978532b [ellentv] Fix extraction 2015-04-24 22:03:14 +06:00
Sergey M․
5090d93f2c [dotsub] Fix extraction 2015-04-24 21:47:13 +06:00
Yen Chi Hsuan
c8ff645766 [gdcvault] Add display_id 2015-04-24 22:43:33 +08:00
Yen Chi Hsuan
25f7d1beba [gdcvault] Extend _VALID_URL (fixes #5236) 2015-04-24 22:33:35 +08:00
pulpe
09aa111918 Merge branch 'embedthumb' 2015-04-24 09:25:44 +02:00
pulpe
10fb7710e8 Forgot to clean the remains of class 2015-04-24 09:17:46 +02:00
pulpe
c0ea8ebb9b [ffmpeg] Remove unneeded class 2015-04-24 09:11:39 +02:00
pulpe
31fd9c7601 [embedthumbnail] use FFmpegPostProcessor for mp3 2015-04-24 09:08:57 +02:00
pulpe
ddbed36455 [embedthumbnail] Add support for mp3 cover embedding 2015-04-24 08:48:49 +02:00
Yen Chi Hsuan
a9b0d4e1f4 [Crunchyroll] Fix extraction on Python 2.6
XPath with recursive children selection not supported
2015-04-24 14:09:35 +08:00
Sergey M․
4d6a3ff411 [README] Finally fix configuration file link 2015-04-24 01:41:36 +06:00
Sergey M․
7fb993e1f4 [README] Fix configuration file link and typo 2015-04-24 01:38:02 +06:00
Sergey M․
02f502f435 [README] Document on how to enable old format selection behavior (#5510, #5511) 2015-04-24 01:34:57 +06:00
Sergey M․
4515cb43ca [xattrpp] Fix typo 2015-04-23 22:11:09 +06:00
Sergey M․
d740333224 [cracked] Modernize 2015-04-23 21:59:18 +06:00
Sergey M․
c610f38ba9 [cracked] Update tests 2015-04-23 21:58:50 +06:00
Sergey M․
6447353f52 [cracked] Add support for youtube embeds 2015-04-23 21:49:54 +06:00
Sergey M․
b46ed49996 [cracked] Fix extraction 2015-04-23 21:44:51 +06:00
Yen Chi Hsuan
cd9fdccde0 [ustream] Try to extract uploader from JSON data (#5128) 2015-04-23 18:33:25 +08:00
Yen Chi Hsuan
2a8137272d [ustream] Add an alternative approach to extract title (fixes #5128) 2015-04-23 18:24:44 +08:00
Yen Chi Hsuan
762155cc90 [ustream] Checking errors 2015-04-23 18:10:18 +08:00
Yen Chi Hsuan
f8610ba1ca [ustream] Fix extraction (closes #3998) 2015-04-23 18:10:18 +08:00
pulpe
c99f4098c4 Merge branch 'master' of github.com:rg3/youtube-dl 2015-04-23 11:43:37 +02:00
pulpe
3eec9fef30 [realvid] Add extractor for realvid.net (closes #5504) 2015-04-23 11:41:21 +02:00
Yen Chi Hsuan
8c8826176d [xattr] Add version detection for python-pyxattr
For more information, see #5498 and changes to convertObj() in
iustin/pyxattr@cc84e466f6
2015-04-23 13:50:44 +08:00
pulpe
14a2d6789f [vimeo] one token overlooked 2015-04-22 23:55:19 +02:00
pulpe
7513f298b0 [vimeo] Fix login token (fixes #5082) 2015-04-22 23:50:11 +02:00
Jaime Marquínez Ferrándiz
c04c3e334c [flickr] Don't use regex for extracting the info from the xml files 2015-04-22 19:58:39 +02:00
Jaime Marquínez Ferrándiz
f8e51f60b3 [flickr] Fix extraction (fixes #5501) 2015-04-22 19:24:14 +02:00
Sergey M․
33b066bda0 [hitbox] Clarify download messages 2015-04-22 21:09:21 +06:00
Sergey M․
14f41bc2fb [hitbox:live] Extract formats before metadata 2015-04-22 21:05:08 +06:00
Sergey M․
008bee0f50 [hitbox] Extract formats before metadata 2015-04-22 21:03:56 +06:00
Sergey M․
29492f3332 [hitbox] Sort formats 2015-04-22 21:01:52 +06:00
Sergey M․
bc94bd510b [hitbox] Extract all formats (Closes #5494) 2015-04-22 21:01:25 +06:00
Sergey M․
9dd8e46a2d [youtube:search] Cancel out _TESTS 2015-04-22 20:28:33 +06:00
Yen Chi Hsuan
8be2bdfabd [YoutubeDL] Remove the redundant assignment to old_filename
Caused by commmit 592e97e855
2015-04-22 15:05:35 +08:00
Jaime Marquínez Ferrándiz
b4c0806963 [youtube:ytsearch] Use the same system as the search webpage (fixes #5483)
The gdata api V2 was deprecated and according to http://youtube-eng.blogspot.com.es/2014/03/committing-to-youtube-data-api-v3_4.html remains available until April 20, 2015.
2015-04-21 19:30:31 +02:00
Sergey M․
cc38fa6cfb [youtube] Remove unused import 2015-04-21 22:55:59 +06:00
Sergey M․
6de5dbafee [youtube:channel] Make extract_videos_from_page static 2015-04-21 22:42:21 +06:00
Sergey M․
60bf45c80d [youtube:channel] Specify first page download message 2015-04-21 22:37:45 +06:00
Sergey M․
eb0f3e7ec0 [youtube:user] Extract in terms of load_more_widget_html 2015-04-21 22:36:41 +06:00
Sergey M․
ed553379df [youtube:ytsearch] Temporary workaround (#5483) 2015-04-21 20:55:05 +06:00
Jaime Marquínez Ferrándiz
5c1e6f69c4 [senate] Simplify
There isn't any problem if the 'formats' field only has one element
2015-04-21 15:04:55 +02:00
Yen Chi Hsuan
757cda0a96 [Cinemassacre] Support Youtube embedded videos (fixes #5131) 2015-04-21 15:21:04 +08:00
Yen Chi Hsuan
e94443de80 [Cinemassacre] Move to a standalone module 2015-04-21 15:10:27 +08:00
Yen Chi Hsuan
0954cd8aa4 [Cinemassacre] Add detection for videos from blip.tv 2015-04-21 13:48:02 +08:00
Yen Chi Hsuan
da55dac047 [CSpan] Removed the md5 sum of CSpan_3 2015-04-21 05:22:23 +08:00
Yen Chi Hsuan
13a11b195f [SenateISVP] Fix tests
Remove md5 sums. They differs from my PC and the travis worker.
2015-04-21 05:13:25 +08:00
Yen Chi Hsuan
92dcba1e1c [CSpan] Fix test cases CSpan_1 and CSpan_2 2015-04-21 03:30:54 +08:00
Yen Chi Hsuan
2fe1b5bd2a [CSpan] Add detection for Senate ISVP. Closes #5302 2015-04-21 03:18:38 +08:00
Yen Chi Hsuan
f91e1a8739 [Senate] Try to capture thumbnails 2015-04-21 02:57:32 +08:00
Yen Chi Hsuan
24e21613b6 [bilibili] Capture the video-not-exist message 2015-04-21 02:32:10 +08:00
Yen Chi Hsuan
c6391cd587 [Senate] Add new extractor (#5302) 2015-04-21 02:29:56 +08:00
Sergey M․
006ce15a0c [bambuser] Add support for authentication (#5478) 2015-04-20 23:00:37 +06:00
Sergey M․
edf4216119 [bambuser] Modernize and extract more metadata 2015-04-20 22:46:01 +06:00
Sergey M․
ae8953409e [bambuser] Capture and output error message (#5478) 2015-04-20 22:35:53 +06:00
Sergey M․
bda44f31a1 [bambuser] Modernize 2015-04-20 22:33:35 +06:00
Sergey M․
6621ca39a3 [ted] Skip hls quality selection format 2015-04-20 22:04:42 +06:00
Sergey M․
14f7abfa71 [ted] Lower preference for direct audio since it's mono 2015-04-20 22:04:17 +06:00
Sergey M․
0f0b5736da [ted] Fix hls audio/video-only formats 2015-04-20 22:01:02 +06:00
Sergey M․
6728187ac0 [YoutubeDL] mp3 is compatible with mp4 2015-04-20 21:58:46 +06:00
Sergey M․
17c8675853 [YoutubeDL] Allow bestvideo+bestaudio/best strategy for ted extractor 2015-04-20 21:58:29 +06:00
Sergey M․
cfbee8a431 [ted] Clarify IE_NAME 2015-04-20 21:42:42 +06:00
Sergey M․
736785ab63 [ted] Clarify audio/video-only formats 2015-04-20 21:42:20 +06:00
Sergey M․
3ded7bac16 [extractor/common] Add ability to specify custom field preference for _sort_formats 2015-04-20 21:13:31 +06:00
Quentin Rameau
b524a001d6 [bandcamp] fix video_id parsing (fixes #4861) 2015-04-20 15:45:57 +02:00
Jaime Marquínez Ferrándiz
7b071e317b README: document bestvideo+bestaudio/best (#5447) 2015-04-19 18:54:05 +02:00
Jaime Marquínez Ferrándiz
a380509259 Move the documentation for the --format option to the manpage
It's too big for beeing embedded in the help message and it's easier to edit in the markdown file.
2015-04-19 18:53:28 +02:00
Sergey M․
c0dea0a782 [YoutubeDL] Respect explicit --merge-format-output for uncompatible formats as well 2015-04-19 22:33:52 +06:00
Sergey M․
70947ea7b1 [parameters.json] Set default format parameter to best 2015-04-19 17:56:06 +02:00
Sergey M․
81cd954a51 [YoutubeDL] Merge incompatible formats into mkv (#5456) 2015-04-19 17:55:42 +02:00
Sergey M․
feccf29c87 [YoutubeDL] Make bestvideo+bestaudio/best default format when merger is available 2015-04-19 17:51:56 +02:00
Jaime Marquínez Ferrándiz
5b5fbc0867 Detect already merged videos
Without the '--keep-video' option the two files would be downloaded again and even using the option, ffmpeg would be run again, which for some videos can take a long time.
We use a temporary file with ffmpeg so that the final file only exists if it success
2015-04-19 17:51:41 +02:00
Yen Chi Hsuan
f158799bbe [Sohu] Fix title extraction 2015-04-19 19:19:44 +08:00
Yen Chi Hsuan
8b0e8990c2 [miomio] Replace the slow test case
MioMio_1 takes about 25~35 seconds on information retrieval
2015-04-19 19:12:23 +08:00
Yen Chi Hsuan
880ee801cf [tests] Allow multi_video to be tested as playlists 2015-04-19 19:08:37 +08:00
Sergey M․
163965d861 [megavideoz] Improve non-existing videos check 2015-04-19 04:14:58 +06:00
Sergey M․
6e218b3f9a [megavideoz] Check non-existing videos 2015-04-19 04:09:01 +06:00
Sergey M․
cc9b9df0b6 [megavideozeu] Rename extractor 2015-04-19 04:08:29 +06:00
Sergey M․
31f224008e [megavideozeu] Simplify (Closes #5454) 2015-04-19 04:07:45 +06:00
Jeff Buchbinder
f32cb5cb14 [megavideoez] Add working test 2015-04-19 04:06:45 +06:00
Jeff Buchbinder
fec2d97ca2 Add megavideoz.eu support. 2015-04-19 04:06:27 +06:00
Sergey M.
f2eeafb061 Merge pull request #5462 from hedii/hedii-patch-1
Update wat.py misspelling 'downloding'
2015-04-18 18:43:03 +05:00
hedii
8f4e8bf280 Update wat.py
line 116, modify 'Downloding' to 'Downloading'.
It looks like nothing, but it is very annoying when youtube-dl command's output is parsed to find progress on a php (or other language) website for example.
2015-04-18 15:40:40 +02:00
Jaime Marquínez Ferrándiz
cc36e2295a [ign] Fix extraction of some videos in articles
Give higher preference to the hero-poster regex because some articles may contain other videos
2015-04-18 13:27:35 +02:00
Jaime Marquínez Ferrándiz
d47aeb2252 FFmpegMergerPP: use the new system for specifying which files can be delete 2015-04-18 11:52:36 +02:00
Jaime Marquínez Ferrándiz
14523ed969 FFmpegEmbedSubtitlePP: remove the subtitle files if '--keep-video' is not given (closes #5435) 2015-04-18 11:44:42 +02:00
Jaime Marquínez Ferrándiz
592e97e855 Postprocessors: use a list for the files that can be deleted
We could only know if we had to delete the original file, but this system allows to specify us more files (like subtitles).
2015-04-18 11:36:42 +02:00
Yen Chi Hsuan
53faa3ca5f [facebook] Extend _VALID_URL take 2 (#5120) 2015-04-18 16:08:24 +08:00
Yen Chi Hsuan
c62566971f [facebook] Extend _VALID_URL 2015-04-18 16:00:33 +08:00
Sergey M․
902be27cf9 Merge branch 'julianrichen-gfycat' 2015-04-18 03:51:38 +06:00
Sergey M․
bf12cbe07c Credit @julianrichen for gfycat (#5440) 2015-04-18 03:51:21 +06:00
Sergey M․
f52e66505a [gfycat] Simplify (Closes #5439, Closes #5394) 2015-04-18 03:50:22 +06:00
Sergey M․
ca75235d3d Merge branch 'gfycat' of https://github.com/julianrichen/youtube-dl into julianrichen-gfycat 2015-04-18 03:49:32 +06:00
Jaime Marquínez Ferrándiz
ecc6bd1341 YoutubeDL.post_process: simplify keep_video handling
Since keep_video started as None we always set it to keep_video_wish unless it was None, so in the end keep_video == keep_video_wish. This should have been changed in f3ff1a3696, but I didn't notice it.
2015-04-17 22:38:14 +02:00
Jaime Marquínez Ferrándiz
ce81b1411d FFmpegExtractAudioPP: Simplify handling of already existing files 2015-04-17 22:37:27 +02:00
Sergey M․
7691a7a3bd [comedycentral] Fix feed uri request (Closes #5449, closes #5455) 2015-04-17 23:41:07 +06:00
Jaime Marquínez Ferrándiz
214e74bf6f [soundcloud] Raise an error instead of calling 'report_error' 2015-04-17 19:24:30 +02:00
Jaime Marquínez Ferrándiz
c5826a491b [mixcloud] Simplify url extraction
On the tracks I tested the server number in the url from the webpage is valid
for the mp3 or the m4a file and any other number is invalid, it's a
waste of time to check them.
2015-04-17 19:02:49 +02:00
Sergey M․
d8e7ef04dc [vimple] Fix extraction (Closes #5448) 2015-04-17 22:56:26 +06:00
Jaime Marquínez Ferrándiz
08f2a92c9c InfoExtractor._search_regex: Suggest updating when the regex is not found (suggested in #5442)
Reuse the same message from ExtractorError
2015-04-17 14:55:24 +02:00
Philipp Hagemeister
3220c50f9a release 2015.04.17 2015-04-17 11:14:25 +02:00
Jaime Marquínez Ferrándiz
024ebb2706 [soundcloud] Handle 'secret_token' for 'w.soundcloud.com/player/?url=*' urls (fixes #5453) 2015-04-17 10:46:25 +02:00
FireDart
954352c4c0 [gfycat] Fixed preferences. 2015-04-16 18:11:30 -04:00
FireDart
4aec95f3c9 [gfycat] Updated tests. 2015-04-16 18:10:53 -04:00
Sergey M․
be531ef1ec [utils] Fix splitunc deprecation warning 2015-04-16 22:12:38 +06:00
Sergey M․
65c1a750f5 [srf] Show display_id when present 2015-04-16 21:48:22 +06:00
Sergey M․
5141249c59 [srf] Extend _VALID_URL 2015-04-16 21:47:42 +06:00
Sergey M․
6225984681 [generic] Update pladform embed test 2015-04-16 21:37:15 +06:00
Sergey M․
5cb91ceaa5 [pladform] Update test 2015-04-16 21:33:01 +06:00
Sergey M․
89c09e2a08 [srf] Update test 2015-04-16 21:30:13 +06:00
Sergey M․
fbbb219409 [srf] Fix direct links ext 2015-04-16 21:28:21 +06:00
Sergey M․
820b064804 [srf] Extract subtitles 2015-04-16 20:48:17 +06:00
Sergey M․
355c524bfa [srf] Extract all formats and prefer direct links over hls and hds 2015-04-16 20:31:02 +06:00
Yen Chi Hsuan
c052ce6cde [Srf] Add new extractor (fixes #981) 2015-04-16 22:00:45 +08:00
Yen Chi Hsuan
c9a779695d [extractor/common] Add the encoding parameter
The QQMusic info extractor need forced encoding for correct working.
2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
a685ae511a [QQMusic] Song extractor: Add lyrics as description
Note: Test fails on python 3 due to encoding issues
2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
5edea45fab [QQMusic] Add album info extractor 2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
8afff9f849 [QQMusic] Add singer info extractor 2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
a2043572aa [QQMusic] Implement the guid algorithm 2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
5d98908b26 [QQMusic] Add new extractor 2015-04-16 17:34:54 +08:00
Yen Chi Hsuan
d6fd958c5f [generic] Extract videos from SMIL manifests (closes #5145 and fixes #5135) 2015-04-16 17:16:11 +08:00
Yen Chi Hsuan
d0eb724e22 [UDNEmbed] Enhance error checking and extend _VALID_URL 2015-04-16 17:04:53 +08:00
FireDart
afe4a8c769 [gfycat] Add new extractor 2015-04-15 22:17:45 -04:00
Yen Chi Hsuan
9fc03aa87c [brightcove] Always return lists from _extract_brightcove_urls
In Python 3, filter() returns an iterable object, which is equivalently
to True even for an empty result set. It causes false positive playlists
in generic extraction logic.
2015-04-16 00:27:39 +08:00
Sergey M․
c798f15b98 [generic] Add test for playwire embed (#5430) 2015-04-15 22:14:29 +06:00
Sergey M․
2dcc114f84 [generic] Add support for playwire embeds (Closes #5430) 2015-04-15 22:10:08 +06:00
Sergey M․
0dfe9bc9d2 [mtv] Capture and output error message (#5420) 2015-04-15 21:02:34 +06:00
Sergey M․
4d1cdb5bfe [spike] Extend _VALID_URL (Closes #5420) 2015-04-15 20:58:48 +06:00
Yen Chi Hsuan
9c5335a027 [teamcoco] Fix "preload" data extraction (fixes #5179) 2015-04-15 19:56:21 +08:00
Yen Chi Hsuan
ae849ca170 [tumblr] Dismiss warnings for optional fields (fixes #5202) 2015-04-15 17:45:28 +08:00
Sergey M․
94c1255782 [brightcove] Handle non well-formed XMLs (#5421) 2015-04-14 17:50:53 +06:00
Sergey M․
476e1095fa [brightcove] Improve brightcove experience regex (Closes #5421) 2015-04-14 17:48:41 +06:00
Yen Chi Hsuan
8da1bb0418 [miomio] Enhance error checking and replace dead test case 2015-04-14 15:27:56 +08:00
Yen Chi Hsuan
01c58f8473 [generic] Fix test generic_51
The website replaced the original video with a new one
2015-04-14 13:10:10 +08:00
Yen Chi Hsuan
edfcf7abe2 [generic] Support another type of Ooyala embedded video 2015-04-14 12:45:43 +08:00
Jaime Marquínez Ferrándiz
37b44fe7c1 [postprocessor/atomicparsley] Don't try to remove the temporary and original files if the format is unsupported (fixes #5419) 2015-04-13 22:50:40 +02:00
Sergey M․
8f02ad4f12 [youtube] Simplify 2015-04-13 20:28:16 +06:00
Yen Chi Hsuan
51f1244600 [vine] flake8 2015-04-13 19:26:15 +08:00
Sergey M․
7bd930368c [youtube] Remove unused variable 2015-04-13 00:08:39 +06:00
Sergey M․
fb69240ca0 [youtube] Extract video titles for channel playlist if possible (Closes #4971) 2015-04-12 23:19:00 +06:00
Sergey M․
830d53bfae [utils] Add video_title for url_result 2015-04-12 23:11:47 +06:00
Sergey M․
c36a959549 [YoutubeDL] Try to download worst audio+video served by a single file first (Closes #5408) 2015-04-12 17:36:29 +06:00
Sergey M․
e91b2d14e3 Credit @snipem for gamersyde (#5352) 2015-04-12 17:17:31 +06:00
Sergey M․
ac58e68bc3 [footyroom] Remove superfluous whitespace 2015-04-12 17:11:11 +06:00
Sergey M․
504c1cedfe [footyroom] Improve 2015-04-12 17:09:52 +06:00
snipem
9a4d8fae82 [FootyTube] Fixed wrong md5 checksum 2015-04-12 17:01:12 +06:00
snipem
7d2ba6394c [FootyRoom] Fixed missing http prefix
For some reason FootyTube is missing the „http:“ prefix on some
Playwire links for some videos
2015-04-12 17:01:03 +06:00
Sergey M․
b04b94da5f [options] Fix file based configurations for python 2 (Closes #5401) 2015-04-12 03:57:56 +06:00
Sergey M․
9933857f67 Merge branch 'fstirlitz-crooksandliars' 2015-04-11 20:27:53 +06:00
Sergey M․
ed5641e249 [crooksandliars] Quotes consistency 2015-04-11 20:27:39 +06:00
Sergey M․
a4257017ef [generic] Add tests for Crooks and Liars embeds 2015-04-11 20:26:42 +06:00
Sergey M․
18153f1b32 [generic] Add support for Crooks and Liars embeds 2015-04-11 20:20:20 +06:00
Sergey M․
7a91d1fc43 [crooksandliars] Improve embed extractor and remove article extractor 2015-04-11 20:03:12 +06:00
Sergey M․
af14ded75e Merge branch 'crooksandliars' of https://github.com/fstirlitz/youtube-dl into fstirlitz-crooksandliars 2015-04-11 19:34:06 +06:00
Sergey M․
65939effb5 [hitbox:live] Fix hls extration (Closes #5315) 2015-04-11 18:52:41 +06:00
Sergey M․
66ee7b3234 [ted] Extract all formats (Closes #5397) 2015-04-10 23:36:28 +06:00
Sergey M․
cd47a628fc [rai] Add test for #5396 2015-04-10 22:44:41 +06:00
Sergey M․
d7c78decb0 [rai] Improve extraction 2015-04-10 22:44:33 +06:00
Sergey M․
8749477ed0 [rai] Fix extraction (Closes #5396) 2015-04-10 22:44:24 +06:00
Naglis Jonaitis
7088f5b5fa [teamcoco] Extract duration 2015-04-10 02:03:38 +03:00
Naglis Jonaitis
5bb6328cb9 [teamcoco] Extract m3u8 URLs 2015-04-09 23:57:51 +03:00
Naglis Jonaitis
ce9f47de99 [teamcoco] Fix extraction 2015-04-09 23:54:53 +03:00
Sergey M․
4c4780c25e [vine] Modernize 2015-04-09 22:41:41 +06:00
Sergey M․
64f1aba8f1 [vine] Extend _VALID_URL 2015-04-09 22:40:18 +06:00
Sergey M․
3359fb661f [vine] Add tests for #5389 2015-04-09 22:37:54 +06:00
Sergey M․
58a9f1b864 [vine] Fix post data regex (Closes #5389) 2015-04-09 22:32:48 +06:00
Sergey M․
6ac41a4ef5 [vine] Zero rate videos is perfectly valid (#5389) 2015-04-09 22:32:22 +06:00
Sergey M․
aa2af7ba74 [dumpert] Add nsfw cookie (Closes #5382) 2015-04-09 19:53:00 +06:00
Jaime Marquínez Ferrándiz
ce73839fe4 [rtve] Detect videos that are no longer available 2015-04-09 14:01:33 +02:00
Philipp Hagemeister
1dc2726f8d release 2015.04.09 2015-04-09 00:21:19 +02:00
Sergey M․
af76e8174d [dailymotion:user] Improve _VALID_URL (Closes #5380) 2015-04-09 02:25:31 +06:00
Sergey M․
402a3efc92 [theplatform] Modernize 2015-04-08 22:29:10 +06:00
Sergey M․
372f08c990 [theplatform] Fix for python 2.6
At least single depth level extraction...
2015-04-08 22:27:25 +06:00
Sergey M․
dd29eb7f81 [postprocessor/common:postprocessor/ffmpeg] Generalize utime 2015-04-08 21:40:31 +06:00
Sergey M.
bca788ab1d Merge pull request #5376 from PeteHemery/ffmpeg-postproc-utime-bug
[ffmpeg] adding exception catching for call to os.utime in run_ffmpeg_multiple_files
2015-04-08 20:27:17 +05:00
Sergey M․
aef8fdba11 [theplatform] Allow <par> without <swtich> at all
Bare `wget` on http://link.theplatform.com/s/kYEXFC/22d_qsQ6MIRTl results in an XML without <switch> at all
but with <par> and <video> inside it. Let's handle this possible outcome as well.
2015-04-08 21:03:11 +06:00
Yen Chi Hsuan
0a1603634b [utils] Remove url_infer_protocol 2015-04-08 21:39:34 +08:00
Yen Chi Hsuan
a662163fd5 [theplatform] Rework on <switch> inside <par> 2015-04-08 20:21:34 +08:00
Yen Chi Hsuan
bd7a6478a2 [theplatform] Fix video url extraction (fixes #5340)
In SMIL 2.1, <switch> nodes may be enclosed in <par>. See
http://www.w3.org/TR/SMIL2/smil-timing.html#edef-par
2015-04-08 19:20:34 +08:00
Yen Chi Hsuan
4a20c9f628 [livestream] Extend _VALID_URL (fixes #5375) 2015-04-08 17:42:26 +08:00
Yen Chi Hsuan
418c5cc3fc [udn] Add new extractor 2015-04-08 17:26:51 +08:00
Pete Hemery
cc55d08832 [ffmpeg] adding exception catching for call to os.utime in run_ffmpeg_multiple_files 2015-04-07 22:33:18 +01:00
Yen Chi Hsuan
de5c545648 [youtube] Skip WebVTT in DASH manifest (#5297) 2015-04-08 03:47:27 +08:00
Sergey M․
a35099bd33 [addanime] Add test for #5372 2015-04-07 21:01:35 +06:00
Sergey M․
5f4b5cf044 [addanime] Extend _VALID_URL (Closes #5372) 2015-04-07 21:00:52 +06:00
Sergey M․
beb10f843f [addanime] Add format quality (Closes #5371) 2015-04-07 21:00:22 +06:00
Philipp Hagemeister
29713e4268 [cnn] Match more affilliates 2015-04-07 14:59:13 +02:00
Jaime Marquínez Ferrándiz
8e4b83b96b Remove check for ssl certs
When it uses a capath instead of a cafile, 'get_ca_certs' or 'cert_store_stats' only returns certificates already used in a connection.
(see #5364)
2015-04-06 22:18:08 +02:00
Sergey M․
ae603b500e Merge branch 'newtonelectron-spankbang.com' 2015-04-06 21:24:45 +06:00
Sergey M․
d97aae7572 [spankbang] Improve and simplify 2015-04-06 21:24:17 +06:00
Sergey M․
a55e2f04a0 Merge branch 'spankbang.com' of https://github.com/newtonelectron/youtube-dl into newtonelectron-spankbang.com 2015-04-06 20:46:40 +06:00
felix
6e53c91608 [crooksandliars] resolve protocol-relative URLs 2015-04-06 10:12:43 +02:00
felix
d2272fcf6e crooksandliars.com extractor 2015-04-06 09:54:19 +02:00
newtonelectron
c7ac5dce8c [SpankBang] Remove regexp type prefix from _TEST url. 2015-04-05 14:02:05 -07:00
newtonelectron
5c1d459ae9 [SpankBang] Add test 2015-04-05 13:57:59 -07:00
newtonelectron
2e7daef502 [SpankBang] Use python2.6 compatible string formatting spec 2015-04-05 13:43:21 -07:00
newtonelectron
6410229681 [SpankBang] Add new extractor 2015-04-05 12:50:21 -07:00
Sergey M․
e40bd5f06b [youtube] Simplify url_encoded_fmt_stream_map check 2015-04-06 00:45:57 +06:00
Sergey M․
06b491eb7b [youtube] Add test for #5361 2015-04-06 00:35:55 +06:00
Yen Chi Hsuan
3a9fadd6df [youtube] Enhance url_encoded_fmt_stream_map checking (fix #5361) 2015-04-05 22:29:06 +08:00
Sergey M․
0de9312a7e [ellentv] Replace test 2015-04-05 00:01:55 +06:00
Sergey M․
27fe5e3473 [ellentv] Make video url extraction fatal 2015-04-05 00:00:04 +06:00
Sergey M․
f67dcc09f5 [eagleplatform] Skip georestricted test 2015-04-04 23:36:45 +06:00
Sergey M․
fefc9d121d [dump] Fix title extraction 2015-04-04 23:33:07 +06:00
Sergey M․
a319c33d8b [dreisat] Update test 2015-04-04 23:30:38 +06:00
Sergey M․
218d6bcc05 [dreisat] Capture status errors 2015-04-04 23:28:47 +06:00
Sergey M․
7d25463972 [drtv] Update test 2015-04-04 23:19:28 +06:00
Sergey M․
aff84bec07 [drtv] Check for unavailable videos 2015-04-04 23:17:09 +06:00
Sergey M․
ac651e974e [culturebox] Fix test 2015-04-04 23:06:16 +06:00
Sergey M․
e21a55abcc [extractor/common] Remove f4m section
It's now provided by `f4m_id`
2015-04-04 23:05:25 +06:00
Sergey M․
bc03228ab5 [francetv] Improve formats extraction 2015-04-04 23:02:04 +06:00
Sergey M․
f05d0e73c6 [francetv] Fix duration 2015-04-04 22:52:25 +06:00
Sergey M․
aed2d4b31e [culturebox] Replace test 2015-04-04 22:50:13 +06:00
Sergey M․
184a197441 [culturebox] Check for unavailable videos 2015-04-04 22:43:34 +06:00
Sergey M․
ed676e8c0a [bliptv] Check format URLs
Some formats are now 404
2015-04-04 22:27:25 +06:00
Sergey M․
8e1f937473 [aftonbladet] Modernize 2015-04-04 22:19:34 +06:00
Sergey M․
1a68d39211 [aftonbladet] Fix extraction 2015-04-04 22:15:59 +06:00
Sergey M․
4ba7d5b14c Merge branch 'tuexss-patch-1' 2015-04-04 20:09:36 +06:00
Sergey M․
1a48181a9f [options] Fix load info help string 2015-04-04 20:09:11 +06:00
Sergey M․
6b70a4eb7d [options] Number is a verb here 2015-04-04 20:02:29 +06:00
Sergey M․
f01855813b [options] extractor is lowercase 2015-04-04 20:01:24 +06:00
Sergey M․
4a3cdf81af [options] Restore some strings 2015-04-04 20:00:23 +06:00
Sergey M․
f777397aca Merge branch 'patch-1' of https://github.com/tuexss/youtube-dl into tuexss-patch-1 2015-04-04 19:50:47 +06:00
Sergey M․
8fb2e5a4f5 [radiojavan] Sort formats 2015-04-04 19:25:08 +06:00
Sergey M․
4e8cc1e973 [radiojavan] Fix height 2015-04-04 19:24:37 +06:00
Sergey M․
ff02a228e3 [test_execution] Fix test under python 2 @ windows 2015-04-04 19:21:50 +06:00
Sergey M․
424266abb1 Credit @Roman2K for pornovoisines (#5264) 2015-04-04 19:16:18 +06:00
Sergey M․
3fde134791 Merge branch 'Roman2K-pornovoisines' 2015-04-04 19:14:01 +06:00
Sergey M․
7c39a65543 [pornovoisines] Simplify 2015-04-04 19:13:37 +06:00
Sergey M․
8cf70de428 [test_utils] Add test for unified_strdate 2015-04-04 19:11:01 +06:00
Sergey M․
15ac8413c7 [utils] Avoid treating *-%Y date template as UTC offset 2015-04-04 19:08:48 +06:00
Sergey M․
79c21abba7 [utils] Add one more template to unified_strdate 2015-04-04 18:45:46 +06:00
Sergey M․
d5c418f29f Merge branch 'pornovoisines' of https://github.com/Roman2K/youtube-dl into Roman2K-pornovoisines 2015-04-04 18:08:20 +06:00
Sergey M․
536b94e56f Merge branch 'snipem-gamersyde' 2015-04-04 17:54:02 +06:00
Sergey M․
5c29dbd0c7 [gamersyde] Simplify 2015-04-04 17:53:22 +06:00
Sergey M․
ba9e68f402 [utils] Drop trailing comma before closing brace 2015-04-04 17:48:55 +06:00
Jaime Marquínez Ferrándiz
e9f65f8749 [rtve] Extract a better quality video 2015-04-04 13:11:55 +02:00
Sergey M․
ae0dd4b298 Merge branch 'gamersyde' of https://github.com/snipem/youtube-dl into snipem-gamersyde 2015-04-04 16:59:39 +06:00
Sergey M․
f1ce35af1a Merge branch 'mtp1376-radiojavan' 2015-04-04 16:47:24 +06:00
Sergey M․
6e617ed0b6 Credit @mtp1376 for varzesh3 and radiojavan 2015-04-04 16:47:09 +06:00
Sergey M․
7cf97daf77 [radiojavan] Simplify and extract upload date 2015-04-04 16:45:41 +06:00
snipem
3d24d997ae Fixed intendation of test cases
Leaded to error on Linux machine
2015-04-04 12:42:14 +02:00
snipem
115c281672 [Gamersyde] Improved robustness, added duration and tests
Fix for Json syntax is now less error prone for Json syntax inside of
values. Extractor is now also using native Json handling. Added tests
for several videos that were producing errors in the first place.
2015-04-04 12:31:48 +02:00
Sergey M․
cce23e43a9 Merge branch 'radiojavan' of https://github.com/mtp1376/youtube-dl into mtp1376-radiojavan 2015-04-04 16:10:17 +06:00
Sergey M․
ff556f5c09 Do not encode outtmpl twice (Closes #5288) 2015-04-04 00:30:37 +06:00
Sergey M․
16fa01291b [prosiebensat1] Fix test 2015-04-03 23:44:13 +06:00
Sergey M․
01534bf54f [prosiebensat1] Fix bitrate (Closes #5350 closes #5351) 2015-04-03 23:42:53 +06:00
Jaime Marquínez Ferrándiz
cd341b6e06 [mixcloud] Fix extraction of like count (reported in #5231) 2015-04-03 19:37:35 +02:00
Mohammad Teimori Pabandi
185a7e25e7 [RadioJavan] Add new extractor 2015-04-03 20:55:39 +04:30
snipem
e81a474603 [Gamersyde] Add new extractor 2015-04-03 15:34:49 +02:00
Jaime Marquínez Ferrándiz
ff2be6e180 [bloomberg] Adapt to website changes (fixes #5347) 2015-04-03 15:01:17 +02:00
Jaime Marquínez Ferrándiz
3da4b31359 [postprocessor/ffmpeg] Fix crash when ffprobe/avprobe are not installed (closes #5349)
'self.probe_basename' was None, so 'probe_executable' raised a KeyError exception
2015-04-03 14:09:50 +02:00
Jaime Marquínez Ferrándiz
4bbeb19fc7 [miomio] pep8: remove whitespaces in empty line 2015-04-03 14:09:07 +02:00
Philipp Hagemeister
a9cbab1735 release 2015.04.03 2015-04-03 10:22:25 +02:00
Sergey M․
6b7556a554 Credit @tiktok7 for miomio.tv (#5265) 2015-04-03 01:47:18 +06:00
Sergey M․
a3c7019e06 [YoutubeDL] Check for get_ca_certs availability
`get_ca_certs` is not available in python <3.4
2015-04-02 22:50:10 +06:00
Sergey M․
416b9c29f7 Merge branch 'tiktok7-MiomioTv' 2015-04-02 22:34:53 +06:00
Sergey M․
2ec8e04cac [miomio] Fix alphabetic order 2015-04-02 22:34:08 +06:00
Sergey M․
e03bfb30ce [miomio] Rename extractor 2015-04-02 22:33:30 +06:00
Sergey M․
f5b669113f [miomio] Simplify and fix python 2.6 issue 2015-04-02 22:32:16 +06:00
Sergey M․
d08225edf4 Merge branch 'MiomioTv' of https://github.com/tiktok7/youtube-dl into tiktok7-MiomioTv 2015-04-02 21:12:47 +06:00
Sergey M․
8075d4f99d [playfm] Adapt to v2api (Closes #5344) 2015-04-02 20:26:05 +06:00
Jaime Marquínez Ferrándiz
1a944d8a2a Print a warning if no ssl certificates are loaded 2015-04-02 14:09:55 +02:00
Sergey M․
7cf02b6619 Merge branch 'mtp1376-varzesh3' 2015-04-01 22:03:35 +06:00
Sergey M․
55cde6ef3c [varzesh3] Simplify 2015-04-01 22:02:55 +06:00
Sergey M․
69c3af567d Merge branch 'varzesh3' of https://github.com/mtp1376/youtube-dl into mtp1376-varzesh3 2015-04-01 20:25:46 +06:00
Sergey M․
60e1fe0079 Merge branch 'master' of github.com:rg3/youtube-dl 2015-04-01 20:25:11 +06:00
Sergey M.
4669393070 Merge pull request #5311 from yan12125/fix_douyu
[douyutv] Fix extractor and improve error handling
2015-04-01 20:22:11 +06:00
Sergey M․
ce3bfe5d57 Merge branch 'fix_douyu' of https://github.com/yan12125/youtube-dl 2015-04-01 19:54:00 +06:00
Sergey M․
2a0c2ca2b8 [dailymotion] Fix ff cookie and use it for embed page (Closes #5330) 2015-03-31 20:55:21 +06:00
Sergey M․
c89fbfb385 [nbc] Remove redundant note
This is already supposed by `only_matching`
2015-03-31 20:14:37 +06:00
Sergey M․
facecb84a1 [generic] Add working NBC Sports vplayer test 2015-03-31 20:11:14 +06:00
Sergey M.
ed06e9949b Merge pull request #5328 from yan12125/fix_5226
[Yahoo/NBCSports] Fix 5226 and add support for NBC sports
2015-03-31 20:00:47 +06:00
Yen Chi Hsuan
e15307a612 [NBCSports/Yahoo] Comment out some MD5 checksums
They seems to change constantly
2015-03-31 13:13:29 +08:00
Yen Chi Hsuan
5cbb2699ee [NBCSports] Add a test case for extended _VALID_URL 2015-03-31 03:38:45 +08:00
Yen Chi Hsuan
a2edf2e7ff [NBC/ThePlatform/Generic] Add a generic detector for NBCSportsVPlayer and enhance error detection in ThePlatformIE 2015-03-31 03:36:09 +08:00
Yen Chi Hsuan
1d31e7a2fc [NBCSports] Move imports alphabetically 2015-03-31 02:51:11 +08:00
Yen Chi Hsuan
a2a4d5fa31 [Yahoo/NBCSports] Generalize NBC sports info extractor 2015-03-31 02:47:18 +08:00
Yen Chi Hsuan
a28ccbabc6 [Yahoo/NBCSports] Fix #5226 2015-03-31 02:21:27 +08:00
Naglis Jonaitis
edd7344820 [phoenix] Extend _VALID_URL (#5322) 2015-03-30 18:16:51 +03:00
Sergey M․
c808ef81bb [soundcloud:set:user] Support mobile URLs (Closes #5323) 2015-03-30 21:03:38 +06:00
Sergey M․
fd203fe357 Credit @jorams or dumpert.nl (#5319) 2015-03-30 20:12:55 +06:00
Sergey M․
5bb7ab9928 Merge branch 'jorams-dumpert' 2015-03-30 20:12:09 +06:00
Sergey M․
87270c8416 [dumpert] Simplify and fix python 3.2 2015-03-30 20:11:51 +06:00
Sergey M․
ebc2f7a2db Merge branch 'dumpert' of https://github.com/jorams/youtube-dl into jorams-dumpert 2015-03-30 19:47:11 +06:00
Sergey M․
7700207ec7 [pornhub] Fix comment count extraction (Closes #5320) 2015-03-30 19:41:04 +06:00
Joram Schrijver
4d5d14f5cf [Dumpert] Add new extractor
Add support for the Dutch video site Dumpert. http://www.dumpert.nl/
2015-03-29 23:41:06 +02:00
Sergey M.
72b249bf1f Merge pull request #5313 from yan12125/fix_xuite_python32
[Xuite] Fix extraction on python 3.2
2015-03-30 01:28:30 +06:00
Yen Chi Hsuan
9b4774b21b [Xuite] Fix extraction on python 3.2
base64.b64decode() accept only binary types in Python 3.2
2015-03-29 20:51:33 +08:00
Yen Chi Hsuan
2ddf083588 [douyutv] Simplify usage of isinstance 2015-03-29 18:17:48 +08:00
Yen Chi Hsuan
8343a03357 [douyutv] Fix extractor and improve error handling 2015-03-29 14:26:28 +08:00
Naglis Jonaitis
ad320e9b83 [generic] Add support for 5min embeds (#5310) 2015-03-29 04:57:37 +03:00
Philipp Hagemeister
ecb750a446 [cnn] Match more URLs 2015-03-28 23:39:41 +01:00
Philipp Hagemeister
5f88e02818 [ultimedia] PEP8 2015-03-28 23:35:55 +01:00
Sergey M․
616af2f4b9 Unduplicate @ossi96 2015-03-29 00:03:59 +06:00
Sergey M․
5a3b315b5f [dhm] Improve _VALID_URL and add test 2015-03-28 23:55:15 +06:00
Sergey M․
b7a2268e7b Credit @ossi96 for dhm (#5305) 2015-03-28 23:43:15 +06:00
Sergey M․
20d729228c Merge branch 'ossi96-dhm' 2015-03-28 22:30:27 +06:00
Sergey M․
af8c93086c [dhm] Simplify 2015-03-28 22:30:13 +06:00
Sergey M․
79fd11ab8e Merge branch 'dhm' of https://github.com/ossi96/youtube-dl into ossi96-dhm 2015-03-28 22:09:05 +06:00
Jaime Marquínez Ferrándiz
cb88671e37 [nbc] Recognize https urls (fixes #5300) 2015-03-28 14:18:11 +01:00
Oskar Jauch
ff79552f13 [DHM] Add extractor description 2015-03-28 10:42:35 +01:00
Oskar Jauch
643fe72717 [DHM] Add new extractor 2015-03-28 10:38:52 +01:00
Philipp Hagemeister
4747e2183a release 2015.03.28 2015-03-28 08:12:05 +01:00
Philipp Hagemeister
c59e701e35 Default to continuedl=True
We already do this in the CLI interface, so it should be just fine.
2015-03-28 08:11:39 +01:00
Jaime Marquínez Ferrándiz
8e678af4ba Makefile: fix 'find' command
It worked with the GNU version, but not with the BSD version.
2015-03-27 14:21:53 +01:00
Jaime Marquínez Ferrándiz
70a1165b32 Don't use bare 'except:'
They catch any exception, including KeyboardInterrupt, we don't want to catch it.
2015-03-27 13:02:20 +01:00
Naglis Jonaitis
af14000215 [eroprofile] Add login support (#5269) 2015-03-26 23:24:28 +02:00
Sergey M․
998e6cdba0 [vimeo] Capture and output error message (#5294) 2015-03-27 03:05:08 +06:00
Mohammad Teimori Pabandi
2315fb5e5f unicde :( 2015-03-26 23:53:57 +04:30
Jaime Marquínez Ferrándiz
157e9e5aa5 [youtube:watchlater] Remove unused properties and fix tests 2015-03-26 20:03:31 +01:00
Jaime Marquínez Ferrándiz
c496ec0848 [vessel] Fix pep8 issue 2015-03-26 19:51:40 +01:00
Sergey M․
15b67a268a Merge branch 'zx8-master' 2015-03-26 23:57:56 +06:00
Sergey M․
31c4809827 [safari] Improve and simplify 2015-03-26 23:57:46 +06:00
Sergey M․
ac0df2350a Merge branch 'master' of https://github.com/zx8/youtube-dl into zx8-master 2015-03-26 23:57:13 +06:00
Naglis Jonaitis
223b27f46c [vessel] Add new extractor (Closes #5275) 2015-03-26 19:48:22 +02:00
Naglis Jonaitis
425142be60 [slideshare] Fix extraction (#5279) 2015-03-26 17:47:25 +02:00
Sergey M․
7e17ec8c71 [youtube] Clarify some IE_NAMEs 2015-03-26 21:42:28 +06:00
Sergey M․
448830ce7b [youtube:watchlater] Extract watchlater as playlist (Closes #5280) 2015-03-26 21:41:09 +06:00
Mohammad Teimori Pabandi
8896b614a9 removing unicode literal because it is imported :)) 2015-03-26 20:06:50 +04:30
Mohammad Teimori Pabandi
a7fce980ad removed one of tests that made problem with testing server 2015-03-26 19:47:34 +04:30
Naglis Jonaitis
91757b0f37 [utils] Escape all HTML entities written in hexadecimal form 2015-03-26 17:15:27 +02:00
Naglis Jonaitis
fbfcc2972b [teamcoco] Fix extraction 2015-03-26 16:13:53 +02:00
Mohammad Teimori Pabandi
db40364b87 [Varzesh3] Add new extractor 2015-03-26 18:17:21 +04:30
Sergey M․
094ce39c45 Credit @amishb for 22tracks (#5276) 2015-03-25 22:27:20 +06:00
Sergey M․
ae67d082fe [22tracks] Improve and simplify 2015-03-25 22:26:02 +06:00
Amish Bhadeshia
8f76df7f37 Updated init to add 22tracks 2015-03-25 21:11:31 +06:00
Amish Bhadeshia
5c19d18cbf [22Tracks] Add new extractor
Conflicts:
	youtube_dl/extractor/__init__.py
2015-03-25 21:10:54 +06:00
Sergey M․
838b93405b [redtube] Fix test 2015-03-25 20:09:01 +06:00
Sergey M․
2676caf344 [redtube] Capture and output removed video message (#5281) 2015-03-25 20:08:35 +06:00
testbonn
17941321ab Clean up of --help output
For consistency and readability
2015-03-25 11:02:55 +01:00
tiktok
5d1f0e607b [MiomioTv] updated based on feedback to merge request:
1) added comment to explain extra xml link download
    2) changed {} entries to {0}, {1} etc
    3) removed redundant language header (the others are required)
    4) checked out the old version of the supported sites md (the change was
    not required)
2015-03-23 23:16:50 +01:00
tiktok
c41a2ec4af [MiomioTv] Add new extractor 2015-03-23 01:42:17 +01:00
Roman Le Négrate
575dad3c98 [pornovoisines] Add extractor 2015-03-22 20:27:45 +01:00
zx8
32d687f55e [safari] Add safaribooksonline extractor 2015-03-22 18:04:50 +00:00
234 changed files with 9403 additions and 3216 deletions

10
AUTHORS
View File

@@ -117,3 +117,13 @@ Alexander Mamay
Devin J. Pohly
Eduardo Ferro Aldama
Jeff Buchbinder
Amish Bhadeshia
Joram Schrijver
Will W.
Mohammad Teimori Pabandi
Roman Le Négrate
Matthias Küch
Julian Richen
Ping O.
Mister Hat
Peter Ding

View File

@@ -2,7 +2,7 @@ all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bas
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find -name "*.pyc" -delete
find . -name "*.pyc" -delete
PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin

244
README.md
View File

@@ -5,6 +5,7 @@ youtube-dl - download videos from youtube.com or other video platforms
- [OPTIONS](#options)
- [CONFIGURATION](#configuration)
- [OUTPUT TEMPLATE](#output-template)
- [FORMAT SELECTION](#format-selection)
- [VIDEO SELECTION](#video-selection)
- [FAQ](#faq)
- [DEVELOPER INSTRUCTIONS](#developer-instructions)
@@ -16,12 +17,12 @@ youtube-dl - download videos from youtube.com or other video platforms
To install it right away for all UNIX users (Linux, OS X, etc.), type:
sudo curl https://yt-dl.org/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+x /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
If you do not have curl, you can alternatively use a recent wget:
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+x /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
Windows users can [download a .exe file](https://yt-dl.org/latest/youtube-dl.exe) and place it in their home directory or any other location on their [PATH](http://en.wikipedia.org/wiki/PATH_%28variable%29).
@@ -45,21 +46,21 @@ which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...]
# OPTIONS
-h, --help print this help text and exit
--version print program version and exit
-U, --update update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)
-i, --ignore-errors continue on download errors, for example to skip unavailable videos in a playlist
-h, --help Print this help text and exit
--version Print program version and exit
-U, --update Update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)
-i, --ignore-errors Continue on download errors, for example to skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the playlist or the command line) if an error occurs
--dump-user-agent display the current browser identification
--list-extractors List all supported extractors and the URLs they would handle
--dump-user-agent Display the current browser identification
--list-extractors List all supported extractors
--extractor-descriptions Output descriptions of all supported extractors
--default-search PREFIX Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
--default-search PREFIX Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The
default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.
--ignore-config Do not read configuration files. When given in the global configuration file /etc/youtube-dl.conf: Do not read the user configuration
in ~/.config/youtube-dl/config (%APPDATA%/youtube-dl/config.txt on Windows)
--flat-playlist Do not extract the videos of a playlist, only list them.
--no-color Do not emit color codes in output.
--no-color Do not emit color codes in output
## Network Options:
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in an empty string (--proxy "") for direct connection
@@ -71,70 +72,70 @@ which means you can modify it, redistribute it or use it however you like.
not present) is used for the actual downloading. (experimental)
## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last)
--playlist-items ITEM_SPEC playlist video items to download. Specify indices of the videos in the playlist seperated by commas like: "--playlist-items 1,2,5,8"
--playlist-start NUMBER Playlist video to start at (default is 1)
--playlist-end NUMBER Playlist video to end at (default is last)
--playlist-items ITEM_SPEC Playlist video items to download. Specify indices of the videos in the playlist seperated by commas like: "--playlist-items 1,2,5,8"
if you want to download videos indexed 1, 2, 5, 8 in the playlist. You can specify range: "--playlist-items 1-3,7,10-13", it will
download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX download only matching titles (regex or caseless sub-string)
--reject-title REGEX skip download for matching titles (regex or caseless sub-string)
--match-title REGEX Download only matching titles (regex or caseless sub-string)
--reject-title REGEX Skip download for matching titles (regex or caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE (e.g. 50k or 44.6m)
--date DATE download only videos uploaded in this date
--datebefore DATE download only videos uploaded on or before this date (i.e. inclusive)
--dateafter DATE download only videos uploaded on or after this date (i.e. inclusive)
--date DATE Download only videos uploaded in this date
--datebefore DATE Download only videos uploaded on or before this date (i.e. inclusive)
--dateafter DATE Download only videos uploaded on or after this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than COUNT views
--max-views COUNT Do not download any videos with more than COUNT views
--match-filter FILTER (Experimental) Generic video filter. Specify any key (see help for -o for a list of available keys) to match if the key is present,
--match-filter FILTER Generic video filter (experimental). Specify any key (see help for -o for a list of available keys) to match if the key is present,
!key to check if the key is not present,key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to compare against
a number, and & to require multiple matches. Values which are not known are excluded unless you put a question mark (?) after the
operator.For example, to only match videos that have been liked more than 100 times and disliked less than 50 times (or the dislike
functionality is not available at the given service), but who also have a description, use --match-filter "like_count > 100 &
dislike_count <? 50 & description" .
--no-playlist If the URL refers to a video and a playlist, download only the video.
--yes-playlist If the URL refers to a video and a playlist, download the playlist.
--age-limit YEARS download only videos suitable for the given age
--no-playlist Download only the video, if the URL refers to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to a video and a playlist.
--age-limit YEARS Download only videos suitable for the given age
--download-archive FILE Download only videos not listed in the archive file. Record the IDs of all downloaded videos in it.
--include-ads Download advertisements as well (experimental)
## Download Options:
-r, --rate-limit LIMIT maximum download rate in bytes per second (e.g. 50K or 4.2M)
-R, --retries RETRIES number of retries (default is 10), or "infinite".
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K) (default is 1024)
--no-resize-buffer do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.
-r, --rate-limit LIMIT Maximum download rate in bytes per second (e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or "infinite".
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) (default is 1024)
--no-resize-buffer Do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize (experimental) set file xattribute ytdl.filesize with expected filesize
--hls-prefer-native (experimental) Use the native HLS downloader instead of ffmpeg.
--xattr-set-filesize Set file xattribute ytdl.filesize with expected filesize (experimental)
--hls-prefer-native Use the native HLS downloader instead of ffmpeg (experimental)
--external-downloader COMMAND Use the specified external downloader. Currently supports aria2c,curl,wget
--external-downloader-args ARGS Give these arguments to the external downloader.
--external-downloader-args ARGS Give these arguments to the external downloader
## Filesystem Options:
-a, --batch-file FILE file containing URLs to download ('-' for stdin)
--id use only video ID in file name
-o, --output TEMPLATE output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(uploader_id)s for the uploader
-a, --batch-file FILE File containing URLs to download ('-' for stdin)
--id Use only video ID in file name
-o, --output TEMPLATE Output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(uploader_id)s for the uploader
nickname if different, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(format)s for
the format description (like "22 - 1280x720" or "HD"), %(format_id)s for the unique id of the format (like Youtube's itags: "137"),
the format description (like "22 - 1280x720" or "HD"), %(format_id)s for the unique id of the format (like YouTube's itags: "137"),
%(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id,
%(playlist_title)s, %(playlist_id)s, or %(playlist)s (=title if present, ID otherwise) for the playlist the video is in,
%(playlist_index)s for the position in the playlist. %(height)s and %(width)s for the width and height of the video format.
%(resolution)s for a textual description of the resolution of the video format. %% for a literal percent. Use - to output to stdout.
Can also be used to download to a different directory, for example with -o '/my/downloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given
--autonumber-size NUMBER Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given
--restrict-filenames Restrict filenames to only ASCII characters, and avoid "&" and spaces in filenames
-A, --auto-number [deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] number downloaded files starting from 00000
-t, --title [deprecated] use title in file name (default)
-l, --literal [deprecated] alias of --title
-w, --no-overwrites do not overwrite files
-c, --continue force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
--no-continue do not resume partially downloaded files (restart from beginning)
--no-part do not use .part files - write directly into output file
--no-mtime do not use the Last-modified header to set the file modification time
--write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file
--write-annotations write video annotations to a .annotation file
--load-info FILE json file containing the video information (created with the "--write-json" option)
--cookies FILE file to read cookies from and dump cookie jar in
-A, --auto-number [deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000
-t, --title [deprecated] Use title in file name (default)
-l, --literal [deprecated] Alias of --title
-w, --no-overwrites Do not overwrite files
-c, --continue Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
--no-continue Do not resume partially downloaded files (restart from beginning)
--no-part Do not use .part files - write directly into output file
--no-mtime Do not use the Last-modified header to set the file modification time
--write-description Write video description to a .description file
--write-info-json Write video metadata to a .info.json file
--write-annotations Write video annotations to a .annotations.xml file
--load-info FILE JSON file containing the video information (created with the "--write-info-json" option)
--cookies FILE File to read cookies from and dump cookie jar in
--cache-dir DIR Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl
or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may
change.
@@ -142,97 +143,87 @@ which means you can modify it, redistribute it or use it however you like.
--rm-cache-dir Delete all filesystem cache files
## Thumbnail images:
--write-thumbnail write thumbnail image to disk
--write-all-thumbnails write all thumbnail image formats to disk
--write-thumbnail Write thumbnail image to disk
--write-all-thumbnails Write all thumbnail image formats to disk
--list-thumbnails Simulate and list all available thumbnail formats
## Verbosity / Simulation Options:
-q, --quiet activates quiet mode
-q, --quiet Activate quiet mode
--no-warnings Ignore warnings
-s, --simulate do not download the video and do not write anything to disk
--skip-download do not download the video
-g, --get-url simulate, quiet but print URL
-e, --get-title simulate, quiet but print title
--get-id simulate, quiet but print id
--get-thumbnail simulate, quiet but print thumbnail URL
--get-description simulate, quiet but print video description
--get-duration simulate, quiet but print video length
--get-filename simulate, quiet but print output filename
--get-format simulate, quiet but print output format
-j, --dump-json simulate, quiet but print JSON information. See --output for a description of available keys.
-J, --dump-single-json simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist
-s, --simulate Do not download the video and do not write anything to disk
--skip-download Do not download the video
-g, --get-url Simulate, quiet but print URL
-e, --get-title Simulate, quiet but print title
--get-id Simulate, quiet but print id
--get-thumbnail Simulate, quiet but print thumbnail URL
--get-description Simulate, quiet but print video description
--get-duration Simulate, quiet but print video length
--get-filename Simulate, quiet but print output filename
--get-format Simulate, quiet but print output format
-j, --dump-json Simulate, quiet but print JSON information. See --output for a description of available keys.
-J, --dump-single-json Simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist
information in a single line.
--print-json Be quiet and print the video information as JSON (video is still being downloaded).
--newline output progress bar as new lines
--no-progress do not print progress bar
--console-title display progress in console titlebar
-v, --verbose print various debugging information
--dump-pages print downloaded pages to debug problems (very verbose)
--newline Output progress bar as new lines
--no-progress Do not print progress bar
--console-title Display progress in console titlebar
-v, --verbose Print various debugging information
--dump-pages Print downloaded pages encoded using base64 to debug problems (very verbose)
--write-pages Write downloaded intermediary pages to files in the current directory to debug problems
--print-traffic Display sent and read HTTP traffic
-C, --call-home Contact the youtube-dl server for debugging.
--no-call-home Do NOT contact the youtube-dl server for debugging.
-C, --call-home Contact the youtube-dl server for debugging
--no-call-home Do NOT contact the youtube-dl server for debugging
## Workarounds:
--encoding ENCODING Force the specified encoding (experimental)
--no-check-certificate Suppress HTTPS certificate validation.
--no-check-certificate Suppress HTTPS certificate validation
--prefer-insecure Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)
--user-agent UA specify a custom user agent
--referer URL specify a custom referer, use if the video access is restricted to one domain
--add-header FIELD:VALUE specify a custom HTTP header and its value, separated by a colon ':'. You can use this option multiple times
--user-agent UA Specify a custom user agent
--referer URL Specify a custom referer, use if the video access is restricted to one domain
--add-header FIELD:VALUE Specify a custom HTTP header and its value, separated by a colon ':'. You can use this option multiple times
--bidi-workaround Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each download.
## Video Format Options:
-f, --format FORMAT video format code, specify the order of preference using slashes, as in -f 22/17/18 . Instead of format codes, you can select by
extension for the extensions aac, m4a, mp3, mp4, ogg, wav, webm. You can also use the special names "best", "bestvideo", "bestaudio",
"worst". You can filter the video results by putting a condition in brackets, as in -f "best[height=720]" (or -f "[filesize>10M]").
This works for filesize, height, width, tbr, abr, vbr, asr, and fps and the comparisons <, <=, >, >=, =, != and for ext, acodec,
vcodec, container, and protocol and the comparisons =, != . Formats for which the value is not known are excluded unless you put a
question mark (?) after the operator. You can combine format filters, so -f "[height <=? 720][tbr>500]" selects up to 720p videos
(or videos where the height is not known) with a bitrate of at least 500 KBit/s. By default, youtube-dl will pick the best quality.
Use commas to download multiple audio formats, such as -f 136/137/mp4/bestvideo,140/m4a/bestaudio. You can merge the video and audio
of two formats into a single file using -f <video-format>+<audio-format> (requires ffmpeg or avconv), for example -f
bestvideo+bestaudio.
--all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific one is requested
--max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats
-f, --format FORMAT Video format code, see the "FORMAT SELECTION" for all the info
--all-formats Download all available video formats
--prefer-free-formats Prefer free video formats unless a specific one is requested
-F, --list-formats List all available formats
--youtube-skip-dash-manifest Do not download the DASH manifest on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g. bestvideo+bestaudio), output to given container format. One of mkv, mp4, ogg, webm, flv.Ignored if no
merge is required
## Subtitle Options:
--write-sub write subtitle file
--write-auto-sub write automatic subtitle file (youtube only)
--all-subs downloads all the available subtitles of the video
--list-subs lists all available subtitles for the video
--sub-format FORMAT subtitle format, accepts formats preference, for example: "ass/srt/best"
--sub-lang LANGS languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'
--write-sub Write subtitle file
--write-auto-sub Write automatic subtitle file (YouTube only)
--all-subs Download all the available subtitles of the video
--list-subs List all available subtitles for the video
--sub-format FORMAT Subtitle format, accepts formats preference, for example: "srt" or "ass/srt/best"
--sub-lang LANGS Languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'
## Authentication Options:
-u, --username USERNAME login with this account ID
-p, --password PASSWORD account password. If this option is left out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR two-factor auth code
-n, --netrc use .netrc authentication data
--video-password PASSWORD video password (vimeo, smotri)
-u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR Two-factor auth code
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri)
## Post-processing Options:
-x, --extract-audio convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "best" by default
--audio-quality QUALITY ffmpeg/avconv audio quality specification, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K
(default 5)
-x, --extract-audio Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "best" by default
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default
5)
--recode-video FORMAT Encode the video to another format if necessary (currently supported: mp4|flv|ogg|webm|mkv)
-k, --keep-video keeps the video file on disk after the post-processing; the video is erased by default
--no-post-overwrites do not overwrite post-processed files; the post-processed files are overwritten by default
--embed-subs embed subtitles in the video (only for mp4 videos)
--embed-thumbnail embed thumbnail in the audio as cover art
--add-metadata write metadata to the video file
--metadata-from-title FORMAT parse additional metadata like song title / artist from the video title. The format syntax is the same as --output, the parsed
parameters replace existing values. Additional templates: %(album), %(artist). Example: --metadata-from-title "%(artist)s -
-k, --keep-video Keep the video file on disk after the post-processing; the video is erased by default
--no-post-overwrites Do not overwrite post-processed files; the post-processed files are overwritten by default
--embed-subs Embed subtitles in the video (only for mkv and mp4 videos)
--embed-thumbnail Embed thumbnail in the audio as cover art
--add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title / artist from the video title. The format syntax is the same as --output, the parsed
parameters replace existing values. Additional templates: %(album)s, %(artist)s. Example: --metadata-from-title "%(artist)s -
%(title)s" matches a title like "Coldplay - Paradise"
--xattrs write metadata to the video file's xattrs (using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn(the default;
--xattrs Write metadata to the video file's xattrs (using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn (the default;
fix file if we can, warn otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the postprocessors
@@ -271,6 +262,17 @@ $ youtube-dl --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filena
youtube-dl_test_video_.mp4 # A simple file name
```
# FORMAT SELECTION
By default youtube-dl tries to download the best quality, but sometimes you may want to download other format.
The simplest case is requesting a specific format, for example `-f 22`. You can get the list of available formats using `--list-formats`, you can also use a file extension (currently it supports aac, m4a, mp3, mp4, ogg, wav, webm) or the special names `best`, `bestvideo`, `bestaudio` and `worst`.
If you want to download multiple videos and they don't have the same formats available, you can specify the order of preference using slashes, as in `-f 22/17/18`. You can also filter the video results by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`). This works for filesize, height, width, tbr, abr, vbr, asr, and fps and the comparisons <, <=, >, >=, =, != and for ext, acodec, vcodec, container, and protocol and the comparisons =, != . Formats for which the value is not known are excluded unless you put a question mark (?) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. Use commas to download multiple formats, such as `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`. You can merge the video and audio of two formats into a single file using `-f <video-format>+<audio-format>` (requires ffmpeg or avconv), for example `-f bestvideo+bestaudio`.
Since the end of April 2015 and version 2015.04.26 youtube-dl uses `-f bestvideo+bestaudio/best` as default format selection (see #5447, #5456). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some dash formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
# VIDEO SELECTION
Videos can be filtered by their upload date using the options `--date`, `--datebefore` or `--dateafter`, they accept dates in two formats:
@@ -321,9 +323,9 @@ YouTube changed their playlist format in March 2014 and later on, so you'll need
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging guys](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
### Do I always have to pass in `--max-quality FORMAT`, or `-citw`?
### Do I always have to pass `-citw`?
By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, [please file an issue where you explain that](https://yt-dl.org/bug)). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, `--max-quality` *limits* the video quality (so if you want the best quality, do NOT pass it in), and the only option out of `-citw` that is regularly useful is `-i`.
By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, [please file an issue where you explain that](https://yt-dl.org/bug)). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, the only option out of `-citw` that is regularly useful is `-i`.
### Can you please put the -b option back?
@@ -355,6 +357,22 @@ YouTube has switched to a new video info format in July 2011 which is not suppor
YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command` ###
That's actually the output from your shell. Since ampersand is one of the special shell characters it's interpreted by shell preventing you from passing the whole URL to youtube-dl. To disable your shell from interpreting the ampersands (or any other special characters) you have to either put the whole URL in quotes or escape them with a backslash (which approach will work depends on your shell).
For example if your URL is https://www.youtube.com/watch?t=4&v=BaW_jenozKc you should end up with following command:
```youtube-dl 'https://www.youtube.com/watch?t=4&v=BaW_jenozKc'```
or
```youtube-dl https://www.youtube.com/watch?t=4\&v=BaW_jenozKc```
For Windows you have to use the double quotes:
```youtube-dl "https://www.youtube.com/watch?t=4&v=BaW_jenozKc"```
### ExtractorError: Could not find JS function u'OF'
In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.

View File

@@ -28,7 +28,7 @@ for test in get_testcases():
if METHOD == 'EURISTIC':
try:
webpage = compat_urllib_request.urlopen(test['url'], timeout=10).read()
except:
except Exception:
print('\nFail: {0}'.format(test['name']))
continue

View File

@@ -2,12 +2,15 @@
- **1tv**: Первый канал
- **1up.com**
- **220.ro**
- **22tracks:genre**
- **22tracks:track**
- **24video**
- **3sat**
- **4tube**
- **56.com**
- **5min**
- **8tracks**
- **91porn**
- **9gag**
- **abc.net.au**
- **Abc7News**
@@ -24,8 +27,7 @@
- **anitube.se**
- **AnySex**
- **Aparat**
- **AppleDailyAnimationNews**
- **AppleDailyRealtimeNews**
- **AppleDaily**
- **AppleTrailers**
- **archive.org**: archive.org videos
- **ARD**
@@ -42,6 +44,7 @@
- **audiomack**
- **audiomack:album**
- **Azubu**
- **BaiduVideo**
- **bambuser**
- **bambuser:channel**
- **Bandcamp**
@@ -61,6 +64,8 @@
- **BR**: Bayerischer Rundfunk Mediathek
- **Break**
- **Brightcove**
- **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **BuzzFeed**
- **BYUtv**
- **Camdemy**
@@ -96,6 +101,7 @@
- **CondeNast**: Condé Nast media group: Condé Nast, GQ, Glamour, Vanity Fair, Vogue, W Magazine, WIRED
- **Cracked**
- **Criterion**
- **CrooksAndLiars**
- **Crunchyroll**
- **crunchyroll:playlist**
- **CSpan**: C-SPAN
@@ -109,15 +115,19 @@
- **DctpTv**
- **DeezerPlaylist**
- **defense.gouv.fr**
- **DHM**: Filmarchiv - Deutsches Historisches Museum
- **Discovery**
- **divxstage**: DivxStage
- **Dotsub**
- **DouyuTV**
- **dramafever**
- **dramafever:series**
- **DRBonanza**
- **Dropbox**
- **DrTuber**
- **DRTV**
- **Dump**
- **Dumpert**
- **dvtv**: http://video.aktualne.cz/
- **EaglePlatform**
- **EbaumsWorld**
@@ -134,6 +144,7 @@
- **Eporner**
- **EroProfile**
- **Escapist**
- **ESPN** (Currently broken)
- **EveryonesMixtape**
- **exfm**: ex.fm
- **ExpoTV**
@@ -143,13 +154,14 @@
- **fc2**
- **fernsehkritik.tv**
- **fernsehkritik.tv:postecke**
- **Firedrive**
- **Firstpost**
- **FiveTV**
- **Flickr**
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom**
- **Foxgay**
- **FoxNews**
- **FoxSports**
- **france2.fr:generation-quoi**
- **FranceCulture**
- **FranceInter**
@@ -162,12 +174,14 @@
- **Gamekings**
- **GameOne**
- **gameone:playlist**
- **Gamersyde**
- **GameSpot**
- **GameStar**
- **Gametrailers**
- **Gazeta**
- **GDCVault**
- **generic**: Generic downloader that works on some sites
- **Gfycat**
- **GiantBomb**
- **Giga**
- **Glide**: Glide mobile video messages (glide.me)
@@ -175,9 +189,8 @@
- **GodTube**
- **GoldenMoustache**
- **Golem**
- **GorillaVid**: GorillaVid.in, daclips.in, movpod.in and fastvideo.in
- **GorillaVid**: GorillaVid.in, daclips.in, movpod.in, fastvideo.in and realvid.net
- **Goshgay**
- **Grooveshark**
- **Groupon**
- **Hark**
- **HearThisAt**
@@ -207,6 +220,7 @@
- **instagram:user**: Instagram user profile
- **InternetVideoArchive**
- **IPrima**
- **iqiyi**
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **Izlesene**
@@ -219,6 +233,7 @@
- **KanalPlay**: Kanal 5/9/11 Play
- **Kankan**
- **Karaoketv**
- **KarriereVideos**
- **keek**
- **KeezMovies**
- **KhanAcademy**
@@ -232,6 +247,7 @@
- **LetvPlaylist**
- **LetvTv**
- **Libsyn**
- **life:embed**
- **lifenews**: LIFE | NEWS
- **LiveLeak**
- **livestream**
@@ -246,11 +262,13 @@
- **Malemotion**
- **MDR**
- **media.ccc.de**
- **MegaVideoz**
- **metacafe**
- **Metacritic**
- **Mgoon**
- **Minhateca**
- **MinistryGrid**
- **miomio.tv**
- **mitele.es**
- **mixcloud**
- **MLB**
@@ -278,12 +296,15 @@
- **MySpass**
- **myvideo**
- **MyVidster**
- **N-JOY**
- **n-tv.de**
- **NationalGeographic**
- **Naver**
- **NBA**
- **NBC**
- **NBCNews**
- **NBCSports**
- **NBCSportsVPlayer**
- **ndr**: NDR.de - Mediathek
- **NDTV**
- **NerdCubedFeed**
@@ -303,8 +324,10 @@
- **Noco**
- **Normalboots**
- **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **novamov**: NovaMov
- **Nowness**
- **NowTV**
- **nowvideo**: NowVideo
- **npo.nl**
- **npo.nl:live**
@@ -316,11 +339,13 @@
- **ntv.ru**
- **Nuvid**
- **NYTimes**
- **NYTimesArticle**
- **ocw.mit.edu**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **Ooyala**
- **OoyalaExternal**
- **OpenFilm**
- **orf:fm4**: radio FM4
- **orf:iptv**: iptv.ORF.at
@@ -329,6 +354,7 @@
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
- **PBS**
- **PhilharmonieDeParis**: Philharmonie de Paris
- **Phoenix**
- **Photobucket**
- **Pladform**
@@ -344,17 +370,23 @@
- **PornHub**
- **PornHubPlaylist**
- **Pornotube**
- **PornoVoisines**
- **PornoXO**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
- **Puls4**
- **Pyvideo**
- **qqmusic**
- **qqmusic:album**
- **qqmusic:singer**
- **qqmusic:toplist**
- **QuickVid**
- **R7**
- **radio.de**
- **radiobremen**
- **radiofrance**
- **RadioJavan**
- **Rai**
- **RBMARadio**
- **RedTube**
@@ -367,7 +399,6 @@
- **Rte**
- **rtl.nl**: rtl.nl and rtlxl.nl
- **RTL2**
- **RTLnow**
- **RTP**
- **RTS**: RTS.ch
- **rtve.es:alacarta**: RTVE a la carta
@@ -380,6 +411,9 @@
- **rutube:movie**: Rutube movies
- **rutube:person**: Rutube person videos
- **RUTV**: RUTV.RU
- **Ruutu**
- **safari**: safaribooksonline.com online video
- **safari:course**: safaribooksonline.com online courses
- **Sandia**: Sandia National Laboratories
- **Sapo**: SAPO Vídeos
- **savefrom.net**
@@ -389,6 +423,7 @@
- **Screencast**
- **ScreencastOMatic**
- **ScreenwaveMedia**
- **SenateISVP**
- **ServingSys**
- **Sexu**
- **SexyKarma**: Sexy Karma and Watch Indian Porn
@@ -402,8 +437,9 @@
- **smotri:community**: Smotri.com community videos
- **smotri:user**: Smotri.com user videos
- **Snotr**
- **Sockshare**
- **Sohu**
- **soompi**
- **soompi:show**
- **soundcloud**
- **soundcloud:playlist**
- **soundcloud:set**
@@ -411,8 +447,12 @@
- **soundgasm**
- **soundgasm:profile**
- **southpark.cc.com**
- **southpark.cc.com:español**
- **southpark.de**
- **southpark.nl**
- **southparkstudios.dk**
- **Space**
- **SpankBang**
- **Spankwire**
- **Spiegel**
- **Spiegel:Article**: Articles on spiegel.de
@@ -420,7 +460,9 @@
- **Spike**
- **Sport5**
- **SportBox**
- **SportBoxEmbed**
- **SportDeutschland**
- **Srf**
- **SRMediathek**: Saarländischer Rundfunk
- **SSA**
- **stanfordoc**: Stanford Open ClassRoom
@@ -429,6 +471,7 @@
- **StreamCZ**
- **StreetVoice**
- **SunPorno**
- **SVT**
- **SVTPlay**: SVT Play and Öppet arkiv
- **SWRMediathek**
- **Syfy**
@@ -443,7 +486,7 @@
- **TeamFour**
- **TechTalks**
- **techtv.mit.edu**
- **TED**
- **ted**
- **tegenlicht.vpro.nl**
- **TeleBruxelles**
- **telecinco.es**
@@ -462,6 +505,7 @@
- **tlc.com**
- **tlc.de**
- **TMZ**
- **TMZArticle**
- **TNAFlix**
- **tou.tv**
- **Toypics**: Toypics user profile
@@ -470,13 +514,18 @@
- **Trilulilu**
- **TruTube**
- **Tube8**
- **TubiTv**
- **Tudou**
- **Tumblr**
- **TuneIn**
- **Turbo**
- **Tutv**
- **tv.dfb.de**
- **TV2**
- **TV2Article**
- **TV4**: tv4.se and tv4play.se
- **TVC**
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **tvp.pl**
- **tvp.pl:Series**
@@ -492,17 +541,20 @@
- **Ubu**
- **udemy**
- **udemy:course**
- **UDNEmbed**
- **Ultimedia**
- **Unistra**
- **Urort**: NRK P3 Urørt
- **ustream**
- **ustream:channel**
- **Varzesh3**
- **Vbox7**
- **VeeHD**
- **Veoh**
- **Vessel**
- **Vesti**: Вести.Ru
- **Vevo**
- **VGTV**
- **VGTV**: VGTV and BTTV
- **vh1.com**
- **Vice**
- **Viddler**
@@ -522,6 +574,7 @@
- **vier:videos**
- **Viewster**
- **viki**
- **viki:channel**
- **vimeo**
- **vimeo:album**
- **vimeo:channel**
@@ -530,12 +583,13 @@
- **vimeo:review**: Review pages on vimeo
- **vimeo:user**
- **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)
- **Vimple**: Vimple.ru
- **Vimple**: Vimple - one-click video hosting
- **Vine**
- **vine:user**
- **vk.com**
- **vk.com:user-videos**: vk.com:All of a user's videos
- **Vodlocker**
- **VoiceRepublic**
- **Vporn**
- **VRT**
- **vube**: Vube.com
@@ -560,6 +614,7 @@
- **XHamster**
- **XMinus**
- **XNXX**
- **Xstream**
- **XTube**
- **XTubeUser**: XTube user profile
- **Xuite**
@@ -588,7 +643,7 @@
- **youtube:show**: YouTube.com (multi-season) shows
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
- **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **Zapiks**
- **ZDF**
- **ZDFChannel**

View File

@@ -150,7 +150,7 @@ def expect_info_dict(self, got_dict, expected_dict):
'invalid value for field %s, expected %r, got %r' % (info_field, expected, got))
# Check for the presence of mandatory fields
if got_dict.get('_type') != 'playlist':
if got_dict.get('_type') not in ('playlist', 'multi_video'):
for key in ('id', 'url', 'title', 'ext'):
self.assertTrue(got_dict.get(key), 'Missing mandatory field %s' % key)
# Check for mandatory fields that are automatically set by YoutubeDL

View File

@@ -7,8 +7,7 @@
"forcethumbnail": false,
"forcetitle": false,
"forceurl": false,
"format": null,
"format_limit": null,
"format": "best",
"ignoreerrors": false,
"listformats": null,
"logtostderr": false,

View File

@@ -12,6 +12,7 @@ import copy
from test.helper import FakeYDL, assertRegexpMatches
from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_str
from youtube_dl.extractor import YoutubeIE
from youtube_dl.postprocessor.common import PostProcessor
from youtube_dl.utils import match_filter_func
@@ -101,39 +102,6 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['ext'], 'flv')
def test_format_limit(self):
formats = [
{'format_id': 'meh', 'url': 'http://example.com/meh', 'preference': 1},
{'format_id': 'good', 'url': 'http://example.com/good', 'preference': 2},
{'format_id': 'great', 'url': 'http://example.com/great', 'preference': 3},
{'format_id': 'excellent', 'url': 'http://example.com/exc', 'preference': 4},
]
info_dict = _make_result(formats)
ydl = YDL()
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'excellent')
ydl = YDL({'format_limit': 'good'})
assert ydl.params['format_limit'] == 'good'
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'good')
ydl = YDL({'format_limit': 'great', 'format': 'all'})
ydl.process_ie_result(info_dict.copy())
self.assertEqual(ydl.downloaded_info_dicts[0]['format_id'], 'meh')
self.assertEqual(ydl.downloaded_info_dicts[1]['format_id'], 'good')
self.assertEqual(ydl.downloaded_info_dicts[2]['format_id'], 'great')
self.assertTrue('3' in ydl.msgs[0])
ydl = YDL()
ydl.params['format_limit'] = 'excellent'
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'excellent')
def test_format_selection(self):
formats = [
{'format_id': '35', 'ext': 'mp4', 'preference': 1, 'url': TEST_URL},
@@ -270,7 +238,7 @@ class TestFormatSelection(unittest.TestCase):
f2['url'] = 'url:' + f2id
info_dict = _make_result([f1, f2], extractor='youtube')
ydl = YDL()
ydl = YDL({'format': 'best/bestvideo'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.process_ie_result(info_dict)
@@ -278,7 +246,7 @@ class TestFormatSelection(unittest.TestCase):
self.assertEqual(downloaded['format_id'], f1id)
info_dict = _make_result([f2, f1], extractor='youtube')
ydl = YDL()
ydl = YDL({'format': 'best/bestvideo'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.process_ie_result(info_dict)
@@ -443,27 +411,36 @@ class TestYoutubeDL(unittest.TestCase):
def run(self, info):
with open(audiofile, 'wt') as f:
f.write('EXAMPLE')
info['filepath']
return False, info
return [info['filepath']], info
def run_pp(params):
def run_pp(params, PP):
with open(filename, 'wt') as f:
f.write('EXAMPLE')
ydl = YoutubeDL(params)
ydl.add_post_processor(SimplePP())
ydl.add_post_processor(PP())
ydl.post_process(filename, {'filepath': filename})
run_pp({'keepvideo': True})
run_pp({'keepvideo': True}, SimplePP)
self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(filename)
os.unlink(audiofile)
run_pp({'keepvideo': False})
run_pp({'keepvideo': False}, SimplePP)
self.assertFalse(os.path.exists(filename), '%s exists' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(audiofile)
class ModifierPP(PostProcessor):
def run(self, info):
with open(info['filepath'], 'wt') as f:
f.write('MODIFIED')
return [], info
run_pp({'keepvideo': False}, ModifierPP)
self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
os.unlink(filename)
def test_match_filter(self):
class FilterYDL(YDL):
def __init__(self, *args, **kwargs):
@@ -531,6 +508,51 @@ class TestYoutubeDL(unittest.TestCase):
res = get_videos(f)
self.assertEqual(res, ['1'])
def test_playlist_items_selection(self):
entries = [{
'id': compat_str(i),
'title': compat_str(i),
'url': TEST_URL,
} for i in range(1, 5)]
playlist = {
'_type': 'playlist',
'id': 'test',
'entries': entries,
'extractor': 'test:playlist',
'extractor_key': 'test:playlist',
'webpage_url': 'http://example.com',
}
def get_ids(params):
ydl = YDL(params)
# make a copy because the dictionary can be modified
ydl.process_ie_result(playlist.copy())
return [int(v['id']) for v in ydl.downloaded_info_dicts]
result = get_ids({})
self.assertEqual(result, [1, 2, 3, 4])
result = get_ids({'playlistend': 10})
self.assertEqual(result, [1, 2, 3, 4])
result = get_ids({'playlistend': 2})
self.assertEqual(result, [1, 2])
result = get_ids({'playliststart': 10})
self.assertEqual(result, [])
result = get_ids({'playliststart': 2})
self.assertEqual(result, [2, 3, 4])
result = get_ids({'playlist_items': '2-4'})
self.assertEqual(result, [2, 3, 4])
result = get_ids({'playlist_items': '2,4'})
self.assertEqual(result, [2, 4])
result = get_ids({'playlist_items': '10'})
self.assertEqual(result, [])
if __name__ == '__main__':
unittest.main()

View File

@@ -39,7 +39,7 @@ class TestAES(unittest.TestCase):
encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) +
b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae'
)
).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg)
@@ -47,7 +47,7 @@ class TestAES(unittest.TestCase):
encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) +
b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83'
)
).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg)

View File

@@ -59,7 +59,7 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('www.youtube.com/NASAgovVideo/videos', ['youtube:user'])
def test_youtube_feeds(self):
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:watch_later'])
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:watchlater'])
self.assertMatch('https://www.youtube.com/feed/subscriptions', ['youtube:subscriptions'])
self.assertMatch('https://www.youtube.com/feed/recommended', ['youtube:recommended'])
self.assertMatch('https://www.youtube.com/my_favorites', ['youtube:favorites'])

View File

@@ -153,7 +153,7 @@ def generator(test_case):
break
if is_playlist:
self.assertEqual(res_dict['_type'], 'playlist')
self.assertTrue(res_dict['_type'] in ['playlist', 'multi_video'])
self.assertTrue('entries' in res_dict)
expect_info_dict(self, res_dict, test_case.get('info_dict', {}))

View File

@@ -8,6 +8,9 @@ import unittest
import sys
import os
import subprocess
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.utils import encodeArgument
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@@ -31,7 +34,7 @@ class TestExecution(unittest.TestCase):
def test_cmdline_umlauts(self):
p = subprocess.Popen(
[sys.executable, 'youtube_dl/__main__.py', 'ä', '--version'],
[sys.executable, 'youtube_dl/__main__.py', encodeArgument('ä'), '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
self.assertFalse(stderr)

View File

@@ -266,7 +266,7 @@ class TestNRKSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['no']))
self.assertEqual(md5(subtitles['no']), '1d221e6458c95c5494dcd38e6a1f129a')
self.assertEqual(md5(subtitles['no']), '544fa917d3197fcbee64634559221cc2')
class TestRaiSubtitles(BaseTestSubtitles):

View File

@@ -40,7 +40,8 @@ from youtube_dl.utils import (
read_batch_urls,
sanitize_filename,
sanitize_path,
sanitize_url_path_consecutive_slashes,
prepend_extension,
replace_extension,
shell_quote,
smuggle_url,
str_to_int,
@@ -51,6 +52,7 @@ from youtube_dl.utils import (
unified_strdate,
unsmuggle_url,
uppercase_escape,
lowercase_escape,
url_basename,
urlencode_postdata,
version_tuple,
@@ -58,6 +60,8 @@ from youtube_dl.utils import (
xpath_text,
render_table,
match_str,
parse_dfxp_time_expr,
dfxp2srt,
)
@@ -171,25 +175,21 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_path('./abc'), 'abc')
self.assertEqual(sanitize_path('./../abc'), '..\\abc')
def test_sanitize_url_path_consecutive_slashes(self):
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/foo//bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname//foo/bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname//'),
'http://hostname/')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/foo/bar/filename.html'),
'http://hostname/foo/bar/filename.html')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/'),
'http://hostname/')
self.assertEqual(
sanitize_url_path_consecutive_slashes('http://hostname/abc//'),
'http://hostname/abc/')
def test_prepend_extension(self):
self.assertEqual(prepend_extension('abc.ext', 'temp'), 'abc.temp.ext')
self.assertEqual(prepend_extension('abc.ext', 'temp', 'ext'), 'abc.temp.ext')
self.assertEqual(prepend_extension('abc.unexpected_ext', 'temp', 'ext'), 'abc.unexpected_ext.temp')
self.assertEqual(prepend_extension('abc', 'temp'), 'abc.temp')
self.assertEqual(prepend_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(prepend_extension('.abc.ext', 'temp'), '.abc.temp.ext')
def test_replace_extension(self):
self.assertEqual(replace_extension('abc.ext', 'temp'), 'abc.temp')
self.assertEqual(replace_extension('abc.ext', 'temp', 'ext'), 'abc.temp')
self.assertEqual(replace_extension('abc.unexpected_ext', 'temp', 'ext'), 'abc.unexpected_ext.temp')
self.assertEqual(replace_extension('abc', 'temp'), 'abc.temp')
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_ordered_set(self):
self.assertEqual(orderedSet([1, 1, 2, 3, 4, 4, 5, 6, 7, 3, 5]), [1, 2, 3, 4, 5, 6, 7])
@@ -200,6 +200,8 @@ class TestUtil(unittest.TestCase):
def test_unescape_html(self):
self.assertEqual(unescapeHTML('%20;'), '%20;')
self.assertEqual(unescapeHTML('&#x2F;'), '/')
self.assertEqual(unescapeHTML('&#47;'), '/')
self.assertEqual(
unescapeHTML('&eacute;'), 'é')
@@ -225,6 +227,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(
unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
'20150202')
self.assertEqual(unified_strdate('25-09-2014'), '20140925')
def test_find_xpath_attr(self):
testxml = '''<root>
@@ -395,6 +398,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
def test_lowercase_escape(self):
self.assertEqual(lowercase_escape(''), '')
self.assertEqual(lowercase_escape('\\u0026'), '&')
def test_limit_length(self):
self.assertEqual(limit_length(None, 12), None)
self.assertEqual(limit_length('foo', 12), 'foo')
@@ -468,6 +475,12 @@ class TestUtil(unittest.TestCase):
self.assertEqual(d['x'], 1)
self.assertEqual(d['y'], 'a')
on = js_to_json('["abc", "def",]')
self.assertEqual(json.loads(on), ['abc', 'def'])
on = js_to_json('{"abc": "def",}')
self.assertEqual(json.loads(on), {'abc': 'def'})
def test_clean_html(self):
self.assertEqual(clean_html('a:\nb'), 'a: b')
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
@@ -572,6 +585,57 @@ ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
'like_count > 100 & dislike_count <? 50 & description',
{'like_count': 190, 'dislike_count': 10}))
def test_parse_dfxp_time_expr(self):
self.assertEqual(parse_dfxp_time_expr(None), 0.0)
self.assertEqual(parse_dfxp_time_expr(''), 0.0)
self.assertEqual(parse_dfxp_time_expr('0.1'), 0.1)
self.assertEqual(parse_dfxp_time_expr('0.1s'), 0.1)
self.assertEqual(parse_dfxp_time_expr('00:00:01'), 1.0)
self.assertEqual(parse_dfxp_time_expr('00:00:01.100'), 1.1)
def test_dfxp2srt(self):
dfxp_data = '''<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xml:lang="en" xmlns:tts="http://www.w3.org/ns/ttml#parameter">
<body>
<div xml:lang="en">
<p begin="0" end="1">The following line contains Chinese characters and special symbols</p>
<p begin="1" end="2">第二行<br/>♪♪</p>
<p begin="2" dur="1"><span>Third<br/>Line</span></p>
</div>
</body>
</tt>'''
srt_data = '''1
00:00:00,000 --> 00:00:01,000
The following line contains Chinese characters and special symbols
2
00:00:01,000 --> 00:00:02,000
第二行
♪♪
3
00:00:02,000 --> 00:00:03,000
Third
Line
'''
self.assertEqual(dfxp2srt(dfxp_data), srt_data)
dfxp_data_no_default_namespace = '''<?xml version="1.0" encoding="UTF-8"?>
<tt xml:lang="en" xmlns:tts="http://www.w3.org/ns/ttml#parameter">
<body>
<div xml:lang="en">
<p begin="0" end="1">The first line</p>
</div>
</body>
</tt>'''
srt_data = '''1
00:00:00,000 --> 00:00:01,000
The first line
'''
self.assertEqual(dfxp2srt(dfxp_data_no_default_namespace), srt_data)
if __name__ == '__main__':
unittest.main()

View File

@@ -4,6 +4,8 @@ envlist = py26,py27,py33,py34
deps =
nose
coverage
# We need a valid $HOME for test_compat_expanduser
passenv = HOME
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py

View File

@@ -49,6 +49,7 @@ from .utils import (
ExtractorError,
format_bytes,
formatSeconds,
HEADRequest,
locked_file,
make_HTTPS_handler,
MaxDownloadsReached,
@@ -64,7 +65,6 @@ from .utils import (
sanitize_path,
std_headers,
subtitles_filename,
takewhile_inclusive,
UnavailableVideoError,
url_basename,
version_tuple,
@@ -72,6 +72,7 @@ from .utils import (
write_string,
YoutubeDLHandler,
prepend_extension,
replace_extension,
args_to_str,
age_restricted,
)
@@ -118,7 +119,7 @@ class YoutubeDL(object):
username: Username for authentication purposes.
password: Password for authentication purposes.
videopassword: Password for acces a video.
videopassword: Password for accessing a video.
usenetrc: Use netrc for authentication instead.
verbose: Print additional info to stdout.
quiet: Do not print messages to stdout.
@@ -135,7 +136,6 @@ class YoutubeDL(object):
(or video) as a single JSON line.
simulate: Do not download the video files.
format: Video format code. See options.py for more information.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
restrictfilenames: Do not allow "&" and spaces in file names
ignoreerrors: Do not stop on download errors.
@@ -261,7 +261,6 @@ class YoutubeDL(object):
The following options are used by the post processors:
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
otherwise prefer avconv.
exec_cmd: Arbitrary command to run after downloading
"""
params = None
@@ -761,7 +760,9 @@ class YoutubeDL(object):
if isinstance(ie_entries, list):
n_all_entries = len(ie_entries)
if playlistitems:
entries = [ie_entries[i - 1] for i in playlistitems]
entries = [
ie_entries[i - 1] for i in playlistitems
if -n_all_entries <= i - 1 < n_all_entries]
else:
entries = ie_entries[playliststart:playlistend]
n_entries = len(entries)
@@ -916,10 +917,17 @@ class YoutubeDL(object):
if not available_formats:
return None
if format_spec == 'best' or format_spec is None:
return available_formats[-1]
elif format_spec == 'worst':
return available_formats[0]
if format_spec in ['best', 'worst', None]:
format_idx = 0 if format_spec == 'worst' else -1
audiovideo_formats = [
f for f in available_formats
if f.get('vcodec') != 'none' and f.get('acodec') != 'none']
if audiovideo_formats:
return audiovideo_formats[format_idx]
# for audio only (soundcloud) or video only (imgur) urls, select the best/worst audio format
elif (all(f.get('acodec') != 'none' for f in available_formats) or
all(f.get('vcodec') != 'none' for f in available_formats)):
return available_formats[format_idx]
elif format_spec == 'bestaudio':
audio_formats = [
f for f in available_formats
@@ -1008,13 +1016,13 @@ class YoutubeDL(object):
info_dict['display_id'] = info_dict['id']
if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
# Working around negative timestamps in Windows
# (see http://bugs.python.org/issue1646728)
if info_dict['timestamp'] < 0 and os.name == 'nt':
info_dict['timestamp'] = 0
upload_date = datetime.datetime.utcfromtimestamp(
info_dict['timestamp'])
info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
# Working around out-of-range timestamp values (e.g. negative ones on Windows,
# see http://bugs.python.org/issue1646728)
try:
upload_date = datetime.datetime.utcfromtimestamp(info_dict['timestamp'])
info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
except (ValueError, OverflowError, OSError):
pass
if self.params.get('listsubtitles', False):
if 'automatic_captions' in info_dict:
@@ -1041,6 +1049,8 @@ class YoutubeDL(object):
if not formats:
raise ExtractorError('No video formats found!')
formats_dict = {}
# We check that all the formats have the format and format_id fields
for i, format in enumerate(formats):
if 'url' not in format:
@@ -1048,6 +1058,18 @@ class YoutubeDL(object):
if format.get('format_id') is None:
format['format_id'] = compat_str(i)
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
formats_dict[format_id].append(format)
# Make sure all formats have unique format_id
for format_id, ambiguous_formats in formats_dict.items():
if len(ambiguous_formats) > 1:
for i, format in enumerate(ambiguous_formats):
format['format_id'] = '%s-%d' % (format_id, i)
for i, format in enumerate(formats):
if format.get('format') is None:
format['format'] = '{id} - {res}{note}'.format(
id=format['format_id'],
@@ -1063,12 +1085,6 @@ class YoutubeDL(object):
full_format_info.update(format)
format['http_headers'] = self._calc_headers(full_format_info)
format_limit = self.params.get('format_limit', None)
if format_limit:
formats = list(takewhile_inclusive(
lambda f: f['format_id'] != format_limit, formats
))
# TODO Central sorting goes here
if formats[0] is not info_dict:
@@ -1086,7 +1102,14 @@ class YoutubeDL(object):
req_format = self.params.get('format')
if req_format is None:
req_format = 'best'
req_format_list = []
if (self.params.get('outtmpl', DEFAULT_OUTTMPL) != '-' and
info_dict['extractor'] in ['youtube', 'ted']):
merger = FFmpegMergerPP(self)
if merger.available and merger.can_merge():
req_format_list.append('bestvideo+bestaudio')
req_format_list.append('best')
req_format = '/'.join(req_format_list)
formats_to_download = []
if req_format == 'all':
formats_to_download = formats
@@ -1268,7 +1291,7 @@ class YoutubeDL(object):
return
if self.params.get('writedescription', False):
descfn = filename + '.description'
descfn = replace_extension(filename, 'description', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Video description is already present')
elif info_dict.get('description') is None:
@@ -1283,7 +1306,7 @@ class YoutubeDL(object):
return
if self.params.get('writeannotations', False):
annofn = filename + '.annotations.xml'
annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)):
self.to_screen('[info] Video annotations are already present')
else:
@@ -1330,13 +1353,13 @@ class YoutubeDL(object):
return
if self.params.get('writeinfojson', False):
infofn = os.path.splitext(filename)[0] + '.info.json'
infofn = replace_extension(filename, 'info.json', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Video description metadata is already present')
else:
self.to_screen('[info] Writing video description metadata as JSON to: ' + infofn)
try:
write_json_file(info_dict, infofn)
write_json_file(self.filter_requested_info(info_dict), infofn)
except (OSError, IOError):
self.report_error('Cannot write metadata to JSON file ' + infofn)
return
@@ -1356,24 +1379,57 @@ class YoutubeDL(object):
if info_dict.get('requested_formats') is not None:
downloaded = []
success = True
merger = FFmpegMergerPP(self, not self.params.get('keepvideo'))
merger = FFmpegMergerPP(self)
if not merger.available:
postprocessors = []
self.report_warning('You have requested multiple '
'formats but ffmpeg or avconv are not installed.'
' The formats won\'t be merged')
' The formats won\'t be merged.')
else:
postprocessors = [merger]
for f in info_dict['requested_formats']:
new_info = dict(info_dict)
new_info.update(f)
fname = self.prepare_filename(new_info)
fname = prepend_extension(fname, 'f%s' % f['format_id'])
downloaded.append(fname)
partial_success = dl(fname, new_info)
success = success and partial_success
info_dict['__postprocessors'] = postprocessors
info_dict['__files_to_merge'] = downloaded
def compatible_formats(formats):
video, audio = formats
# Check extension
video_ext, audio_ext = audio.get('ext'), video.get('ext')
if video_ext and audio_ext:
COMPATIBLE_EXTS = (
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v'),
('webm')
)
for exts in COMPATIBLE_EXTS:
if video_ext in exts and audio_ext in exts:
return True
# TODO: Check acodec/vcodec
return False
filename_real_ext = os.path.splitext(filename)[1][1:]
filename_wo_ext = (
os.path.splitext(filename)[0]
if filename_real_ext == info_dict['ext']
else filename)
requested_formats = info_dict['requested_formats']
if self.params.get('merge_output_format') is None and not compatible_formats(requested_formats):
info_dict['ext'] = 'mkv'
self.report_warning(
'Requested formats are incompatible for merge and will be merged into mkv.')
# Ensure filename always has a correct extension for successful merge
filename = '%s.%s' % (filename_wo_ext, info_dict['ext'])
if os.path.exists(encodeFilename(filename)):
self.to_screen(
'[download] %s has already been downloaded and '
'merged' % filename)
else:
for f in requested_formats:
new_info = dict(info_dict)
new_info.update(f)
fname = self.prepare_filename(new_info)
fname = prepend_extension(fname, 'f%s' % f['format_id'], new_info['ext'])
downloaded.append(fname)
partial_success = dl(fname, new_info)
success = success and partial_success
info_dict['__postprocessors'] = postprocessors
info_dict['__files_to_merge'] = downloaded
else:
# Just a single file
success = dl(filename, info_dict)
@@ -1460,7 +1516,7 @@ class YoutubeDL(object):
[info_filename], mode='r',
openhook=fileinput.hook_encoded('utf-8'))) as f:
# FileInput doesn't have a read method, we can't call json.load
info = json.loads('\n'.join(f))
info = self.filter_requested_info(json.loads('\n'.join(f)))
try:
self.process_ie_result(info, download=True)
except DownloadError:
@@ -1472,6 +1528,12 @@ class YoutubeDL(object):
raise
return self._download_retcode
@staticmethod
def filter_requested_info(info_dict):
return dict(
(k, v) for k, v in info_dict.items()
if k not in ['requested_formats', 'requested_subtitles'])
def post_process(self, filename, ie_info):
"""Run all the postprocessors on the given file."""
info = dict(ie_info)
@@ -1481,24 +1543,18 @@ class YoutubeDL(object):
pps_chain.extend(ie_info['__postprocessors'])
pps_chain.extend(self._pps)
for pp in pps_chain:
keep_video = None
old_filename = info['filepath']
files_to_delete = []
try:
keep_video_wish, info = pp.run(info)
if keep_video_wish is not None:
if keep_video_wish:
keep_video = keep_video_wish
elif keep_video is None:
# No clear decision yet, let IE decide
keep_video = keep_video_wish
files_to_delete, info = pp.run(info)
except PostProcessingError as e:
self.report_error(e.msg)
if keep_video is False and not self.params.get('keepvideo', False):
try:
if files_to_delete and not self.params.get('keepvideo', False):
for old_filename in files_to_delete:
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded video file')
try:
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded original file')
def _make_archive_id(self, info_dict):
# Future-proof against any change in case
@@ -1666,7 +1722,8 @@ class YoutubeDL(object):
if req_is_string:
req = url_escaped
else:
req = compat_urllib_request.Request(
req_type = HEADRequest if req.get_method() == 'HEAD' else compat_urllib_request.Request
req = req_type(
url_escaped, data=req.data, headers=req.headers,
origin_req_host=req.origin_req_host, unverifiable=req.unverifiable)
@@ -1701,10 +1758,10 @@ class YoutubeDL(object):
out = out.decode().strip()
if re.match('[0-9a-f]+', out):
self._write_string('[debug] Git HEAD: ' + out + '\n')
except:
except Exception:
try:
sys.exc_clear()
except:
except Exception:
pass
self._write_string('[debug] Python version %s - %s\n' % (
platform.python_version(), platform_name()))
@@ -1812,7 +1869,7 @@ class YoutubeDL(object):
thumb_ext = determine_ext(t['url'], 'jpg')
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
thumb_filename = os.path.splitext(filename)[0] + suffix + '.' + thumb_ext
t['filename'] = thumb_filename = os.path.splitext(filename)[0] + suffix + '.' + thumb_ext
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(thumb_filename)):
self.to_screen('[%s] %s: Thumbnail %sis already present' %

View File

@@ -189,10 +189,6 @@ def _real_main(argv=None):
if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True
if sys.version_info < (3,):
# In Python 2, sys.argv is a bytestring (also note http://bugs.python.org/issue2128 for Windows systems)
if opts.outtmpl is not None:
opts.outtmpl = opts.outtmpl.decode(preferredencoding())
outtmpl = ((opts.outtmpl is not None and opts.outtmpl) or
(opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') or
(opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') or
@@ -244,15 +240,18 @@ def _real_main(argv=None):
if opts.xattrs:
postprocessors.append({'key': 'XAttrMetadata'})
if opts.embedthumbnail:
if not opts.addmetadata:
postprocessors.append({'key': 'FFmpegAudioFix'})
postprocessors.append({'key': 'AtomicParsley'})
already_have_thumbnail = opts.writethumbnail or opts.write_all_thumbnails
postprocessors.append({
'key': 'EmbedThumbnail',
'already_have_thumbnail': already_have_thumbnail
})
if not already_have_thumbnail:
opts.writethumbnail = True
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
if opts.exec_cmd:
postprocessors.append({
'key': 'ExecAfterDownload',
'verboseOutput': opts.verbose,
'exec_cmd': opts.exec_cmd,
})
if opts.xattr_set_filesize:
@@ -289,7 +288,6 @@ def _real_main(argv=None):
'simulate': opts.simulate or any_getting,
'skip_download': opts.skip_download,
'format': opts.format,
'format_limit': opts.format_limit,
'listformats': opts.listformats,
'outtmpl': outtmpl,
'autonumber_size': opts.autonumber_size,
@@ -352,7 +350,6 @@ def _real_main(argv=None):
'default_search': opts.default_search,
'youtube_include_dash_manifest': opts.youtube_include_dash_manifest,
'encoding': opts.encoding,
'exec_cmd': opts.exec_cmd,
'extract_flat': opts.extract_flat,
'merge_output_format': opts.merge_output_format,
'postprocessors': postprocessors,

View File

@@ -152,7 +152,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
"""
NONCE_LENGTH_BYTES = 8
data = bytes_to_intlist(base64.b64decode(data))
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
password = bytes_to_intlist(password.encode('utf-8'))
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))

View File

@@ -46,11 +46,6 @@ try:
except ImportError: # Python 2
import htmlentitydefs as compat_html_entities
try:
import html.parser as compat_html_parser
except ImportError: # Python 2
import HTMLParser as compat_html_parser
try:
import http.client as compat_http_client
except ImportError: # Python 2
@@ -389,7 +384,7 @@ else:
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
lines, columns = map(int, out.split())
except:
except Exception:
pass
return _terminal_size(columns, lines)
@@ -404,7 +399,6 @@ __all__ = [
'compat_getenv',
'compat_getpass',
'compat_html_entities',
'compat_html_parser',
'compat_http_client',
'compat_http_server',
'compat_kwargs',

View File

@@ -6,7 +6,7 @@ from .f4m import F4mFD
from .hls import HlsFD
from .hls import NativeHlsFD
from .http import HttpFD
from .mplayer import MplayerFD
from .rtsp import RtspFD
from .rtmp import RtmpFD
from ..utils import (
@@ -17,8 +17,8 @@ PROTOCOL_MAP = {
'rtmp': RtmpFD,
'm3u8_native': NativeHlsFD,
'm3u8': HlsFD,
'mms': MplayerFD,
'rtsp': MplayerFD,
'mms': RtspFD,
'rtsp': RtspFD,
'f4m': F4mFD,
}

View File

@@ -8,6 +8,7 @@ import time
from ..compat import compat_str
from ..utils import (
encodeFilename,
decodeArgument,
format_bytes,
timeconvert,
)
@@ -204,7 +205,7 @@ class FileDownloader(object):
return
try:
os.utime(filename, (time.time(), filetime))
except:
except Exception:
pass
return filetime
@@ -318,7 +319,7 @@ class FileDownloader(object):
)
continuedl_and_exists = (
self.params.get('continuedl', False) and
self.params.get('continuedl', True) and
os.path.isfile(encodeFilename(filename)) and
not self.params.get('nopart', False)
)
@@ -353,19 +354,15 @@ class FileDownloader(object):
# this interface
self._progress_hooks.append(ph)
def _debug_cmd(self, args, subprocess_encoding, exe=None):
def _debug_cmd(self, args, exe=None):
if not self.params.get('verbose', False):
return
if exe is None:
exe = os.path.basename(args[0])
str_args = [decodeArgument(a) for a in args]
if exe is None:
exe = os.path.basename(str_args[0])
if subprocess_encoding:
str_args = [
a.decode(subprocess_encoding) if isinstance(a, bytes) else a
for a in args]
else:
str_args = args
try:
import pipes
shell_quote = lambda args: ' '.join(map(pipes.quote, str_args))

View File

@@ -2,11 +2,11 @@ from __future__ import unicode_literals
import os.path
import subprocess
import sys
from .common import FileDownloader
from ..utils import (
encodeFilename,
encodeArgument,
)
@@ -60,17 +60,9 @@ class ExternalFD(FileDownloader):
def _call_downloader(self, tmpfilename, info_dict):
""" Either overwrite this or implement _make_cmd """
cmd = self._make_cmd(tmpfilename, info_dict)
cmd = [encodeArgument(a) for a in self._make_cmd(tmpfilename, info_dict)]
if sys.platform == 'win32' and sys.version_info < (3, 0):
# Windows subprocess module does not actually support Unicode
# on Python 2.x
# See http://stackoverflow.com/a/9951851/35070
subprocess_encoding = sys.getfilesystemencoding()
cmd = [a.encode(subprocess_encoding, 'ignore') for a in cmd]
else:
subprocess_encoding = None
self._debug_cmd(cmd, subprocess_encoding)
self._debug_cmd(cmd)
p = subprocess.Popen(
cmd, stderr=subprocess.PIPE)

View File

@@ -389,6 +389,8 @@ class F4mFD(FileDownloader):
url = base_url + name
if akamai_pv:
url += '?' + akamai_pv.strip(';')
if info_dict.get('extra_param_to_segment_url'):
url += info_dict.get('extra_param_to_segment_url')
frag_filename = '%s-%s' % (tmpfilename, name)
try:
success = http_dl.download(frag_filename, {'url': url})

View File

@@ -28,13 +28,8 @@ class HttpFD(FileDownloader):
add_headers = info_dict.get('http_headers')
if add_headers:
headers.update(add_headers)
data = info_dict.get('http_post_data')
http_method = info_dict.get('http_method')
basic_request = compat_urllib_request.Request(url, data, headers)
request = compat_urllib_request.Request(url, data, headers)
if http_method is not None:
basic_request.get_method = lambda: http_method
request.get_method = lambda: http_method
basic_request = compat_urllib_request.Request(url, None, headers)
request = compat_urllib_request.Request(url, None, headers)
is_test = self.params.get('test', False)
@@ -49,7 +44,7 @@ class HttpFD(FileDownloader):
open_mode = 'wb'
if resume_len != 0:
if self.params.get('continuedl', False):
if self.params.get('continuedl', True):
self.report_resuming_byte(resume_len)
request.add_header('Range', 'bytes=%d-' % resume_len)
open_mode = 'ab'

View File

@@ -3,7 +3,6 @@ from __future__ import unicode_literals
import os
import re
import subprocess
import sys
import time
from .common import FileDownloader
@@ -11,6 +10,7 @@ from ..compat import compat_str
from ..utils import (
check_executable,
encodeFilename,
encodeArgument,
get_exe_version,
)
@@ -105,7 +105,7 @@ class RtmpFD(FileDownloader):
protocol = info_dict.get('rtmp_protocol', None)
real_time = info_dict.get('rtmp_real_time', False)
no_resume = info_dict.get('no_resume', False)
continue_dl = info_dict.get('continuedl', False)
continue_dl = info_dict.get('continuedl', True)
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
@@ -121,7 +121,7 @@ class RtmpFD(FileDownloader):
# possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = [
'rtmpdump', '--verbose', '-r', url,
'-o', encodeFilename(tmpfilename, True)]
'-o', tmpfilename]
if player_url is not None:
basic_args += ['--swfVfy', player_url]
if page_url is not None:
@@ -131,7 +131,7 @@ class RtmpFD(FileDownloader):
if play_path is not None:
basic_args += ['--playpath', play_path]
if tc_url is not None:
basic_args += ['--tcUrl', url]
basic_args += ['--tcUrl', tc_url]
if test:
basic_args += ['--stop', '1']
if flash_version is not None:
@@ -154,16 +154,9 @@ class RtmpFD(FileDownloader):
if not live and continue_dl:
args += ['--skip', '1']
if sys.platform == 'win32' and sys.version_info < (3, 0):
# Windows subprocess module does not actually support Unicode
# on Python 2.x
# See http://stackoverflow.com/a/9951851/35070
subprocess_encoding = sys.getfilesystemencoding()
args = [a.encode(subprocess_encoding, 'ignore') for a in args]
else:
subprocess_encoding = None
args = [encodeArgument(a) for a in args]
self._debug_cmd(args, subprocess_encoding, exe='rtmpdump')
self._debug_cmd(args, exe='rtmpdump')
RD_SUCCESS = 0
RD_FAILED = 1
@@ -180,7 +173,11 @@ class RtmpFD(FileDownloader):
prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] %s bytes' % prevsize)
time.sleep(5.0) # This seems to be needed
retval = run_rtmpdump(basic_args + ['-e'] + [[], ['-k', '1']][retval == RD_FAILED])
args = basic_args + ['--resume']
if retval == RD_FAILED:
args += ['--skip', '1']
args = [encodeArgument(a) for a in args]
retval = run_rtmpdump(args)
cursize = os.path.getsize(encodeFilename(tmpfilename))
if prevsize == cursize and retval == RD_FAILED:
break

View File

@@ -10,21 +10,23 @@ from ..utils import (
)
class MplayerFD(FileDownloader):
class RtspFD(FileDownloader):
def real_download(self, filename, info_dict):
url = info_dict['url']
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
args = [
'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
'-dumpstream', '-dumpfile', tmpfilename, url]
# Check for mplayer first
if not check_executable('mplayer', ['-h']):
self.report_error('MMS or RTSP download detected but "%s" could not be run' % args[0])
if check_executable('mplayer', ['-h']):
args = [
'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
'-dumpstream', '-dumpfile', tmpfilename, url]
elif check_executable('mpv', ['-h']):
args = [
'mpv', '-really-quiet', '--vo=null', '--stream-dump=' + tmpfilename, url]
else:
self.report_error('MMS or RTSP download detected but neither "mplayer" nor "mpv" could be run. Please install any.')
return False
# Download using mplayer.
retval = subprocess.call(args)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
@@ -39,5 +41,5 @@ class MplayerFD(FileDownloader):
return True
else:
self.to_stderr('\n')
self.report_error('mplayer exited with code %d' % retval)
self.report_error('%s exited with code %d' % (args[0], retval))
return False

View File

@@ -32,6 +32,7 @@ from .atresplayer import AtresPlayerIE
from .atttechchannel import ATTTechChannelIE
from .audiomack import AudiomackIE, AudiomackAlbumIE
from .azubu import AzubuIE
from .baidu import BaiduVideoIE
from .bambuser import BambuserIE, BambuserChannelIE
from .bandcamp import BandcampIE, BandcampAlbumIE
from .bbccouk import BBCCoUkIE
@@ -70,6 +71,7 @@ from .chirbit import (
ChirbitProfileIE,
)
from .cinchcast import CinchcastIE
from .cinemassacre import CinemassacreIE
from .clipfish import ClipfishIE
from .cliphunter import CliphunterIE
from .clipsyndicate import ClipsyndicateIE
@@ -90,6 +92,7 @@ from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
from .condenast import CondeNastIE
from .cracked import CrackedIE
from .criterion import CriterionIE
from .crooksandliars import CrooksAndLiarsIE
from .crunchyroll import (
CrunchyrollIE,
CrunchyrollShowPlaylistIE
@@ -106,14 +109,20 @@ from .dbtv import DBTVIE
from .dctp import DctpTvIE
from .deezer import DeezerPlaylistIE
from .dfb import DFBIE
from .dhm import DHMIE
from .dotsub import DotsubIE
from .douyutv import DouyuTVIE
from .dramafever import (
DramaFeverIE,
DramaFeverSeriesIE,
)
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
from .drtv import DRTVIE
from .dvtv import DVTVIE
from .dump import DumpIE
from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .divxstage import DivxStageIE
@@ -136,6 +145,7 @@ from .engadget import EngadgetIE
from .eporner import EpornerIE
from .eroprofile import EroProfileIE
from .escapist import EscapistIE
from .espn import ESPNIE
from .everyonesmixtape import EveryonesMixtapeIE
from .exfm import ExfmIE
from .expotv import ExpoTVIE
@@ -143,10 +153,10 @@ from .extremetube import ExtremeTubeIE
from .facebook import FacebookIE
from .faz import FazIE
from .fc2 import FC2IE
from .firedrive import FiredriveIE
from .firstpost import FirstpostIE
from .firsttv import FirstTVIE
from .fivemin import FiveMinIE
from .fivetv import FiveTVIE
from .fktv import (
FKTVIE,
FKTVPosteckeIE,
@@ -157,6 +167,7 @@ from .footyroom import FootyRoomIE
from .fourtube import FourTubeIE
from .foxgay import FoxgayIE
from .foxnews import FoxNewsIE
from .foxsports import FoxSportsIE
from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE
from .francetv import (
@@ -175,12 +186,14 @@ from .gameone import (
GameOneIE,
GameOnePlaylistIE,
)
from .gamersyde import GamersydeIE
from .gamespot import GameSpotIE
from .gamestar import GameStarIE
from .gametrailers import GametrailersIE
from .gazeta import GazetaIE
from .gdcvault import GDCVaultIE
from .generic import GenericIE
from .gfycat import GfycatIE
from .giantbomb import GiantBombIE
from .giga import GigaIE
from .glide import GlideIE
@@ -192,7 +205,6 @@ from .googleplus import GooglePlusIE
from .googlesearch import GoogleSearchIE
from .gorillavid import GorillaVidIE
from .goshgay import GoshgayIE
from .grooveshark import GroovesharkIE
from .groupon import GrouponIE
from .hark import HarkIE
from .hearthisat import HearThisAtIE
@@ -222,6 +234,7 @@ from .infoq import InfoQIE
from .instagram import InstagramIE, InstagramUserIE
from .internetvideoarchive import InternetVideoArchiveIE
from .iprima import IPrimaIE
from .iqiyi import IqiyiIE
from .ivi import (
IviIE,
IviCompilationIE
@@ -236,6 +249,7 @@ from .kaltura import KalturaIE
from .kanalplay import KanalPlayIE
from .kankan import KankanIE
from .karaoketv import KaraoketvIE
from .karrierevideos import KarriereVideosIE
from .keezmovies import KeezMoviesIE
from .khanacademy import KhanAcademyIE
from .kickstarter import KickStarterIE
@@ -251,7 +265,10 @@ from .letv import (
LetvPlaylistIE
)
from .libsyn import LibsynIE
from .lifenews import LifeNewsIE
from .lifenews import (
LifeNewsIE,
LifeEmbedIE,
)
from .liveleak import LiveLeakIE
from .livestream import (
LivestreamIE,
@@ -269,11 +286,13 @@ from .macgamestore import MacGameStoreIE
from .mailru import MailRuIE
from .malemotion import MalemotionIE
from .mdr import MDRIE
from .megavideoz import MegaVideozIE
from .metacafe import MetacafeIE
from .metacritic import MetacriticIE
from .mgoon import MgoonIE
from .minhateca import MinhatecaIE
from .ministrygrid import MinistryGridIE
from .miomio import MioMioIE
from .mit import TechTVMITIE, MITIE, OCWMITIE
from .mitele import MiTeleIE
from .mixcloud import MixcloudIE
@@ -309,8 +328,13 @@ from .nba import NBAIE
from .nbc import (
NBCIE,
NBCNewsIE,
NBCSportsIE,
NBCSportsVPlayerIE,
)
from .ndr import (
NDRIE,
NJoyIE,
)
from .ndr import NDRIE
from .ndtv import NDTVIE
from .netzkino import NetzkinoIE
from .nerdcubed import NerdCubedFeedIE
@@ -320,8 +344,7 @@ from .newstube import NewstubeIE
from .nextmedia import (
NextMediaIE,
NextMediaActionNewsIE,
AppleDailyRealtimeNewsIE,
AppleDailyAnimationNewsIE
AppleDailyIE,
)
from .nfb import NFBIE
from .nfl import NFLIE
@@ -335,8 +358,10 @@ from .ninegag import NineGagIE
from .noco import NocoIE
from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE
from .nova import NovaIE
from .novamov import NovaMovIE
from .nowness import NownessIE
from .nowtv import NowTVIE
from .nowvideo import NowVideoIE
from .npo import (
NPOIE,
@@ -352,11 +377,17 @@ from .nrk import (
)
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .nytimes import NYTimesIE
from .nytimes import (
NYTimesIE,
NYTimesArticleIE,
)
from .nuvid import NuvidIE
from .odnoklassniki import OdnoklassnikiIE
from .oktoberfesttv import OktoberfestTVIE
from .ooyala import OoyalaIE
from .ooyala import (
OoyalaIE,
OoyalaExternalIE,
)
from .openfilm import OpenFilmIE
from .orf import (
ORFTVthekIE,
@@ -367,6 +398,7 @@ from .orf import (
from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE
from .pbs import PBSIE
from .philharmoniedeparis import PhilharmonieDeParisIE
from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE
from .planetaplay import PlanetaPlayIE
@@ -376,21 +408,30 @@ from .playfm import PlayFMIE
from .playvid import PlayvidIE
from .playwire import PlaywireIE
from .podomatic import PodomaticIE
from .porn91 import Porn91IE
from .pornhd import PornHdIE
from .pornhub import (
PornHubIE,
PornHubPlaylistIE,
)
from .pornotube import PornotubeIE
from .pornovoisines import PornoVoisinesIE
from .pornoxo import PornoXOIE
from .primesharetv import PrimeShareTVIE
from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE
from .puls4 import Puls4IE
from .pyvideo import PyvideoIE
from .qqmusic import (
QQMusicIE,
QQMusicSingerIE,
QQMusicAlbumIE,
QQMusicToplistIE,
)
from .quickvid import QuickVidIE
from .r7 import R7IE
from .radiode import RadioDeIE
from .radiojavan import RadioJavanIE
from .radiobremen import RadioBremenIE
from .radiofrance import RadioFranceIE
from .rai import RaiIE
@@ -405,7 +446,6 @@ from .roxwel import RoxwelIE
from .rtbf import RTBFIE
from .rte import RteIE
from .rtlnl import RtlNlIE
from .rtlnow import RTLnowIE
from .rtl2 import RTL2IE
from .rtp import RTPIE
from .rts import RTSIE
@@ -419,14 +459,20 @@ from .rutube import (
RutubePersonIE,
)
from .rutv import RUTVIE
from .ruutu import RuutuIE
from .sandia import SandiaIE
from .safari import (
SafariIE,
SafariCourseIE,
)
from .sapo import SapoIE
from .savefrom import SaveFromIE
from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .screenwavemedia import CinemassacreIE, ScreenwaveMediaIE, TeamFourIE
from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE
from .senateisvp import SenateISVPIE
from .servingsys import ServingSysIE
from .sexu import SexuIE
from .sexykarma import SexyKarmaIE
@@ -442,8 +488,11 @@ from .smotri import (
SmotriBroadcastIE,
)
from .snotr import SnotrIE
from .sockshare import SockshareIE
from .sohu import SohuIE
from .soompi import (
SoompiIE,
SoompiShowIE,
)
from .soundcloud import (
SoundcloudIE,
SoundcloudSetIE,
@@ -456,16 +505,24 @@ from .soundgasm import (
)
from .southpark import (
SouthParkIE,
SouthparkDeIE,
SouthParkDeIE,
SouthParkDkIE,
SouthParkEsIE,
SouthParkNlIE
)
from .space import SpaceIE
from .spankbang import SpankBangIE
from .spankwire import SpankwireIE
from .spiegel import SpiegelIE, SpiegelArticleIE
from .spiegeltv import SpiegeltvIE
from .spike import SpikeIE
from .sport5 import Sport5IE
from .sportbox import SportBoxIE
from .sportbox import (
SportBoxIE,
SportBoxEmbedIE,
)
from .sportdeutschland import SportDeutschlandIE
from .srf import SrfIE
from .srmediathek import SRMediathekIE
from .ssa import SSAIE
from .stanfordoc import StanfordOpenClassroomIE
@@ -474,7 +531,10 @@ from .streamcloud import StreamcloudIE
from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE
from .sunporno import SunPornoIE
from .svtplay import SVTPlayIE
from .svt import (
SVTIE,
SVTPlayIE,
)
from .swrmediathek import SWRMediathekIE
from .syfy import SyfyIE
from .sztvhu import SztvHuIE
@@ -503,7 +563,10 @@ from .thesixtyone import TheSixtyOneIE
from .thisav import ThisAVIE
from .tinypic import TinyPicIE
from .tlc import TlcIE, TlcDeIE
from .tmz import TMZIE
from .tmz import (
TMZIE,
TMZArticleIE,
)
from .tnaflix import TNAFlixIE
from .thvideo import (
THVideoIE,
@@ -515,17 +578,30 @@ from .traileraddict import TrailerAddictIE
from .trilulilu import TriluliluIE
from .trutube import TruTubeIE
from .tube8 import Tube8IE
from .tubitv import TubiTvIE
from .tudou import TudouIE
from .tumblr import TumblrIE
from .tunein import TuneInIE
from .turbo import TurboIE
from .tutv import TutvIE
from .tv2 import (
TV2IE,
TV2ArticleIE,
)
from .tv4 import TV4IE
from .tvc import (
TVCIE,
TVCArticleIE,
)
from .tvigle import TvigleIE
from .tvp import TvpIE, TvpSeriesIE
from .tvplay import TVPlayIE
from .tweakers import TweakersIE
from .twentyfourvideo import TwentyFourVideoIE
from .twentytwotracks import (
TwentyTwoTracksIE,
TwentyTwoTracksGenreIE
)
from .twitch import (
TwitchVideoIE,
TwitchChapterIE,
@@ -540,16 +616,23 @@ from .udemy import (
UdemyIE,
UdemyCourseIE
)
from .udn import UDNEmbedIE
from .ultimedia import UltimediaIE
from .unistra import UnistraIE
from .urort import UrortIE
from .ustream import UstreamIE, UstreamChannelIE
from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE
from .veehd import VeeHDIE
from .veoh import VeohIE
from .vessel import VesselIE
from .vesti import VestiIE
from .vevo import VevoIE
from .vgtv import VGTVIE
from .vgtv import (
BTArticleIE,
BTVestlendingenIE,
VGTVIE,
)
from .vh1 import VH1IE
from .vice import ViceIE
from .viddler import ViddlerIE
@@ -580,12 +663,16 @@ from .vine import (
VineIE,
VineUserIE,
)
from .viki import VikiIE
from .viki import (
VikiIE,
VikiChannelIE,
)
from .vk import (
VKIE,
VKUserVideosIE,
)
from .vodlocker import VodlockerIE
from .voicerepublic import VoiceRepublicIE
from .vporn import VpornIE
from .vrt import VRTIE
from .vube import VubeIE
@@ -612,9 +699,10 @@ from .xboxclips import XboxClipsIE
from .xhamster import XHamsterIE
from .xminus import XMinusIE
from .xnxx import XNXXIE
from .xvideos import XVideosIE
from .xstream import XstreamIE
from .xtube import XTubeUserIE, XTubeIE
from .xuite import XuiteIE
from .xvideos import XVideosIE
from .xxxymovies import XXXYMoviesIE
from .yahoo import (
YahooIE,

View File

@@ -11,12 +11,13 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
qualities,
)
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'^http://(?:\w+\.)?add-anime\.net/watch_video\.php\?(?:.*?)v=(?P<id>[\w_]+)(?:.*)'
_TEST = {
_VALID_URL = r'http://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
'info_dict': {
@@ -25,7 +26,10 @@ class AddAnimeIE(InfoExtractor):
'description': 'One Piece 606',
'title': 'One Piece 606',
}
}
}, {
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -63,8 +67,10 @@ class AddAnimeIE(InfoExtractor):
note='Confirming after redirect')
webpage = self._download_webpage(url, video_id)
FORMATS = ('normal', 'hq')
quality = qualities(FORMATS)
formats = []
for format_id in ('normal', 'hq'):
for format_id in FORMATS:
rex = r"var %s_video_file = '(.*?)';" % re.escape(format_id)
video_url = self._search_regex(rex, webpage, 'video file URLx',
fatal=False)
@@ -73,6 +79,7 @@ class AddAnimeIE(InfoExtractor):
formats.append({
'format_id': format_id,
'url': video_url,
'quality': quality(format_id),
})
self._sort_formats(formats)
video_title = self._og_search_title(webpage)

View File

@@ -1,21 +1,11 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
xpath_with_ns,
xpath_text,
find_xpath_attr,
)
class AftenpostenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/(?:#!/)?video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.aftenposten.no/webtv/#!/video/21039/trailer-sweatshop-i-can-t-take-any-more',
'md5': 'fd828cd29774a729bf4d4425fe192972',
@@ -30,69 +20,4 @@ class AftenpostenIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_xml(
'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
NS_MAP = {
'atom': 'http://www.w3.org/2005/Atom',
'xt': 'http://xstream.dk/',
'media': 'http://search.yahoo.com/mrss/',
}
entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
title = xpath_text(
entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
description = xpath_text(
entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
timestamp = parse_iso8601(xpath_text(
entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
formats = []
media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
media_url = media_content.get('url')
if not media_url:
continue
tbr = int_or_none(media_content.get('bitrate'))
mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
if mobj:
formats.append({
'url': mobj.group('url'),
'play_path': 'mp4:%s' % mobj.group('playpath'),
'app': mobj.group('app'),
'ext': 'flv',
'tbr': tbr,
'format_id': 'rtmp-%d' % tbr,
})
else:
formats.append({
'url': media_url,
'tbr': tbr,
})
self._sort_formats(formats)
link = find_xpath_attr(
entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
if link is not None:
formats.append({
'url': link.get('href'),
'format_id': link.get('rel'),
})
thumbnails = [{
'url': splash.get('url'),
'width': int_or_none(splash.get('width')),
'height': int_or_none(splash.get('height')),
} for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'formats': formats,
'thumbnails': thumbnails,
}
return self.url_result('xstream:ap:%s' % self._match_id(url), 'Xstream')

View File

@@ -2,14 +2,15 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
class AftonbladetIE(InfoExtractor):
_VALID_URL = r'^http://tv\.aftonbladet\.se/webbtv.+?(?P<video_id>article[0-9]+)\.ab(?:$|[?#])'
_VALID_URL = r'http://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://tv.aftonbladet.se/webbtv/nyheter/vetenskap/rymden/article36015.ab',
'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
'info_dict': {
'id': 'article36015',
'id': '36015',
'ext': 'mp4',
'title': 'Vulkanutbrott i rymden - nu släpper NASA bilderna',
'description': 'Jupiters måne mest aktiv av alla himlakroppar',
@@ -24,8 +25,9 @@ class AftonbladetIE(InfoExtractor):
# find internal video meta data
meta_url = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
internal_meta_id = self._html_search_regex(
r'data-aptomaId="([\w\d]+)"', webpage, 'internal_meta_id')
player_config = self._parse_json(self._html_search_regex(
r'data-player-config="([^"]+)"', webpage, 'player config'), video_id)
internal_meta_id = player_config['videoId']
internal_meta_url = meta_url % internal_meta_id
internal_meta_json = self._download_json(
internal_meta_url, video_id, 'Downloading video meta data')
@@ -43,9 +45,9 @@ class AftonbladetIE(InfoExtractor):
formats.append({
'url': 'http://%s:%d/%s/%s' % (p['address'], p['port'], p['path'], p['filename']),
'ext': 'mp4',
'width': fmt['width'],
'height': fmt['height'],
'tbr': fmt['bitrate'],
'width': int_or_none(fmt.get('width')),
'height': int_or_none(fmt.get('height')),
'tbr': int_or_none(fmt.get('bitrate')),
'protocol': 'http',
})
self._sort_formats(formats)
@@ -54,9 +56,9 @@ class AftonbladetIE(InfoExtractor):
'id': video_id,
'title': internal_meta_json['title'],
'formats': formats,
'thumbnail': internal_meta_json['imageUrl'],
'description': internal_meta_json['shortPreamble'],
'timestamp': internal_meta_json['timePublished'],
'duration': internal_meta_json['duration'],
'view_count': internal_meta_json['views'],
'thumbnail': internal_meta_json.get('imageUrl'),
'description': internal_meta_json.get('shortPreamble'),
'timestamp': int_or_none(internal_meta_json.get('timePublished')),
'duration': int_or_none(internal_meta_json.get('duration')),
'view_count': int_or_none(internal_meta_json.get('views')),
}

View File

@@ -33,7 +33,7 @@ class ArchiveOrgIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
json_url = url + ('?' if '?' in url else '&') + 'output=json'
json_url = url + ('&' if '?' in url else '?') + 'output=json'
data = self._download_json(json_url, video_id)
def get_optional(data_dict, field):

View File

@@ -7,7 +7,6 @@ from .common import InfoExtractor
from ..utils import (
find_xpath_attr,
unified_strdate,
get_element_by_id,
get_element_by_attribute,
int_or_none,
qualities,
@@ -195,7 +194,9 @@ class ArteTVFutureIE(ArteTVPlus7IE):
def _real_extract(self, url):
anchor_id, lang = self._extract_url_info(url)
webpage = self._download_webpage(url, anchor_id)
row = get_element_by_id(anchor_id, webpage)
row = self._search_regex(
r'(?s)id="%s"[^>]*>.+?(<div[^>]*arte_vp_url[^>]*>)' % anchor_id,
webpage, 'row')
return self._extract_from_webpage(row, anchor_id, lang)

View File

@@ -0,0 +1,68 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
class BaiduVideoIE(InfoExtractor):
_VALID_URL = r'http://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_TESTS = [{
'url': 'http://v.baidu.com/comic/1069.htm?frp=bdbrand&q=%E4%B8%AD%E5%8D%8E%E5%B0%8F%E5%BD%93%E5%AE%B6',
'info_dict': {
'id': '1069',
'title': '中华小当家 TV版 (全52集)',
'description': 'md5:395a419e41215e531c857bb037bbaf80',
},
'playlist_count': 52,
}, {
'url': 'http://v.baidu.com/show/11595.htm?frp=bdbrand',
'info_dict': {
'id': '11595',
'title': 're:^奔跑吧兄弟',
'description': 'md5:1bf88bad6d850930f542d51547c089b8',
},
'playlist_mincount': 3,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
category = category2 = mobj.group('type')
if category == 'show':
category2 = 'tvshow'
webpage = self._download_webpage(url, playlist_id)
playlist_title = self._html_search_regex(
r'title\s*:\s*(["\'])(?P<title>[^\']+)\1', webpage,
'playlist title', group='title')
playlist_description = self._html_search_regex(
r'<input[^>]+class="j-data-intro"[^>]+value="([^"]+)"/>', webpage,
playlist_id, 'playlist description')
site = self._html_search_regex(
r'filterSite\s*:\s*["\']([^"]*)["\']', webpage,
'primary provider site')
api_result = self._download_json(
'http://v.baidu.com/%s_intro/?dtype=%sPlayUrl&id=%s&site=%s' % (
category, category2, playlist_id, site),
playlist_id, 'Get playlist links')
entries = []
for episode in api_result[0]['episodes']:
episode_id = '%s_%s' % (playlist_id, episode['episode'])
redirect_page = self._download_webpage(
compat_urlparse.urljoin(url, episode['url']), episode_id,
note='Download Baidu redirect page')
real_url = self._html_search_regex(
r'location\.replace\("([^"]+)"\)', redirect_page, 'real URL')
entries.append(self.url_result(
real_url, video_title=episode['single_title']))
return self.playlist_result(
entries, playlist_id, playlist_title, playlist_description)

View File

@@ -1,12 +1,18 @@
from __future__ import unicode_literals
import re
import json
import itertools
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
compat_str,
)
from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
)
@@ -14,6 +20,8 @@ class BambuserIE(InfoExtractor):
IE_NAME = 'bambuser'
_VALID_URL = r'https?://bambuser\.com/v/(?P<id>\d+)'
_API_KEY = '005f64509e19a868399060af746a00aa'
_LOGIN_URL = 'https://bambuser.com/user'
_NETRC_MACHINE = 'bambuser'
_TEST = {
'url': 'http://bambuser.com/v/4050584',
@@ -26,6 +34,9 @@ class BambuserIE(InfoExtractor):
'duration': 3741,
'uploader': 'pixelversity',
'uploader_id': '344706',
'timestamp': 1382976692,
'upload_date': '20131028',
'view_count': int,
},
'params': {
# It doesn't respect the 'Range' header, it would download the whole video
@@ -34,23 +45,60 @@ class BambuserIE(InfoExtractor):
},
}
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_form = {
'form_id': 'user_login',
'op': 'Log in',
'name': username,
'pass': password,
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
login_error = self._html_search_regex(
r'(?s)<div class="messages error">(.+?)</div>',
response, 'login error', default=None)
if login_error:
raise ExtractorError(
'Unable to login: %s' % login_error, expected=True)
def _real_initialize(self):
self._login()
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
info_url = ('http://player-c.api.bambuser.com/getVideo.json?'
'&api_key=%s&vid=%s' % (self._API_KEY, video_id))
info_json = self._download_webpage(info_url, video_id)
info = json.loads(info_json)['result']
video_id = self._match_id(url)
info = self._download_json(
'http://player-c.api.bambuser.com/getVideo.json?api_key=%s&vid=%s'
% (self._API_KEY, video_id), video_id)
error = info.get('error')
if error:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error), expected=True)
result = info['result']
return {
'id': video_id,
'title': info['title'],
'url': info['url'],
'thumbnail': info.get('preview'),
'duration': int(info['length']),
'view_count': int(info['views_total']),
'uploader': info['username'],
'uploader_id': info['owner']['uid'],
'title': result['title'],
'url': result['url'],
'thumbnail': result.get('preview'),
'duration': int_or_none(result.get('length')),
'uploader': result.get('username'),
'uploader_id': compat_str(result.get('owner', {}).get('uid')),
'timestamp': int_or_none(result.get('created')),
'fps': float_or_none(result.get('framerate')),
'view_count': int_or_none(result.get('views_total')),
'comment_count': int_or_none(result.get('comment_count')),
}

View File

@@ -72,7 +72,7 @@ class BandcampIE(InfoExtractor):
download_link = m_download.group(1)
video_id = self._search_regex(
r'(?ms)var TralbumData = {.*?id: (?P<id>\d+),?$',
r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
webpage, 'video id')
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')

View File

@@ -3,7 +3,10 @@ from __future__ import unicode_literals
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import ExtractorError
from ..utils import (
ExtractorError,
int_or_none,
)
from ..compat import compat_HTTPError
@@ -112,6 +115,20 @@ class BBCCoUkIE(InfoExtractor):
# rtmp download
'skip_download': True,
}
}, {
'url': 'http://www.bbc.co.uk/iplayer/episode/b054fn09/ad/natural-world-20152016-2-super-powered-owls',
'info_dict': {
'id': 'p02n76xf',
'ext': 'flv',
'title': 'Natural World, 2015-2016: 2. Super Powered Owls',
'description': 'md5:e4db5c937d0e95a7c6b5e654d429183d',
'duration': 3540,
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'geolocation',
}, {
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
'only_matching': True,
@@ -326,16 +343,27 @@ class BBCCoUkIE(InfoExtractor):
webpage = self._download_webpage(url, group_id, 'Downloading video page')
programme_id = self._search_regex(
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
programme_id = None
tviplayer = self._search_regex(
r'mediator\.bind\(({.+?})\s*,\s*document\.getElementById',
webpage, 'player', default=None)
if tviplayer:
player = self._parse_json(tviplayer, group_id).get('player', {})
duration = int_or_none(player.get('duration'))
programme_id = player.get('vpid')
if not programme_id:
programme_id = self._search_regex(
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
if programme_id:
player = self._download_json(
'http://www.bbc.co.uk/iplayer/episode/%s.json' % group_id,
group_id)['jsConf']['player']
title = player['title']
description = player['subtitle']
duration = player['duration']
formats, subtitles = self._download_media_selector(programme_id)
title = self._og_search_title(webpage)
description = self._search_regex(
r'<p class="medium-description">([^<]+)</p>',
webpage, 'description', fatal=False)
else:
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
@@ -345,6 +373,7 @@ class BBCCoUkIE(InfoExtractor):
'id': programme_id,
'title': title,
'description': description,
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'duration': duration,
'formats': formats,
'subtitles': subtitles,

View File

@@ -16,11 +16,11 @@ class BetIE(InfoExtractor):
{
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
'info_dict': {
'id': '740ab250-bb94-4a8a-8787-fe0de7c74471',
'id': 'news/national/2014/a-conversation-with-president-obama',
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
'ext': 'flv',
'title': 'BET News Presents: A Conversation With President Obama',
'description': 'md5:5a88d8ae912c1b33e090290af7ec33c6',
'title': 'A Conversation With President Obama',
'description': 'md5:699d0652a350cf3e491cd15cc745b5da',
'duration': 1534,
'timestamp': 1418075340,
'upload_date': '20141208',
@@ -35,7 +35,7 @@ class BetIE(InfoExtractor):
{
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
'info_dict': {
'id': 'bcd1b1df-673a-42cf-8d01-b282db608f2d',
'id': 'news/national/2014/justice-for-ferguson-a-community-reacts',
'display_id': 'justice-for-ferguson-a-community-reacts',
'ext': 'flv',
'title': 'Justice for Ferguson: A Community Reacts',
@@ -61,6 +61,9 @@ class BetIE(InfoExtractor):
[r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
webpage, 'media URL'))
video_id = self._search_regex(
r'/video/(.*)/_jcr_content/', media_url, 'video id')
mrss = self._download_xml(media_url, display_id)
item = mrss.find('./channel/item')
@@ -75,8 +78,6 @@ class BetIE(InfoExtractor):
description = xpath_text(
item, './description', 'description', fatal=False)
video_id = xpath_text(item, './guid', 'video id', fatal=False)
timestamp = parse_iso8601(xpath_text(
item, xpath_with_ns('./dc:date', NS_MAP),
'upload date', fatal=False))

View File

@@ -2,7 +2,10 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import (
int_or_none,
fix_xml_ampersands,
)
class BildIE(InfoExtractor):
@@ -15,7 +18,7 @@ class BildIE(InfoExtractor):
'id': '38184146',
'ext': 'mp4',
'title': 'BILD hat sie getestet',
'thumbnail': 'http://bilder.bild.de/fotos/stand-das-koennen-die-neuen-ipads-38184138/Bild/1.bild.jpg',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 196,
'description': 'Mit dem iPad Air 2 und dem iPad Mini 3 hat Apple zwei neue Tablet-Modelle präsentiert. BILD-Reporter Sven Stein durfte die Geräte bereits testen. ',
}
@@ -25,7 +28,7 @@ class BildIE(InfoExtractor):
video_id = self._match_id(url)
xml_url = url.split(".bild.html")[0] + ",view=xml.bild.xml"
doc = self._download_xml(xml_url, video_id)
doc = self._download_xml(xml_url, video_id, transform_source=fix_xml_ampersands)
duration = int_or_none(doc.attrib.get('duration'), scale=1000)

View File

@@ -2,34 +2,47 @@
from __future__ import unicode_literals
import re
import itertools
import json
import xml.etree.ElementTree as ET
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
ExtractorError,
)
class BiliBiliIE(InfoExtractor):
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>[0-9]+)/'
_TEST = {
_TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '2c301e4dab317596e837c3e7633e7d86',
'info_dict': {
'id': '1074402',
'id': '1074402_part1',
'ext': 'flv',
'title': '【金坷垃】金泡沫',
'duration': 308,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
},
}
}, {
'url': 'http://www.bilibili.com/video/av1041170/',
'info_dict': {
'id': '1041170',
'title': '【BD1080P】刀语【诸神&异域】',
},
'playlist_count': 9,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if self._search_regex(r'(此视频不存在或被删除)', webpage, 'error message', default=None):
raise ExtractorError('The video does not exist or was deleted', expected=True)
video_code = self._search_regex(
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')
@@ -54,19 +67,22 @@ class BiliBiliIE(InfoExtractor):
cid = self._search_regex(r'cid=(\d+)', webpage, 'cid')
lq_doc = self._download_xml(
entries = []
lq_page = self._download_webpage(
'http://interface.bilibili.com/v_cdn_play?appkey=1&cid=%s' % cid,
video_id,
note='Downloading LQ video info'
)
lq_durl = lq_doc.find('./durl')
formats = [{
'format_id': 'lq',
'quality': 1,
'url': lq_durl.find('./url').text,
'filesize': int_or_none(
lq_durl.find('./size'), get_attr='text'),
}]
try:
err_info = json.loads(lq_page)
raise ExtractorError(
'BiliBili said: ' + err_info['error_text'], expected=True)
except ValueError:
pass
lq_doc = ET.fromstring(lq_page)
lq_durls = lq_doc.findall('./durl')
hq_doc = self._download_xml(
'http://interface.bilibili.com/playurl?appkey=1&cid=%s' % cid,
@@ -75,22 +91,45 @@ class BiliBiliIE(InfoExtractor):
fatal=False,
)
if hq_doc is not False:
hq_durl = hq_doc.find('./durl')
formats.append({
'format_id': 'hq',
'quality': 2,
'ext': 'flv',
'url': hq_durl.find('./url').text,
hq_durls = hq_doc.findall('./durl')
assert len(lq_durls) == len(hq_durls)
else:
hq_durls = itertools.repeat(None)
i = 1
for lq_durl, hq_durl in zip(lq_durls, hq_durls):
formats = [{
'format_id': 'lq',
'quality': 1,
'url': lq_durl.find('./url').text,
'filesize': int_or_none(
hq_durl.find('./size'), get_attr='text'),
lq_durl.find('./size'), get_attr='text'),
}]
if hq_durl is not None:
formats.append({
'format_id': 'hq',
'quality': 2,
'ext': 'flv',
'url': hq_durl.find('./url').text,
'filesize': int_or_none(
hq_durl.find('./size'), get_attr='text'),
})
self._sort_formats(formats)
entries.append({
'id': '%s_part%d' % (video_id, i),
'title': title,
'formats': formats,
'duration': duration,
'upload_date': upload_date,
'thumbnail': thumbnail,
})
self._sort_formats(formats)
i += 1
return {
'_type': 'multi_video',
'entries': entries,
'id': video_id,
'title': title,
'formats': formats,
'duration': duration,
'upload_date': upload_date,
'thumbnail': thumbnail,
'title': title
}

View File

@@ -102,6 +102,15 @@ class BlipTVIE(InfoExtractor):
},
]
@staticmethod
def _extract_url(webpage):
mobj = re.search(r'<meta\s[^>]*https?://api\.blip\.tv/\w+/redirect/\w+/(\d+)', webpage)
if mobj:
return 'http://blip.tv/a/a-' + mobj.group(1)
mobj = re.search(r'<(?:iframe|embed|object)\s[^>]*(https?://(?:\w+\.)?blip\.tv/(?:play/|api\.swf#)[a-zA-Z0-9_]+)', webpage)
if mobj:
return mobj.group(1)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
lookup_id = mobj.group('lookup_id')
@@ -172,6 +181,7 @@ class BlipTVIE(InfoExtractor):
'width': int_or_none(media_content.get('width')),
'height': int_or_none(media_content.get('height')),
})
self._check_formats(formats, video_id)
self._sort_formats(formats)
subtitles = self.extract_subtitles(video_id, subtitles_urls)

View File

@@ -6,32 +6,39 @@ from .common import InfoExtractor
class BloombergIE(InfoExtractor):
_VALID_URL = r'https?://www\.bloomberg\.com/video/(?P<id>.+?)\.html'
_VALID_URL = r'https?://www\.bloomberg\.com/news/videos/[^/]+/(?P<id>[^/?#]+)'
_TEST = {
'url': 'http://www.bloomberg.com/video/shah-s-presentation-on-foreign-exchange-strategies-qurhIVlJSB6hzkVi229d8g.html',
'url': 'http://www.bloomberg.com/news/videos/b/aaeae121-5949-481e-a1ce-4562db6f5df2',
# The md5 checksum changes
'info_dict': {
'id': 'qurhIVlJSB6hzkVi229d8g',
'ext': 'flv',
'title': 'Shah\'s Presentation on Foreign-Exchange Strategies',
'description': 'md5:0681e0d30dcdfc6abf34594961d8ea88',
'description': 'md5:a8ba0302912d03d246979735c17d2761',
},
}
def _real_extract(self, url):
name = self._match_id(url)
webpage = self._download_webpage(url, name)
f4m_url = self._search_regex(
r'<source src="(https?://[^"]+\.f4m.*?)"', webpage,
'f4m url')
video_id = self._search_regex(r'"bmmrId":"(.+?)"', webpage, 'id')
title = re.sub(': Video$', '', self._og_search_title(webpage))
embed_info = self._download_json(
'http://www.bloomberg.com/api/embed?id=%s' % video_id, video_id)
formats = []
for stream in embed_info['streams']:
if stream["muxing_format"] == "TS":
formats.extend(self._extract_m3u8_formats(stream['url'], video_id))
else:
formats.extend(self._extract_f4m_formats(stream['url'], video_id))
self._sort_formats(formats)
return {
'id': name.split('-')[-1],
'id': video_id,
'title': title,
'formats': self._extract_f4m_formats(f4m_url, name),
'formats': formats,
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
}

View File

@@ -16,27 +16,38 @@ class BRIE(InfoExtractor):
_TESTS = [
{
'url': 'http://www.br.de/mediathek/video/sendungen/heimatsound/heimatsound-festival-2014-trailer-100.html',
'md5': '93556dd2bcb2948d9259f8670c516d59',
'url': 'http://www.br.de/mediathek/video/sendungen/abendschau/betriebliche-altersvorsorge-104.html',
'md5': '83a0477cf0b8451027eb566d88b51106',
'info_dict': {
'id': '25e279aa-1ffd-40fd-9955-5325bd48a53a',
'id': '48f656ef-287e-486f-be86-459122db22cc',
'ext': 'mp4',
'title': 'Wenn das Traditions-Theater wackelt',
'description': 'Heimatsound-Festival 2014: Wenn das Traditions-Theater wackelt',
'duration': 34,
'uploader': 'BR',
'upload_date': '20140802',
'title': 'Die böse Überraschung',
'description': 'Betriebliche Altersvorsorge: Die böse Überraschung',
'duration': 180,
'uploader': 'Reinhard Weber',
'upload_date': '20150422',
}
},
{
'url': 'http://www.br.de/nachrichten/schaeuble-haushaltsentwurf-bundestag-100.html',
'md5': '3db0df1a9a9cd9fa0c70e6ea8aa8e820',
'url': 'http://www.br.de/nachrichten/oberbayern/inhalt/muenchner-polizeipraesident-schreiber-gestorben-100.html',
'md5': 'a44396d73ab6a68a69a568fae10705bb',
'info_dict': {
'id': 'c6aae3de-2cf9-43f2-957f-f17fef9afaab',
'id': 'a4b83e34-123d-4b81-9f4e-c0d3121a4e05',
'ext': 'mp4',
'title': 'Manfred Schreiber ist tot',
'description': 'Abendschau kompakt: Manfred Schreiber ist tot',
'duration': 26,
}
},
{
'url': 'http://www.br.de/radio/br-klassik/sendungen/allegro/premiere-urauffuehrung-the-land-2015-dance-festival-muenchen-100.html',
'md5': '8b5b27c0b090f3b35eac4ab3f7a73d3d',
'info_dict': {
'id': '74c603c9-26d3-48bb-b85b-079aeed66e0b',
'ext': 'aac',
'title': '"Keine neuen Schulden im nächsten Jahr"',
'description': 'Haushaltsentwurf: "Keine neuen Schulden im nächsten Jahr"',
'duration': 64,
'title': 'Kurzweilig und sehr bewegend',
'description': '"The Land" von Peeping Tom: Kurzweilig und sehr bewegend',
'duration': 296,
}
},
{

View File

@@ -117,7 +117,10 @@ class BrightcoveIE(InfoExtractor):
object_str = re.sub(r'(<object[^>]*)(xmlns=".*?")', r'\1', object_str)
object_str = fix_xml_ampersands(object_str)
object_doc = xml.etree.ElementTree.fromstring(object_str.encode('utf-8'))
try:
object_doc = xml.etree.ElementTree.fromstring(object_str.encode('utf-8'))
except xml.etree.ElementTree.ParseError:
return
fv_el = find_xpath_attr(object_doc, './param', 'name', 'flashVars')
if fv_el is not None:
@@ -153,6 +156,28 @@ class BrightcoveIE(InfoExtractor):
linkBase = find_param('linkBaseURL')
if linkBase is not None:
params['linkBaseURL'] = linkBase
return cls._make_brightcove_url(params)
@classmethod
def _build_brighcove_url_from_js(cls, object_js):
# The layout of JS is as follows:
# customBC.createVideo = function (width, height, playerID, playerKey, videoPlayer, VideoRandomID) {
# // build Brightcove <object /> XML
# }
m = re.search(
r'''(?x)customBC.\createVideo\(
.*? # skipping width and height
["\'](?P<playerID>\d+)["\']\s*,\s* # playerID
["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s* # playerKey begins with AQ and is 50 characters
# in length, however it's appended to itself
# in places, so truncate
["\'](?P<videoID>\d+)["\'] # @videoPlayer
''', object_js)
if m:
return cls._make_brightcove_url(m.groupdict())
@classmethod
def _make_brightcove_url(cls, params):
data = compat_urllib_parse.urlencode(params)
return cls._FEDERATED_URL_TEMPLATE % data
@@ -169,7 +194,7 @@ class BrightcoveIE(InfoExtractor):
"""Return a list of all Brightcove URLs from the webpage """
url_m = re.search(
r'<meta\s+property="og:video"\s+content="(https?://(?:secure|c)\.brightcove.com/[^"]+)"',
r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
webpage)
if url_m:
url = unescapeHTML(url_m.group(1))
@@ -183,9 +208,14 @@ class BrightcoveIE(InfoExtractor):
(?:
[^>]+?class=[\'"][^>]*?BrightcoveExperience.*?[\'"] |
[^>]*?>\s*<param\s+name="movie"\s+value="https?://[^/]*brightcove\.com/
).+?</object>''',
).+?>\s*</object>''',
webpage)
return [cls._build_brighcove_url(m) for m in matches]
if matches:
return list(filter(None, [cls._build_brighcove_url(m) for m in matches]))
return list(filter(None, [
cls._build_brighcove_url_from_js(custom_bc)
for custom_bc in re.findall(r'(customBC\.createVideo\(.+?\);)', webpage)]))
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})

View File

@@ -16,7 +16,7 @@ class BYUtvIE(InfoExtractor):
'ext': 'mp4',
'description': 'md5:5438d33774b6bdc662f9485a340401cc',
'title': 'Season 5 Episode 5',
'thumbnail': 're:^https?://.*promo.*'
'thumbnail': 're:^https?://.*\.jpg$'
},
'params': {
'skip_download': True,

View File

@@ -25,14 +25,14 @@ class CanalplusIE(InfoExtractor):
}
_TESTS = [{
'url': 'http://www.canalplus.fr/c-infos-documentaires/pid1830-c-zapping.html?vid=922470',
'md5': '3db39fb48b9685438ecf33a1078023e4',
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1263092',
'md5': 'b3481d7ca972f61e37420798d0a9d934',
'info_dict': {
'id': '922470',
'id': '1263092',
'ext': 'flv',
'title': 'Zapping - 26/08/13',
'description': 'Le meilleur de toutes les chaînes, tous les jours.\nEmission du 26 août 2013',
'upload_date': '20130826',
'title': 'Le Zapping - 13/05/15',
'description': 'md5:09738c0d06be4b5d06a0940edb0da73f',
'upload_date': '20150513',
},
}, {
'url': 'http://www.piwiplus.fr/videos-piwi/pid1405-le-labyrinthe-boing-super-ranger.html?vid=1108190',
@@ -56,7 +56,7 @@ class CanalplusIE(InfoExtractor):
'skip': 'videos get deleted after a while',
}, {
'url': 'http://www.itele.fr/france/video/aubervilliers-un-lycee-en-colere-111559',
'md5': '65aa83ad62fe107ce29e564bb8712580',
'md5': 'f3a46edcdf28006598ffaf5b30e6a2d4',
'info_dict': {
'id': '1213714',
'ext': 'flv',

View File

@@ -4,12 +4,13 @@ from .common import InfoExtractor
class CBSIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cbs\.com/shows/[^/]+/(?:video|artist)/(?P<id>[^/]+)/.*'
_VALID_URL = r'https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/(?:video|artist)|colbertlateshow\.com/(?:video|podcasts))/[^/]+/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
'info_dict': {
'id': '4JUVEwq3wUT7',
'display_id': 'connect-chat-feat-garth-brooks',
'ext': 'flv',
'title': 'Connect Chat feat. Garth Brooks',
'description': 'Connect with country music singer Garth Brooks, as he chats with fans on Wednesday November 27, 2013. Be sure to tune in to Garth Brooks: Live from Las Vegas, Friday November 29, at 9/8c on CBS!',
@@ -24,6 +25,7 @@ class CBSIE(InfoExtractor):
'url': 'http://www.cbs.com/shows/liveonletterman/artist/221752/st-vincent/',
'info_dict': {
'id': 'WWF_5KqY3PK1',
'display_id': 'st-vincent',
'ext': 'flv',
'title': 'Live on Letterman - St. Vincent',
'description': 'Live On Letterman: St. Vincent in concert from New York\'s Ed Sullivan Theater on Tuesday, July 16, 2014.',
@@ -34,12 +36,23 @@ class CBSIE(InfoExtractor):
'skip_download': True,
},
'_skip': 'Blocked outside the US',
}, {
'url': 'http://colbertlateshow.com/video/8GmB0oY0McANFvp2aEffk9jZZZ2YyXxy/the-colbeard/',
'only_matching': True,
}, {
'url': 'http://www.colbertlateshow.com/podcasts/dYSwjqPs_X1tvbV_P2FcPWRa_qT6akTC/in-the-bad-room-with-stephen/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
real_id = self._search_regex(
r"video\.settings\.pid\s*=\s*'([^']+)';",
[r"video\.settings\.pid\s*=\s*'([^']+)';", r"cbsplayer\.pid\s*=\s*'([^']+)';"],
webpage, 'real video ID')
return self.url_result('theplatform:%s' % real_id)
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': 'theplatform:%s' % real_id,
'display_id': display_id,
}

View File

@@ -32,7 +32,7 @@ class CBSNewsIE(InfoExtractor):
'id': 'fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack',
'ext': 'flv',
'title': 'Fort Hood shooting: Army downplays mental illness as cause of attack',
'thumbnail': 'http://cbsnews2.cbsistatic.com/hub/i/r/2014/04/04/0c9fbc66-576b-41ca-8069-02d122060dd2/thumbnail/140x90/6dad7a502f88875ceac38202984b6d58/en-0404-werner-replace-640x360.jpg',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 205,
},
'params': {

View File

@@ -16,7 +16,7 @@ class CCCIE(InfoExtractor):
_TEST = {
'url': 'http://media.ccc.de/browse/congress/2013/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor.html#video',
'md5': '205a365d0d57c0b1e43a12c9ffe8f9be',
'md5': '3a1eda8f3a29515d27f5adb967d7e740',
'info_dict': {
'id': '20131228183',
'ext': 'mp4',
@@ -51,7 +51,7 @@ class CCCIE(InfoExtractor):
matches = re.finditer(r'''(?xs)
<(?:span|div)\s+class='label\s+filetype'>(?P<format>.*?)</(?:span|div)>\s*
<a\s+href='(?P<http_url>[^']+)'>\s*
<a\s+download\s+href='(?P<http_url>[^']+)'>\s*
(?:
.*?
<a\s+href='(?P<torrent_url>[^']+\.torrent)'

View File

@@ -57,7 +57,7 @@ class ChilloutzoneIE(InfoExtractor):
base64_video_info = self._html_search_regex(
r'var cozVidData = "(.+?)";', webpage, 'video data')
decoded_video_info = base64.b64decode(base64_video_info).decode("utf-8")
decoded_video_info = base64.b64decode(base64_video_info.encode('utf-8')).decode('utf-8')
video_info_dict = json.loads(decoded_video_info)
# get video information from dict

View File

@@ -0,0 +1,110 @@
# encoding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import ExtractorError
from .bliptv import BlipTVIE
class CinemassacreIE(InfoExtractor):
_VALID_URL = 'https?://(?:www\.)?cinemassacre\.com/(?P<date_y>[0-9]{4})/(?P<date_m>[0-9]{2})/(?P<date_d>[0-9]{2})/(?P<display_id>[^?#/]+)'
_TESTS = [
{
'url': 'http://cinemassacre.com/2012/11/10/avgn-the-movie-trailer/',
'md5': 'fde81fbafaee331785f58cd6c0d46190',
'info_dict': {
'id': 'Cinemassacre-19911',
'ext': 'mp4',
'upload_date': '20121110',
'title': '“Angry Video Game Nerd: The Movie” Trailer',
'description': 'md5:fb87405fcb42a331742a0dce2708560b',
},
},
{
'url': 'http://cinemassacre.com/2013/10/02/the-mummys-hand-1940',
'md5': 'd72f10cd39eac4215048f62ab477a511',
'info_dict': {
'id': 'Cinemassacre-521be8ef82b16',
'ext': 'mp4',
'upload_date': '20131002',
'title': 'The Mummys Hand (1940)',
},
},
{
# blip.tv embedded video
'url': 'http://cinemassacre.com/2006/12/07/chronologically-confused-about-bad-movie-and-video-game-sequel-titles/',
'md5': 'ca9b3c8dd5a66f9375daeb5135f5a3de',
'info_dict': {
'id': '4065369',
'ext': 'flv',
'title': 'AVGN: Chronologically Confused about Bad Movie and Video Game Sequel Titles',
'upload_date': '20061207',
'uploader': 'cinemassacre',
'uploader_id': '250778',
'timestamp': 1283233867,
'description': 'md5:0a108c78d130676b207d0f6d029ecffd',
}
},
{
# Youtube embedded video
'url': 'http://cinemassacre.com/2006/09/01/mckids/',
'md5': '6eb30961fa795fedc750eac4881ad2e1',
'info_dict': {
'id': 'FnxsNhuikpo',
'ext': 'mp4',
'upload_date': '20060901',
'uploader': 'Cinemassacre Extras',
'description': 'md5:de9b751efa9e45fbaafd9c8a1123ed53',
'uploader_id': 'Cinemassacre',
'title': 'AVGN: McKids',
}
},
{
'url': 'http://cinemassacre.com/2015/05/25/mario-kart-64-nintendo-64-james-mike-mondays/',
'md5': '1376908e49572389e7b06251a53cdd08',
'info_dict': {
'id': 'Cinemassacre-555779690c440',
'ext': 'mp4',
'description': 'Lets Play Mario Kart 64 !! Mario Kart 64 is a classic go-kart racing game released for the Nintendo 64 (N64). Today James & Mike do 4 player Battle Mode with Kyle and Bootsy!',
'title': 'Mario Kart 64 (Nintendo 64) James & Mike Mondays',
'upload_date': '20150525',
}
}
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('display_id')
video_date = mobj.group('date_y') + mobj.group('date_m') + mobj.group('date_d')
webpage = self._download_webpage(url, display_id)
playerdata_url = self._search_regex(
[
r'src="(http://(?:player2\.screenwavemedia\.com|player\.screenwavemedia\.com/play)/[a-zA-Z]+\.php\?[^"]*\bid=.+?)"',
r'<iframe[^>]+src="((?:https?:)?//(?:[^.]+\.)?youtube\.com/.+?)"',
],
webpage, 'player data URL', default=None)
if not playerdata_url:
playerdata_url = BlipTVIE._extract_url(webpage)
if not playerdata_url:
raise ExtractorError('Unable to find player data')
video_title = self._html_search_regex(
r'<title>(?P<title>.+?)\|', webpage, 'title')
video_description = self._html_search_regex(
r'<div class="entry-content">(?P<description>.+?)</div>',
webpage, 'description', flags=re.DOTALL, fatal=False)
video_thumbnail = self._og_search_thumbnail(webpage)
return {
'_type': 'url_transparent',
'display_id': display_id,
'title': video_title,
'description': video_description,
'upload_date': video_date,
'thumbnail': video_thumbnail,
'url': playerdata_url,
}

View File

@@ -11,7 +11,7 @@ from ..utils import (
class CNETIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
_TEST = {
_TESTS = [{
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
'info_dict': {
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
@@ -25,7 +25,20 @@ class CNETIE(InfoExtractor):
'params': {
'skip_download': 'requires rtmpdump',
}
}
}, {
'url': 'http://www.cnet.com/videos/whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187/',
'info_dict': {
'id': '56527b93-d25d-44e3-b738-f989ce2e49ba',
'ext': 'flv',
'description': 'Khail and Ashley wonder what other civic woes can be solved by self-tweeting objects, investigate a new kind of VR camera and watch an origami robot self-assemble, walk, climb, dig and dissolve. #TDPothole',
'uploader_id': 'b163284d-6b73-44fc-b3e6-3da66c392d40',
'uploader': 'Ashley Esqueda',
'title': 'Whiny potholes tweet at local government when hit by cars (Tomorrow Daily 187)',
},
'params': {
'skip_download': True, # requires rtmpdump
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -42,7 +55,7 @@ class CNETIE(InfoExtractor):
raise ExtractorError('Cannot find video data')
mpx_account = data['config']['players']['default']['mpx_account']
vid = vdata['files']['rtmp']
vid = vdata['files'].get('rtmp', vdata['files']['hds'])
tp_link = 'http://link.theplatform.com/s/%s/%s' % (mpx_account, vid)
video_id = vdata['id']

View File

@@ -12,7 +12,7 @@ from ..utils import (
class CNNIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:(?:edition|www)\.)?cnn\.com/video/(?:data/.+?|\?)/
(?P<path>.+?/(?P<title>[^/]+?)(?:\.(?:cnn|hln)(?:-ap)?|(?=&)))'''
(?P<path>.+?/(?P<title>[^/]+?)(?:\.(?:[a-z\-]+)|(?=&)))'''
_TESTS = [{
'url': 'http://edition.cnn.com/video/?/video/sports/2013/06/09/nadal-1-on-1.cnn',
@@ -45,6 +45,12 @@ class CNNIE(InfoExtractor):
'description': 'md5:e7223a503315c9f150acac52e76de086',
'upload_date': '20141222',
}
}, {
'url': 'http://cnn.com/video/?/video/politics/2015/03/27/pkg-arizona-senator-church-attendance-mandatory.ktvk',
'only_matching': True,
}, {
'url': 'http://cnn.com/video/?/video/us/2015/04/06/dnt-baker-refuses-anti-gay-order.wkmg',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -201,7 +201,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
uri = mMovieParams[0][1]
# Correct cc.com in uri
uri = re.sub(r'(episode:[^.]+)(\.cc)?\.com', r'\1.cc.com', uri)
uri = re.sub(r'(episode:[^.]+)(\.cc)?\.com', r'\1.com', uri)
index_url = 'http://%s.cc.com/feeds/mrss?%s' % (show_name, compat_urllib_parse.urlencode({'uri': uri}))
idoc = self._download_xml(

View File

@@ -23,6 +23,7 @@ from ..compat import (
)
from ..utils import (
age_restricted,
bug_reports_message,
clean_html,
compiled_regex_type,
ExtractorError,
@@ -46,7 +47,7 @@ class InfoExtractor(object):
information possibly downloading the video to the file system, among
other possible outcomes.
The type field determines the the type of the result.
The type field determines the type of the result.
By far the most common value (and the default if _type is missing) is
"video", which indicates a single video.
@@ -110,11 +111,8 @@ class InfoExtractor(object):
(quality takes higher priority)
-1 for default (order by other properties),
-2 or smaller for less than default.
* http_method HTTP method to use for the download.
* http_headers A dictionary of additional HTTP headers
to add to the request.
* http_post_data Additional data to send with a POST
request.
* stretched_ratio If given and not 1, indicates that the
video's pixels are not square.
width : height ratio as float.
@@ -324,7 +322,7 @@ class InfoExtractor(object):
self._downloader.report_warning(errmsg)
return False
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True):
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None):
""" Returns a tuple (page content as string, URL handle) """
# Strip hashes from the URL (#1038)
if isinstance(url_or_request, (compat_str, str)):
@@ -334,14 +332,11 @@ class InfoExtractor(object):
if urlh is False:
assert not fatal
return False
content = self._webpage_read_content(urlh, url_or_request, video_id, note, errnote, fatal)
content = self._webpage_read_content(urlh, url_or_request, video_id, note, errnote, fatal, encoding=encoding)
return (content, urlh)
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True, prefix=None):
content_type = urlh.headers.get('Content-Type', '')
webpage_bytes = urlh.read()
if prefix is not None:
webpage_bytes = prefix + webpage_bytes
@staticmethod
def _guess_encoding_from_content(content_type, webpage_bytes):
m = re.match(r'[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+\s*;\s*charset=(.+)', content_type)
if m:
encoding = m.group(1)
@@ -354,6 +349,16 @@ class InfoExtractor(object):
encoding = 'utf-16'
else:
encoding = 'utf-8'
return encoding
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True, prefix=None, encoding=None):
content_type = urlh.headers.get('Content-Type', '')
webpage_bytes = urlh.read()
if prefix is not None:
webpage_bytes = prefix + webpage_bytes
if not encoding:
encoding = self._guess_encoding_from_content(content_type, webpage_bytes)
if self._downloader.params.get('dump_intermediate_pages', False):
try:
url = url_or_request.get_full_url()
@@ -410,13 +415,13 @@ class InfoExtractor(object):
return content
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5):
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None):
""" Returns the data of the page as a string """
success = False
try_count = 0
while success is False:
try:
res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal)
res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal, encoding=encoding)
success = True
except compat_http_client.IncompleteRead as e:
try_count += 1
@@ -431,10 +436,10 @@ class InfoExtractor(object):
def _download_xml(self, url_or_request, video_id,
note='Downloading XML', errnote='Unable to download XML',
transform_source=None, fatal=True):
transform_source=None, fatal=True, encoding=None):
"""Return the xml as an xml.etree.ElementTree.Element"""
xml_string = self._download_webpage(
url_or_request, video_id, note, errnote, fatal=fatal)
url_or_request, video_id, note, errnote, fatal=fatal, encoding=encoding)
if xml_string is False:
return xml_string
if transform_source:
@@ -445,9 +450,10 @@ class InfoExtractor(object):
note='Downloading JSON metadata',
errnote='Unable to download JSON metadata',
transform_source=None,
fatal=True):
fatal=True, encoding=None):
json_string = self._download_webpage(
url_or_request, video_id, note, errnote, fatal=fatal)
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding)
if (not fatal) and json_string is False:
return None
return self._parse_json(
@@ -492,7 +498,7 @@ class InfoExtractor(object):
# Methods for following #608
@staticmethod
def url_result(url, ie=None, video_id=None):
def url_result(url, ie=None, video_id=None, video_title=None):
"""Returns a url that points to a page that should be processed"""
# TODO: ie should be the class used for getting the info
video_info = {'_type': 'url',
@@ -500,6 +506,8 @@ class InfoExtractor(object):
'ie_key': ie}
if video_id is not None:
video_info['id'] = video_id
if video_title is not None:
video_info['title'] = video_title
return video_info
@staticmethod
@@ -546,8 +554,7 @@ class InfoExtractor(object):
elif fatal:
raise RegexNotFoundError('Unable to extract %s' % _name)
else:
self._downloader.report_warning('unable to extract %s; '
'please report this issue on http://yt-dl.org/bug' % _name)
self._downloader.report_warning('unable to extract %s' % _name + bug_reports_message())
return None
def _html_search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
@@ -562,7 +569,7 @@ class InfoExtractor(object):
def _get_login_info(self):
"""
Get the the login info as (username, password)
Get the login info as (username, password)
It will look in the netrc file using the _NETRC_MACHINE value
If there's no info available, return (None, None)
"""
@@ -698,7 +705,7 @@ class InfoExtractor(object):
return self._html_search_meta('twitter:player', html,
'twitter card player')
def _sort_formats(self, formats):
def _sort_formats(self, formats, field_preference=None):
if not formats:
raise ExtractorError('No video formats found')
@@ -708,6 +715,9 @@ class InfoExtractor(object):
if not f.get('ext') and 'url' in f:
f['ext'] = determine_ext(f['url'])
if isinstance(field_preference, (list, tuple)):
return tuple(f.get(field) if f.get(field) is not None else -1 for field in field_preference)
preference = f.get('preference')
if preference is None:
proto = f.get('protocol')
@@ -754,7 +764,7 @@ class InfoExtractor(object):
f.get('fps') if f.get('fps') is not None else -1,
f.get('filesize_approx') if f.get('filesize_approx') is not None else -1,
f.get('source_preference') if f.get('source_preference') is not None else -1,
f.get('format_id'),
f.get('format_id') if f.get('format_id') is not None else '',
)
formats.sort(key=_formats_key)
@@ -776,8 +786,8 @@ class InfoExtractor(object):
return True
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError):
self.report_warning(
'%s URL is invalid, skipping' % item, video_id)
self.to_screen(
'%s: %s URL is invalid, skipping' % (video_id, item))
return False
raise
@@ -822,7 +832,7 @@ class InfoExtractor(object):
(media_el.attrib.get('href') or media_el.attrib.get('url')))
tbr = int_or_none(media_el.attrib.get('bitrate'))
formats.append({
'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
'format_id': '-'.join(filter(None, [f4m_id, compat_str(i if tbr is None else tbr)])),
'url': manifest_url,
'ext': 'flv',
'tbr': tbr,
@@ -836,7 +846,7 @@ class InfoExtractor(object):
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None,
m3u8_id=None):
m3u8_id=None, note=None, errnote=None):
formats = [{
'format_id': '-'.join(filter(None, [m3u8_id, 'meta'])),
@@ -855,8 +865,8 @@ class InfoExtractor(object):
m3u8_doc = self._download_webpage(
m3u8_url, video_id,
note='Downloading m3u8 information',
errnote='Failed to download m3u8 information')
note=note or 'Downloading m3u8 information',
errnote=errnote or 'Failed to download m3u8 information')
last_info = None
last_media = None
kv_rex = re.compile(
@@ -886,7 +896,7 @@ class InfoExtractor(object):
format_id = []
if m3u8_id:
format_id.append(m3u8_id)
last_media_name = last_media.get('NAME') if last_media else None
last_media_name = last_media.get('NAME') if last_media and last_media.get('TYPE') != 'SUBTITLES' else None
format_id.append(last_media_name if last_media_name else '%d' % (tbr if tbr else len(formats)))
f = {
'format_id': '-'.join(format_id),
@@ -1062,9 +1072,6 @@ class InfoExtractor(object):
def _get_automatic_captions(self, *args, **kwargs):
raise NotImplementedError("This method must be implemented by subclasses")
def _subtitles_timecode(self, seconds):
return '%02d:%02d:%02d.%03d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 1000)
class SearchInfoExtractor(InfoExtractor):
"""

View File

@@ -11,39 +11,65 @@ from ..utils import (
class CrackedIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cracked\.com/video_(?P<id>\d+)_[\da-z-]+\.html'
_TEST = {
'url': 'http://www.cracked.com/video_19006_4-plot-holes-you-didnt-notice-in-your-favorite-movies.html',
'md5': '4b29a5eeec292cd5eca6388c7558db9e',
_TESTS = [{
'url': 'http://www.cracked.com/video_19070_if-animal-actors-got-e21-true-hollywood-stories.html',
'md5': '89b90b9824e3806ca95072c4d78f13f7',
'info_dict': {
'id': '19006',
'id': '19070',
'ext': 'mp4',
'title': '4 Plot Holes You Didn\'t Notice in Your Favorite Movies',
'description': 'md5:3b909e752661db86007d10e5ec2df769',
'timestamp': 1405659600,
'upload_date': '20140718',
'title': 'If Animal Actors Got E! True Hollywood Stories',
'timestamp': 1404954000,
'upload_date': '20140710',
}
}
}, {
# youtube embed
'url': 'http://www.cracked.com/video_19006_4-plot-holes-you-didnt-notice-in-your-favorite-movies.html',
'md5': 'ccd52866b50bde63a6ef3b35016ba8c7',
'info_dict': {
'id': 'EjI00A3rZD0',
'ext': 'mp4',
'title': "4 Plot Holes You Didn't Notice in Your Favorite Movies - The Spit Take",
'description': 'md5:c603708c718b796fe6079e2b3351ffc7',
'upload_date': '20140725',
'uploader_id': 'Cracked',
'uploader': 'Cracked',
}
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
youtube_url = self._search_regex(
r'<iframe[^>]+src="((?:https?:)?//www\.youtube\.com/embed/[^"]+)"',
webpage, 'youtube url', default=None)
if youtube_url:
return self.url_result(youtube_url, 'Youtube')
video_url = self._html_search_regex(
[r'var\s+CK_vidSrc\s*=\s*"([^"]+)"', r'<video\s+src="([^"]+)"'], webpage, 'video URL')
[r'var\s+CK_vidSrc\s*=\s*"([^"]+)"', r'<video\s+src="([^"]+)"'],
webpage, 'video URL')
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)
title = self._search_regex(
[r'property="?og:title"?\s+content="([^"]+)"', r'class="?title"?>([^<]+)'],
webpage, 'title')
timestamp = self._html_search_regex(r'<time datetime="([^"]+)"', webpage, 'upload date', fatal=False)
description = self._search_regex(
r'name="?(?:og:)?description"?\s+content="([^"]+)"',
webpage, 'description', default=None)
timestamp = self._html_search_regex(
r'"date"\s*:\s*"([^"]+)"', webpage, 'upload date', fatal=False)
if timestamp:
timestamp = parse_iso8601(timestamp[:-6])
view_count = str_to_int(self._html_search_regex(
r'<span class="views" id="viewCounts">([\d,\.]+) Views</span>', webpage, 'view count', fatal=False))
r'<span\s+class="?views"? id="?viewCounts"?>([\d,\.]+) Views</span>',
webpage, 'view count', fatal=False))
comment_count = str_to_int(self._html_search_regex(
r'<span id="commentCounts">([\d,\.]+)</span>', webpage, 'comment count', fatal=False))
r'<span\s+id="?commentCounts"?>([\d,\.]+)</span>',
webpage, 'comment count', fatal=False))
m = re.search(r'_(?P<width>\d+)X(?P<height>\d+)\.mp4$', video_url)
if m:

View File

@@ -0,0 +1,60 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
qualities,
)
class CrooksAndLiarsIE(InfoExtractor):
_VALID_URL = r'https?://embed\.crooksandliars\.com/(?:embed|v)/(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'https://embed.crooksandliars.com/embed/8RUoRhRi',
'info_dict': {
'id': '8RUoRhRi',
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'thumbnail': 're:^https?://.*\.jpg',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',
'duration': 236,
}
}, {
'url': 'http://embed.crooksandliars.com/v/MTE3MjUtMzQ2MzA',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://embed.crooksandliars.com/embed/%s' % video_id, video_id)
manifest = self._parse_json(
self._search_regex(
r'var\s+manifest\s*=\s*({.+?})\n', webpage, 'manifest JSON'),
video_id)
quality = qualities(('webm_low', 'mp4_low', 'webm_high', 'mp4_high'))
formats = [{
'url': item['url'],
'format_id': item['type'],
'quality': quality(item['type']),
} for item in manifest['flavors'] if item['mime'].startswith('video/')]
self._sort_formats(formats)
return {
'url': url,
'id': video_id,
'title': manifest['title'],
'description': manifest.get('description'),
'thumbnail': self._proto_relative_url(manifest.get('poster')),
'timestamp': int_or_none(manifest.get('created')),
'uploader': manifest.get('author'),
'duration': int_or_none(manifest.get('duration')),
'formats': formats,
}

View File

@@ -76,8 +76,8 @@ class CrunchyrollIE(InfoExtractor):
self._login()
def _decrypt_subtitles(self, data, iv, id):
data = bytes_to_intlist(data)
iv = bytes_to_intlist(iv)
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
iv = bytes_to_intlist(base64.b64decode(iv.encode('utf-8')))
id = int(id)
def obfuscate_key_aux(count, modulo, start):
@@ -179,6 +179,16 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
return output
def _extract_subtitles(self, subtitle):
sub_root = xml.etree.ElementTree.fromstring(subtitle)
return [{
'ext': 'srt',
'data': self._convert_subtitles_to_srt(sub_root),
}, {
'ext': 'ass',
'data': self._convert_subtitles_to_ass(sub_root),
}]
def _get_subtitles(self, video_id, webpage):
subtitles = {}
for sub_id, sub_name in re.findall(r'\?ssid=([0-9]+)" title="([^"]+)', webpage):
@@ -190,25 +200,11 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
data = self._search_regex(r'<data>([^<]+)', sub_page, 'subtitle_data', fatal=False)
if not id or not iv or not data:
continue
id = int(id)
iv = base64.b64decode(iv)
data = base64.b64decode(data)
subtitle = self._decrypt_subtitles(data, iv, id).decode('utf-8')
lang_code = self._search_regex(r'lang_code=["\']([^"\']+)', subtitle, 'subtitle_lang_code', fatal=False)
if not lang_code:
continue
sub_root = xml.etree.ElementTree.fromstring(subtitle)
subtitles[lang_code] = [
{
'ext': 'srt',
'data': self._convert_subtitles_to_srt(sub_root),
},
{
'ext': 'ass',
'data': self._convert_subtitles_to_ass(sub_root),
},
]
subtitles[lang_code] = self._extract_subtitles(subtitle)
return subtitles
def _real_extract(self, url):
@@ -263,8 +259,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
streamdata = self._download_xml(
streamdata_req, video_id,
note='Downloading media info for %s' % video_format)
video_url = streamdata.find('.//host').text
video_play_path = streamdata.find('.//file').text
video_url = streamdata.find('./host').text
video_play_path = streamdata.find('./file').text
formats.append({
'url': video_url,
'play_path': video_play_path,

View File

@@ -7,7 +7,10 @@ from ..utils import (
int_or_none,
unescapeHTML,
find_xpath_attr,
smuggle_url,
determine_ext,
)
from .senateisvp import SenateISVPIE
class CSpanIE(InfoExtractor):
@@ -35,11 +38,22 @@ class CSpanIE(InfoExtractor):
}
}, {
'url': 'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall',
'md5': '446562a736c6bf97118e389433ed88d4',
'info_dict': {
'id': '342759',
'ext': 'mp4',
'title': 'General Motors Ignition Switch Recall',
'duration': 14848,
'description': 'md5:70c7c3b8fa63fa60d42772440596034c'
},
'playlist_duration_sum': 14855,
}, {
# Video from senate.gov
'url': 'http://www.c-span.org/video/?104517-1/immigration-reforms-needed-protect-skilled-american-workers',
'info_dict': {
'id': 'judiciary031715',
'ext': 'flv',
'title': 'Immigration Reforms Needed to Protect Skilled American Workers',
}
}]
def _real_extract(self, url):
@@ -56,7 +70,7 @@ class CSpanIE(InfoExtractor):
# present, otherwise this is a stripped version
r'<p class=\'initial\'>(.*?)</p>'
],
webpage, 'description', flags=re.DOTALL)
webpage, 'description', flags=re.DOTALL, default=None)
info_url = 'http://c-spanvideo.org/videoLibrary/assets/player/ajax-player.php?os=android&html5=program&id=' + video_id
data = self._download_json(info_url, video_id)
@@ -68,7 +82,16 @@ class CSpanIE(InfoExtractor):
title = find_xpath_attr(doc, './/string', 'name', 'title').text
thumbnail = find_xpath_attr(doc, './/string', 'name', 'poster').text
senate_isvp_url = SenateISVPIE._search_iframe_url(webpage)
if senate_isvp_url:
surl = smuggle_url(senate_isvp_url, {'force_title': title})
return self.url_result(surl, 'SenateISVP', video_id, title)
files = data['video']['files']
try:
capfile = data['video']['capfile']['#text']
except KeyError:
capfile = None
entries = [{
'id': '%s_%d' % (video_id, partnum + 1),
@@ -79,11 +102,22 @@ class CSpanIE(InfoExtractor):
'description': description,
'thumbnail': thumbnail,
'duration': int_or_none(f.get('length', {}).get('#text')),
'subtitles': {
'en': [{
'url': capfile,
'ext': determine_ext(capfile, 'dfxp')
}],
} if capfile else None,
} for partnum, f in enumerate(files)]
return {
'_type': 'playlist',
'entries': entries,
'title': title,
'id': video_id,
}
if len(entries) == 1:
entry = dict(entries[0])
entry['id'] = video_id
return entry
else:
return {
'_type': 'playlist',
'entries': entries,
'title': title,
'id': video_id,
}

View File

@@ -25,8 +25,7 @@ class DailymotionBaseInfoExtractor(InfoExtractor):
def _build_request(url):
"""Build a request with the family filter disabled"""
request = compat_urllib_request.Request(url)
request.add_header('Cookie', 'family_filter=off')
request.add_header('Cookie', 'ff=off')
request.add_header('Cookie', 'family_filter=off; ff=off')
return request
@@ -53,6 +52,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'ext': 'mp4',
'uploader': 'IGN',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
'upload_date': '20150306',
}
},
# Vevo video
@@ -86,7 +86,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
url = 'http://www.dailymotion.com/video/%s' % video_id
url = 'https://www.dailymotion.com/video/%s' % video_id
# Retrieve video webpage to extract further information
request = self._build_request(url)
@@ -107,13 +107,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
age_limit = self._rta_search(webpage)
video_upload_date = None
mobj = re.search(r'<div class="[^"]*uploaded_cont[^"]*" title="[^"]*">([0-9]{2})-([0-9]{2})-([0-9]{4})</div>', webpage)
mobj = re.search(r'<meta property="video:release_date" content="([0-9]{4})-([0-9]{2})-([0-9]{2}).+?"/>', webpage)
if mobj is not None:
video_upload_date = mobj.group(3) + mobj.group(2) + mobj.group(1)
video_upload_date = mobj.group(1) + mobj.group(2) + mobj.group(3)
embed_url = 'http://www.dailymotion.com/embed/video/%s' % video_id
embed_page = self._download_webpage(embed_url, video_id,
'Downloading embed page')
embed_url = 'https://www.dailymotion.com/embed/video/%s' % video_id
embed_request = self._build_request(embed_url)
embed_page = self._download_webpage(
embed_request, video_id, 'Downloading embed page')
info = self._search_regex(r'var info = ({.*?}),$', embed_page,
'video info', flags=re.MULTILINE)
info = json.loads(info)
@@ -224,7 +225,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
class DailymotionUserIE(DailymotionPlaylistIE):
IE_NAME = 'dailymotion:user'
_VALID_URL = r'https?://(?:www\.)?dailymotion\.[a-z]{2,3}/user/(?P<user>[^/]+)'
_VALID_URL = r'https?://(?:www\.)?dailymotion\.[a-z]{2,3}/(?:(?:old/)?user/)?(?P<user>[^/]+)$'
_PAGE_TEMPLATE = 'http://www.dailymotion.com/user/%s/%s'
_TESTS = [{
'url': 'https://www.dailymotion.com/user/nqtv',
@@ -238,7 +239,8 @@ class DailymotionUserIE(DailymotionPlaylistIE):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user = mobj.group('user')
webpage = self._download_webpage(url, user)
webpage = self._download_webpage(
'https://www.dailymotion.com/user/%s' % user, user)
full_user = unescapeHTML(self._html_search_regex(
r'<a class="nav-image" title="([^"]+)" href="/%s">' % re.escape(user),
webpage, 'user'))

View File

@@ -0,0 +1,73 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
xpath_text,
parse_duration,
)
class DHMIE(InfoExtractor):
IE_DESC = 'Filmarchiv - Deutsches Historisches Museum'
_VALID_URL = r'https?://(?:www\.)?dhm\.de/filmarchiv/(?:[^/]+/)+(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.dhm.de/filmarchiv/die-filme/the-marshallplan-at-work-in-west-germany/',
'md5': '11c475f670209bf6acca0b2b7ef51827',
'info_dict': {
'id': 'the-marshallplan-at-work-in-west-germany',
'ext': 'flv',
'title': 'MARSHALL PLAN AT WORK IN WESTERN GERMANY, THE',
'description': 'md5:1fabd480c153f97b07add61c44407c82',
'duration': 660,
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.dhm.de/filmarchiv/02-mapping-the-wall/peter-g/rolle-1/',
'md5': '09890226332476a3e3f6f2cb74734aa5',
'info_dict': {
'id': 'rolle-1',
'ext': 'flv',
'title': 'ROLLE 1',
'thumbnail': 're:^https?://.*\.jpg$',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
playlist_url = self._search_regex(
r"file\s*:\s*'([^']+)'", webpage, 'playlist url')
playlist = self._download_xml(playlist_url, video_id)
track = playlist.find(
'./{http://xspf.org/ns/0/}trackList/{http://xspf.org/ns/0/}track')
video_url = xpath_text(
track, './{http://xspf.org/ns/0/}location',
'video url', fatal=True)
thumbnail = xpath_text(
track, './{http://xspf.org/ns/0/}image',
'thumbnail')
title = self._search_regex(
[r'dc:title="([^"]+)"', r'<title> &raquo;([^<]+)</title>'],
webpage, 'title').strip()
description = self._html_search_regex(
r'<p><strong>Description:</strong>(.+?)</p>',
webpage, 'description', default=None)
duration = parse_duration(self._search_regex(
r'<em>Length\s*</em>\s*:\s*</strong>([^<]+)',
webpage, 'duration', default=None))
return {
'id': video_id,
'url': video_url,
'title': title,
'description': description,
'duration': duration,
'thumbnail': thumbnail,
}

View File

@@ -2,19 +2,19 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
parse_duration,
parse_iso8601,
int_or_none,
)
from ..compat import compat_str
class DiscoveryIE(InfoExtractor):
_VALID_URL = r'http://www\.discovery\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9_\-]*)(?:\.htm)?'
_TEST = {
_TESTS = [{
'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mission-impossible-outtakes.htm',
'md5': '3c69d77d9b0d82bfd5e5932a60f26504',
'info_dict': {
'id': 'mission-impossible-outtakes',
'ext': 'flv',
'id': '20769',
'ext': 'mp4',
'title': 'Mission Impossible Outtakes',
'description': ('Watch Jamie Hyneman and Adam Savage practice being'
' each other -- to the point of confusing Jamie\'s dog -- and '
@@ -24,22 +24,36 @@ class DiscoveryIE(InfoExtractor):
'timestamp': 1303099200,
'upload_date': '20110418',
},
}
'params': {
'skip_download': True, # requires ffmpeg
}
}, {
'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mythbusters-the-simpsons',
'info_dict': {
'id': 'mythbusters-the-simpsons',
'title': 'MythBusters: The Simpsons',
},
'playlist_count': 9,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
info = self._download_json(url + '?flat=1', video_id)
info = self._parse_json(self._search_regex(
r'(?s)<script type="application/ld\+json">(.*?)</script>',
webpage, 'video info'), video_id)
video_title = info.get('playlist_title') or info.get('video_title')
return {
'id': video_id,
'title': info['name'],
'url': info['contentURL'],
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'timestamp': parse_iso8601(info.get('uploadDate')),
'duration': int_or_none(info.get('duration')),
}
entries = [{
'id': compat_str(video_info['id']),
'formats': self._extract_m3u8_formats(
video_info['src'], video_id, ext='mp4',
note='Download m3u8 information for video %d' % (idx + 1)),
'title': video_info['title'],
'description': video_info.get('description'),
'duration': parse_duration(video_info.get('video_length')),
'webpage_url': video_info.get('href'),
'thumbnail': video_info.get('thumbnailURL'),
'alt_title': video_info.get('secondary_title'),
'timestamp': parse_iso8601(video_info.get('publishedDate')),
} for idx, video_info in enumerate(info['playlist'])]
return self.playlist_result(entries, video_id, video_title)

View File

@@ -36,7 +36,8 @@ class DotsubIE(InfoExtractor):
if not video_url:
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'"file"\s*:\s*\'([^\']+)', webpage, 'video url')
[r'<source[^>]+src="([^"]+)"', r'"file"\s*:\s*\'([^\']+)'],
webpage, 'video url')
return {
'id': video_id,

View File

@@ -1,19 +1,23 @@
# coding: utf-8
from __future__ import unicode_literals
import hashlib
import time
from .common import InfoExtractor
from ..utils import ExtractorError
from ..utils import (ExtractorError, unescapeHTML)
from ..compat import (compat_str, compat_basestring)
class DouyuTVIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?douyutv\.com/(?P<id>[A-Za-z0-9]+)'
_TEST = {
_TESTS = [{
'url': 'http://www.douyutv.com/iseven',
'info_dict': {
'id': 'iseven',
'id': '17732',
'display_id': 'iseven',
'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:9e525642c25a0a24302869937cf69d17',
'description': 'md5:c93d6692dde6fe33809a46edcbecca44',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': '7师傅',
'uploader_id': '431925',
@@ -22,22 +26,52 @@ class DouyuTVIE(InfoExtractor):
'params': {
'skip_download': True,
}
}
}, {
'url': 'http://www.douyutv.com/85982',
'info_dict': {
'id': '85982',
'display_id': '85982',
'ext': 'flv',
'title': 're:^小漠从零单排记——CSOL2躲猫猫 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:746a2f7a253966a06755a912f0acc0d2',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'douyu小漠',
'uploader_id': '3769985',
'is_live': True,
},
'params': {
'skip_download': True,
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
if video_id.isdigit():
room_id = video_id
else:
page = self._download_webpage(url, video_id)
room_id = self._html_search_regex(
r'"room_id"\s*:\s*(\d+),', page, 'room id')
prefix = 'room/%s?aid=android&client_sys=android&time=%d' % (
room_id, int(time.time()))
auth = hashlib.md5((prefix + '1231').encode('ascii')).hexdigest()
config = self._download_json(
'http://www.douyutv.com/api/client/room/%s' % video_id, video_id)
'http://www.douyutv.com/api/v1/%s&auth=%s' % (prefix, auth),
video_id)
data = config['data']
error_code = config.get('error', 0)
show_status = data.get('show_status')
if error_code is not 0:
raise ExtractorError(
'Server reported error %i' % error_code, expected=True)
error_desc = 'Server reported error %i' % error_code
if isinstance(data, (compat_str, compat_basestring)):
error_desc += ': ' + data
raise ExtractorError(error_desc, expected=True)
show_status = data.get('show_status')
# 1 = live, 2 = offline
if show_status == '2':
raise ExtractorError(
@@ -46,7 +80,7 @@ class DouyuTVIE(InfoExtractor):
base_url = data['rtmp_url']
live_path = data['rtmp_live']
title = self._live_title(data['room_name'])
title = self._live_title(unescapeHTML(data['room_name']))
description = data.get('show_details')
thumbnail = data.get('room_src')
@@ -66,7 +100,8 @@ class DouyuTVIE(InfoExtractor):
self._sort_formats(formats)
return {
'id': video_id,
'id': room_id,
'display_id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,

View File

@@ -0,0 +1,160 @@
# encoding: utf-8
from __future__ import unicode_literals
import itertools
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_urlparse,
)
from ..utils import (
ExtractorError,
clean_html,
determine_ext,
int_or_none,
parse_iso8601,
)
class DramaFeverIE(InfoExtractor):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TEST = {
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': {
'id': '4512.1',
'ext': 'flv',
'title': 'Cooking with Shin 4512.1',
'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'thumbnail': 're:^https?://.*\.jpg',
'timestamp': 1404336058,
'upload_date': '20140702',
'duration': 343,
}
}
def _real_extract(self, url):
video_id = self._match_id(url).replace('/', '.')
try:
feed = self._download_json(
'http://www.dramafever.com/amp/episode/feed.json?guid=%s' % video_id,
video_id, 'Downloading episode JSON')['channel']['item']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError):
raise ExtractorError(
'Currently unavailable in your country.', expected=True)
raise
media_group = feed.get('media-group', {})
formats = []
for media_content in media_group['media-content']:
src = media_content.get('@attributes', {}).get('url')
if not src:
continue
ext = determine_ext(src)
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
src, video_id, f4m_id='hds'))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
src, video_id, 'mp4', m3u8_id='hls'))
else:
formats.append({
'url': src,
})
self._sort_formats(formats)
title = media_group.get('media-title')
description = media_group.get('media-description')
duration = int_or_none(media_group['media-content'][0].get('@attributes', {}).get('duration'))
thumbnail = self._proto_relative_url(
media_group.get('media-thumbnail', {}).get('@attributes', {}).get('url'))
timestamp = parse_iso8601(feed.get('pubDate'), ' ')
subtitles = {}
for media_subtitle in media_group.get('media-subTitle', []):
lang = media_subtitle.get('@attributes', {}).get('lang')
href = media_subtitle.get('@attributes', {}).get('href')
if not lang or not href:
continue
subtitles[lang] = [{
'ext': 'ttml',
'url': href,
}]
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'duration': duration,
'formats': formats,
'subtitles': subtitles,
}
class DramaFeverSeriesIE(InfoExtractor):
IE_NAME = 'dramafever:series'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {
'id': '4512',
'title': 'Cooking with Shin',
'description': 'md5:84a3f26e3cdc3fb7f500211b3593b5c1',
},
'playlist_count': 4,
}, {
'url': 'http://www.dramafever.com/drama/124/IRIS/',
'info_dict': {
'id': '124',
'title': 'IRIS',
'description': 'md5:b3a30e587cf20c59bd1c01ec0ee1b862',
},
'playlist_count': 20,
}]
_CONSUMER_SECRET = 'DA59dtVXYLxajktV'
_PAGE_SIZE = 60 # max is 60 (see http://api.drama9.com/#get--api-4-episode-series-)
def _get_consumer_secret(self, video_id):
mainjs = self._download_webpage(
'http://www.dramafever.com/static/51afe95/df2014/scripts/main.js',
video_id, 'Downloading main.js', fatal=False)
if not mainjs:
return self._CONSUMER_SECRET
return self._search_regex(
r"var\s+cs\s*=\s*'([^']+)'", mainjs,
'consumer secret', default=self._CONSUMER_SECRET)
def _real_extract(self, url):
series_id = self._match_id(url)
consumer_secret = self._get_consumer_secret(series_id)
series = self._download_json(
'http://www.dramafever.com/api/4/series/query/?cs=%s&series_id=%s'
% (consumer_secret, series_id),
series_id, 'Downloading series JSON')['series'][series_id]
title = clean_html(series['name'])
description = clean_html(series.get('description') or series.get('description_short'))
entries = []
for page_num in itertools.count(1):
episodes = self._download_json(
'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_size=%d&page_number=%d'
% (consumer_secret, series_id, self._PAGE_SIZE, page_num),
series_id, 'Downloading episodes JSON page #%d' % page_num)
for episode in episodes.get('value', []):
entries.append(self.url_result(
compat_urlparse.urljoin(url, episode['episode_url']),
'DramaFever', episode.get('guid')))
if page_num == episodes['num_pages']:
break
return self.playlist_result(entries, series_id, title, description)

View File

@@ -3,24 +3,33 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import unified_strdate
from ..utils import (
ExtractorError,
unified_strdate,
)
class DreiSatIE(InfoExtractor):
IE_NAME = '3sat'
_VALID_URL = r'(?:http://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TEST = {
'url': 'http://www.3sat.de/mediathek/index.php?obj=36983',
'md5': '9dcfe344732808dbfcc901537973c922',
'info_dict': {
'id': '36983',
'ext': 'mp4',
'title': 'Kaffeeland Schweiz',
'description': 'md5:cc4424b18b75ae9948b13929a0814033',
'uploader': '3sat',
'upload_date': '20130622'
}
}
_VALID_URL = r'(?:http://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TESTS = [
{
'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918',
'md5': 'be37228896d30a88f315b638900a026e',
'info_dict': {
'id': '45918',
'ext': 'mp4',
'title': 'Waidmannsheil',
'description': 'md5:cce00ca1d70e21425e72c86a98a56817',
'uploader': '3sat',
'upload_date': '20140913'
}
},
{
'url': 'http://www.3sat.de/mediathek/mediathek.php?mode=play&obj=51066',
'only_matching': True,
},
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
@@ -28,6 +37,15 @@ class DreiSatIE(InfoExtractor):
details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?ak=web&id=%s' % video_id
details_doc = self._download_xml(details_url, video_id, 'Downloading video details')
status_code = details_doc.find('./status/statuscode')
if status_code is not None and status_code.text != 'ok':
code = status_code.text
if code == 'notVisibleAnymore':
message = 'Video %s is not available' % video_id
else:
message = '%s returned error: %s' % (self.IE_NAME, code)
raise ExtractorError(message, expected=True)
thumbnail_els = details_doc.findall('.//teaserimage')
thumbnails = [{
'width': int(te.attrib['key'].partition('x')[0]),

View File

@@ -1,23 +1,27 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor, ExtractorError
from ..utils import parse_iso8601
from .common import InfoExtractor
from ..utils import (
ExtractorError,
parse_iso8601,
)
class DRTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dr\.dk/tv/se/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'
_TEST = {
'url': 'http://www.dr.dk/tv/se/partiets-mand/partiets-mand-7-8',
'md5': '4a7e1dd65cdb2643500a3f753c942f25',
'url': 'https://www.dr.dk/tv/se/boern/ultra/panisk-paske/panisk-paske-5',
'md5': 'dc515a9ab50577fa14cc4e4b0265168f',
'info_dict': {
'id': 'partiets-mand-7-8',
'id': 'panisk-paske-5',
'ext': 'mp4',
'title': 'Partiets mand (7:8)',
'description': 'md5:a684b90a8f9336cd4aab94b7647d7862',
'timestamp': 1403047940,
'upload_date': '20140617',
'duration': 1299.040,
'title': 'Panisk Påske (5)',
'description': 'md5:ca14173c5ab24cd26b0fcc074dff391c',
'timestamp': 1426984612,
'upload_date': '20150322',
'duration': 1455,
},
}
@@ -26,6 +30,10 @@ class DRTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
if '>Programmet er ikke længere tilgængeligt' in webpage:
raise ExtractorError(
'Video %s is not available' % video_id, expected=True)
video_id = self._search_regex(
r'data-(?:material-identifier|episode-slug)="([^"]+)"',
webpage, 'video id')
@@ -55,19 +63,31 @@ class DRTVIE(InfoExtractor):
restricted_to_denmark = asset['RestrictedToDenmark']
spoken_subtitles = asset['Target'] == 'SpokenSubtitles'
for link in asset['Links']:
target = link['Target']
uri = link['Uri']
target = link['Target']
format_id = target
preference = -1 if target == 'HDS' else -2
preference = None
if spoken_subtitles:
preference -= 2
preference = -1
format_id += '-spoken-subtitles'
formats.append({
'url': uri + '?hdcore=3.3.0&plugin=aasp-3.3.0.99.43' if target == 'HDS' else uri,
'format_id': format_id,
'ext': link['FileFormat'],
'preference': preference,
})
if target == 'HDS':
formats.extend(self._extract_f4m_formats(
uri + '?hdcore=3.3.0&plugin=aasp-3.3.0.99.43',
video_id, preference, f4m_id=format_id))
elif target == 'HLS':
formats.extend(self._extract_m3u8_formats(
uri, video_id, 'mp4', preference=preference,
m3u8_id=format_id))
else:
bitrate = link.get('Bitrate')
if bitrate:
format_id += '-%s' % bitrate
formats.append({
'url': uri,
'format_id': format_id,
'tbr': bitrate,
'ext': link.get('FileFormat'),
})
subtitles_list = asset.get('SubtitlesList')
if isinstance(subtitles_list, list):
LANGS = {

View File

@@ -28,12 +28,12 @@ class DumpIE(InfoExtractor):
video_url = self._search_regex(
r's1.addVariable\("file",\s*"([^"]+)"', webpage, 'video URL')
thumb = self._og_search_thumbnail(webpage)
title = self._search_regex(r'<b>([^"]+)</b>', webpage, 'title')
title = self._og_search_title(webpage)
thumbnail = self._og_search_thumbnail(webpage)
return {
'id': video_id,
'title': title,
'url': video_url,
'thumbnail': thumb,
'thumbnail': thumbnail,
}

View File

@@ -0,0 +1,60 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
from .common import InfoExtractor
from ..compat import compat_urllib_request
from ..utils import qualities
class DumpertIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dumpert\.nl/mediabase/(?P<id>[0-9]+/[0-9a-zA-Z]+)'
_TEST = {
'url': 'http://www.dumpert.nl/mediabase/6646981/951bc60f/',
'md5': '1b9318d7d5054e7dcb9dc7654f21d643',
'info_dict': {
'id': '6646981/951bc60f',
'ext': 'mp4',
'title': 'Ik heb nieuws voor je',
'description': 'Niet schrikken hoor',
'thumbnail': 're:^https?://.*\.jpg$',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
req = compat_urllib_request.Request(url)
req.add_header('Cookie', 'nsfw=1; cpc=10')
webpage = self._download_webpage(req, video_id)
files_base64 = self._search_regex(
r'data-files="([^"]+)"', webpage, 'data files')
files = self._parse_json(
base64.b64decode(files_base64.encode('utf-8')).decode('utf-8'),
video_id)
quality = qualities(['flv', 'mobile', 'tablet', '720p'])
formats = [{
'url': video_url,
'format_id': format_id,
'quality': quality(format_id),
} for format_id, video_url in files.items() if format_id != 'still']
self._sort_formats(formats)
title = self._html_search_meta(
'title', webpage) or self._og_search_title(webpage)
description = self._html_search_meta(
'description', webpage) or self._og_search_description(webpage)
thumbnail = files.get('still') or self._og_search_thumbnail(webpage)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'formats': formats
}

View File

@@ -45,6 +45,7 @@ class EaglePlatformIE(InfoExtractor):
'duration': 216,
'view_count': int,
},
'skip': 'Georestricted',
}]
def _handle_error(self, response):

View File

@@ -6,56 +6,42 @@ import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
parse_iso8601,
)
class EllenTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:ellentv|ellentube)\.com/videos/(?P<id>[a-z0-9_-]+)'
_TESTS = [{
'url': 'http://www.ellentv.com/videos/0-7jqrsr18/',
'md5': 'e4af06f3bf0d5f471921a18db5764642',
_TEST = {
'url': 'http://www.ellentv.com/videos/0-ipq1gsai/',
'md5': '8e3c576bf2e9bfff4d76565f56f94c9c',
'info_dict': {
'id': '0-7jqrsr18',
'id': '0_ipq1gsai',
'ext': 'mp4',
'title': 'What\'s Wrong with These Photos? A Whole Lot',
'description': 'md5:35f152dc66b587cf13e6d2cf4fa467f6',
'timestamp': 1406876400,
'upload_date': '20140801',
'title': 'Fast Fingers of Fate',
'description': 'md5:587e79fbbd0d73b148bc596d99ce48e6',
'timestamp': 1428035648,
'upload_date': '20150403',
'uploader_id': 'batchUser',
}
}, {
'url': 'http://ellentube.com/videos/0-dvzmabd5/',
'md5': '98238118eaa2bbdf6ad7f708e3e4f4eb',
'info_dict': {
'id': '0-dvzmabd5',
'ext': 'mp4',
'title': '1 year old twin sister makes her brother laugh',
'description': '1 year old twin sister makes her brother laugh',
'timestamp': 1419542075,
'upload_date': '20141225',
}
}]
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_meta('VideoURL', webpage, 'url')
title = self._og_search_title(webpage, default=None) or self._search_regex(
r'pageName\s*=\s*"([^"]+)"', webpage, 'title')
description = self._html_search_meta(
'description', webpage, 'description') or self._og_search_description(webpage)
timestamp = parse_iso8601(self._search_regex(
r'<span class="publish-date"><time datetime="([^"]+)">',
webpage, 'timestamp'))
webpage = self._download_webpage(
'http://widgets.ellentube.com/videos/%s' % video_id,
video_id)
return {
'id': video_id,
'url': video_url,
'title': title,
'description': description,
'timestamp': timestamp,
}
partner_id = self._search_regex(
r"var\s+partnerId\s*=\s*'([^']+)", webpage, 'partner id')
kaltura_id = self._search_regex(
[r'id="kaltura_player_([^"]+)"',
r"_wb_entry_id\s*:\s*'([^']+)",
r'data-kaltura-entry-id="([^"]+)'],
webpage, 'kaltura id')
return self.url_result('kaltura:%s:%s' % (partner_id, kaltura_id), 'Kaltura')
class EllenTVClipsIE(InfoExtractor):
@@ -67,7 +53,7 @@ class EllenTVClipsIE(InfoExtractor):
'id': 'meryl-streep-vanessa-hudgens',
'title': 'Meryl Streep, Vanessa Hudgens',
},
'playlist_mincount': 9,
'playlist_mincount': 7,
}
def _real_extract(self, url):
@@ -91,4 +77,8 @@ class EllenTVClipsIE(InfoExtractor):
raise ExtractorError('Failed to download JSON', cause=ve)
def _extract_entries(self, playlist):
return [self.url_result(item['url'], 'EllenTV') for item in playlist]
return [
self.url_result(
'kaltura:%s:%s' % (item['kaltura_partner_id'], item['kaltura_entry_id']),
'Kaltura')
for item in playlist]

View File

@@ -4,22 +4,28 @@ from .tnaflix import TNAFlixIE
class EMPFlixIE(TNAFlixIE):
_VALID_URL = r'^https?://www\.empflix\.com/videos/(?P<display_id>[0-9a-zA-Z-]+)-(?P<id>[0-9]+)\.html'
_VALID_URL = r'https?://(?:www\.)?empflix\.com/videos/(?P<display_id>.+?)-(?P<id>[0-9]+)\.html'
_TITLE_REGEX = r'name="title" value="(?P<title>[^"]*)"'
_DESCRIPTION_REGEX = r'name="description" value="([^"]*)"'
_CONFIG_REGEX = r'flashvars\.config\s*=\s*escape\("([^"]+)"'
_TEST = {
'url': 'http://www.empflix.com/videos/Amateur-Finger-Fuck-33051.html',
'md5': 'b1bc15b6412d33902d6e5952035fcabc',
'info_dict': {
'id': '33051',
'display_id': 'Amateur-Finger-Fuck',
'ext': 'mp4',
'title': 'Amateur Finger Fuck',
'description': 'Amateur solo finger fucking.',
'thumbnail': 're:https?://.*\.jpg$',
'age_limit': 18,
_TESTS = [
{
'url': 'http://www.empflix.com/videos/Amateur-Finger-Fuck-33051.html',
'md5': 'b1bc15b6412d33902d6e5952035fcabc',
'info_dict': {
'id': '33051',
'display_id': 'Amateur-Finger-Fuck',
'ext': 'mp4',
'title': 'Amateur Finger Fuck',
'description': 'Amateur solo finger fucking.',
'thumbnail': 're:https?://.*\.jpg$',
'age_limit': 18,
}
},
{
'url': 'http://www.empflix.com/videos/[AROMA][ARMD-718]-Aoi-Yoshino-Sawa-25826.html',
'only_matching': True,
}
}
]

View File

@@ -1,11 +1,20 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
ExtractorError,
unescapeHTML
)
class EroProfileIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?eroprofile\.com/m/videos/view/(?P<id>[^/]+)'
_TEST = {
_LOGIN_URL = 'http://www.eroprofile.com/auth/auth.php?'
_NETRC_MACHINE = 'eroprofile'
_TESTS = [{
'url': 'http://www.eroprofile.com/m/videos/view/sexy-babe-softcore',
'md5': 'c26f351332edf23e1ea28ce9ec9de32f',
'info_dict': {
@@ -16,19 +25,61 @@ class EroProfileIE(InfoExtractor):
'thumbnail': 're:https?://.*\.jpg',
'age_limit': 18,
}
}
}, {
'url': 'http://www.eroprofile.com/m/videos/view/Try-It-On-Pee_cut_2-wmv-4shared-com-file-sharing-download-movie-file',
'md5': '1baa9602ede46ce904c431f5418d8916',
'info_dict': {
'id': '1133519',
'ext': 'm4v',
'title': 'Try It On Pee_cut_2.wmv - 4shared.com - file sharing - download movie file',
'thumbnail': 're:https?://.*\.jpg',
'age_limit': 18,
},
'skip': 'Requires login',
}]
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
query = compat_urllib_parse.urlencode({
'username': username,
'password': password,
'url': 'http://www.eroprofile.com/',
})
login_url = self._LOGIN_URL + query
login_page = self._download_webpage(login_url, None, False)
m = re.search(r'Your username or password was incorrect\.', login_page)
if m:
raise ExtractorError(
'Wrong username and/or password.', expected=True)
self.report_login()
redirect_url = self._search_regex(
r'<script[^>]+?src="([^"]+)"', login_page, 'login redirect url')
self._download_webpage(redirect_url, None, False)
def _real_initialize(self):
self._login()
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
m = re.search(r'You must be logged in to view this video\.', webpage)
if m:
raise ExtractorError(
'This video requires login. Please specify a username and password and try again.', expected=True)
video_id = self._search_regex(
[r"glbUpdViews\s*\('\d*','(\d+)'", r'p/report/video/(\d+)'],
webpage, 'video id', default=None)
video_url = self._search_regex(
r'<source src="([^"]+)', webpage, 'video url')
video_url = unescapeHTML(self._search_regex(
r'<source src="([^"]+)', webpage, 'video url'))
title = self._html_search_regex(
r'Title:</th><td>([^<]+)</td>', webpage, 'title')
thumbnail = self._search_regex(

View File

@@ -1,128 +1,107 @@
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..compat import compat_urllib_request
from ..utils import (
ExtractorError,
js_to_json,
parse_duration,
determine_ext,
clean_html,
int_or_none,
float_or_none,
)
def _decrypt_config(key, string):
a = ''
i = ''
r = ''
while len(a) < (len(string) / 2):
a += key
a = a[0:int(len(string) / 2)]
t = 0
while t < len(string):
i += chr(int(string[t] + string[t + 1], 16))
t += 2
icko = [s for s in i]
for t, c in enumerate(a):
r += chr(ord(c) ^ ord(icko[t]))
return r
class EscapistIE(InfoExtractor):
_VALID_URL = r'https?://?(www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])'
_USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko'
_TEST = {
_VALID_URL = r'https?://?(?:www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])'
_TESTS = [{
'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'md5': 'ab3a706c681efca53f0a35f1415cf0d1',
'info_dict': {
'id': '6618',
'ext': 'mp4',
'description': "Baldur's Gate: Original, Modded or Enhanced Edition? I'll break down what you can expect from the new Baldur's Gate: Enhanced Edition.",
'uploader_id': 'the-escapist-presents',
'uploader': 'The Escapist Presents',
'title': "Breaking Down Baldur's Gate",
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 264,
'uploader': 'The Escapist',
}
}
}, {
'url': 'http://www.escapistmagazine.com/videos/view/zero-punctuation/10044-Evolve-One-vs-Multiplayer',
'md5': '9e8c437b0dbb0387d3bd3255ca77f6bf',
'info_dict': {
'id': '10044',
'ext': 'mp4',
'description': 'This week, Zero Punctuation reviews Evolve.',
'title': 'Evolve - One vs Multiplayer',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 304,
'uploader': 'The Escapist',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage_req = compat_urllib_request.Request(url)
webpage_req.add_header('User-Agent', self._USER_AGENT)
webpage = self._download_webpage(webpage_req, video_id)
webpage = self._download_webpage(url, video_id)
uploader_id = self._html_search_regex(
r"<h1\s+class='headline'>\s*<a\s+href='/videos/view/(.*?)'",
webpage, 'uploader ID', fatal=False)
uploader = self._html_search_regex(
r"<h1\s+class='headline'>(.*?)</a>",
webpage, 'uploader', fatal=False)
description = self._html_search_meta('description', webpage)
duration = parse_duration(self._html_search_meta('duration', webpage))
ims_video = self._parse_json(
self._search_regex(
r'imsVideo\.play\(({.+?})\);', webpage, 'imsVideo'),
video_id)
video_id = ims_video['videoID']
key = ims_video['hash']
raw_title = self._html_search_meta('title', webpage, fatal=True)
title = raw_title.partition(' : ')[2]
config_req = compat_urllib_request.Request(
'http://www.escapistmagazine.com/videos/'
'vidconfig.php?videoID=%s&hash=%s' % (video_id, key))
config_req.add_header('Referer', url)
config = self._download_webpage(config_req, video_id, 'Downloading video config')
config_url = compat_urllib_parse.unquote(self._html_search_regex(
r'''(?x)
(?:
<param\s+name="flashvars".*?\s+value="config=|
flashvars=&quot;config=
)
(https?://[^"&]+)
''',
webpage, 'config URL'))
data = json.loads(_decrypt_config(key, config))
formats = []
ad_formats = []
video_data = data['videoData']
def _add_format(name, cfg_url, quality):
cfg_req = compat_urllib_request.Request(cfg_url)
cfg_req.add_header('User-Agent', self._USER_AGENT)
config = self._download_json(
cfg_req, video_id,
'Downloading ' + name + ' configuration',
'Unable to download ' + name + ' configuration',
transform_source=js_to_json)
title = clean_html(video_data['title'])
duration = float_or_none(video_data.get('duration'), 1000)
uploader = video_data.get('publisher')
playlist = config['playlist']
for p in playlist:
if p.get('eventCategory') == 'Video':
ar = formats
elif p.get('eventCategory') == 'Video Postroll':
ar = ad_formats
else:
continue
ar.append({
'url': p['url'],
'format_id': name,
'quality': quality,
'http_headers': {
'User-Agent': self._USER_AGENT,
},
})
_add_format('normal', config_url, quality=0)
hq_url = (config_url +
('&hq=1' if '?' in config_url else config_url + '?hq=1'))
try:
_add_format('hq', hq_url, quality=1)
except ExtractorError:
pass # That's fine, we'll just use normal quality
formats = [{
'url': video['src'],
'format_id': '%s-%sp' % (determine_ext(video['src']), video['res']),
'height': int_or_none(video.get('res')),
} for video in data['files']['videos']]
self._sort_formats(formats)
if '/escapist/sales-marketing/' in formats[-1]['url']:
raise ExtractorError('This IP address has been blocked by The Escapist', expected=True)
res = {
return {
'id': video_id,
'formats': formats,
'uploader': uploader,
'uploader_id': uploader_id,
'title': title,
'thumbnail': self._og_search_thumbnail(webpage),
'description': description,
'description': self._og_search_description(webpage),
'duration': duration,
'uploader': uploader,
}
if self._downloader.params.get('include_ads') and ad_formats:
self._sort_formats(ad_formats)
ad_res = {
'id': '%s-ad' % video_id,
'title': '%s (Postroll)' % title,
'formats': ad_formats,
}
return {
'_type': 'playlist',
'entries': [res, ad_res],
'title': title,
'id': video_id,
}
return res

View File

@@ -0,0 +1,55 @@
from __future__ import unicode_literals
from .common import InfoExtractor
class ESPNIE(InfoExtractor):
_VALID_URL = r'https?://espn\.go\.com/(?:[^/]+/)*(?P<id>[^/]+)'
_WORKING = False
_TESTS = [{
'url': 'http://espn.go.com/video/clip?id=10365079',
'info_dict': {
'id': 'FkYWtmazr6Ed8xmvILvKLWjd4QvYZpzG',
'ext': 'mp4',
'title': 'dm_140128_30for30Shorts___JudgingJewellv2',
'description': '',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://espn.go.com/video/iframe/twitter/?cms=espn&id=10365079',
'only_matching': True,
}, {
'url': 'http://espn.go.com/nba/recap?gameId=400793786',
'only_matching': True,
}, {
'url': 'http://espn.go.com/blog/golden-state-warriors/post/_/id/593/how-warriors-rapidly-regained-a-winning-edge',
'only_matching': True,
}, {
'url': 'http://espn.go.com/sports/endurance/story/_/id/12893522/dzhokhar-tsarnaev-sentenced-role-boston-marathon-bombings',
'only_matching': True,
}, {
'url': 'http://espn.go.com/nba/playoffs/2015/story/_/id/12887571/john-wall-washington-wizards-no-swelling-left-hand-wrist-game-5-return',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_id = self._search_regex(
r'class="video-play-button"[^>]+data-id="(\d+)',
webpage, 'video id')
player = self._download_webpage(
'https://espn.go.com/video/iframe/twitter/?id=%s' % video_id, video_id)
pcode = self._search_regex(
r'["\']pcode=([^"\']+)["\']', player, 'pcode')
return self.url_result(
'ooyalaexternal:espn:%s:%s' % (video_id, pcode),
'OoyalaExternal')

View File

@@ -24,8 +24,12 @@ class FacebookIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://(?:\w+\.)?facebook\.com/
(?:[^#]*?\#!/)?
(?:video/video\.php|photo\.php|video\.php|video/embed)\?(?:.*?)
(?:v|video_id)=(?P<id>[0-9]+)
(?:
(?:video/video\.php|photo\.php|video\.php|video/embed)\?(?:.*?)
(?:v|video_id)=|
[^/]+/videos/(?:[^/]+/)?
)
(?P<id>[0-9]+)
(?:.*)'''
_LOGIN_URL = 'https://www.facebook.com/login.php?next=http%3A%2F%2Ffacebook.com%2Fhome.php&login_attempt=1'
_CHECKPOINT_URL = 'https://www.facebook.com/checkpoint/?next=http%3A%2F%2Ffacebook.com%2Fhome.php&_fb_noscript=1'
@@ -46,10 +50,19 @@ class FacebookIE(InfoExtractor):
'id': '274175099429670',
'ext': 'mp4',
'title': 'Facebook video #274175099429670',
}
},
'expected_warnings': [
'title'
]
}, {
'url': 'https://www.facebook.com/video.php?v=10204634152394104',
'only_matching': True,
}, {
'url': 'https://www.facebook.com/amogood/videos/1618742068337349/?fref=nf',
'only_matching': True,
}, {
'url': 'https://www.facebook.com/ChristyClarkForBC/videos/vb.22819070941/10153870694020942/?type=2&theater',
'only_matching': True,
}]
def _login(self):
@@ -139,12 +152,12 @@ class FacebookIE(InfoExtractor):
raise ExtractorError('Cannot find video formats')
video_title = self._html_search_regex(
r'<h2 class="uiHeaderTitle">([^<]*)</h2>', webpage, 'title',
fatal=False)
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage, 'title',
default=None)
if not video_title:
video_title = self._html_search_regex(
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',
webpage, 'alternative title', default=None)
webpage, 'alternative title', fatal=False)
video_title = limit_length(video_title, 80)
if not video_title:
video_title = 'Facebook video #%s' % video_id

View File

@@ -1,80 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
)
class FiredriveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?firedrive\.com/' + \
'(?:file|embed)/(?P<id>[0-9a-zA-Z]+)'
_FILE_DELETED_REGEX = r'<div class="removed_file_image">'
_TESTS = [{
'url': 'https://www.firedrive.com/file/FEB892FA160EBD01',
'md5': 'd5d4252f80ebeab4dc2d5ceaed1b7970',
'info_dict': {
'id': 'FEB892FA160EBD01',
'ext': 'flv',
'title': 'bbb_theora_486kbit.flv',
'thumbnail': 're:^http://.*\.jpg$',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
url = 'http://firedrive.com/file/%s' % video_id
webpage = self._download_webpage(url, video_id)
if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
raise ExtractorError('Video %s does not exist' % video_id,
expected=True)
fields = dict(re.findall(r'''(?x)<input\s+
type="hidden"\s+
name="([^"]+)"\s+
value="([^"]*)"
''', webpage))
post = compat_urllib_parse.urlencode(fields)
req = compat_urllib_request.Request(url, post)
req.add_header('Content-type', 'application/x-www-form-urlencoded')
# Apparently, this header is required for confirmation to work.
req.add_header('Host', 'www.firedrive.com')
webpage = self._download_webpage(req, video_id,
'Downloading video page')
title = self._search_regex(r'class="external_title_left">(.+)</div>',
webpage, 'title')
thumbnail = self._search_regex(r'image:\s?"(//[^\"]+)', webpage,
'thumbnail', fatal=False)
if thumbnail is not None:
thumbnail = 'http:' + thumbnail
ext = self._search_regex(r'type:\s?\'([^\']+)\',',
webpage, 'extension', fatal=False)
video_url = self._search_regex(
r'file:\s?loadURL\(\'(http[^\']+)\'\),', webpage, 'file url')
formats = [{
'format_id': 'sd',
'url': video_url,
'ext': ext,
}]
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}

View File

@@ -0,0 +1,88 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
class FiveTVIE(InfoExtractor):
_VALID_URL = r'''(?x)
http://
(?:www\.)?5-tv\.ru/
(?:
(?:[^/]+/)+(?P<id>\d+)|
(?P<path>[^/?#]+)(?:[/?#])?
)
'''
_TESTS = [{
'url': 'http://5-tv.ru/news/96814/',
'md5': 'bbff554ad415ecf5416a2f48c22d9283',
'info_dict': {
'id': '96814',
'ext': 'mp4',
'title': 'Россияне выбрали имя для общенациональной платежной системы',
'description': 'md5:a8aa13e2b7ad36789e9f77a74b6de660',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
'url': 'http://5-tv.ru/video/1021729/',
'info_dict': {
'id': '1021729',
'ext': 'mp4',
'title': '3D принтер',
'description': 'md5:d76c736d29ef7ec5c0cf7d7c65ffcb41',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
'url': 'http://www.5-tv.ru/glavnoe/#itemDetails',
'info_dict': {
'id': 'glavnoe',
'ext': 'mp4',
'title': 'Итоги недели с 8 по 14 июня 2015 года',
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.5-tv.ru/glavnoe/broadcasts/508645/',
'only_matching': True,
}, {
'url': 'http://5-tv.ru/films/1507502/',
'only_matching': True,
}, {
'url': 'http://5-tv.ru/programs/broadcast/508713/',
'only_matching': True,
}, {
'url': 'http://5-tv.ru/angel/',
'only_matching': True,
}, {
'url': 'http://www.5-tv.ru/schedule/?iframe=true&width=900&height=450',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') or mobj.group('path')
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'<a[^>]+?href="([^"]+)"[^>]+?class="videoplayer"',
webpage, 'video url')
title = self._og_search_title(webpage, default=None) or self._search_regex(
r'<title>([^<]+)</title>', webpage, 'title')
duration = int_or_none(self._og_search_property(
'video:duration', webpage, 'duration', default=None))
return {
'id': video_id,
'url': video_url,
'title': title,
'description': self._og_search_description(webpage, default=None),
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'duration': duration,
}

View File

@@ -3,9 +3,10 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_request
from ..utils import (
ExtractorError,
unescapeHTML,
find_xpath_attr,
)
@@ -29,25 +30,31 @@ class FlickrIE(InfoExtractor):
video_id = mobj.group('id')
video_uploader_id = mobj.group('uploader_id')
webpage_url = 'http://www.flickr.com/photos/' + video_uploader_id + '/' + video_id
webpage = self._download_webpage(webpage_url, video_id)
req = compat_urllib_request.Request(webpage_url)
req.add_header(
'User-Agent',
# it needs a more recent version
'Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20150101 Firefox/38.0 (Chrome)')
webpage = self._download_webpage(req, video_id)
secret = self._search_regex(r"photo_secret: '(\w+)'", webpage, 'secret')
secret = self._search_regex(r'secret"\s*:\s*"(\w+)"', webpage, 'secret')
first_url = 'https://secure.flickr.com/apps/video/video_mtl_xml.gne?v=x&photo_id=' + video_id + '&secret=' + secret + '&bitrate=700&target=_self'
first_xml = self._download_webpage(first_url, video_id, 'Downloading first data webpage')
first_xml = self._download_xml(first_url, video_id, 'Downloading first data webpage')
node_id = self._html_search_regex(r'<Item id="id">(\d+-\d+)</Item>',
first_xml, 'node_id')
node_id = find_xpath_attr(
first_xml, './/{http://video.yahoo.com/YEP/1.0/}Item', 'id',
'id').text
second_url = 'https://secure.flickr.com/video_playlist.gne?node_id=' + node_id + '&tech=flash&mode=playlist&bitrate=700&secret=' + secret + '&rd=video.yahoo.com&noad=1'
second_xml = self._download_webpage(second_url, video_id, 'Downloading second data webpage')
second_xml = self._download_xml(second_url, video_id, 'Downloading second data webpage')
self.report_extraction(video_id)
mobj = re.search(r'<STREAM APP="(.+?)" FULLPATH="(.+?)"', second_xml)
if mobj is None:
stream = second_xml.find('.//STREAM')
if stream is None:
raise ExtractorError('Unable to extract video url')
video_url = mobj.group(1) + unescapeHTML(mobj.group(2))
video_url = stream.attrib['APP'] + stream.attrib['FULLPATH']
return {
'id': video_id,

View File

@@ -6,14 +6,21 @@ from .common import InfoExtractor
class FootyRoomIE(InfoExtractor):
_VALID_URL = r'http://footyroom\.com/(?P<id>[^/]+)'
_TEST = {
_TESTS = [{
'url': 'http://footyroom.com/schalke-04-0-2-real-madrid-2015-02/',
'info_dict': {
'id': 'schalke-04-0-2-real-madrid-2015-02',
'title': 'Schalke 04 0 2 Real Madrid',
},
'playlist_count': 3,
}
}, {
'url': 'http://footyroom.com/georgia-0-2-germany-2015-03/',
'info_dict': {
'id': 'georgia-0-2-germany-2015-03',
'title': 'Georgia 0 2 Germany',
},
'playlist_count': 1,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
@@ -36,6 +43,7 @@ class FootyRoomIE(InfoExtractor):
r'data-config="([^"]+)"', payload,
'playwire url', default=None)
if playwire_url:
entries.append(self.url_result(playwire_url, 'Playwire'))
entries.append(self.url_result(self._proto_relative_url(
playwire_url, 'http:'), 'Playwire'))
return self.playlist_result(entries, playlist_id, playlist_title)

View File

@@ -0,0 +1,32 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import smuggle_url
class FoxSportsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?foxsports\.com/(?:[^/]+/)*(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.foxsports.com/video?vid=432609859715',
'info_dict': {
'id': 'gA0bHB3Ladz3',
'ext': 'flv',
'title': 'Courtney Lee on going up 2-0 in series vs. Blazers',
'description': 'Courtney Lee talks about Memphis being focused.',
},
'add_ie': ['ThePlatform'],
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
config = self._parse_json(
self._search_regex(
r"data-player-config='([^']+)'", webpage, 'data player config'),
video_id)
return self.url_result(smuggle_url(
config['releaseURL'] + '&manifest=f4m', {'force_smil_url': True}))

View File

@@ -14,7 +14,9 @@ from ..utils import (
clean_html,
ExtractorError,
int_or_none,
float_or_none,
parse_duration,
determine_ext,
)
@@ -50,7 +52,8 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
if not video_url:
continue
format_id = video['format']
if video_url.endswith('.f4m'):
ext = determine_ext(video_url)
if ext == 'f4m':
if georestricted:
# See https://github.com/rg3/youtube-dl/issues/3963
# m3u8 urls work fine
@@ -60,12 +63,9 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
'http://hdfauth.francetv.fr/esi/urltokengen2.html?url=%s' % video_url_parsed.path,
video_id, 'Downloading f4m manifest token', fatal=False)
if f4m_url:
f4m_formats = self._extract_f4m_formats(f4m_url, video_id)
for f4m_format in f4m_formats:
f4m_format['preference'] = 1
formats.extend(f4m_formats)
elif video_url.endswith('.m3u8'):
formats.extend(self._extract_m3u8_formats(video_url, video_id, 'mp4'))
formats.extend(self._extract_f4m_formats(f4m_url, video_id, 1, format_id))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(video_url, video_id, 'mp4', m3u8_id=format_id))
elif video_url.startswith('rtmp'):
formats.append({
'url': video_url,
@@ -86,7 +86,7 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
'title': info['titre'],
'description': clean_html(info['synopsis']),
'thumbnail': compat_urlparse.urljoin('http://pluzz.francetv.fr', info['image']),
'duration': parse_duration(info['duree']),
'duration': float_or_none(info.get('real_duration'), 1000) or parse_duration(info['duree']),
'timestamp': int_or_none(info['diffusion']['timestamp']),
'formats': formats,
}
@@ -260,22 +260,28 @@ class CultureboxIE(FranceTVBaseInfoExtractor):
_VALID_URL = r'https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?P<name>.*?)(\?|$)'
_TEST = {
'url': 'http://culturebox.francetvinfo.fr/festivals/dans-les-jardins-de-william-christie/dans-les-jardins-de-william-christie-le-camus-162553',
'md5': '5ad6dec1ffb2a3fbcb20cc4b744be8d6',
'url': 'http://culturebox.francetvinfo.fr/live/musique/musique-classique/le-livre-vermeil-de-montserrat-a-la-cathedrale-delne-214511',
'md5': '9b88dc156781c4dbebd4c3e066e0b1d6',
'info_dict': {
'id': 'EV_22853',
'id': 'EV_50111',
'ext': 'flv',
'title': 'Dans les jardins de William Christie - Le Camus',
'description': 'md5:4710c82315c40f0c865ca8b9a68b5299',
'upload_date': '20140829',
'timestamp': 1409317200,
'title': "Le Livre Vermeil de Montserrat à la Cathédrale d'Elne",
'description': 'md5:f8a4ad202e8fe533e2c493cc12e739d9',
'upload_date': '20150320',
'timestamp': 1426892400,
'duration': 2760.9,
},
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
name = mobj.group('name')
webpage = self._download_webpage(url, name)
if ">Ce live n'est plus disponible en replay<" in webpage:
raise ExtractorError('Video %s is not available' % name, expected=True)
video_id, catalogue = self._search_regex(
r'"http://videos\.francetv\.fr/video/([^@]+@[^"]+)"', webpage, 'video id').split('@')

View File

@@ -0,0 +1,70 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
js_to_json,
parse_duration,
remove_start,
)
class GamersydeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gamersyde\.com/hqstream_(?P<display_id>[\da-z_]+)-(?P<id>\d+)_[a-z]{2}\.html'
_TEST = {
'url': 'http://www.gamersyde.com/hqstream_bloodborne_birth_of_a_hero-34371_en.html',
'md5': 'f38d400d32f19724570040d5ce3a505f',
'info_dict': {
'id': '34371',
'ext': 'mp4',
'duration': 372,
'title': 'Bloodborne - Birth of a hero',
'thumbnail': 're:^https?://.*\.jpg$',
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('display_id')
webpage = self._download_webpage(url, display_id)
playlist = self._parse_json(
self._search_regex(
r'(?s)playlist: \[({.+?})\]\s*}\);', webpage, 'files'),
display_id, transform_source=js_to_json)
formats = []
for source in playlist['sources']:
video_url = source.get('file')
if not video_url:
continue
format_id = source.get('label')
f = {
'url': video_url,
'format_id': format_id,
}
m = re.search(r'^(?P<height>\d+)[pP](?P<fps>\d+)fps', format_id)
if m:
f.update({
'height': int(m.group('height')),
'fps': int(m.group('fps')),
})
formats.append(f)
self._sort_formats(formats)
title = remove_start(playlist['title'], '%s - ' % video_id)
thumbnail = playlist.get('image')
duration = parse_duration(self._search_regex(
r'Length:</label>([^<]+)<', webpage, 'duration', fatal=False))
return {
'id': video_id,
'display_id': display_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}

View File

@@ -14,8 +14,8 @@ from ..utils import (
class GameSpotIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?gamespot\.com/.*-(?P<id>\d+)/?'
_TEST = {
_VALID_URL = r'http://(?:www\.)?gamespot\.com/.*-(?P<id>\d+)/?'
_TESTS = [{
'url': 'http://www.gamespot.com/videos/arma-3-community-guide-sitrep-i/2300-6410818/',
'md5': 'b2a30deaa8654fcccd43713a6b6a4825',
'info_dict': {
@@ -23,8 +23,16 @@ class GameSpotIE(InfoExtractor):
'ext': 'mp4',
'title': 'Arma 3 - Community Guide: SITREP I',
'description': 'Check out this video where some of the basics of Arma 3 is explained.',
}
}
},
}, {
'url': 'http://www.gamespot.com/videos/the-witcher-3-wild-hunt-xbox-one-now-playing/2300-6424837/',
'info_dict': {
'id': 'gs-2300-6424837',
'ext': 'flv',
'title': 'The Witcher 3: Wild Hunt [Xbox ONE] - Now Playing',
'description': 'Join us as we take a look at the early hours of The Witcher 3: Wild Hunt and more.',
},
}]
def _real_extract(self, url):
page_id = self._match_id(url)
@@ -32,25 +40,37 @@ class GameSpotIE(InfoExtractor):
data_video_json = self._search_regex(
r'data-video=["\'](.*?)["\']', webpage, 'data video')
data_video = json.loads(unescapeHTML(data_video_json))
streams = data_video['videoStreams']
# Transform the manifest url to a link to the mp4 files
# they are used in mobile devices.
f4m_url = data_video['videoStreams']['f4m_stream']
f4m_path = compat_urlparse.urlparse(f4m_url).path
QUALITIES_RE = r'((,\d+)+,?)'
qualities = self._search_regex(QUALITIES_RE, f4m_path, 'qualities').strip(',').split(',')
http_path = f4m_path[1:].split('/', 1)[1]
http_template = re.sub(QUALITIES_RE, r'%s', http_path)
http_template = http_template.replace('.csmil/manifest.f4m', '')
http_template = compat_urlparse.urljoin(
'http://video.gamespotcdn.com/', http_template)
formats = []
for q in qualities:
formats.append({
'url': http_template % q,
'ext': 'mp4',
'format_id': q,
})
f4m_url = streams.get('f4m_stream')
if f4m_url is not None:
# Transform the manifest url to a link to the mp4 files
# they are used in mobile devices.
f4m_path = compat_urlparse.urlparse(f4m_url).path
QUALITIES_RE = r'((,\d+)+,?)'
qualities = self._search_regex(QUALITIES_RE, f4m_path, 'qualities').strip(',').split(',')
http_path = f4m_path[1:].split('/', 1)[1]
http_template = re.sub(QUALITIES_RE, r'%s', http_path)
http_template = http_template.replace('.csmil/manifest.f4m', '')
http_template = compat_urlparse.urljoin(
'http://video.gamespotcdn.com/', http_template)
for q in qualities:
formats.append({
'url': http_template % q,
'ext': 'mp4',
'format_id': q,
})
else:
for quality in ['sd', 'hd']:
# It's actually a link to a flv file
flv_url = streams.get('f4m_{0}'.format(quality))
if flv_url is not None:
formats.append({
'url': flv_url,
'ext': 'flv',
'format_id': quality,
})
return {
'id': data_video['guid'],

View File

@@ -11,7 +11,7 @@ from ..utils import remove_end
class GDCVaultIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)/(?P<name>(\w|-)+)'
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)/(?P<name>(\w|-)+)?'
_NETRC_MACHINE = 'gdcvault'
_TESTS = [
{
@@ -19,6 +19,7 @@ class GDCVaultIE(InfoExtractor):
'md5': '7ce8388f544c88b7ac11c7ab1b593704',
'info_dict': {
'id': '1019721',
'display_id': 'Doki-Doki-Universe-Sweet-Simple',
'ext': 'mp4',
'title': 'Doki-Doki Universe: Sweet, Simple and Genuine (GDC Next 10)'
}
@@ -27,6 +28,7 @@ class GDCVaultIE(InfoExtractor):
'url': 'http://www.gdcvault.com/play/1015683/Embracing-the-Dark-Art-of',
'info_dict': {
'id': '1015683',
'display_id': 'Embracing-the-Dark-Art-of',
'ext': 'flv',
'title': 'Embracing the Dark Art of Mathematical Modeling in AI'
},
@@ -39,10 +41,15 @@ class GDCVaultIE(InfoExtractor):
'md5': 'a5eb77996ef82118afbbe8e48731b98e',
'info_dict': {
'id': '1015301',
'display_id': 'Thexder-Meets-Windows-95-or',
'ext': 'flv',
'title': 'Thexder Meets Windows 95, or Writing Great Games in the Windows 95 Environment',
},
'skip': 'Requires login',
},
{
'url': 'http://gdcvault.com/play/1020791/',
'only_matching': True,
}
]
@@ -90,7 +97,7 @@ class GDCVaultIE(InfoExtractor):
})
return video_formats
def _login(self, webpage_url, video_id):
def _login(self, webpage_url, display_id):
(username, password) = self._get_login_info()
if username is None or password is None:
self.report_warning('It looks like ' + webpage_url + ' requires a login. Try specifying a username and password and try again.')
@@ -107,9 +114,9 @@ class GDCVaultIE(InfoExtractor):
request = compat_urllib_request.Request(login_url, compat_urllib_parse.urlencode(login_form))
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
self._download_webpage(request, video_id, 'Logging in')
start_page = self._download_webpage(webpage_url, video_id, 'Getting authenticated video page')
self._download_webpage(logout_url, video_id, 'Logging out')
self._download_webpage(request, display_id, 'Logging in')
start_page = self._download_webpage(webpage_url, display_id, 'Getting authenticated video page')
self._download_webpage(logout_url, display_id, 'Logging out')
return start_page
@@ -117,8 +124,10 @@ class GDCVaultIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('name') or video_id
webpage_url = 'http://www.gdcvault.com/play/' + video_id
start_page = self._download_webpage(webpage_url, video_id)
start_page = self._download_webpage(webpage_url, display_id)
direct_url = self._search_regex(
r's1\.addVariable\("file",\s*encodeURIComponent\("(/[^"]+)"\)\);',
@@ -131,6 +140,7 @@ class GDCVaultIE(InfoExtractor):
return {
'id': video_id,
'display_id': display_id,
'url': video_url,
'ext': 'flv',
'title': title,
@@ -141,7 +151,7 @@ class GDCVaultIE(InfoExtractor):
start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, video_id)
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
@@ -159,7 +169,7 @@ class GDCVaultIE(InfoExtractor):
xml_name = self._html_search_regex(r'<iframe src=".*?\?xmlURL=xml/(?P<xml_file>.+?\.xml).*?".*?</iframe>', start_page, 'xml filename')
xml_decription_url = xml_root + 'xml/' + xml_name
xml_description = self._download_xml(xml_decription_url, video_id)
xml_description = self._download_xml(xml_decription_url, display_id)
video_title = xml_description.find('./metadata/title').text
video_formats = self._parse_mp4(xml_description)
@@ -168,6 +178,7 @@ class GDCVaultIE(InfoExtractor):
return {
'id': video_id,
'display_id': display_id,
'title': video_title,
'formats': video_formats,
}

View File

@@ -9,6 +9,8 @@ from .common import InfoExtractor
from .youtube import YoutubeIE
from ..compat import (
compat_urllib_parse,
compat_urllib_parse_unquote,
compat_urllib_request,
compat_urlparse,
compat_xml_parse_error,
)
@@ -29,10 +31,18 @@ from ..utils import (
xpath_text,
)
from .brightcove import BrightcoveIE
from .nbc import NBCSportsVPlayerIE
from .ooyala import OoyalaIE
from .rutv import RUTVIE
from .tvc import TVCIE
from .sportbox import SportBoxEmbedIE
from .smotri import SmotriIE
from .condenast import CondeNastIE
from .udn import UDNEmbedIE
from .senateisvp import SenateISVPIE
from .bliptv import BlipTVIE
from .svt import SVTIE
from .pornhub import PornHubIE
class GenericIE(InfoExtractor):
@@ -40,6 +50,97 @@ class GenericIE(InfoExtractor):
_VALID_URL = r'.*'
IE_NAME = 'generic'
_TESTS = [
# Direct link to a video
{
'url': 'http://media.w3.org/2010/05/sintel/trailer.mp4',
'md5': '67d406c2bcb6af27fa886f31aa934bbe',
'info_dict': {
'id': 'trailer',
'ext': 'mp4',
'title': 'trailer',
'upload_date': '20100513',
}
},
# Direct link to media delivered compressed (until Accept-Encoding is *)
{
'url': 'http://calimero.tk/muzik/FictionJunction-Parallel_Hearts.flac',
'md5': '128c42e68b13950268b648275386fc74',
'info_dict': {
'id': 'FictionJunction-Parallel_Hearts',
'ext': 'flac',
'title': 'FictionJunction-Parallel_Hearts',
'upload_date': '20140522',
},
'expected_warnings': [
'URL could be a direct video link, returning it as such.'
]
},
# Direct download with broken HEAD
{
'url': 'http://ai-radio.org:8000/radio.opus',
'info_dict': {
'id': 'radio',
'ext': 'opus',
'title': 'radio',
},
'params': {
'skip_download': True, # infinite live stream
},
'expected_warnings': [
r'501.*Not Implemented'
],
},
# Direct link with incorrect MIME type
{
'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
'md5': '4ccbebe5f36706d85221f204d7eb5913',
'info_dict': {
'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
'id': '5_Lennart_Poettering_-_Systemd',
'ext': 'webm',
'title': '5_Lennart_Poettering_-_Systemd',
'upload_date': '20141120',
},
'expected_warnings': [
'URL could be a direct video link, returning it as such.'
]
},
# RSS feed
{
'url': 'http://phihag.de/2014/youtube-dl/rss2.xml',
'info_dict': {
'id': 'http://phihag.de/2014/youtube-dl/rss2.xml',
'title': 'Zero Punctuation',
'description': 're:.*groundbreaking video review series.*'
},
'playlist_mincount': 11,
},
# RSS feed with enclosure
{
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
'info_dict': {
'id': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
'ext': 'm4v',
'upload_date': '20150228',
'title': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
}
},
# google redirect
{
'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
'info_dict': {
'id': 'cmQHVoWB5FY',
'ext': 'mp4',
'upload_date': '20130224',
'uploader_id': 'TheVerge',
'description': 're:^Chris Ziegler takes a look at the\.*',
'uploader': 'The Verge',
'title': 'First Firefox OS phones side-by-side',
},
'params': {
'skip_download': False,
}
},
{
'url': 'http://www.hodiho.fr/2013/02/regis-plante-sa-jeep.html',
'md5': '85b90ccc9d73b4acd9138d3af4c27f89',
@@ -119,17 +220,6 @@ class GenericIE(InfoExtractor):
'skip_download': True, # m3u8 download
},
},
# Direct link to a video
{
'url': 'http://media.w3.org/2010/05/sintel/trailer.mp4',
'md5': '67d406c2bcb6af27fa886f31aa934bbe',
'info_dict': {
'id': 'trailer',
'ext': 'mp4',
'title': 'trailer',
'upload_date': '20100513',
}
},
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
@@ -154,22 +244,6 @@ class GenericIE(InfoExtractor):
},
'add_ie': ['Ooyala'],
},
# google redirect
{
'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
'info_dict': {
'id': 'cmQHVoWB5FY',
'ext': 'mp4',
'upload_date': '20130224',
'uploader_id': 'TheVerge',
'description': 're:^Chris Ziegler takes a look at the\.*',
'uploader': 'The Verge',
'title': 'First Firefox OS phones side-by-side',
},
'params': {
'skip_download': False,
}
},
# embed.ly video
{
'url': 'http://www.tested.com/science/weird/460206-tested-grinding-coffee-2000-frames-second/',
@@ -219,6 +293,46 @@ class GenericIE(InfoExtractor):
'skip_download': True,
},
},
# TVC embed
{
'url': 'http://sch1298sz.mskobr.ru/dou_edu/karamel_ki/filial_galleries/video/iframe_src_http_tvc_ru_video_iframe_id_55304_isplay_false_acc_video_id_channel_brand_id_11_show_episodes_episode_id_32307_frameb/',
'info_dict': {
'id': '55304',
'ext': 'mp4',
'title': 'Дошкольное воспитание',
},
},
# SportBox embed
{
'url': 'http://www.vestifinance.ru/articles/25753',
'info_dict': {
'id': '25753',
'title': 'Вести Экономика ― Прямые трансляции с Форума-выставки "Госзаказ-2013"',
},
'playlist': [{
'info_dict': {
'id': '370908',
'title': 'Госзаказ. День 3',
'ext': 'mp4',
}
}, {
'info_dict': {
'id': '370905',
'title': 'Госзаказ. День 2',
'ext': 'mp4',
}
}, {
'info_dict': {
'id': '370902',
'title': 'Госзаказ. День 1',
'ext': 'mp4',
}
}],
'params': {
# m3u8 download
'skip_download': True,
},
},
# Embedded TED video
{
'url': 'http://en.support.wordpress.com/videos/ted-talks/',
@@ -370,16 +484,6 @@ class GenericIE(InfoExtractor):
'title': 'Busty Blonde Siri Tit Fuck While Wank at HandjobHub.com',
}
},
# RSS feed
{
'url': 'http://phihag.de/2014/youtube-dl/rss2.xml',
'info_dict': {
'id': 'http://phihag.de/2014/youtube-dl/rss2.xml',
'title': 'Zero Punctuation',
'description': 're:.*groundbreaking video review series.*'
},
'playlist_mincount': 11,
},
# Multiple brightcove videos
# https://github.com/rg3/youtube-dl/issues/2283
{
@@ -433,21 +537,6 @@ class GenericIE(InfoExtractor):
'uploader': 'thoughtworks.wistia.com',
},
},
# Direct download with broken HEAD
{
'url': 'http://ai-radio.org:8000/radio.opus',
'info_dict': {
'id': 'radio',
'ext': 'opus',
'title': 'radio',
},
'params': {
'skip_download': True, # infinite live stream
},
'expected_warnings': [
r'501.*Not Implemented'
],
},
# Soundcloud embed
{
'url': 'http://nakedsecurity.sophos.com/2014/10/29/sscc-171-are-you-sure-that-1234-is-a-bad-password-podcast/',
@@ -479,21 +568,6 @@ class GenericIE(InfoExtractor):
},
'playlist_mincount': 2,
},
# Direct link with incorrect MIME type
{
'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
'md5': '4ccbebe5f36706d85221f204d7eb5913',
'info_dict': {
'url': 'http://ftp.nluug.nl/video/nluug/2014-11-20_nj14/zaal-2/5_Lennart_Poettering_-_Systemd.webm',
'id': '5_Lennart_Poettering_-_Systemd',
'ext': 'webm',
'title': '5_Lennart_Poettering_-_Systemd',
'upload_date': '20141120',
},
'expected_warnings': [
'URL could be a direct video link, returning it as such.'
]
},
# Cinchcast embed
{
'url': 'http://undergroundwellness.com/podcasts/306-5-steps-to-permanent-gut-healing/',
@@ -613,22 +687,131 @@ class GenericIE(InfoExtractor):
'info_dict': {
'id': '100183293',
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • Тайна перевала Дятлова 1 серия 2 часть',
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
},
# RSS feed with enclosure
# Playwire embed
{
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
'url': 'http://www.cinemablend.com/new/First-Joe-Dirt-2-Trailer-Teaser-Stupid-Greatness-70874.html',
'info_dict': {
'id': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
'ext': 'm4v',
'upload_date': '20150228',
'title': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
'id': '3519514',
'ext': 'mp4',
'title': 'Joe Dirt 2 Beautiful Loser Teaser Trailer',
'thumbnail': 're:^https?://.*\.png$',
'duration': 45.115,
},
},
# 5min embed
{
'url': 'http://techcrunch.com/video/facebook-creates-on-this-day-crunch-report/518726732/',
'md5': '4c6f127a30736b59b3e2c19234ee2bf7',
'info_dict': {
'id': '518726732',
'ext': 'mp4',
'title': 'Facebook Creates "On This Day" | Crunch Report',
},
},
# SVT embed
{
'url': 'http://www.svt.se/sport/ishockey/jagr-tacklar-giroux-under-intervjun',
'info_dict': {
'id': '2900353',
'ext': 'flv',
'title': 'Här trycker Jagr till Giroux (under SVT-intervjun)',
'duration': 27,
'age_limit': 0,
},
},
# Crooks and Liars embed
{
'url': 'http://crooksandliars.com/2015/04/fox-friends-says-protecting-atheists',
'info_dict': {
'id': '8RUoRhRi',
'ext': 'mp4',
'title': "Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!",
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',
},
},
# Crooks and Liars external embed
{
'url': 'http://theothermccain.com/2010/02/02/video-proves-that-bill-kristol-has-been-watching-glenn-beck/comment-page-1/',
'info_dict': {
'id': 'MTE3MjUtMzQ2MzA',
'ext': 'mp4',
'title': 'md5:5e3662a81a4014d24c250d76d41a08d5',
'description': 'md5:9b8e9542d6c3c5de42d6451b7d780cec',
'timestamp': 1265032391,
'upload_date': '20100201',
'uploader': 'Heather',
},
},
# NBC Sports vplayer embed
{
'url': 'http://www.riderfans.com/forum/showthread.php?121827-Freeman&s=e98fa1ea6dc08e886b1678d35212494a',
'info_dict': {
'id': 'ln7x1qSThw4k',
'ext': 'flv',
'title': "PFT Live: New leader in the 'new-look' defense",
'description': 'md5:65a19b4bbfb3b0c0c5768bed1dfad74e',
},
},
# UDN embed
{
'url': 'http://www.udn.com/news/story/7314/822787',
'md5': 'fd2060e988c326991037b9aff9df21a6',
'info_dict': {
'id': '300346',
'ext': 'mp4',
'title': '中一中男師變性 全校師生力挺',
'thumbnail': 're:^https?://.*\.jpg$',
}
},
# Ooyala embed
{
'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
'info_dict': {
'id': '50YnY4czr4ms1vJ7yz3xzq0excz_pUMs',
'ext': 'mp4',
'description': 'VIDEO: Index/Match versus VLOOKUP.',
'title': 'This is what separates the Excel masters from the wannabes',
},
'params': {
# m3u8 downloads
'skip_download': True,
}
},
# Contains a SMIL manifest
{
'url': 'http://www.telewebion.com/fa/1263668/%D9%82%D8%B1%D8%B9%D9%87%E2%80%8C%DA%A9%D8%B4%DB%8C-%D9%84%DB%8C%DA%AF-%D9%82%D9%87%D8%B1%D9%85%D8%A7%D9%86%D8%A7%D9%86-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7/%2B-%D9%81%D9%88%D8%AA%D8%A8%D8%A7%D9%84.html',
'info_dict': {
'id': 'file',
'ext': 'flv',
'title': '+ Football: Lottery Champions League Europe',
'uploader': 'www.telewebion.com',
},
'params': {
# rtmpe downloads
'skip_download': True,
}
},
# Brightcove URL in single quotes
{
'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/',
'md5': '4ae374f1f8b91c889c4b9203c8c752af',
'info_dict': {
'id': '4255764656001',
'ext': 'mp4',
'title': 'SN Presents: Russell Martin, World Citizen',
'description': 'To understand why he was the Toronto Blue Jays top off-season priority is to appreciate his background and upbringing in Montreal, where he first developed his baseball skills. Written and narrated by Stephen Brunt.',
'uploader': 'Rogers Sportsnet',
},
}
]
@@ -750,7 +933,7 @@ class GenericIE(InfoExtractor):
force_videoid = smuggled_data['force_videoid']
video_id = force_videoid
else:
video_id = os.path.splitext(url.rstrip('/').split('/')[-1])[0]
video_id = compat_urllib_parse_unquote(os.path.splitext(url.rstrip('/').split('/')[-1])[0])
self.to_screen('%s: Requesting header' % video_id)
@@ -772,7 +955,9 @@ class GenericIE(InfoExtractor):
full_response = None
if head_response is False:
full_response = self._request_webpage(url, video_id)
request = compat_urllib_request.Request(url)
request.add_header('Accept-Encoding', '*')
full_response = self._request_webpage(request, video_id)
head_response = full_response
# Check for direct link to a video
@@ -783,7 +968,7 @@ class GenericIE(InfoExtractor):
head_response.headers.get('Last-Modified'))
return {
'id': video_id,
'title': os.path.splitext(url_basename(url))[0],
'title': compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0]),
'direct': True,
'formats': [{
'format_id': m.group('format_id'),
@@ -797,7 +982,17 @@ class GenericIE(InfoExtractor):
self._downloader.report_warning('Falling back on generic information extractor.')
if not full_response:
full_response = self._request_webpage(url, video_id)
request = compat_urllib_request.Request(url)
# Some webservers may serve compressed content of rather big size (e.g. gzipped flac)
# making it impossible to download only chunk of the file (yet we need only 512kB to
# test whether it's HTML or not). According to youtube-dl default Accept-Encoding
# that will always result in downloading the whole file that is not desirable.
# Therefore for extraction pass we have to override Accept-Encoding to any in order
# to accept raw bytes and being able to download only a chunk.
# It may probably better to solve this by checking Content-Type for application/octet-stream
# after HEAD request finishes, but not sure if we can rely on this.
request.add_header('Accept-Encoding', '*')
full_response = self._request_webpage(request, video_id)
# Maybe it's a direct link to a video?
# Be careful not to download the whole thing!
@@ -809,7 +1004,7 @@ class GenericIE(InfoExtractor):
head_response.headers.get('Last-Modified'))
return {
'id': video_id,
'title': os.path.splitext(url_basename(url))[0],
'title': compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0]),
'direct': True,
'url': url,
'upload_date': upload_date,
@@ -889,7 +1084,7 @@ class GenericIE(InfoExtractor):
# Look for embedded rtl.nl player
matches = re.findall(
r'<iframe\s+(?:[a-zA-Z-]+="[^"]+"\s+)*?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+video_embed[^"]+)"',
r'<iframe[^>]+?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+(?:video_)?embed[^"]+)"',
webpage)
if matches:
return _playlist_from_matches(matches, ie='RtlNl')
@@ -974,12 +1169,14 @@ class GenericIE(InfoExtractor):
}
# Look for embedded blip.tv player
mobj = re.search(r'<meta\s[^>]*https?://api\.blip\.tv/\w+/redirect/\w+/(\d+)', webpage)
if mobj:
return self.url_result('http://blip.tv/a/a-' + mobj.group(1), 'BlipTV')
mobj = re.search(r'<(?:iframe|embed|object)\s[^>]*(https?://(?:\w+\.)?blip\.tv/(?:play/|api\.swf#)[a-zA-Z0-9_]+)', webpage)
if mobj:
return self.url_result(mobj.group(1), 'BlipTV')
bliptv_url = BlipTVIE._extract_url(webpage)
if bliptv_url:
return self.url_result(bliptv_url, 'BlipTV')
# Look for SVT player
svt_url = SVTIE._extract_url(webpage)
if svt_url:
return self.url_result(svt_url, 'SVT')
# Look for embedded condenast player
matches = re.findall(
@@ -1033,7 +1230,8 @@ class GenericIE(InfoExtractor):
# Look for Ooyala videos
mobj = (re.search(r'player\.ooyala\.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage) or
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage))
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage) or
re.search(r'data-ooyala-video-id\s*=\s*[\'"](?P<ec>.{32})[\'"]', webpage))
if mobj is not None:
return OoyalaIE._build_url_result(mobj.group('ec'))
@@ -1114,6 +1312,27 @@ class GenericIE(InfoExtractor):
if rutv_url:
return self.url_result(rutv_url, 'RUTV')
# Look for embedded TVC player
tvc_url = TVCIE._extract_url(webpage)
if tvc_url:
return self.url_result(tvc_url, 'TVC')
# Look for embedded SportBox player
sportbox_urls = SportBoxEmbedIE._extract_urls(webpage)
if sportbox_urls:
return _playlist_from_matches(sportbox_urls, ie='SportBoxEmbed')
# Look for embedded PornHub player
pornhub_url = PornHubIE._extract_url(webpage)
if pornhub_url:
return self.url_result(pornhub_url, 'PornHub')
# Look for embedded Tvigle player
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//cloud\.tvigle\.ru/video/.+?)\1', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'), 'Tvigle')
# Look for embedded TED player
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>https?://embed(?:-ssl)?\.ted\.com/.+?)\1', webpage)
@@ -1191,6 +1410,10 @@ class GenericIE(InfoExtractor):
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>https?://m(?:lb)?\.mlb\.com/shared/video/embed/embed\.html\?.+?)\1',
webpage)
if not mobj:
mobj = re.search(
r'data-video-link=["\'](?P<url>http://m.mlb.com/video/[^"\']+)',
webpage)
if mobj is not None:
return self.url_result(mobj.group('url'), 'MLB')
@@ -1236,6 +1459,41 @@ class GenericIE(InfoExtractor):
if mobj is not None:
return self.url_result(mobj.group('url'), 'Pladform')
# Look for Playwire embeds
mobj = re.search(
r'<script[^>]+data-config=(["\'])(?P<url>(?:https?:)?//config\.playwire\.com/.+?)\1', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'))
# Look for 5min embeds
mobj = re.search(
r'<meta[^>]+property="og:video"[^>]+content="https?://embed\.5min\.com/(?P<id>[0-9]+)/?', webpage)
if mobj is not None:
return self.url_result('5min:%s' % mobj.group('id'), 'FiveMin')
# Look for Crooks and Liars embeds
mobj = re.search(
r'<(?:iframe[^>]+src|param[^>]+value)=(["\'])(?P<url>(?:https?:)?//embed\.crooksandliars\.com/(?:embed|v)/.+?)\1', webpage)
if mobj is not None:
return self.url_result(mobj.group('url'))
# Look for NBC Sports VPlayer embeds
nbc_sports_url = NBCSportsVPlayerIE._extract_url(webpage)
if nbc_sports_url:
return self.url_result(nbc_sports_url, 'NBCSportsVPlayer')
# Look for UDN embeds
mobj = re.search(
r'<iframe[^>]+src="(?P<url>%s)"' % UDNEmbedIE._VALID_URL, webpage)
if mobj is not None:
return self.url_result(
compat_urlparse.urljoin(url, mobj.group('url')), 'UDNEmbed')
# Look for Senate ISVP iframe
senate_isvp_url = SenateISVPIE._search_iframe_url(webpage)
if senate_isvp_url:
return self.url_result(senate_isvp_url, 'SenateISVP')
def check_video(vurl):
if YoutubeIE.suitable(vurl):
return True
@@ -1303,7 +1561,7 @@ class GenericIE(InfoExtractor):
if refresh_header:
found = re.search(REDIRECT_REGEX, refresh_header)
if found:
new_url = found.group(1)
new_url = compat_urlparse.urljoin(url, found.group(1))
self.report_following_redirect(new_url)
return {
'_type': 'url',
@@ -1325,13 +1583,22 @@ class GenericIE(InfoExtractor):
# here's a fun little line of code for you:
video_id = os.path.splitext(video_id)[0]
entries.append({
'id': video_id,
'url': video_url,
'uploader': video_uploader,
'title': video_title,
'age_limit': age_limit,
})
if determine_ext(video_url) == 'smil':
entries.append({
'id': video_id,
'formats': self._extract_smil_formats(video_url, video_id),
'uploader': video_uploader,
'title': video_title,
'age_limit': age_limit,
})
else:
entries.append({
'id': video_id,
'url': video_url,
'uploader': video_uploader,
'title': video_title,
'age_limit': age_limit,
})
if len(entries) == 1:
return entries[0]

View File

@@ -0,0 +1,90 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
float_or_none,
qualities,
)
class GfycatIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gfycat\.com/(?P<id>[^/?#]+)'
_TEST = {
'url': 'http://gfycat.com/DeadlyDecisiveGermanpinscher',
'info_dict': {
'id': 'DeadlyDecisiveGermanpinscher',
'ext': 'mp4',
'title': 'Ghost in the Shell',
'timestamp': 1410656006,
'upload_date': '20140914',
'uploader': 'anonymous',
'duration': 10.4,
'view_count': int,
'like_count': int,
'dislike_count': int,
'categories': list,
'age_limit': 0,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
gfy = self._download_json(
'http://gfycat.com/cajax/get/%s' % video_id,
video_id, 'Downloading video info')['gfyItem']
title = gfy.get('title') or gfy['gfyName']
description = gfy.get('description')
timestamp = int_or_none(gfy.get('createDate'))
uploader = gfy.get('userName')
view_count = int_or_none(gfy.get('views'))
like_count = int_or_none(gfy.get('likes'))
dislike_count = int_or_none(gfy.get('dislikes'))
age_limit = 18 if gfy.get('nsfw') == '1' else 0
width = int_or_none(gfy.get('width'))
height = int_or_none(gfy.get('height'))
fps = int_or_none(gfy.get('frameRate'))
num_frames = int_or_none(gfy.get('numFrames'))
duration = float_or_none(num_frames, fps) if num_frames and fps else None
categories = gfy.get('tags') or gfy.get('extraLemmas') or []
FORMATS = ('gif', 'webm', 'mp4')
quality = qualities(FORMATS)
formats = []
for format_id in FORMATS:
video_url = gfy.get('%sUrl' % format_id)
if not video_url:
continue
filesize = gfy.get('%sSize' % format_id)
formats.append({
'url': video_url,
'format_id': format_id,
'width': width,
'height': height,
'fps': fps,
'filesize': filesize,
'quality': quality(format_id),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'uploader': uploader,
'duration': duration,
'view_count': view_count,
'like_count': like_count,
'dislike_count': dislike_count,
'categories': categories,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -85,7 +85,8 @@ class GigaIE(InfoExtractor):
r'class="author">([^<]+)</a>', webpage, 'uploader', fatal=False)
view_count = str_to_int(self._search_regex(
r'<span class="views"><strong>([\d.]+)</strong>', webpage, 'view count', fatal=False))
r'<span class="views"><strong>([\d.,]+)</strong>',
webpage, 'view count', fatal=False))
return {
'id': video_id,

View File

@@ -15,10 +15,10 @@ from ..utils import (
class GorillaVidIE(InfoExtractor):
IE_DESC = 'GorillaVid.in, daclips.in, movpod.in and fastvideo.in'
IE_DESC = 'GorillaVid.in, daclips.in, movpod.in, fastvideo.in and realvid.net'
_VALID_URL = r'''(?x)
https?://(?P<host>(?:www\.)?
(?:daclips\.in|gorillavid\.in|movpod\.in|fastvideo\.in))/
(?:daclips\.in|gorillavid\.in|movpod\.in|fastvideo\.in|realvid\.net))/
(?:embed-)?(?P<id>[0-9a-zA-Z]+)(?:-[0-9]+x[0-9]+\.html)?
'''
@@ -35,13 +35,7 @@ class GorillaVidIE(InfoExtractor):
},
}, {
'url': 'http://gorillavid.in/embed-z08zf8le23c6-960x480.html',
'md5': 'c9e293ca74d46cad638e199c3f3fe604',
'info_dict': {
'id': 'z08zf8le23c6',
'ext': 'mp4',
'title': 'Say something nice',
'thumbnail': 're:http://.*\.jpg',
},
'only_matching': True,
}, {
'url': 'http://daclips.in/3rso4kdn6f9m',
'md5': '1ad8fd39bb976eeb66004d3a4895f106',
@@ -61,6 +55,15 @@ class GorillaVidIE(InfoExtractor):
'title': 'Man of Steel - Trailer',
'thumbnail': 're:http://.*\.jpg',
},
}, {
'url': 'http://realvid.net/ctn2y6p2eviw',
'md5': 'b2166d2cf192efd6b6d764c18fd3710e',
'info_dict': {
'id': 'ctn2y6p2eviw',
'ext': 'flv',
'title': 'rdx 1955',
'thumbnail': 're:http://.*\.jpg',
},
}, {
'url': 'http://movpod.in/0wguyyxi1yca',
'only_matching': True,
@@ -97,7 +100,7 @@ class GorillaVidIE(InfoExtractor):
webpage = self._download_webpage(req, video_id, 'Downloading video page')
title = self._search_regex(
r'style="z-index: [0-9]+;">([^<]+)</span>',
[r'style="z-index: [0-9]+;">([^<]+)</span>', r'>Watch (.+) '],
webpage, 'title', default=None) or self._og_search_title(webpage)
video_url = self._search_regex(
r'file\s*:\s*["\'](http[^"\']+)["\'],', webpage, 'file url')

View File

@@ -1,191 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import time
import math
import os.path
import re
from .common import InfoExtractor
from ..compat import (
compat_html_parser,
compat_urllib_parse,
compat_urllib_request,
compat_urlparse,
)
from ..utils import ExtractorError
class GroovesharkHtmlParser(compat_html_parser.HTMLParser):
def __init__(self):
self._current_object = None
self.objects = []
compat_html_parser.HTMLParser.__init__(self)
def handle_starttag(self, tag, attrs):
attrs = dict((k, v) for k, v in attrs)
if tag == 'object':
self._current_object = {'attrs': attrs, 'params': []}
elif tag == 'param':
self._current_object['params'].append(attrs)
def handle_endtag(self, tag):
if tag == 'object':
self.objects.append(self._current_object)
self._current_object = None
@classmethod
def extract_object_tags(cls, html):
p = cls()
p.feed(html)
p.close()
return p.objects
class GroovesharkIE(InfoExtractor):
_VALID_URL = r'https?://(www\.)?grooveshark\.com/#!/s/([^/]+)/([^/]+)'
_TEST = {
'url': 'http://grooveshark.com/#!/s/Jolene+Tenth+Key+Remix+Ft+Will+Sessions/6SS1DW?src=5',
'md5': '7ecf8aefa59d6b2098517e1baa530023',
'info_dict': {
'id': '6SS1DW',
'title': 'Jolene (Tenth Key Remix ft. Will Sessions)',
'ext': 'mp3',
'duration': 227,
}
}
do_playerpage_request = True
do_bootstrap_request = True
def _parse_target(self, target):
uri = compat_urlparse.urlparse(target)
hash = uri.fragment[1:].split('?')[0]
token = os.path.basename(hash.rstrip('/'))
return (uri, hash, token)
def _build_bootstrap_url(self, target):
(uri, hash, token) = self._parse_target(target)
query = 'getCommunicationToken=1&hash=%s&%d' % (compat_urllib_parse.quote(hash, safe=''), self.ts)
return (compat_urlparse.urlunparse((uri.scheme, uri.netloc, '/preload.php', None, query, None)), token)
def _build_meta_url(self, target):
(uri, hash, token) = self._parse_target(target)
query = 'hash=%s&%d' % (compat_urllib_parse.quote(hash, safe=''), self.ts)
return (compat_urlparse.urlunparse((uri.scheme, uri.netloc, '/preload.php', None, query, None)), token)
def _build_stream_url(self, meta):
return compat_urlparse.urlunparse(('http', meta['streamKey']['ip'], '/stream.php', None, None, None))
def _build_swf_referer(self, target, obj):
(uri, _, _) = self._parse_target(target)
return compat_urlparse.urlunparse((uri.scheme, uri.netloc, obj['attrs']['data'], None, None, None))
def _transform_bootstrap(self, js):
return re.split('(?m)^\s*try\s*\{', js)[0] \
.split(' = ', 1)[1].strip().rstrip(';')
def _transform_meta(self, js):
return js.split('\n')[0].split('=')[1].rstrip(';')
def _get_meta(self, target):
(meta_url, token) = self._build_meta_url(target)
self.to_screen('Metadata URL: %s' % meta_url)
headers = {'Referer': compat_urlparse.urldefrag(target)[0]}
req = compat_urllib_request.Request(meta_url, headers=headers)
res = self._download_json(req, token,
transform_source=self._transform_meta)
if 'getStreamKeyWithSong' not in res:
raise ExtractorError(
'Metadata not found. URL may be malformed, or Grooveshark API may have changed.')
if res['getStreamKeyWithSong'] is None:
raise ExtractorError(
'Metadata download failed, probably due to Grooveshark anti-abuse throttling. Wait at least an hour before retrying from this IP.',
expected=True)
return res['getStreamKeyWithSong']
def _get_bootstrap(self, target):
(bootstrap_url, token) = self._build_bootstrap_url(target)
headers = {'Referer': compat_urlparse.urldefrag(target)[0]}
req = compat_urllib_request.Request(bootstrap_url, headers=headers)
res = self._download_json(req, token, fatal=False,
note='Downloading player bootstrap data',
errnote='Unable to download player bootstrap data',
transform_source=self._transform_bootstrap)
return res
def _get_playerpage(self, target):
(_, _, token) = self._parse_target(target)
webpage = self._download_webpage(
target, token,
note='Downloading player page',
errnote='Unable to download player page',
fatal=False)
if webpage is not None:
# Search (for example German) error message
error_msg = self._html_search_regex(
r'<div id="content">\s*<h2>(.*?)</h2>', webpage,
'error message', default=None)
if error_msg is not None:
error_msg = error_msg.replace('\n', ' ')
raise ExtractorError('Grooveshark said: %s' % error_msg)
if webpage is not None:
o = GroovesharkHtmlParser.extract_object_tags(webpage)
return webpage, [x for x in o if x['attrs']['id'] == 'jsPlayerEmbed']
return webpage, None
def _real_initialize(self):
self.ts = int(time.time() * 1000) # timestamp in millis
def _real_extract(self, url):
(target_uri, _, token) = self._parse_target(url)
# 1. Fill cookiejar by making a request to the player page
swf_referer = None
if self.do_playerpage_request:
(_, player_objs) = self._get_playerpage(url)
if player_objs:
swf_referer = self._build_swf_referer(url, player_objs[0])
self.to_screen('SWF Referer: %s' % swf_referer)
# 2. Ask preload.php for swf bootstrap data to better mimic webapp
if self.do_bootstrap_request:
bootstrap = self._get_bootstrap(url)
self.to_screen('CommunicationToken: %s' % bootstrap['getCommunicationToken'])
# 3. Ask preload.php for track metadata.
meta = self._get_meta(url)
# 4. Construct stream request for track.
stream_url = self._build_stream_url(meta)
duration = int(math.ceil(float(meta['streamKey']['uSecs']) / 1000000))
post_dict = {'streamKey': meta['streamKey']['streamKey']}
post_data = compat_urllib_parse.urlencode(post_dict).encode('utf-8')
headers = {
'Content-Length': len(post_data),
'Content-Type': 'application/x-www-form-urlencoded'
}
if swf_referer is not None:
headers['Referer'] = swf_referer
return {
'id': token,
'title': meta['song']['Name'],
'http_method': 'POST',
'url': stream_url,
'ext': 'mp3',
'format': 'mp3 audio',
'duration': duration,
'http_post_data': post_data,
'http_headers': headers,
}

View File

@@ -25,7 +25,8 @@ class HistoricFilmsIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
tape_id = self._search_regex(
r'class="tapeId">([^<]+)<', webpage, 'tape id')
[r'class="tapeId"[^>]*>([^<]+)<', r'tapeId\s*:\s*"([^"]+)"'],
webpage, 'tape id')
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)

View File

@@ -10,6 +10,7 @@ from ..utils import (
float_or_none,
int_or_none,
compat_str,
determine_ext,
)
@@ -42,7 +43,8 @@ class HitboxIE(InfoExtractor):
def _extract_metadata(self, url, video_id):
thumb_base = 'https://edge.sf.hitbox.tv'
metadata = self._download_json(
'%s/%s' % (url, video_id), video_id)
'%s/%s' % (url, video_id), video_id,
'Downloading metadata JSON')
date = 'media_live_since'
media_type = 'livestream'
@@ -87,21 +89,41 @@ class HitboxIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
player_config = self._download_json(
'https://www.hitbox.tv/api/player/config/video/%s' % video_id,
video_id, 'Downloading video JSON')
formats = []
for video in player_config['clip']['bitrates']:
label = video.get('label')
if label == 'Auto':
continue
video_url = video.get('url')
if not video_url:
continue
bitrate = int_or_none(video.get('bitrate'))
if determine_ext(video_url) == 'm3u8':
if not video_url.startswith('http'):
continue
formats.append({
'url': video_url,
'ext': 'mp4',
'tbr': bitrate,
'format_note': label,
'protocol': 'm3u8_native',
})
else:
formats.append({
'url': video_url,
'tbr': bitrate,
'format_note': label,
})
self._sort_formats(formats)
metadata = self._extract_metadata(
'https://www.hitbox.tv/api/media/video',
video_id)
player_config = self._download_json(
'https://www.hitbox.tv/api/player/config/video/%s' % video_id,
video_id)
clip = player_config.get('clip')
video_url = clip.get('url')
res = clip.get('bitrates', [])[0].get('label')
metadata['resolution'] = res
metadata['url'] = video_url
metadata['protocol'] = 'm3u8'
metadata['formats'] = formats
return metadata
@@ -129,10 +151,6 @@ class HitboxLiveIE(HitboxIE):
def _real_extract(self, url):
video_id = self._match_id(url)
metadata = self._extract_metadata(
'https://www.hitbox.tv/api/media/live',
video_id)
player_config = self._download_json(
'https://www.hitbox.tv/api/player/config/live/%s' % video_id,
video_id)
@@ -147,20 +165,39 @@ class HitboxLiveIE(HitboxIE):
servers.append(base_url)
for stream in cdn.get('bitrates'):
label = stream.get('label')
if label != 'Auto':
if label == 'Auto':
continue
stream_url = stream.get('url')
if not stream_url:
continue
bitrate = int_or_none(stream.get('bitrate'))
if stream.get('provider') == 'hls' or determine_ext(stream_url) == 'm3u8':
if not stream_url.startswith('http'):
continue
formats.append({
'url': '%s/%s' % (base_url, stream.get('url')),
'url': stream_url,
'ext': 'mp4',
'vbr': stream.get('bitrate'),
'resolution': label,
'tbr': bitrate,
'format_note': label,
'rtmp_live': True,
})
else:
formats.append({
'url': '%s/%s' % (base_url, stream_url),
'ext': 'mp4',
'tbr': bitrate,
'rtmp_live': True,
'format_note': host,
'page_url': url,
'player_url': 'http://www.hitbox.tv/static/player/flowplayer/flowplayer.commercial-3.2.16.swf',
})
self._sort_formats(formats)
metadata = self._extract_metadata(
'https://www.hitbox.tv/api/media/live',
video_id)
metadata['formats'] = formats
metadata['is_live'] = True
metadata['title'] = self._live_title(metadata.get('title'))
return metadata

View File

@@ -1,36 +1,75 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
class IconosquareIE(InfoExtractor):
_VALID_URL = r'https?://(www\.)?(?:iconosquare\.com|statigr\.am)/p/(?P<id>[^/]+)'
_VALID_URL = r'https?://(?:www\.)?(?:iconosquare\.com|statigr\.am)/p/(?P<id>[^/]+)'
_TEST = {
'url': 'http://statigr.am/p/522207370455279102_24101272',
'md5': '6eb93b882a3ded7c378ee1d6884b1814',
'info_dict': {
'id': '522207370455279102_24101272',
'ext': 'mp4',
'uploader_id': 'aguynamedpatrick',
'title': 'Instagram photo by @aguynamedpatrick (Patrick Janelle)',
'title': 'Instagram media by @aguynamedpatrick (Patrick Janelle)',
'description': 'md5:644406a9ec27457ed7aa7a9ebcd4ce3d',
'timestamp': 1376471991,
'upload_date': '20130814',
'uploader': 'aguynamedpatrick',
'uploader_id': '24101272',
'comment_count': int,
'like_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
media = self._parse_json(
self._search_regex(
r'window\.media\s*=\s*({.+?});\n', webpage, 'media'),
video_id)
formats = [{
'url': f['url'],
'format_id': format_id,
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height'))
} for format_id, f in media['videos'].items()]
self._sort_formats(formats)
title = self._html_search_regex(
r'<title>(.+?)(?: *\(Videos?\))? \| (?:Iconosquare|Statigram)</title>',
webpage, 'title')
uploader_id = self._html_search_regex(
r'@([^ ]+)', title, 'uploader name', fatal=False)
timestamp = int_or_none(media.get('created_time') or media.get('caption', {}).get('created_time'))
description = media.get('caption', {}).get('text')
uploader = media.get('user', {}).get('username')
uploader_id = media.get('user', {}).get('id')
comment_count = int_or_none(media.get('comments', {}).get('count'))
like_count = int_or_none(media.get('likes', {}).get('count'))
thumbnails = [{
'url': t['url'],
'id': thumbnail_id,
'width': int_or_none(t.get('width')),
'height': int_or_none(t.get('height'))
} for thumbnail_id, t in media.get('images', {}).items()]
return {
'id': video_id,
'url': self._og_search_video_url(webpage),
'title': title,
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'uploader_id': uploader_id
'description': description,
'thumbnails': thumbnails,
'timestamp': timestamp,
'uploader': uploader,
'uploader_id': uploader_id,
'comment_count': comment_count,
'like_count': like_count,
'formats': formats,
}

View File

@@ -61,7 +61,7 @@ class IGNIE(InfoExtractor):
},
{
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
'md5': '4e9a0bda1e5eebd31ddcf86ec0b9b3c7',
'md5': '618fedb9c901fd086f6f093564ef8558',
'info_dict': {
'id': '078fdd005f6d3c02f63d795faa1b984f',
'ext': 'mp4',
@@ -77,10 +77,10 @@ class IGNIE(InfoExtractor):
def _find_video_id(self, webpage):
res_id = [
r'"video_id"\s*:\s*"(.*?)"',
r'class="hero-poster[^"]*?"[^>]*id="(.+?)"',
r'data-video-id="(.+?)"',
r'<object id="vid_(.+?)"',
r'<meta name="og:image" content=".*/(.+?)-(.+?)/.+.jpg"',
r'class="hero-poster[^"]*?"[^>]*id="(.+?)"',
]
return self._search_regex(res_id, webpage, 'video id')

View File

@@ -3,6 +3,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
int_or_none,
js_to_json,
@@ -12,7 +13,7 @@ from ..utils import (
class ImgurIE(InfoExtractor):
_VALID_URL = r'https?://(?:i\.)?imgur\.com/(?P<id>[a-zA-Z0-9]+)(?:\.mp4|\.gifv)?'
_VALID_URL = r'https?://(?:i\.)?imgur\.com/(?P<id>[a-zA-Z0-9]+)'
_TESTS = [{
'url': 'https://i.imgur.com/A61SaA1.gifv',
@@ -34,7 +35,8 @@ class ImgurIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_webpage(
compat_urlparse.urljoin(url, video_id), video_id)
width = int_or_none(self._search_regex(
r'<param name="width" value="([0-9]+)"',

View File

@@ -5,13 +5,14 @@ import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
limit_length,
)
class InstagramIE(InfoExtractor):
_VALID_URL = r'http://instagram\.com/p/(?P<id>.*?)/'
_VALID_URL = r'https://instagram\.com/p/(?P<id>[\da-zA-Z]+)'
_TEST = {
'url': 'http://instagram.com/p/aye83DjauH/?foo=bar#abc',
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
'md5': '0d2da106a9d2631273e192b372806516',
'info_dict': {
'id': 'aye83DjauH',
@@ -23,8 +24,8 @@ class InstagramIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
uploader_id = self._search_regex(r'"owner":{"username":"(.+?)"',
webpage, 'uploader id', fatal=False)
@@ -43,11 +44,11 @@ class InstagramIE(InfoExtractor):
class InstagramUserIE(InfoExtractor):
_VALID_URL = r'http://instagram\.com/(?P<username>[^/]{2,})/?(?:$|[?#])'
_VALID_URL = r'https://instagram\.com/(?P<username>[^/]{2,})/?(?:$|[?#])'
IE_DESC = 'Instagram user profile'
IE_NAME = 'instagram:user'
_TEST = {
'url': 'http://instagram.com/porsche',
'url': 'https://instagram.com/porsche',
'info_dict': {
'id': 'porsche',
'title': 'porsche',
@@ -102,11 +103,13 @@ class InstagramUserIE(InfoExtractor):
thumbnails_el = it.get('images', {})
thumbnail = thumbnails_el.get('thumbnail', {}).get('url')
title = it.get('caption', {}).get('text', it['id'])
# In some cases caption is null, which corresponds to None
# in python. As a result, it.get('caption', {}) gives None
title = (it.get('caption') or {}).get('text', it['id'])
entries.append({
'id': it['id'],
'title': title,
'title': limit_length(title, 80),
'formats': formats,
'thumbnail': thumbnail,
'webpage_url': it.get('link'),

View File

@@ -11,11 +11,12 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
remove_end,
)
class IPrimaIE(InfoExtractor):
_VALID_URL = r'https?://play\.iprima\.cz/[^?#]+/(?P<id>[^?#]+)'
_VALID_URL = r'https?://play\.iprima\.cz/(?:[^/]+/)*(?P<id>[^?#]+)'
_TESTS = [{
'url': 'http://play.iprima.cz/particka/particka-92',
@@ -23,7 +24,7 @@ class IPrimaIE(InfoExtractor):
'id': '39152',
'ext': 'flv',
'title': 'Partička (92)',
'description': 'md5:3740fda51464da35a2d4d0670b8e4fd6',
'description': 'md5:74e9617e51bca67c3ecfb2c6f9766f45',
'thumbnail': 'http://play.iprima.cz/sites/default/files/image_crops/image_620x349/3/491483_particka-92_image_620x349.jpg',
},
'params': {
@@ -35,13 +36,14 @@ class IPrimaIE(InfoExtractor):
'id': '9718337',
'ext': 'flv',
'title': 'Tchibo Partička - Jarní móda',
'description': 'md5:589f8f59f414220621ff8882eb3ce7be',
'thumbnail': 're:^http:.*\.jpg$',
},
'params': {
'skip_download': True, # requires rtmpdump
},
'skip': 'Do not have permission to access this page',
}, {
'url': 'http://play.iprima.cz/zpravy-ftv-prima-2752015',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -102,8 +104,10 @@ class IPrimaIE(InfoExtractor):
return {
'id': real_id,
'title': self._og_search_title(webpage),
'title': remove_end(self._og_search_title(webpage), ' | Prima PLAY'),
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats,
'description': self._og_search_description(webpage),
'description': self._search_regex(
r'<p[^>]+itemprop="description"[^>]*>([^<]+)',
webpage, 'description', default=None),
}

View File

@@ -0,0 +1,296 @@
# coding: utf-8
from __future__ import unicode_literals
import hashlib
import math
import os.path
import random
import re
import time
import uuid
import zlib
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
ExtractorError,
url_basename,
)
class IqiyiIE(InfoExtractor):
IE_NAME = 'iqiyi'
_VALID_URL = r'http://(?:www\.)iqiyi.com/v_.+?\.html'
_TESTS = [{
'url': 'http://www.iqiyi.com/v_19rrojlavg.html',
'md5': '2cb594dc2781e6c941a110d8f358118b',
'info_dict': {
'id': '9c1fb1b99d192b21c559e5a1a2cb3c73',
'title': '美国德州空中惊现奇异云团 酷似UFO',
'ext': 'f4v',
}
}, {
'url': 'http://www.iqiyi.com/v_19rrhnnclk.html',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb',
'title': '名侦探柯南第752集',
},
'playlist': [{
'md5': '7e49376fecaffa115d951634917fe105',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part1',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '41b75ba13bb7ac0e411131f92bc4f6ca',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part2',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '0cee1dd0a3d46a83e71e2badeae2aab0',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part3',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '4f8ad72373b0c491b582e7c196b0b1f9',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part4',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': 'd89ad028bcfad282918e8098e811711d',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part5',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '9cb1e5c95da25dff0660c32ae50903b7',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part6',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '155116e0ff1867bbc9b98df294faabc9',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part7',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}, {
'md5': '53f5db77622ae14fa493ed2a278a082b',
'info_dict': {
'id': 'e3f585b550a280af23c98b6cb2be19fb_part8',
'ext': 'f4v',
'title': '名侦探柯南第752集',
},
}],
}]
_FORMATS_MAP = [
('1', 'h6'),
('2', 'h5'),
('3', 'h4'),
('4', 'h3'),
('5', 'h2'),
('10', 'h1'),
]
def construct_video_urls(self, data, video_id, _uuid):
def do_xor(x, y):
a = y % 3
if a == 1:
return x ^ 121
if a == 2:
return x ^ 72
return x ^ 103
def get_encode_code(l):
a = 0
b = l.split('-')
c = len(b)
s = ''
for i in range(c - 1, -1, -1):
a = do_xor(int(b[c - i - 1], 16), i)
s += chr(a)
return s[::-1]
def get_path_key(x, format_id, segment_index):
mg = ')(*&^flash@#$%a'
tm = self._download_json(
'http://data.video.qiyi.com/t?tn=' + str(random.random()), video_id,
note='Download path key of segment %d for format %s' % (segment_index + 1, format_id)
)['t']
t = str(int(math.floor(int(tm) / (600.0))))
return hashlib.md5((t + mg + x).encode('utf8')).hexdigest()
video_urls_dict = {}
for format_item in data['vp']['tkl'][0]['vs']:
if 0 < int(format_item['bid']) <= 10:
format_id = self.get_format(format_item['bid'])
else:
continue
video_urls = []
video_urls_info = format_item['fs']
if not format_item['fs'][0]['l'].startswith('/'):
t = get_encode_code(format_item['fs'][0]['l'])
if t.endswith('mp4'):
video_urls_info = format_item['flvs']
for segment_index, segment in enumerate(video_urls_info):
vl = segment['l']
if not vl.startswith('/'):
vl = get_encode_code(vl)
key = get_path_key(
vl.split('/')[-1].split('.')[0], format_id, segment_index)
filesize = segment['b']
base_url = data['vp']['du'].split('/')
base_url.insert(-1, key)
base_url = '/'.join(base_url)
param = {
'su': _uuid,
'qyid': uuid.uuid4().hex,
'client': '',
'z': '',
'bt': '',
'ct': '',
'tn': str(int(time.time()))
}
api_video_url = base_url + vl + '?' + \
compat_urllib_parse.urlencode(param)
js = self._download_json(
api_video_url, video_id,
note='Download video info of segment %d for format %s' % (segment_index + 1, format_id))
video_url = js['l']
video_urls.append(
(video_url, filesize))
video_urls_dict[format_id] = video_urls
return video_urls_dict
def get_format(self, bid):
matched_format_ids = [_format_id for _bid, _format_id in self._FORMATS_MAP if _bid == str(bid)]
return matched_format_ids[0] if len(matched_format_ids) else None
def get_bid(self, format_id):
matched_bids = [_bid for _bid, _format_id in self._FORMATS_MAP if _format_id == format_id]
return matched_bids[0] if len(matched_bids) else None
def get_raw_data(self, tvid, video_id, enc_key, _uuid):
tm = str(int(time.time()))
param = {
'key': 'fvip',
'src': hashlib.md5(b'youtube-dl').hexdigest(),
'tvId': tvid,
'vid': video_id,
'vinfo': 1,
'tm': tm,
'enc': hashlib.md5(
(enc_key + tm + tvid).encode('utf8')).hexdigest(),
'qyid': _uuid,
'tn': random.random(),
'um': 0,
'authkey': hashlib.md5(
(tm + tvid).encode('utf8')).hexdigest()
}
api_url = 'http://cache.video.qiyi.com/vms' + '?' + \
compat_urllib_parse.urlencode(param)
raw_data = self._download_json(api_url, video_id)
return raw_data
def get_enc_key(self, swf_url, video_id):
filename, _ = os.path.splitext(url_basename(swf_url))
enc_key_json = self._downloader.cache.load('iqiyi-enc-key', filename)
if enc_key_json is not None:
return enc_key_json[0]
req = self._request_webpage(
swf_url, video_id, note='download swf content')
cn = req.read()
cn = zlib.decompress(cn[8:])
pt = re.compile(b'MixerRemote\x08(?P<enc_key>.+?)\$&vv')
enc_key = self._search_regex(pt, cn, 'enc_key').decode('utf8')
self._downloader.cache.store('iqiyi-enc-key', filename, [enc_key])
return enc_key
def _real_extract(self, url):
webpage = self._download_webpage(
url, 'temp_id', note='download video page')
tvid = self._search_regex(
r'data-player-tvid\s*=\s*[\'"](\d+)', webpage, 'tvid')
video_id = self._search_regex(
r'data-player-videoid\s*=\s*[\'"]([a-f\d]+)', webpage, 'video_id')
swf_url = self._search_regex(
r'(http://[^\'"]+MainPlayer[^.]+\.swf)', webpage, 'swf player URL')
_uuid = uuid.uuid4().hex
enc_key = self.get_enc_key(swf_url, video_id)
raw_data = self.get_raw_data(tvid, video_id, enc_key, _uuid)
if raw_data['code'] != 'A000000':
raise ExtractorError('Unable to load data. Error code: ' + raw_data['code'])
if not raw_data['data']['vp']['tkl']:
raise ExtractorError('No support iQiqy VIP video')
data = raw_data['data']
title = data['vi']['vn']
# generate video_urls_dict
video_urls_dict = self.construct_video_urls(
data, video_id, _uuid)
# construct info
entries = []
for format_id in video_urls_dict:
video_urls = video_urls_dict[format_id]
for i, video_url_info in enumerate(video_urls):
if len(entries) < i + 1:
entries.append({'formats': []})
entries[i]['formats'].append(
{
'url': video_url_info[0],
'filesize': video_url_info[-1],
'format_id': format_id,
'preference': int(self.get_bid(format_id))
}
)
for i in range(len(entries)):
self._sort_formats(entries[i]['formats'])
entries[i].update(
{
'id': '%s_part%d' % (video_id, i + 1),
'title': title,
}
)
if len(entries) > 1:
info = {
'_type': 'multi_video',
'id': video_id,
'title': title,
'entries': entries,
}
else:
info = entries[0]
info['id'] = video_id
info['title'] = title
return info

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
from ..utils import (
determine_ext,
float_or_none,
@@ -30,7 +31,7 @@ class IzleseneIE(InfoExtractor):
'description': 'md5:253753e2655dde93f59f74b572454f6d',
'thumbnail': 're:^http://.*\.jpg',
'uploader_id': 'pelikzzle',
'timestamp': 1404302298,
'timestamp': int,
'upload_date': '20140702',
'duration': 95.395,
'age_limit': 0,
@@ -46,7 +47,7 @@ class IzleseneIE(InfoExtractor):
'description': 'Tarkan Dortmund 2006 Konseri',
'thumbnail': 're:^http://.*\.jpg',
'uploader_id': 'parlayankiz',
'timestamp': 1163322193,
'timestamp': int,
'upload_date': '20061112',
'duration': 253.666,
'age_limit': 0,
@@ -67,9 +68,9 @@ class IzleseneIE(InfoExtractor):
uploader = self._html_search_regex(
r"adduserUsername\s*=\s*'([^']+)';",
webpage, 'uploader', fatal=False, default='')
webpage, 'uploader', fatal=False)
timestamp = parse_iso8601(self._html_search_meta(
'uploadDate', webpage, 'upload date', fatal=False))
'uploadDate', webpage, 'upload date'))
duration = float_or_none(self._html_search_regex(
r'"videoduration"\s*:\s*"([^"]+)"',
@@ -86,8 +87,7 @@ class IzleseneIE(InfoExtractor):
# Might be empty for some videos.
streams = self._html_search_regex(
r'"qualitylevel"\s*:\s*"([^"]+)"',
webpage, 'streams', fatal=False, default='')
r'"qualitylevel"\s*:\s*"([^"]+)"', webpage, 'streams', default='')
formats = []
if streams:
@@ -95,15 +95,15 @@ class IzleseneIE(InfoExtractor):
quality, url = re.search(r'\[(\w+)\](.+)', stream).groups()
formats.append({
'format_id': '%sp' % quality if quality else 'sd',
'url': url,
'url': compat_urllib_parse_unquote(url),
'ext': ext,
})
else:
stream_url = self._search_regex(
r'"streamurl"\s?:\s?"([^"]+)"', webpage, 'stream URL')
r'"streamurl"\s*:\s*"([^"]+)"', webpage, 'stream URL')
formats.append({
'format_id': 'sd',
'url': stream_url,
'url': compat_urllib_parse_unquote(stream_url),
'ext': ext,
})

View File

@@ -7,6 +7,7 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
srt_subtitles_timecode,
)
@@ -39,8 +40,8 @@ class KanalPlayIE(InfoExtractor):
'%s\r\n%s --> %s\r\n%s'
% (
num,
self._subtitles_timecode(item['startMillis'] / 1000.0),
self._subtitles_timecode(item['endMillis'] / 1000.0),
srt_subtitles_timecode(item['startMillis'] / 1000.0),
srt_subtitles_timecode(item['endMillis'] / 1000.0),
item['text'],
) for num, item in enumerate(subs, 1))

View File

@@ -0,0 +1,96 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
fix_xml_ampersands,
float_or_none,
xpath_with_ns,
xpath_text,
)
class KarriereVideosIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?karrierevideos\.at(?:/[^/]+)+/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.karrierevideos.at/berufsvideos/mittlere-hoehere-schulen/altenpflegerin',
'info_dict': {
'id': '32c91',
'ext': 'flv',
'title': 'AltenpflegerIn',
'description': 'md5:dbadd1259fde2159a9b28667cb664ae2',
'thumbnail': 're:^http://.*\.png',
},
'params': {
# rtmp download
'skip_download': True,
}
}, {
# broken ampersands
'url': 'http://www.karrierevideos.at/orientierung/vaeterkarenz-und-neue-chancen-fuer-muetter-baby-was-nun',
'info_dict': {
'id': '5sniu',
'ext': 'flv',
'title': 'Väterkarenz und neue Chancen für Mütter - "Baby - was nun?"',
'description': 'md5:97092c6ad1fd7d38e9d6a5fdeb2bcc33',
'thumbnail': 're:^http://.*\.png',
},
'params': {
# rtmp download
'skip_download': True,
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = (self._html_search_meta('title', webpage, default=None) or
self._search_regex(r'<h1 class="title">([^<]+)</h1>'))
video_id = self._search_regex(
r'/config/video/(.+?)\.xml', webpage, 'video id')
playlist = self._download_xml(
'http://www.karrierevideos.at/player-playlist.xml.php?p=%s' % video_id,
video_id, transform_source=fix_xml_ampersands)
NS_MAP = {
'jwplayer': 'http://developer.longtailvideo.com/trac/wiki/FlashFormats'
}
def ns(path):
return xpath_with_ns(path, NS_MAP)
item = playlist.find('./tracklist/item')
video_file = xpath_text(
item, ns('./jwplayer:file'), 'video url', fatal=True)
streamer = xpath_text(
item, ns('./jwplayer:streamer'), 'streamer', fatal=True)
uploader = xpath_text(
item, ns('./jwplayer:author'), 'uploader')
duration = float_or_none(
xpath_text(item, ns('./jwplayer:duration'), 'duration'))
description = self._html_search_regex(
r'(?s)<div class="leadtext">(.+?)</div>',
webpage, 'description')
thumbnail = self._html_search_meta(
'thumbnail', webpage, 'thumbnail')
if thumbnail:
thumbnail = compat_urlparse.urljoin(url, thumbnail)
return {
'id': video_id,
'url': streamer.replace('rtmpt', 'rtmp'),
'play_path': 'mp4:%s' % video_file,
'ext': 'flv',
'title': title,
'description': description,
'thumbnail': thumbnail,
'uploader': uploader,
'duration': duration,
}

Some files were not shown because too many files have changed in this diff Show More