Compare commits

..

138 Commits

Author SHA1 Message Date
Philipp Hagemeister
86be82610c release 2015.10.06.1 2015-10-06 17:43:50 +02:00
Philipp Hagemeister
4810c48d6d [compat] Do not compare None <= 0
The result is meaningless (and it emits a warning in cpython2 when called with -3), so handle None before making integer comparisons.
2015-10-06 14:30:43 +02:00
Philipp Hagemeister
c4af7684d8 release 2015.10.06 2015-10-06 09:08:10 +02:00
Sergey M
fcc2546269 [README.md] Markdown improvements 2015-10-06 02:31:49 +06:00
Sergey M․
40fbb05e1c [ustream] Fix tests 2015-10-05 22:52:51 +06:00
Sergey M․
dc5756fd77 [ustream] Fix typo 2015-10-05 22:51:04 +06:00
Sergey M․
41db733308 [ustream] Move filesize 2015-10-05 22:48:47 +06:00
Sergey M․
0bf219889e [ustream] Remove unused import 2015-10-05 22:44:59 +06:00
Sergey M․
f2a7ed77ef [tumblr] Remove redundant field 2015-10-05 22:44:36 +06:00
Sergey M․
4853eb63fe [ustream] Modernize 2015-10-05 22:40:20 +06:00
Sergey M․
5820c4a29e [ustream] Switch extraction to api 2015-10-05 22:30:38 +06:00
David Rabinowitz
7fd4ed9939 Fixed the ustream extractor to use the current ustream API 2015-10-05 22:30:14 +06:00
Sergey M․
88c86d211b [tumblr] Add missing fields for vidme test 2015-10-05 21:54:54 +06:00
Sergey M․
5d84b79a30 [tumblr] Remove redundant test 2015-10-05 21:53:59 +06:00
Sergey M․
140ac73965 [tumblr] Simplify and extract duration 2015-10-05 21:53:01 +06:00
Oli Allen
2a27e66234 [tumblr] Added support for HD video where available (#7036)
[tumblr] Replaced test URL for HD video as old one lead to 404

[tumblr] Don't make assumptions about video resolution, cleaner handling of no HD version available

[tumblr] Removed extraneous resolution key in HD video tests
2015-10-05 21:51:03 +06:00
Sergey M․
e759a00119 [appletrailers] Quotes consistency 2015-10-05 20:21:53 +06:00
Sergey M.
9d5332518c Merge pull request #6963 from remitamine/appledaily
[nextmedia] update AppleDailyIE tests
2015-10-05 20:12:24 +06:00
Sergey M․
90ab741e90 [pbs] Add test for #7059 2015-10-04 21:37:49 +06:00
Sergey M․
96229998c2 [pbs] Allow empty attribute in player regex 2015-10-04 21:19:47 +06:00
Sergey M․
0659dfccfe [pbs] Improve player regex (Closes #7059) 2015-10-04 21:13:13 +06:00
Sergey M․
9c544e2537 [limelight] Add test video with subtitles 2015-10-04 20:48:44 +06:00
Sergey M․
d7fc56318b [limelight] Fix python 2.6, simplify, make more robust (Closes #6734) 2015-10-04 20:42:35 +06:00
Sergey M․
4bba371644 [YoutubeDL] Autocalculate ext for subtitles when missing 2015-10-04 20:42:26 +06:00
remitamine
ef5acfe32d [limelight] Add new extractor 2015-10-04 20:42:18 +06:00
Sergey M.
85557f635a Merge pull request #7052 from remitamine/engadget
[engadget] accept short video urls
2015-10-03 22:36:49 +06:00
Naglis Jonaitis
60d23e5e59 [tapely] Improve _VALID_URL 2015-10-03 16:25:33 +03:00
remitamine
97d5bfcba6 [engadget] accept short video urls 2015-10-03 14:17:17 +01:00
Yen Chi Hsuan
bad84757eb [doc] Better formatting of youtube-dl.1 (closes #6510) 2015-10-03 00:01:10 +02:00
Yen Chi Hsuan
13118a50b8 [compat] Allow overriding by only COLUMNS or LINES in compat_get_terminal_size
Now the semantic of this function is identical to
shutil.get_terminal_size() in Python 3.3+. The new behavior also
corresponds to the old get_term_width(), which is removed in
003c69a84b
2015-10-03 00:00:33 +02:00
Yen Chi Hsuan
5495937f46 [options] Cleanup double spaces in help texts 2015-10-02 23:59:47 +02:00
Jaime Marquínez Ferrándiz
b203095d4c [europa] Style fix: add whitespace after comma 2015-10-02 22:40:35 +02:00
Sergey M․
f3b098fb90 [europa] Add support for audio URLs 2015-10-02 23:22:53 +06:00
Sergey M․
af17794c65 [europa] Improve extraction 2015-10-02 22:29:15 +06:00
ngld
3bb3f04108 [europa] Add new extractor 2015-10-02 21:30:07 +06:00
Sergey M․
59a9efe85b [ruutu] Limit resolution split to 2 pieces (Closes #7037, closes #7042) 2015-10-02 20:48:39 +06:00
fluks
0facd2af3e Fix ruutu extractor bug
If there's no resolution attribute in xml, only width gets a
value, height doesn't and ValueError is raised.
2015-10-02 20:48:14 +06:00
Jaime Marquínez Ferrándiz
7d0ada5ff9 [test/helper] Fix style
Use the correct indentation to please flake8
2015-10-02 13:42:11 +02:00
Jaime Marquínez Ferrándiz
44451f22d5 [naver] Remove unused import 2015-10-02 13:41:52 +02:00
Sergey M․
06c6efa970 [videolecturesnet] Add test video with broken direct format links 2015-10-01 23:10:36 +06:00
Sergey M․
e5851b963a [extractor/common] Make f4m extraction for SMIL non fatal 2015-10-01 23:04:56 +06:00
Sergey M․
4de6131090 [extractor/common] Add fatal to _extract_f4m_formats 2015-10-01 23:03:31 +06:00
Sergey M․
3a1341a7bc [extractor/common] Make m3u8 extraction for SMIL non fatal 2015-10-01 22:59:20 +06:00
Sergey M․
c78e48177c [extractor/common] Check validity of direct URLs 2015-10-01 22:54:54 +06:00
Sergey M․
6edaa0e25b [videolecturesnet] Add playlist test 2015-10-01 22:45:10 +06:00
Sergey M․
fb97809e64 [videolecturesnet] Improve playlist extraction 2015-10-01 22:44:51 +06:00
Sergey M․
0c996b9f48 [videolecturesnet] Add support for playlists (Closes #7031) 2015-10-01 22:39:38 +06:00
Sergey M․
acfb717a18 [videolecturesnet] Use generic SMIL extraction 2015-10-01 22:20:35 +06:00
Sergey M․
647eab4541 [extractor/common] Extract upload date from SMIL 2015-10-01 22:20:28 +06:00
Sergey M․
1e5bcdec02 [extractor/common] Extract images from SMIL 2015-10-01 22:20:21 +06:00
Sergey M․
e7d8e98a9f [extractor/common] Allow float bitrates 2015-10-01 22:20:15 +06:00
Sergey M․
2b3f951a2e [nrktv] Rework subtitles and eliminate downloading twice 2015-10-01 20:33:17 +06:00
Sergey M.
6751a1284d Merge pull request #7035 from jfremstad/nrk-fix-spelling
[nrk] Spelling
2015-10-01 19:56:21 +06:00
Joakim Fremstad
b83831df1f [nrk] Spelling 2015-10-01 14:58:49 +02:00
Sergey M․
f540b93706 [naver] Improve error regex 2015-10-01 02:33:48 +06:00
Sergey M․
8466336104 [vk] Detect vimeo embeds (Closes #7021) 2015-09-30 22:12:52 +06:00
Sergey M․
f88f1b40ce [test/helper] Clarify field for list length mismatch 2015-09-30 20:33:59 +06:00
Sergey M․
386a7b52d5 [test/helper] Spelling 2015-09-30 20:33:51 +06:00
Sergey M․
2e885de796 [test/helper] Formatting 2015-09-30 20:33:45 +06:00
Qijiang Fan
687c04cbb8 [test] use descriptive variable name 2015-09-30 20:33:35 +06:00
Qijiang Fan
40c931de4b [test] split expect_dict to two functions 2015-09-30 20:33:30 +06:00
Qijiang Fan
93bc7ef165 [test] recursively check dict and list in expect_info_dict
This allows to use md5:, re:, etc within the str inside a list
or dict.
2015-09-30 20:33:20 +06:00
Sergey M․
ee2d190253 [nfl] Add test for #7012 2015-09-30 20:06:21 +06:00
remitamine
aedb930cfc [nfl] fix content id regex(fixes #7012) 2015-09-30 20:03:10 +06:00
Philipp Hagemeister
c596ce91cd [comedycentral] Fix youtube-dl :thedailyshow
We'll let the generic IE follow the redirect and call back to us with the episode URL
2015-09-30 15:39:52 +02:00
Sergey M․
8a64969404 [adultswim] Prefer stream (Closes #7015) 2015-09-29 21:33:21 +06:00
Philipp Hagemeister
c254f75bbb release 2015.09.28 2015-09-28 04:42:11 +02:00
Sergey M․
86692c019c [keek] Strip title 2015-09-28 01:17:28 +06:00
Sergey M․
1ab1c4ef57 [keek] Improve uploader fields regexes 2015-09-28 01:15:13 +06:00
Sergey M․
926fb62eec [keek] Remove description
Since it equals title plus static suffix
2015-09-28 01:14:14 +06:00
Sergey M․
817690ff73 [keek] Make uploader fields non fatal 2015-09-28 01:05:24 +06:00
remitamine
98e1c935a1 [keek] extract uploader and uploader id with _search_regex 2015-09-28 01:03:22 +06:00
remitamine
f30e9976d6 [keek] add utf-8 coding cookie 2015-09-28 01:03:16 +06:00
remitamine
80e98aed69 [keek] fix test title 2015-09-28 01:03:12 +06:00
remitamine
6a24cb3d22 [keek] extract more info 2015-09-28 01:03:08 +06:00
remitamine
e13b9e7885 [keek] fix info extraction 2015-09-28 01:02:59 +06:00
Sergey M․
dd467d33d0 [extractor/generic] Add support for condenast script embeds (Closes #6885, closes #6991) 2015-09-27 05:55:48 +06:00
Sergey M․
c6b8f4d0c9 [condenast] Add support for JS embeds 2015-09-27 05:53:21 +06:00
Yen Chi Hsuan
95240b8093 Use insert for all sys.path manipulations
Closes #6867.
2015-09-26 22:04:41 +02:00
Sergey M․
2f962d0a91 [eagleplatform] Use http scheme for thumbnail 2015-09-27 01:17:44 +06:00
Sergey M․
3c63e1bb57 [eagleplatform] Make _handle_error staticmethod 2015-09-27 01:12:46 +06:00
Sergey M․
c471b34575 [eagleplatform] Simplify secure mp4 construction and clarify rationale 2015-09-27 01:10:39 +06:00
remitamine
d045f0bdb7 [eagleplatform] use http urls explicitly 2015-09-27 01:08:31 +06:00
remitamine
22becac4bd [eagleplatform] return the code to handle errors in all _download_json requests 2015-09-27 01:08:26 +06:00
remitamine
9d632b1b27 [eagleplatform] extract mp4 url and fix thumbnail url 2015-09-27 01:08:22 +06:00
Sergey M․
95c5e10103 [qqmusic] Allow [mm:ss] timestamps 2015-09-26 21:15:34 +06:00
Sergey M․
a5d09d684e [qqmusic] Use release_date 2015-09-26 21:08:23 +06:00
Sergey M․
8aab976bbd [extractor/common] Document release_date field 2015-09-26 21:07:54 +06:00
Sergey M․
26c6d1922e Credit @fqj1994 for qqmusic .lrc support 2015-09-26 21:01:28 +06:00
Sergey M․
cd1bb54990 [qqmusic] Add test for a song with non .lrc lyrics 2015-09-26 21:00:59 +06:00
Sergey M․
d4cd06138c [qqmusic] Do not capture braced text from the middle of the string 2015-09-26 20:54:41 +06:00
Sergey M․
961c5cbf17 [qqmusic] Eliminate _filter_lrc and use single quotes 2015-09-26 20:38:11 +06:00
Qijiang Fan
b65e5bb72f [qqmusic] Add subtitles for QQMusic
Use .lrc lyrics as subtitles if lyrics in lrc format exist.
2015-09-26 20:21:27 +06:00
Sergey M․
54914380c0 [bbc] Add test for programme that fails with iptv-all mediaset 2015-09-26 20:07:12 +06:00
Sergey M․
26ccc68bed [bbc] Clarify iptv-all mediaset rationale 2015-09-26 20:06:21 +06:00
Sergey M․
ee3d5a6d47 [bbc] Skip mediaselection on gelocation error (Closes #6983) 2015-09-26 19:57:17 +06:00
Sergey M․
46fde8a1a2 [extractor/generic] Use _extract_url for mtvservices 2015-09-26 19:47:20 +06:00
Sergey M․
fe1d858e35 [mtvservices:embedded] Add _extract_url 2015-09-26 19:46:42 +06:00
Sergey M․
fc42bc6ec9 [mtv] Look for sm4:video:embed (Closes #6936, closes #6970) 2015-09-26 19:45:43 +06:00
Yen Chi Hsuan
fe6ad195ae Merge pull request #6966 from remitamine/kuwo
[kuwo] fix title extraction and update test
2015-09-26 19:28:16 +08:00
remitamine
7193650641 [kuwo] treat the offline error as an expected ExtractorError 2015-09-26 11:44:35 +01:00
remitamine
5db34f680f [kuwo] check for the offline error page 2015-09-26 10:31:32 +01:00
Sergey M.
a82ba8d0ce Merge pull request #6978 from remitamine/fktv
[fktv] get format_id from video file ext
2015-09-26 13:01:21 +06:00
remitamine
3706fb5dc8 [fktv] get format_id from video file ext 2015-09-26 07:51:11 +01:00
Jaime Marquínez Ferrándiz
08bea4adde Also run tests with python 3.5 2015-09-25 22:34:02 +02:00
Jaime Marquínez Ferrándiz
4c917d0314 [README.md] Document the 'duration' field in the output template (#6929) 2015-09-25 22:02:48 +02:00
Jaime Marquínez Ferrándiz
4866b72eb2 [fktv] Don't redefine 'url' in list comprehension
Detected with flake8.
2015-09-25 21:58:45 +02:00
Yen Chi Hsuan
2d00be0477 Merge branch 'remitamine-fktv' 2015-09-25 19:28:26 +08:00
remitamine
3d09aa4c82 [kuwo] extract title inside element with class title exactly 2015-09-25 11:40:32 +01:00
remitamine
c44c7895b8 [kuwo] fix title extraction and update test 2015-09-25 11:28:26 +01:00
Yen Chi Hsuan
8de28761c4 [fktv] Fix a regex 2015-09-25 18:17:48 +08:00
Yen Chi Hsuan
711762f0b7 [fktv] Coding style 2015-09-25 18:01:08 +08:00
Yen Chi Hsuan
5773803961 [fktv] Correct thumbnail extraction and add the test 2015-09-25 17:58:44 +08:00
Yen Chi Hsuan
140359fc2c [fktv] Correct and improve some regexs 2015-09-25 17:51:48 +08:00
Yen Chi Hsuan
8ddf48d59f [fktv] Raise an error is no videos found 2015-09-25 17:48:51 +08:00
Yen Chi Hsuan
2e40a12225 [fktv] Correct spellings 2015-09-25 17:24:35 +08:00
Yen Chi Hsuan
dade7245af Merge branch 'fktv' of https://github.com/remitamine/youtube-dl into remitamine-fktv 2015-09-25 17:02:10 +08:00
remitamine
1f9fb20fcd [nextmedia] update AppleDailyIE tests 2015-09-25 07:39:22 +01:00
Sergey M․
0940c5b4c6 [condenast] Do not capture unused group in _VALID_URL 2015-09-25 05:18:45 +06:00
Sergey M․
42ca72dff3 [condenast] Keep acute accent 2015-09-25 05:15:21 +06:00
remitamine
2949a6cda9 [condenast] fix video info regex 2015-09-25 05:11:48 +06:00
remitamine
882fc9052e [condenast] fix extraction and add support for other sites 2015-09-25 05:11:38 +06:00
Sergey M․
9b166fc1f8 [iconosquare] Extract comments 2015-09-25 04:45:31 +06:00
Sergey M․
d4364f30bd [iconosquare] Revert title (Closes #6954) 2015-09-25 04:44:52 +06:00
remitamine
857421024d [iconosquare] fix info extraction 2015-09-25 04:36:15 +06:00
Sergey M.
80faa7a152 Merge pull request #6955 from atomic83/patch-1
More title extraction fixing.
2015-09-25 04:26:07 +06:00
atomic83
545a23f11b More title extraction fixing. 2015-09-24 23:05:32 +02:00
Sergey M.
caedb0721e Merge pull request #6952 from remitamine/hostingbulk
[hostingbulk] remove extractor
2015-09-25 02:08:26 +06:00
remitamine
47024eb564 [hostingbulk] remove extractor 2015-09-24 19:49:10 +01:00
Sergey M․
9c58885c70 [nhl:news] Add support for iframe embeds (Closes #6941) 2015-09-24 23:54:16 +06:00
Sergey M․
9fbd4b35a2 [nhl] Add support for embedded URLs 2015-09-24 23:48:23 +06:00
Sergey M․
05b476a270 [vidme] Prefer non clip (Closes #6924) 2015-09-24 23:38:53 +06:00
Sergey M․
4395ca2e04 [xhamster] Fix title extraction (Closes #6944) 2015-09-24 19:56:54 +06:00
Yen Chi Hsuan
19f93d906e [iqiyi] Use md5_text for all MD5 calls 2015-09-23 22:25:16 +08:00
Yen Chi Hsuan
57565375c8 [iqiyi] Fix extraction (fixes #6878) 2015-09-23 22:22:04 +08:00
Sergey M․
eb11cbe867 [soundcloud] Update client id (Closes #6930) 2015-09-23 19:54:40 +06:00
Sergey M․
f102819463 [downloader/hls] Pass http headers to downloader 2015-09-23 02:46:24 +06:00
remitamine
7b4137c351 [fktv] fix info extraction 2015-09-09 10:42:47 +01:00
54 changed files with 1331 additions and 626 deletions

View File

@@ -5,6 +5,7 @@ python:
- "3.2"
- "3.3"
- "3.4"
- "3.5"
sudo: false
script: nosetests test --verbose
notifications:

View File

@@ -143,3 +143,4 @@ Shaun Walbridge
Lee Jenkins
Anssi Hannula
Lukáš Lalinský
Qijiang Fan

View File

@@ -16,15 +16,15 @@ So please elaborate on what feature you are requesting, or what bug you want to
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
If your server has multiple IPs or you suspect censorship, adding --call-home may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
### Are you using the latest version?
Before reporting any issue, type youtube-dl -U. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
### Is the issue already documented?

408
README.md
View File

@@ -49,110 +49,220 @@ which means you can modify it, redistribute it or use it however you like.
# OPTIONS
-h, --help Print this help text and exit
--version Print program version and exit
-U, --update Update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)
-i, --ignore-errors Continue on download errors, for example to skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the playlist or the command line) if an error occurs
-U, --update Update this program to latest version. Make
sure that you have sufficient permissions
(run with sudo if needed)
-i, --ignore-errors Continue on download errors, for example to
skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the
playlist or the command line) if an error
occurs
--dump-user-agent Display the current browser identification
--list-extractors List all supported extractors
--extractor-descriptions Output descriptions of all supported extractors
--force-generic-extractor Force extraction to use the generic extractor
--default-search PREFIX Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The
default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.
--ignore-config Do not read configuration files. When given in the global configuration file /etc/youtube-dl.conf: Do not read the user configuration
in ~/.config/youtube-dl/config (%APPDATA%/youtube-dl/config.txt on Windows)
--flat-playlist Do not extract the videos of a playlist, only list them.
--extractor-descriptions Output descriptions of all supported
extractors
--force-generic-extractor Force extraction to use the generic
extractor
--default-search PREFIX Use this prefix for unqualified URLs. For
example "gvsearch2:" downloads two videos
from google videos for youtube-dl "large
apple". Use the value "auto" to let
youtube-dl guess ("auto_warning" to emit a
warning when guessing). "error" just throws
an error. The default value "fixup_error"
repairs broken URLs, but emits an error if
this is not possible instead of searching.
--ignore-config Do not read configuration files. When given
in the global configuration file /etc
/youtube-dl.conf: Do not read the user
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--flat-playlist Do not extract the videos of a playlist,
only list them.
--no-color Do not emit color codes in output
## Network Options:
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in an empty string (--proxy "") for direct connection
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in
an empty string (--proxy "") for direct
connection
--socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to (experimental)
-4, --force-ipv4 Make all connections via IPv4 (experimental)
-6, --force-ipv6 Make all connections via IPv6 (experimental)
--cn-verification-proxy URL Use this proxy to verify the IP address for some Chinese sites. The default proxy specified by --proxy (or none, if the options is
not present) is used for the actual downloading. (experimental)
--source-address IP Client-side IP address to bind to
(experimental)
-4, --force-ipv4 Make all connections via IPv4
(experimental)
-6, --force-ipv6 Make all connections via IPv6
(experimental)
--cn-verification-proxy URL Use this proxy to verify the IP address for
some Chinese sites. The default proxy
specified by --proxy (or none, if the
options is not present) is used for the
actual downloading. (experimental)
## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1)
--playlist-end NUMBER Playlist video to end at (default is last)
--playlist-items ITEM_SPEC Playlist video items to download. Specify indices of the videos in the playlist separated by commas like: "--playlist-items 1,2,5,8"
if you want to download videos indexed 1, 2, 5, 8 in the playlist. You can specify range: "--playlist-items 1-3,7,10-13", it will
download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX Download only matching titles (regex or caseless sub-string)
--reject-title REGEX Skip download for matching titles (regex or caseless sub-string)
--playlist-items ITEM_SPEC Playlist video items to download. Specify
indices of the videos in the playlist
separated by commas like: "--playlist-items
1,2,5,8" if you want to download videos
indexed 1, 2, 5, 8 in the playlist. You can
specify range: "--playlist-items
1-3,7,10-13", it will download the videos
at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX Download only matching titles (regex or
caseless sub-string)
--reject-title REGEX Skip download for matching titles (regex or
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE (e.g. 50k or 44.6m)
--min-filesize SIZE Do not download any videos smaller than
SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE
(e.g. 50k or 44.6m)
--date DATE Download only videos uploaded in this date
--datebefore DATE Download only videos uploaded on or before this date (i.e. inclusive)
--dateafter DATE Download only videos uploaded on or after this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than COUNT views
--max-views COUNT Do not download any videos with more than COUNT views
--match-filter FILTER Generic video filter (experimental). Specify any key (see help for -o for a list of available keys) to match if the key is present,
!key to check if the key is not present,key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to compare against
a number, and & to require multiple matches. Values which are not known are excluded unless you put a question mark (?) after the
operator.For example, to only match videos that have been liked more than 100 times and disliked less than 50 times (or the dislike
functionality is not available at the given service), but who also have a description, use --match-filter "like_count > 100 &
--datebefore DATE Download only videos uploaded on or before
this date (i.e. inclusive)
--dateafter DATE Download only videos uploaded on or after
this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than
COUNT views
--max-views COUNT Do not download any videos with more than
COUNT views
--match-filter FILTER Generic video filter (experimental).
Specify any key (see help for -o for a list
of available keys) to match if the key is
present, !key to check if the key is not
present,key > NUMBER (like "comment_count >
12", also works with >=, <, <=, !=, =) to
compare against a number, and & to require
multiple matches. Values which are not
known are excluded unless you put a
question mark (?) after the operator.For
example, to only match videos that have
been liked more than 100 times and disliked
less than 50 times (or the dislike
functionality is not available at the given
service), but who also have a description,
use --match-filter "like_count > 100 &
dislike_count <? 50 & description" .
--no-playlist Download only the video, if the URL refers to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to a video and a playlist.
--age-limit YEARS Download only videos suitable for the given age
--download-archive FILE Download only videos not listed in the archive file. Record the IDs of all downloaded videos in it.
--include-ads Download advertisements as well (experimental)
--no-playlist Download only the video, if the URL refers
to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to
a video and a playlist.
--age-limit YEARS Download only videos suitable for the given
age
--download-archive FILE Download only videos not listed in the
archive file. Record the IDs of all
downloaded videos in it.
--include-ads Download advertisements as well
(experimental)
## Download Options:
-r, --rate-limit LIMIT Maximum download rate in bytes per second (e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or "infinite".
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) (default is 1024)
--no-resize-buffer Do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.
-r, --rate-limit LIMIT Maximum download rate in bytes per second
(e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or
"infinite".
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer Do not automatically adjust the buffer
size. By default, the buffer size is
automatically resized from an initial value
of SIZE.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize Set file xattribute ytdl.filesize with expected filesize (experimental)
--hls-prefer-native Use the native HLS downloader instead of ffmpeg (experimental)
--external-downloader COMMAND Use the specified external downloader. Currently supports aria2c,axel,curl,httpie,wget
--external-downloader-args ARGS Give these arguments to the external downloader
--xattr-set-filesize Set file xattribute ytdl.filesize with
expected filesize (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg (experimental)
--external-downloader COMMAND Use the specified external downloader.
Currently supports
aria2c,axel,curl,httpie,wget
--external-downloader-args ARGS Give these arguments to the external
downloader
## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for stdin)
-a, --batch-file FILE File containing URLs to download ('-' for
stdin)
--id Use only video ID in file name
-o, --output TEMPLATE Output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(uploader_id)s for the uploader
nickname if different, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(format)s for
the format description (like "22 - 1280x720" or "HD"), %(format_id)s for the unique id of the format (like YouTube's itags: "137"),
%(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id,
%(playlist_title)s, %(playlist_id)s, or %(playlist)s (=title if present, ID otherwise) for the playlist the video is in,
%(playlist_index)s for the position in the playlist. %(height)s and %(width)s for the width and height of the video format.
%(resolution)s for a textual description of the resolution of the video format. %% for a literal percent. Use - to output to stdout.
Can also be used to download to a different directory, for example with -o '/my/downloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given
--restrict-filenames Restrict filenames to only ASCII characters, and avoid "&" and spaces in filenames
-A, --auto-number [deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000
-t, --title [deprecated] Use title in file name (default)
-o, --output TEMPLATE Output filename template. Use %(title)s to
get the title, %(uploader)s for the
uploader name, %(uploader_id)s for the
uploader nickname if different,
%(autonumber)s to get an automatically
incremented number, %(ext)s for the
filename extension, %(format)s for the
format description (like "22 - 1280x720" or
"HD"), %(format_id)s for the unique id of
the format (like YouTube's itags: "137"),
%(upload_date)s for the upload date
(YYYYMMDD), %(extractor)s for the provider
(youtube, metacafe, etc), %(id)s for the
video id, %(playlist_title)s,
%(playlist_id)s, or %(playlist)s (=title if
present, ID otherwise) for the playlist the
video is in, %(playlist_index)s for the
position in the playlist. %(height)s and
%(width)s for the width and height of the
video format. %(resolution)s for a textual
description of the resolution of the video
format. %% for a literal percent. Use - to
output to stdout. Can also be used to
download to a different directory, for
example with -o '/my/downloads/%(uploader)s
/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specify the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
is given
--restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in
filenames
-A, --auto-number [deprecated; use -o
"%(autonumber)s-%(title)s.%(ext)s" ] Number
downloaded files starting from 00000
-t, --title [deprecated] Use title in file name
(default)
-l, --literal [deprecated] Alias of --title
-w, --no-overwrites Do not overwrite files
-c, --continue Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
--no-continue Do not resume partially downloaded files (restart from beginning)
--no-part Do not use .part files - write directly into output file
--no-mtime Do not use the Last-modified header to set the file modification time
--write-description Write video description to a .description file
-c, --continue Force resume of partially downloaded files.
By default, youtube-dl will resume
downloads if possible.
--no-continue Do not resume partially downloaded files
(restart from beginning)
--no-part Do not use .part files - write directly
into output file
--no-mtime Do not use the Last-modified header to set
the file modification time
--write-description Write video description to a .description
file
--write-info-json Write video metadata to a .info.json file
--write-annotations Write video annotations to a .annotations.xml file
--load-info FILE JSON file containing the video information (created with the "--write-info-json" option)
--cookies FILE File to read cookies from and dump cookie jar in
--cache-dir DIR Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl
or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may
change.
--write-annotations Write video annotations to a
.annotations.xml file
--load-info FILE JSON file containing the video information
(created with the "--write-info-json"
option)
--cookies FILE File to read cookies from and dump cookie
jar in
--cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information
permanently. By default $XDG_CACHE_HOME
/youtube-dl or ~/.cache/youtube-dl . At the
moment, only YouTube player files (for
videos with obfuscated signatures) are
cached, but that may change.
--no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files
## Thumbnail images:
--write-thumbnail Write thumbnail image to disk
--write-all-thumbnails Write all thumbnail image formats to disk
--list-thumbnails Simulate and list all available thumbnail formats
--list-thumbnails Simulate and list all available thumbnail
formats
## Verbosity / Simulation Options:
-q, --quiet Activate quiet mode
--no-warnings Ignore warnings
-s, --simulate Do not download the video and do not write anything to disk
-s, --simulate Do not download the video and do not write
anything to disk
--skip-download Do not download the video
-g, --get-url Simulate, quiet but print URL
-e, --get-title Simulate, quiet but print title
@@ -162,78 +272,135 @@ which means you can modify it, redistribute it or use it however you like.
--get-duration Simulate, quiet but print video length
--get-filename Simulate, quiet but print output filename
--get-format Simulate, quiet but print output format
-j, --dump-json Simulate, quiet but print JSON information. See --output for a description of available keys.
-J, --dump-single-json Simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist
information in a single line.
--print-json Be quiet and print the video information as JSON (video is still being downloaded).
-j, --dump-json Simulate, quiet but print JSON information.
See --output for a description of available
keys.
-J, --dump-single-json Simulate, quiet but print JSON information
for each command-line argument. If the URL
refers to a playlist, dump the whole
playlist information in a single line.
--print-json Be quiet and print the video information as
JSON (video is still being downloaded).
--newline Output progress bar as new lines
--no-progress Do not print progress bar
--console-title Display progress in console titlebar
-v, --verbose Print various debugging information
--dump-pages Print downloaded pages encoded using base64 to debug problems (very verbose)
--write-pages Write downloaded intermediary pages to files in the current directory to debug problems
--dump-pages Print downloaded pages encoded using base64
to debug problems (very verbose)
--write-pages Write downloaded intermediary pages to
files in the current directory to debug
problems
--print-traffic Display sent and read HTTP traffic
-C, --call-home Contact the youtube-dl server for debugging
--no-call-home Do NOT contact the youtube-dl server for debugging
--no-call-home Do NOT contact the youtube-dl server for
debugging
## Workarounds:
--encoding ENCODING Force the specified encoding (experimental)
--no-check-certificate Suppress HTTPS certificate validation
--prefer-insecure Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)
--prefer-insecure Use an unencrypted connection to retrieve
information about the video. (Currently
supported only for YouTube)
--user-agent UA Specify a custom user agent
--referer URL Specify a custom referer, use if the video access is restricted to one domain
--add-header FIELD:VALUE Specify a custom HTTP header and its value, separated by a colon ':'. You can use this option multiple times
--bidi-workaround Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each download.
--referer URL Specify a custom referer, use if the video
access is restricted to one domain
--add-header FIELD:VALUE Specify a custom HTTP header and its value,
separated by a colon ':'. You can use this
option multiple times
--bidi-workaround Work around terminals that lack
bidirectional text support. Requires bidiv
or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each
download.
## Video Format Options:
-f, --format FORMAT Video format code, see the "FORMAT SELECTION" for all the info
-f, --format FORMAT Video format code, see the "FORMAT
SELECTION" for all the info
--all-formats Download all available video formats
--prefer-free-formats Prefer free video formats unless a specific one is requested
--prefer-free-formats Prefer free video formats unless a specific
one is requested
-F, --list-formats List all available formats
--youtube-skip-dash-manifest Do not download the DASH manifests and related data on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g. bestvideo+bestaudio), output to given container format. One of mkv, mp4, ogg, webm, flv. Ignored if no
merge is required
--youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
webm, flv. Ignored if no merge is required
## Subtitle Options:
--write-sub Write subtitle file
--write-auto-sub Write automatic subtitle file (YouTube only)
--all-subs Download all the available subtitles of the video
--write-auto-sub Write automatic subtitle file (YouTube
only)
--all-subs Download all the available subtitles of the
video
--list-subs List all available subtitles for the video
--sub-format FORMAT Subtitle format, accepts formats preference, for example: "srt" or "ass/srt/best"
--sub-lang LANGS Languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'
--sub-format FORMAT Subtitle format, accepts formats
preference, for example: "srt" or
"ass/srt/best"
--sub-lang LANGS Languages of the subtitles to download
(optional) separated by commas, use IETF
language tags like 'en,pt'
## Authentication Options:
-u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left out, youtube-dl will ask interactively.
-p, --password PASSWORD Account password. If this option is left
out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR Two-factor auth code
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku)
## Post-processing Options:
-x, --extract-audio Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "best" by default
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default
5)
--recode-video FORMAT Encode the video to another format if necessary (currently supported: mp4|flv|ogg|webm|mkv|avi)
-x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or
avprobe)
--audio-format FORMAT Specify audio format: "best", "aac",
"vorbis", "mp3", "m4a", "opus", or "wav";
"best" by default
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
a value between 0 (better) and 9 (worse)
for VBR or a specific bitrate like 128K
(default 5)
--recode-video FORMAT Encode the video to another format if
necessary (currently supported:
mp4|flv|ogg|webm|mkv|avi)
--postprocessor-args ARGS Give these arguments to the postprocessor
-k, --keep-video Keep the video file on disk after the post-processing; the video is erased by default
--no-post-overwrites Do not overwrite post-processed files; the post-processed files are overwritten by default
--embed-subs Embed subtitles in the video (only for mkv and mp4 videos)
-k, --keep-video Keep the video file on disk after the post-
processing; the video is erased by default
--no-post-overwrites Do not overwrite post-processed files; the
post-processed files are overwritten by
default
--embed-subs Embed subtitles in the video (only for mkv
and mp4 videos)
--embed-thumbnail Embed thumbnail in the audio as cover art
--add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title / artist from the video title. The format syntax is the same as --output, the parsed
parameters replace existing values. Additional templates: %(album)s, %(artist)s. Example: --metadata-from-title "%(artist)s -
%(title)s" matches a title like "Coldplay - Paradise"
--xattrs Write metadata to the video file's xattrs (using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn (the default;
fix file if we can, warn otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory.
--exec CMD Execute a command on the file after downloading, similar to find's -exec syntax. Example: --exec 'adb push {} /sdcard/Music/ && rm
{}'
--convert-subtitles FORMAT Convert the subtitles to other format (currently supported: srt|ass|vtt)
--metadata-from-title FORMAT Parse additional metadata like song title /
artist from the video title. The format
syntax is the same as --output, the parsed
parameters replace existing values.
Additional templates: %(album)s,
%(artist)s. Example: --metadata-from-title
"%(artist)s - %(title)s" matches a title
like "Coldplay - Paradise"
--xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the
file. One of never (do nothing), warn (only
emit a warning), detect_or_warn (the
default; fix file if we can, warn
otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the
postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its
containing directory.
--exec CMD Execute a command on the file after
downloading, similar to find's -exec
syntax. Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
--convert-subtitles FORMAT Convert the subtitles to other format
(currently supported: srt|ass|vtt)
# CONFIGURATION
@@ -281,6 +448,7 @@ The `-o` option allows users to indicate a template for the output file names. T
- `playlist`: The sequence will be replaced by the name or the id of the playlist that contains the video.
- `playlist_index`: The sequence will be replaced by the index of the video in the playlist padded with leading zeros according to the total length of the playlist.
- `format_id`: The sequence will be replaced by the format code specified by `--format`.
- `duration`: The sequence will be replaced by the length of the video in seconds.
The current default template is `%(title)s-%(id)s.%(ext)s`.
@@ -358,7 +526,7 @@ If you have installed youtube-dl with a package manager, pip, setup.py or a tarb
By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, [please file an issue where you explain that](https://yt-dl.org/bug)). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, the only option out of `-citw` that is regularly useful is `-i`.
### Can you please put the -b option back?
### Can you please put the `-b` option back?
Most people asking this question are not aware that youtube-dl now defaults to downloading the highest available quality as reported by YouTube, which will be 1080p or 720p in some cases, so you no longer need the `-b` option. For some specific videos, maybe YouTube does not report them to be available in a specific high quality format you're interested in. In that case, simply request it with the `-f` option and youtube-dl will try to download it.
@@ -370,13 +538,13 @@ Apparently YouTube requires you to pass a CAPTCHA test if you download too much.
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
### I extracted a video URL with -g, but it does not play on another machine / in my webbrowser.
### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
It may be beneficial to use IPv6; in some cases, the restrictions are only applied to IPv4. Some services (sometimes only for a subset of videos) do not restrict the video URL by IP address, cookie, or user-agent, but these are the exception rather than the rule.
Please bear in mind that some URL protocols are **not** supported by browsers out of the box, including RTMP. If you are using -g, your own downloader must support these as well.
Please bear in mind that some URL protocols are **not** supported by browsers out of the box, including RTMP. If you are using `-g`, your own downloader must support these as well.
If you want to play the video on a machine that is not running youtube-dl, you can relay the video content from the machine that runs youtube-dl. You can use `-o -` to let youtube-dl stream a video to stdout, or simply allow the player to download the files written by youtube-dl in turn.
@@ -642,15 +810,15 @@ So please elaborate on what feature you are requesting, or what bug you want to
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
If your server has multiple IPs or you suspect censorship, adding --call-home may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
### Are you using the latest version?
Before reporting any issue, type youtube-dl -U. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
### Is the issue already documented?

View File

@@ -5,7 +5,7 @@ import os
from os.path import dirname as dirn
import sys
sys.path.append(dirn(dirn((os.path.abspath(__file__)))))
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import youtube_dl
BASH_COMPLETION_FILE = "youtube-dl.bash-completion"

View File

@@ -6,7 +6,7 @@ import os
from os.path import dirname as dirn
import sys
sys.path.append(dirn(dirn((os.path.abspath(__file__)))))
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import youtube_dl
from youtube_dl.utils import shell_quote

View File

@@ -6,7 +6,7 @@ import os
import textwrap
# We must be able to import youtube_dl
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
import youtube_dl

View File

@@ -9,7 +9,7 @@ import sys
# Import youtube_dl
ROOT_DIR = os.path.join(os.path.dirname(__file__), '..')
sys.path.append(ROOT_DIR)
sys.path.insert(0, ROOT_DIR)
import youtube_dl

View File

@@ -8,6 +8,35 @@ import re
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
README_FILE = os.path.join(ROOT_DIR, 'README.md')
def filter_options(readme):
ret = ''
in_options = False
for line in readme.split('\n'):
if line.startswith('# '):
if line[2:].startswith('OPTIONS'):
in_options = True
else:
in_options = False
if in_options:
if line.lstrip().startswith('-'):
option, description = re.split(r'\s{2,}', line.lstrip())
split_option = option.split(' ')
if not split_option[-1].startswith('-'): # metavar
option = ' '.join(split_option[:-1] + ['*%s*' % split_option[-1]])
# Pandoc's definition_lists. See http://pandoc.org/README.html
# for more information.
ret += '\n%s\n: %s\n' % (option, description)
else:
ret += line.lstrip() + '\n'
else:
ret += line + '\n'
return ret
with io.open(README_FILE, encoding='utf-8') as f:
readme = f.read()
@@ -26,6 +55,8 @@ readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
readme = PREFIX + readme
readme = filter_options(readme)
if sys.version_info < (3, 0):
print(readme.encode('utf-8'))
else:

View File

@@ -5,7 +5,7 @@ import os
from os.path import dirname as dirn
import sys
sys.path.append(dirn(dirn((os.path.abspath(__file__)))))
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
import youtube_dl
ZSH_COMPLETION_FILE = "youtube-dl.zsh"

View File

@@ -101,7 +101,7 @@
- **ComCarCoff**
- **ComedyCentral**
- **ComedyCentralShows**: The Daily Show / The Colbert Report
- **CondeNast**: Condé Nast media group: Condé Nast, GQ, Glamour, Vanity Fair, Vogue, W Magazine, WIRED
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **Cracked**
- **Criterion**
- **CrooksAndLiars**
@@ -150,6 +150,7 @@
- **Escapist**
- **ESPN** (Currently broken)
- **EsriVideo**
- **Europa**
- **EveryonesMixtape**
- **exfm**: ex.fm
- **ExpoTV**
@@ -158,7 +159,6 @@
- **faz.net**
- **fc2**
- **fernsehkritik.tv**
- **fernsehkritik.tv:postecke**
- **Firstpost**
- **FiveTV**
- **Flickr**
@@ -208,7 +208,6 @@
- **hitbox**
- **hitbox:live**
- **HornBunny**
- **HostingBulk**
- **HotNewHipHop**
- **Howcast**
- **HowStuffWorks**
@@ -265,6 +264,9 @@
- **Libsyn**
- **life:embed**
- **lifenews**: LIFE | NEWS
- **limelight**
- **limelight:channel**
- **limelight:channel_list**
- **LiveLeak**
- **livestream**
- **livestream:original**

View File

@@ -89,66 +89,81 @@ def gettestcases(include_onlymatching=False):
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
def expect_info_dict(self, got_dict, expected_dict):
def expect_value(self, got, expected, field):
if isinstance(expected, compat_str) and expected.startswith('re:'):
match_str = expected[len('re:'):]
match_rex = re.compile(match_str)
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, field))
self.assertTrue(
match_rex.match(got),
'field %s (value: %r) should match %r' % (field, got, match_str))
elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
start_str = expected[len('startswith:'):]
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, field))
self.assertTrue(
got.startswith(start_str),
'field %s (value: %r) should start with %r' % (field, got, start_str))
elif isinstance(expected, compat_str) and expected.startswith('contains:'):
contains_str = expected[len('contains:'):]
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, field))
self.assertTrue(
contains_str in got,
'field %s (value: %r) should contain %r' % (field, got, contains_str))
elif isinstance(expected, type):
self.assertTrue(
isinstance(got, expected),
'Expected type %r for field %s, but got value %r of type %r' % (expected, field, got, type(got)))
elif isinstance(expected, dict) and isinstance(got, dict):
expect_dict(self, got, expected)
elif isinstance(expected, list) and isinstance(got, list):
self.assertEqual(
len(expected), len(got),
'Expect a list of length %d, but got a list of length %d for field %s' % (
len(expected), len(got), field))
for index, (item_got, item_expected) in enumerate(zip(got, expected)):
type_got = type(item_got)
type_expected = type(item_expected)
self.assertEqual(
type_expected, type_got,
'Type mismatch for list item at index %d for field %s, expected %r, got %r' % (
index, field, type_expected, type_got))
expect_value(self, item_got, item_expected, field)
else:
if isinstance(expected, compat_str) and expected.startswith('md5:'):
got = 'md5:' + md5(got)
elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
self.assertTrue(
isinstance(got, (list, dict)),
'Expected field %s to be a list or a dict, but it is of type %s' % (
field, type(got).__name__))
expected_num = int(expected.partition(':')[2])
assertGreaterEqual(
self, len(got), expected_num,
'Expected %d items in field %s, but only got %d' % (expected_num, field, len(got)))
return
self.assertEqual(
expected, got,
'Invalid value for field %s, expected %r, got %r' % (field, expected, got))
def expect_dict(self, got_dict, expected_dict):
for info_field, expected in expected_dict.items():
if isinstance(expected, compat_str) and expected.startswith('re:'):
got = got_dict.get(info_field)
match_str = expected[len('re:'):]
match_rex = re.compile(match_str)
got = got_dict.get(info_field)
expect_value(self, got, expected, info_field)
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, info_field))
self.assertTrue(
match_rex.match(got),
'field %s (value: %r) should match %r' % (info_field, got, match_str))
elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
got = got_dict.get(info_field)
start_str = expected[len('startswith:'):]
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, info_field))
self.assertTrue(
got.startswith(start_str),
'field %s (value: %r) should start with %r' % (info_field, got, start_str))
elif isinstance(expected, compat_str) and expected.startswith('contains:'):
got = got_dict.get(info_field)
contains_str = expected[len('contains:'):]
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, info_field))
self.assertTrue(
contains_str in got,
'field %s (value: %r) should contain %r' % (info_field, got, contains_str))
elif isinstance(expected, type):
got = got_dict.get(info_field)
self.assertTrue(isinstance(got, expected),
'Expected type %r for field %s, but got value %r of type %r' % (expected, info_field, got, type(got)))
else:
if isinstance(expected, compat_str) and expected.startswith('md5:'):
got = 'md5:' + md5(got_dict.get(info_field))
elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
got = got_dict.get(info_field)
self.assertTrue(
isinstance(got, (list, dict)),
'Expected field %s to be a list or a dict, but it is of type %s' % (
info_field, type(got).__name__))
expected_num = int(expected.partition(':')[2])
assertGreaterEqual(
self, len(got), expected_num,
'Expected %d items in field %s, but only got %d' % (
expected_num, info_field, len(got)
)
)
continue
else:
got = got_dict.get(info_field)
self.assertEqual(expected, got,
'invalid value for field %s, expected %r, got %r' % (info_field, expected, got))
def expect_info_dict(self, got_dict, expected_dict):
expect_dict(self, got_dict, expected_dict)
# Check for the presence of mandatory fields
if got_dict.get('_type') not in ('playlist', 'multi_video'):
for key in ('id', 'url', 'title', 'ext'):

View File

@@ -1,5 +1,5 @@
[tox]
envlist = py26,py27,py33,py34
envlist = py26,py27,py33,py34,py35
[testenv]
deps =
nose

View File

@@ -1232,13 +1232,20 @@ class YoutubeDL(object):
except (ValueError, OverflowError, OSError):
pass
subtitles = info_dict.get('subtitles')
if subtitles:
for _, subtitle in subtitles.items():
for subtitle_format in subtitle:
if 'ext' not in subtitle_format:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
if self.params.get('listsubtitles', False):
if 'automatic_captions' in info_dict:
self.list_subtitles(info_dict['id'], info_dict.get('automatic_captions'), 'automatic captions')
self.list_subtitles(info_dict['id'], info_dict.get('subtitles'), 'subtitles')
self.list_subtitles(info_dict['id'], subtitles, 'subtitles')
return
info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], info_dict.get('subtitles'),
info_dict['id'], subtitles,
info_dict.get('automatic_captions'))
# We now pick which formats have to be downloaded

View File

@@ -11,7 +11,7 @@ if __package__ is None and not hasattr(sys, "frozen"):
# direct call of __main__.py
import os.path
path = os.path.realpath(os.path.abspath(__file__))
sys.path.append(os.path.dirname(os.path.dirname(path)))
sys.path.insert(0, os.path.dirname(os.path.dirname(path)))
import youtube_dl

View File

@@ -416,26 +416,32 @@ if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
else:
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
def compat_get_terminal_size():
columns = compat_getenv('COLUMNS', None)
def compat_get_terminal_size(fallback=(80, 24)):
columns = compat_getenv('COLUMNS')
if columns:
columns = int(columns)
else:
columns = None
lines = compat_getenv('LINES', None)
lines = compat_getenv('LINES')
if lines:
lines = int(lines)
else:
lines = None
try:
sp = subprocess.Popen(
['stty', 'size'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
lines, columns = map(int, out.split())
except Exception:
pass
if columns is None or lines is None or columns <= 0 or lines <= 0:
try:
sp = subprocess.Popen(
['stty', 'size'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
_columns, _lines = map(int, out.split())
except Exception:
_columns, _lines = _terminal_size(*fallback)
if columns is None or columns <= 0:
columns = _columns
if lines is None or lines <= 0:
lines = _lines
return _terminal_size(columns, lines)
try:

View File

@@ -28,9 +28,18 @@ class HlsFD(FileDownloader):
return False
ffpp.check_version()
args = [
encodeArgument(opt)
for opt in (ffpp.executable, '-y', '-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc')]
args = [ffpp.executable, '-y']
if info_dict['http_headers']:
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in info_dict['http_headers'].items())]
args += ['-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc']
args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)

View File

@@ -158,6 +158,7 @@ from .eroprofile import EroProfileIE
from .escapist import EscapistIE
from .espn import ESPNIE
from .esri import EsriVideoIE
from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE
from .exfm import ExfmIE
from .expotv import ExpoTVIE
@@ -169,10 +170,7 @@ from .firstpost import FirstpostIE
from .firsttv import FirstTVIE
from .fivemin import FiveMinIE
from .fivetv import FiveTVIE
from .fktv import (
FKTVIE,
FKTVPosteckeIE,
)
from .fktv import FKTVIE
from .flickr import FlickrIE
from .folketinget import FolketingetIE
from .footyroom import FootyRoomIE
@@ -228,7 +226,6 @@ from .historicfilms import HistoricFilmsIE
from .history import HistoryIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hornbunny import HornBunnyIE
from .hostingbulk import HostingBulkIE
from .hotnewhiphop import HotNewHipHopIE
from .howcast import HowcastIE
from .howstuffworks import HowStuffWorksIE
@@ -298,6 +295,11 @@ from .lifenews import (
LifeNewsIE,
LifeEmbedIE,
)
from .limelight import (
LimelightMediaIE,
LimelightChannelIE,
LimelightChannelListIE,
)
from .liveleak import LiveLeakIE
from .livestream import (
LivestreamIE,

View File

@@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
xpath_text,
@@ -123,7 +124,6 @@ class AdultSwimIE(InfoExtractor):
else:
collections = bootstrapped_data['show']['collections']
collection, video_info = self.find_collection_containing_video(collections, episode_path)
# Video wasn't found in the collections, let's try `slugged_video`.
if video_info is None:
if bootstrapped_data.get('slugged_video', {}).get('slug') == episode_path:
@@ -133,7 +133,9 @@ class AdultSwimIE(InfoExtractor):
show = bootstrapped_data['show']
show_title = show['title']
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
stream = video_info.get('stream')
clips = [stream] if stream else video_info['clips']
segment_ids = [clip['videoPlaybackID'] for clip in clips]
episode_id = video_info['id']
episode_title = video_info['title']
@@ -142,7 +144,7 @@ class AdultSwimIE(InfoExtractor):
entries = []
for part_num, segment_id in enumerate(segment_ids):
segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=mobile' % segment_id
segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=desktop' % segment_id
segment_title = '%s - %s' % (show_title, episode_title)
if len(segment_ids) > 1:
@@ -158,17 +160,30 @@ class AdultSwimIE(InfoExtractor):
formats = []
file_els = idoc.findall('.//files/file') or idoc.findall('./files/file')
unique_urls = []
unique_file_els = []
for file_el in file_els:
media_url = file_el.text
if not media_url or determine_ext(media_url) == 'f4m':
continue
if file_el.text not in unique_urls:
unique_urls.append(file_el.text)
unique_file_els.append(file_el)
for file_el in unique_file_els:
bitrate = file_el.attrib.get('bitrate')
ftype = file_el.attrib.get('type')
formats.append({
'format_id': '%s_%s' % (bitrate, ftype),
'url': file_el.text.strip(),
# The bitrate may not be a number (for example: 'iphone')
'tbr': int(bitrate) if bitrate.isdigit() else None,
'quality': 1 if ftype == 'hd' else -1
})
media_url = file_el.text
if determine_ext(media_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media_url, segment_title, 'mp4', 'm3u8_native', preference=0, m3u8_id='hls'))
else:
formats.append({
'format_id': '%s_%s' % (bitrate, ftype),
'url': file_el.text.strip(),
# The bitrate may not be a number (for example: 'iphone')
'tbr': int(bitrate) if bitrate.isdigit() else None,
})
self._sort_formats(formats)

View File

@@ -13,53 +13,53 @@ from ..utils import (
class AppleTrailersIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)'
_TESTS = [{
"url": "http://trailers.apple.com/trailers/wb/manofsteel/",
'url': 'http://trailers.apple.com/trailers/wb/manofsteel/',
'info_dict': {
'id': 'manofsteel',
},
"playlist": [
'playlist': [
{
"md5": "d97a8e575432dbcb81b7c3acb741f8a8",
"info_dict": {
"id": "manofsteel-trailer4",
"ext": "mov",
"duration": 111,
"title": "Trailer 4",
"upload_date": "20130523",
"uploader_id": "wb",
'md5': 'd97a8e575432dbcb81b7c3acb741f8a8',
'info_dict': {
'id': 'manofsteel-trailer4',
'ext': 'mov',
'duration': 111,
'title': 'Trailer 4',
'upload_date': '20130523',
'uploader_id': 'wb',
},
},
{
"md5": "b8017b7131b721fb4e8d6f49e1df908c",
"info_dict": {
"id": "manofsteel-trailer3",
"ext": "mov",
"duration": 182,
"title": "Trailer 3",
"upload_date": "20130417",
"uploader_id": "wb",
'md5': 'b8017b7131b721fb4e8d6f49e1df908c',
'info_dict': {
'id': 'manofsteel-trailer3',
'ext': 'mov',
'duration': 182,
'title': 'Trailer 3',
'upload_date': '20130417',
'uploader_id': 'wb',
},
},
{
"md5": "d0f1e1150989b9924679b441f3404d48",
"info_dict": {
"id": "manofsteel-trailer",
"ext": "mov",
"duration": 148,
"title": "Trailer",
"upload_date": "20121212",
"uploader_id": "wb",
'md5': 'd0f1e1150989b9924679b441f3404d48',
'info_dict': {
'id': 'manofsteel-trailer',
'ext': 'mov',
'duration': 148,
'title': 'Trailer',
'upload_date': '20121212',
'uploader_id': 'wb',
},
},
{
"md5": "5fe08795b943eb2e757fa95cb6def1cb",
"info_dict": {
"id": "manofsteel-teaser",
"ext": "mov",
"duration": 93,
"title": "Teaser",
"upload_date": "20120721",
"uploader_id": "wb",
'md5': '5fe08795b943eb2e757fa95cb6def1cb',
'info_dict': {
'id': 'manofsteel-teaser',
'ext': 'mov',
'duration': 93,
'title': 'Teaser',
'upload_date': '20120721',
'uploader_id': 'wb',
},
},
]

View File

@@ -21,6 +21,9 @@ class BBCCoUkIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:(?:programmes|iplayer(?:/[^/]+)?/(?:episode|playlist))/)|music/clips[/#])(?P<id>[\da-z]{8})'
_MEDIASELECTOR_URLS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails
# with geolocation in some cases when it's even not geo restricted at all (e.g.
# http://www.bbc.co.uk/programmes/b06bp7lf)
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
]
@@ -153,6 +156,21 @@ class BBCCoUkIE(InfoExtractor):
'skip_download': True,
},
'skip': 'geolocation',
}, {
# iptv-all mediaset fails with geolocation however there is no geo restriction
# for this programme at all
'url': 'http://www.bbc.co.uk/programmes/b06bp7lf',
'info_dict': {
'id': 'b06bp7kf',
'ext': 'flv',
'title': "Annie Mac's Friday Night, B.Traits sits in for Annie",
'description': 'B.Traits sits in for Annie Mac with a Mini-Mix from Disclosure.',
'duration': 10800,
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
'only_matching': True,
@@ -294,7 +312,7 @@ class BBCCoUkIE(InfoExtractor):
return self._download_media_selector_url(
mediaselector_url % programme_id, programme_id)
except BBCCoUkIE.MediaSelectionError as e:
if e.id == 'notukerror':
if e.id in ('notukerror', 'geolocation'):
last_exception = e
continue
self._raise_extractor_error(e)

View File

@@ -151,12 +151,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
mobj = re.match(self._VALID_URL, url)
if mobj.group('shortname'):
if mobj.group('shortname') in ('tds', 'thedailyshow'):
url = 'http://thedailyshow.cc.com/full-episodes/'
else:
url = 'http://thecolbertreport.cc.com/full-episodes/'
mobj = re.match(self._VALID_URL, url, re.VERBOSE)
assert mobj is not None
return self.url_result('http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes')
if mobj.group('clip'):
if mobj.group('videotitle'):

View File

@@ -39,6 +39,7 @@ from ..utils import (
RegexNotFoundError,
sanitize_filename,
unescapeHTML,
unified_strdate,
url_basename,
xpath_text,
xpath_with_ns,
@@ -152,6 +153,7 @@ class InfoExtractor(object):
description: Full video description.
uploader: Full name of the video uploader.
creator: The main artist who created the video.
release_date: The date (YYYYMMDD) when the video was released.
timestamp: UNIX timestamp of the moment the video became available.
upload_date: Video upload date (YYYYMMDD).
If not explicitly set, calculated from timestamp.
@@ -163,6 +165,7 @@ class InfoExtractor(object):
with the "ext" entry and one of:
* "data": The subtitles file contents
* "url": A URL pointing to the subtitles file
"ext" will be calculated from URL if missing
automatic_captions: Like 'subtitles', used by the YoutubeIE for
automatically generated captions
duration: Length of the video in seconds, as an integer.
@@ -868,13 +871,18 @@ class InfoExtractor(object):
time.sleep(timeout)
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None,
transform_source=lambda s: fix_xml_ampersands(s).strip()):
transform_source=lambda s: fix_xml_ampersands(s).strip(),
fatal=True):
manifest = self._download_xml(
manifest_url, video_id, 'Downloading f4m manifest',
'Unable to download f4m manifest',
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244)
transform_source=transform_source)
transform_source=transform_source,
fatal=fatal)
if manifest is False:
return manifest
formats = []
manifest_version = '1.0'
@@ -895,7 +903,10 @@ class InfoExtractor(object):
# may differ leading to inability to resolve the format by requested
# bitrate in f4m downloader
if determine_ext(manifest_url) == 'f4m':
formats.extend(self._extract_f4m_formats(manifest_url, video_id, preference, f4m_id))
f4m_formats = self._extract_f4m_formats(
manifest_url, video_id, preference, f4m_id, fatal=fatal)
if f4m_formats:
formats.extend(f4m_formats)
continue
tbr = int_or_none(media_el.attrib.get('bitrate'))
formats.append({
@@ -1043,6 +1054,7 @@ class InfoExtractor(object):
video_id = os.path.splitext(url_basename(smil_url))[0]
title = None
description = None
upload_date = None
for meta in smil.findall(self._xpath_ns('./head/meta', namespace)):
name = meta.attrib.get('name')
content = meta.attrib.get('content')
@@ -1052,11 +1064,22 @@ class InfoExtractor(object):
title = content
elif not description and name in ('description', 'abstract'):
description = content
elif not upload_date and name == 'date':
upload_date = unified_strdate(content)
thumbnails = [{
'id': image.get('type'),
'url': image.get('src'),
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
} for image in smil.findall(self._xpath_ns('.//image', namespace)) if image.get('src')]
return {
'id': video_id,
'title': title or video_id,
'description': description,
'upload_date': upload_date,
'thumbnails': thumbnails,
'formats': formats,
'subtitles': subtitles,
}
@@ -1083,7 +1106,7 @@ class InfoExtractor(object):
if not src:
continue
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
filesize = int_or_none(video.get('size') or video.get('fileSize'))
width = int_or_none(video.get('width'))
height = int_or_none(video.get('height'))
@@ -1115,8 +1138,10 @@ class InfoExtractor(object):
src_url = src if src.startswith('http') else compat_urlparse.urljoin(base, src)
if proto == 'm3u8' or src_ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
src_url, video_id, ext or 'mp4', m3u8_id='hls'))
m3u8_formats = self._extract_m3u8_formats(
src_url, video_id, ext or 'mp4', m3u8_id='hls', fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
continue
if src_ext == 'f4m':
@@ -1128,10 +1153,12 @@ class InfoExtractor(object):
}
f4m_url += '&' if '?' in f4m_url else '?'
f4m_url += compat_urllib_parse.urlencode(f4m_params)
formats.extend(self._extract_f4m_formats(f4m_url, video_id, f4m_id='hds'))
f4m_formats = self._extract_f4m_formats(f4m_url, video_id, f4m_id='hds', fatal=False)
if f4m_formats:
formats.extend(f4m_formats)
continue
if src_url.startswith('http'):
if src_url.startswith('http') and self._is_valid_url(src, video_id):
http_count += 1
formats.append({
'url': src_url,

View File

@@ -2,7 +2,6 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..compat import (
@@ -12,6 +11,7 @@ from ..compat import (
)
from ..utils import (
orderedSet,
remove_end,
)
@@ -24,21 +24,33 @@ class CondeNastIE(InfoExtractor):
# The keys are the supported sites and the values are the name to be shown
# to the user and in the extractor description.
_SITES = {
'wired': 'WIRED',
'gq': 'GQ',
'vogue': 'Vogue',
'glamour': 'Glamour',
'wmagazine': 'W Magazine',
'vanityfair': 'Vanity Fair',
'allure': 'Allure',
'architecturaldigest': 'Architectural Digest',
'arstechnica': 'Ars Technica',
'bonappetit': 'Bon Appétit',
'brides': 'Brides',
'cnevids': 'Condé Nast',
'cntraveler': 'Condé Nast Traveler',
'details': 'Details',
'epicurious': 'Epicurious',
'glamour': 'Glamour',
'golfdigest': 'Golf Digest',
'gq': 'GQ',
'newyorker': 'The New Yorker',
'self': 'SELF',
'teenvogue': 'Teen Vogue',
'vanityfair': 'Vanity Fair',
'vogue': 'Vogue',
'wired': 'WIRED',
'wmagazine': 'W Magazine',
}
_VALID_URL = r'http://(video|www|player)\.(?P<site>%s)\.com/(?P<type>watch|series|video|embed)/(?P<id>[^/?#]+)' % '|'.join(_SITES.keys())
_VALID_URL = r'http://(?:video|www|player)\.(?P<site>%s)\.com/(?P<type>watch|series|video|embed(?:js)?)/(?P<id>[^/?#]+)' % '|'.join(_SITES.keys())
IE_DESC = 'Condé Nast media group: %s' % ', '.join(sorted(_SITES.values()))
EMBED_URL = r'(?:https?:)?//player\.(?P<site>%s)\.com/(?P<type>embed)/.+?' % '|'.join(_SITES.keys())
EMBED_URL = r'(?:https?:)?//player\.(?P<site>%s)\.com/(?P<type>embed(?:js)?)/.+?' % '|'.join(_SITES.keys())
_TEST = {
_TESTS = [{
'url': 'http://video.wired.com/watch/3d-printed-speakers-lit-with-led',
'md5': '1921f713ed48aabd715691f774c451f7',
'info_dict': {
@@ -47,7 +59,16 @@ class CondeNastIE(InfoExtractor):
'title': '3D Printed Speakers Lit With LED',
'description': 'Check out these beautiful 3D printed LED speakers. You can\'t actually buy them, but LumiGeek is working on a board that will let you make you\'re own.',
}
}
}, {
# JS embed
'url': 'http://player.cnevids.com/embedjs/55f9cf8b61646d1acf00000c/5511d76261646d5566020000.js',
'md5': 'f1a6f9cafb7083bab74a710f65d08999',
'info_dict': {
'id': '55f9cf8b61646d1acf00000c',
'ext': 'mp4',
'title': '3D printed TSA Travel Sentry keys really do open TSA locks',
}
}]
def _extract_series(self, url, webpage):
title = self._html_search_regex(r'<div class="cne-series-info">.*?<h1>(.+?)</h1>',
@@ -86,8 +107,8 @@ class CondeNastIE(InfoExtractor):
info_url = base_info_url + data
info_page = self._download_webpage(info_url, video_id,
'Downloading video info')
video_info = self._search_regex(r'var video = ({.+?});', info_page, 'video info')
video_info = json.loads(video_info)
video_info = self._search_regex(r'var\s+video\s*=\s*({.+?});', info_page, 'video info')
video_info = self._parse_json(video_info, video_id)
formats = [{
'format_id': '%s-%s' % (fdata['type'].split('/')[-1], fdata['quality']),
@@ -111,6 +132,13 @@ class CondeNastIE(InfoExtractor):
url_type = mobj.group('type')
item_id = mobj.group('id')
# Convert JS embed to regular embed
if url_type == 'embedjs':
parsed_url = compat_urlparse.urlparse(url)
url = compat_urlparse.urlunparse(parsed_url._replace(
path=remove_end(parsed_url.path, '.js').replace('/embedjs/', '/embed/')))
url_type = 'embed'
self.to_screen('Extracting from %s with the Condé Nast extractor' % self._SITES[site])
webpage = self._download_webpage(url, item_id)

View File

@@ -21,7 +21,7 @@ class EaglePlatformIE(InfoExtractor):
_TESTS = [{
# http://lenta.ru/news/2015/03/06/navalny/
'url': 'http://lentaru.media.eagleplatform.com/index/player?player=new&record_id=227304&player_template_id=5201',
'md5': '0b7994faa2bd5c0f69a3db6db28d078d',
'md5': '70f5187fb620f2c1d503b3b22fd4efe3',
'info_dict': {
'id': '227304',
'ext': 'mp4',
@@ -36,7 +36,7 @@ class EaglePlatformIE(InfoExtractor):
# http://muz-tv.ru/play/7129/
# http://media.clipyou.ru/index/player?record_id=12820&width=730&height=415&autoplay=true
'url': 'eagleplatform:media.clipyou.ru:12820',
'md5': '6c2ebeab03b739597ce8d86339d5a905',
'md5': '90b26344ba442c8e44aa4cf8f301164a',
'info_dict': {
'id': '12820',
'ext': 'mp4',
@@ -48,7 +48,8 @@ class EaglePlatformIE(InfoExtractor):
'skip': 'Georestricted',
}]
def _handle_error(self, response):
@staticmethod
def _handle_error(response):
status = int_or_none(response.get('status', 200))
if status != 200:
raise ExtractorError(' '.join(response['errors']), expected=True)
@@ -58,6 +59,9 @@ class EaglePlatformIE(InfoExtractor):
self._handle_error(response)
return response
def _get_video_url(self, url_or_request, video_id, note='Downloading JSON metadata'):
return self._download_json(url_or_request, video_id, note)['data'][0]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id')
@@ -69,7 +73,7 @@ class EaglePlatformIE(InfoExtractor):
title = media['title']
description = media.get('description')
thumbnail = media.get('snapshot')
thumbnail = self._proto_relative_url(media.get('snapshot'), 'http:')
duration = int_or_none(media.get('duration'))
view_count = int_or_none(media.get('views'))
@@ -78,13 +82,20 @@ class EaglePlatformIE(InfoExtractor):
if age_restriction:
age_limit = 0 if age_restriction == 'allow_all' else 18
m3u8_data = self._download_json(
self._proto_relative_url(media['sources']['secure_m3u8']['auto'], 'http:'),
video_id, 'Downloading m3u8 JSON')
secure_m3u8 = self._proto_relative_url(media['sources']['secure_m3u8']['auto'], 'http:')
m3u8_url = self._get_video_url(secure_m3u8, video_id, 'Downloading m3u8 JSON')
formats = self._extract_m3u8_formats(
m3u8_data['data'][0], video_id,
m3u8_url, video_id,
'mp4', entry_protocol='m3u8_native')
mp4_url = self._get_video_url(
# Secure mp4 URL is constructed according to Player.prototype.mp4 from
# http://lentaru.media.eagleplatform.com/player/player.js
re.sub(r'm3u8|hlsvod|hls|f4m', 'mp4', secure_m3u8),
video_id, 'Downloading mp4 JSON')
formats.append({'url': mp4_url, 'format_id': 'mp4'})
self._sort_formats(formats)
return {

View File

@@ -10,7 +10,7 @@ from ..utils import (
class EngadgetIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://www.engadget.com/
(?:video/5min/(?P<id>\d+)|
(?:video(?:/5min)?/(?P<id>\d+)|
[\d/]+/.*?)
'''

View File

@@ -0,0 +1,93 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
int_or_none,
orderedSet,
parse_duration,
qualities,
unified_strdate,
xpath_text
)
class EuropaIE(InfoExtractor):
_VALID_URL = r'https?://ec\.europa\.eu/avservices/(?:video/player|audio/audioDetails)\.cfm\?.*?\bref=(?P<id>[A-Za-z0-9-]+)'
_TESTS = [{
'url': 'http://ec.europa.eu/avservices/video/player.cfm?ref=I107758',
'md5': '574f080699ddd1e19a675b0ddf010371',
'info_dict': {
'id': 'I107758',
'ext': 'mp4',
'title': 'TRADE - Wikileaks on TTIP',
'description': 'NEW LIVE EC Midday press briefing of 11/08/2015',
'thumbnail': 're:^https?://.*\.jpg$',
'upload_date': '20150811',
'duration': 34,
'view_count': int,
'formats': 'mincount:3',
}
}, {
'url': 'http://ec.europa.eu/avservices/video/player.cfm?sitelang=en&ref=I107786',
'only_matching': True,
}, {
'url': 'http://ec.europa.eu/avservices/audio/audioDetails.cfm?ref=I-109295&sitelang=en',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
playlist = self._download_xml(
'http://ec.europa.eu/avservices/video/player/playlist.cfm?ID=%s' % video_id, video_id)
def get_item(type_, preference):
items = {}
for item in playlist.findall('./info/%s/item' % type_):
lang, label = xpath_text(item, 'lg', default=None), xpath_text(item, 'label', default=None)
if lang and label:
items[lang] = label.strip()
for p in preference:
if items.get(p):
return items[p]
query = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
preferred_lang = query.get('sitelang', ('en', ))[0]
preferred_langs = orderedSet((preferred_lang, 'en', 'int'))
title = get_item('title', preferred_langs) or video_id
description = get_item('description', preferred_langs)
thumbnmail = xpath_text(playlist, './info/thumburl', 'thumbnail')
upload_date = unified_strdate(xpath_text(playlist, './info/date', 'upload date'))
duration = parse_duration(xpath_text(playlist, './info/duration', 'duration'))
view_count = int_or_none(xpath_text(playlist, './info/views', 'views'))
language_preference = qualities(preferred_langs[::-1])
formats = []
for file_ in playlist.findall('./files/file'):
video_url = xpath_text(file_, './url')
if not video_url:
continue
lang = xpath_text(file_, './lg')
formats.append({
'url': video_url,
'format_id': lang,
'format_note': xpath_text(file_, './lglabel'),
'language_preference': language_preference(lang)
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnmail,
'upload_date': upload_date,
'duration': duration,
'view_count': view_count,
'formats': formats
}

View File

@@ -1,13 +1,12 @@
from __future__ import unicode_literals
import re
import random
import json
from .common import InfoExtractor
from ..utils import (
get_element_by_id,
clean_html,
determine_ext,
ExtractorError,
)
@@ -17,66 +16,40 @@ class FKTVIE(InfoExtractor):
_TEST = {
'url': 'http://fernsehkritik.tv/folge-1',
'md5': '21f0b0c99bce7d5b524eb1b17b1c6d79',
'info_dict': {
'id': '00011',
'ext': 'flv',
'id': '1',
'ext': 'mp4',
'title': 'Folge 1 vom 10. April 2007',
'description': 'md5:fb4818139c7cfe6907d4b83412a6864f',
'thumbnail': 're:^https?://.*\.jpg$',
},
}
def _real_extract(self, url):
episode = int(self._match_id(url))
episode = self._match_id(url)
video_thumbnail = 'http://fernsehkritik.tv/images/magazin/folge%s.jpg' % episode
start_webpage = self._download_webpage('http://fernsehkritik.tv/folge-%s/Start' % episode,
episode)
playlist = self._search_regex(r'playlist = (\[.*?\]);', start_webpage,
'playlist', flags=re.DOTALL)
files = json.loads(re.sub('{[^{}]*?}', '{}', playlist))
webpage = self._download_webpage(
'http://fernsehkritik.tv/folge-%s/play' % episode, episode)
title = clean_html(self._html_search_regex(
'<h3>([^<]+)</h3>', webpage, 'title'))
matches = re.search(
r'(?s)<video(?:(?!poster)[^>])+(?:poster="([^"]+)")?[^>]*>(.*)</video>',
webpage)
if matches is None:
raise ExtractorError('Unable to extract the video')
videos = []
for i, _ in enumerate(files, 1):
video_id = '%04d%d' % (episode, i)
video_url = 'http://fernsehkritik.tv/js/directme.php?file=%s%s.flv' % (episode, '' if i == 1 else '-%d' % i)
videos.append({
'ext': 'flv',
'id': video_id,
'url': video_url,
'title': clean_html(get_element_by_id('eptitle', start_webpage)),
'description': clean_html(get_element_by_id('contentlist', start_webpage)),
'thumbnail': video_thumbnail
})
poster, sources = matches.groups()
if poster is None:
self.report_warning('unable to extract thumbnail')
urls = re.findall(r'<source[^>]+src="([^"]+)"', sources)
formats = [{
'url': furl,
'format_id': determine_ext(furl),
} for furl in urls]
return {
'_type': 'multi_video',
'entries': videos,
'id': 'folge-%s' % episode,
}
class FKTVPosteckeIE(InfoExtractor):
IE_NAME = 'fernsehkritik.tv:postecke'
_VALID_URL = r'http://(?:www\.)?fernsehkritik\.tv/inline-video/postecke\.php\?(.*&)?ep=(?P<ep>[0-9]+)(&|$)'
_TEST = {
'url': 'http://fernsehkritik.tv/inline-video/postecke.php?iframe=true&width=625&height=440&ep=120',
'md5': '262f0adbac80317412f7e57b4808e5c4',
'info_dict': {
'id': '0120',
'ext': 'flv',
'title': 'Postecke 120',
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
episode = int(mobj.group('ep'))
server = random.randint(2, 4)
video_id = '%04d' % episode
video_url = 'http://dl%d.fernsehkritik.tv/postecke/postecke%d.flv' % (server, episode)
video_title = 'Postecke %d' % episode
return {
'id': video_id,
'url': video_url,
'title': video_title,
'id': episode,
'title': title,
'formats': formats,
'thumbnail': poster,
}

View File

@@ -50,6 +50,7 @@ from .dailymotion import DailymotionCloudIE
from .onionstudios import OnionStudiosIE
from .snagfilms import SnagFilmsEmbedIE
from .screenwavemedia import ScreenwaveMediaIE
from .mtv import MTVServicesEmbeddedIE
class GenericIE(InfoExtractor):
@@ -1611,12 +1612,9 @@ class GenericIE(InfoExtractor):
return self.url_result(url, ie='Vulture')
# Look for embedded mtvservices player
mobj = re.search(
r'<iframe src="(?P<url>https?://media\.mtvnservices\.com/embed/[^"]+)"',
webpage)
if mobj is not None:
url = unescapeHTML(mobj.group('url'))
return self.url_result(url, ie='MTVServicesEmbedded')
mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
if mtvservices_url:
return self.url_result(mtvservices_url, ie='MTVServicesEmbedded')
# Look for embedded yahoo player
mobj = re.search(
@@ -1655,7 +1653,7 @@ class GenericIE(InfoExtractor):
return self.url_result(mobj.group('url'), 'MLB')
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>%s)\1' % CondeNastIE.EMBED_URL,
r'<(?:iframe|script)[^>]+?src=(["\'])(?P<url>%s)\1' % CondeNastIE.EMBED_URL,
webpage)
if mobj is not None:
return self.url_result(self._proto_relative_url(mobj.group('url'), scheme='http:'), 'CondeNast')

View File

@@ -1,80 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_request,
)
from ..utils import (
ExtractorError,
int_or_none,
urlencode_postdata,
)
class HostingBulkIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://(?:www\.)?hostingbulk\.com/
(?:embed-)?(?P<id>[A-Za-z0-9]{12})(?:-\d+x\d+)?\.html'''
_FILE_DELETED_REGEX = r'<b>File Not Found</b>'
_TEST = {
'url': 'http://hostingbulk.com/n0ulw1hv20fm.html',
'md5': '6c8653c8ecf7ebfa83b76e24b7b2fe3f',
'info_dict': {
'id': 'n0ulw1hv20fm',
'ext': 'mp4',
'title': 'md5:5afeba33f48ec87219c269e054afd622',
'filesize': 6816081,
'thumbnail': 're:^http://.*\.jpg$',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
url = 'http://hostingbulk.com/{0:}.html'.format(video_id)
# Custom request with cookie to set language to English, so our file
# deleted regex would work.
request = compat_urllib_request.Request(
url, headers={'Cookie': 'lang=english'})
webpage = self._download_webpage(request, video_id)
if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
raise ExtractorError('Video %s does not exist' % video_id,
expected=True)
title = self._html_search_regex(r'<h3>(.*?)</h3>', webpage, 'title')
filesize = int_or_none(
self._search_regex(
r'<small>\((\d+)\sbytes?\)</small>',
webpage,
'filesize',
fatal=False
)
)
thumbnail = self._search_regex(
r'<img src="([^"]+)".+?class="pic"',
webpage, 'thumbnail', fatal=False)
fields = self._hidden_inputs(webpage)
request = compat_urllib_request.Request(url, urlencode_postdata(fields))
request.add_header('Content-type', 'application/x-www-form-urlencoded')
response = self._request_webpage(request, video_id,
'Submiting download request')
video_url = response.geturl()
formats = [{
'format_id': 'sd',
'filesize': filesize,
'url': video_url,
}]
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}

View File

@@ -1,7 +1,11 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import (
int_or_none,
get_element_by_id,
remove_end,
)
class IconosquareIE(InfoExtractor):
@@ -12,7 +16,7 @@ class IconosquareIE(InfoExtractor):
'info_dict': {
'id': '522207370455279102_24101272',
'ext': 'mp4',
'title': 'Instagram media by @aguynamedpatrick (Patrick Janelle)',
'title': 'Instagram photo by @aguynamedpatrick (Patrick Janelle)',
'description': 'md5:644406a9ec27457ed7aa7a9ebcd4ce3d',
'timestamp': 1376471991,
'upload_date': '20130814',
@@ -29,8 +33,7 @@ class IconosquareIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
media = self._parse_json(
self._search_regex(
r'window\.media\s*=\s*({.+?});\n', webpage, 'media'),
get_element_by_id('mediaJson', webpage),
video_id)
formats = [{
@@ -41,9 +44,7 @@ class IconosquareIE(InfoExtractor):
} for format_id, f in media['videos'].items()]
self._sort_formats(formats)
title = self._html_search_regex(
r'<title>(.+?)(?: *\(Videos?\))? \| (?:Iconosquare|Statigram)</title>',
webpage, 'title')
title = remove_end(self._og_search_title(webpage), ' - via Iconosquare')
timestamp = int_or_none(media.get('created_time') or media.get('caption', {}).get('created_time'))
description = media.get('caption', {}).get('text')
@@ -61,6 +62,14 @@ class IconosquareIE(InfoExtractor):
'height': int_or_none(t.get('height'))
} for thumbnail_id, t in media.get('images', {}).items()]
comments = [{
'id': comment.get('id'),
'text': comment['text'],
'timestamp': int_or_none(comment.get('created_time')),
'author': comment.get('from', {}).get('full_name'),
'author_id': comment.get('from', {}).get('username'),
} for comment in media.get('comments', {}).get('data', []) if 'text' in comment]
return {
'id': video_id,
'title': title,
@@ -72,4 +81,5 @@ class IconosquareIE(InfoExtractor):
'comment_count': comment_count,
'like_count': like_count,
'formats': formats,
'comments': comments,
}

View File

@@ -95,6 +95,10 @@ class IqiyiIE(InfoExtractor):
('10', 'h1'),
]
@staticmethod
def md5_text(text):
return hashlib.md5(text.encode('utf-8')).hexdigest()
def construct_video_urls(self, data, video_id, _uuid):
def do_xor(x, y):
a = y % 3
@@ -121,7 +125,7 @@ class IqiyiIE(InfoExtractor):
note='Download path key of segment %d for format %s' % (segment_index + 1, format_id)
)['t']
t = str(int(math.floor(int(tm) / (600.0))))
return hashlib.md5((t + mg + x).encode('utf8')).hexdigest()
return self.md5_text(t + mg + x)
video_urls_dict = {}
for format_item in data['vp']['tkl'][0]['vs']:
@@ -179,20 +183,19 @@ class IqiyiIE(InfoExtractor):
def get_raw_data(self, tvid, video_id, enc_key, _uuid):
tm = str(int(time.time()))
tail = tm + tvid
param = {
'key': 'fvip',
'src': hashlib.md5(b'youtube-dl').hexdigest(),
'src': self.md5_text('youtube-dl'),
'tvId': tvid,
'vid': video_id,
'vinfo': 1,
'tm': tm,
'enc': hashlib.md5(
(enc_key + tm + tvid).encode('utf8')).hexdigest(),
'enc': self.md5_text((enc_key + tail)[1:64:2] + tail),
'qyid': _uuid,
'tn': random.random(),
'um': 0,
'authkey': hashlib.md5(
(tm + tvid).encode('utf8')).hexdigest()
'authkey': self.md5_text(self.md5_text('') + tail),
}
api_url = 'http://cache.video.qiyi.com/vms' + '?' + \
@@ -201,7 +204,8 @@ class IqiyiIE(InfoExtractor):
return raw_data
def get_enc_key(self, swf_url, video_id):
enc_key = '3601ba290e4f4662848c710e2122007e' # last update at 2015-08-10 for Zombie
# TODO: automatic key extraction
enc_key = 'eac64f22daf001da6ba9aa8da4d501508bbe90a4d4091fea3b0582a85b38c2cc' # last update at 2015-09-23-23 for Zombie::bite
return enc_key
def _real_extract(self, url):

View File

@@ -1,46 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class KeekIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?keek\.com/(?:!|\w+/keeks/)(?P<id>\w+)'
_VALID_URL = r'https?://(?:www\.)?keek\.com/keek/(?P<id>\w+)'
IE_NAME = 'keek'
_TEST = {
'url': 'https://www.keek.com/ytdl/keeks/NODfbab',
'md5': '09c5c109067536c1cec8bac8c21fea05',
'url': 'https://www.keek.com/keek/NODfbab',
'md5': '9b0636f8c0f7614afa4ea5e4c6e57e83',
'info_dict': {
'id': 'NODfbab',
'ext': 'mp4',
'uploader': 'youtube-dl project',
'uploader_id': 'ytdl',
'title': 'test chars: "\'/\\\u00e4<>This is a test video for youtube-dl.For more information, contact phihag@phihag.de .',
'title': 'md5:35d42050a3ece241d5ddd7fdcc6fd896',
'uploader': 'ytdl',
'uploader_id': 'eGT5bab',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_url = 'http://cdn.keek.com/keek/video/%s' % video_id
thumbnail = 'http://cdn.keek.com/keek/thumbnail/%s/w100/h75' % video_id
webpage = self._download_webpage(url, video_id)
raw_desc = self._html_search_meta('description', webpage)
if raw_desc:
uploader = self._html_search_regex(
r'Watch (.*?)\s+\(', raw_desc, 'uploader', fatal=False)
uploader_id = self._html_search_regex(
r'Watch .*?\(@(.+?)\)', raw_desc, 'uploader_id', fatal=False)
else:
uploader = None
uploader_id = None
return {
'id': video_id,
'url': video_url,
'url': self._og_search_video_url(webpage),
'ext': 'mp4',
'title': self._og_search_title(webpage),
'thumbnail': thumbnail,
'uploader': uploader,
'uploader_id': uploader_id,
'title': self._og_search_description(webpage).strip(),
'thumbnail': self._og_search_thumbnail(webpage),
'uploader': self._search_regex(
r'data-username=(["\'])(?P<uploader>.+?)\1', webpage,
'uploader', fatal=False, group='uploader'),
'uploader_id': self._search_regex(
r'data-user-id=(["\'])(?P<uploader_id>.+?)\1', webpage,
'uploader id', fatal=False, group='uploader_id'),
}

View File

@@ -57,6 +57,7 @@ class KuwoIE(KuwoBaseIE):
'upload_date': '20080122',
'description': 'md5:ed13f58e3c3bf3f7fd9fbc4e5a7aa75c'
},
'skip': 'this song has been offline because of copyright issues',
}, {
'url': 'http://www.kuwo.cn/yinyue/6446136/',
'info_dict': {
@@ -76,9 +77,11 @@ class KuwoIE(KuwoBaseIE):
webpage = self._download_webpage(
url, song_id, note='Download song detail info',
errnote='Unable to get song detail info')
if '对不起,该歌曲由于版权问题已被下线,将返回网站首页' in webpage:
raise ExtractorError('this song has been offline because of copyright issues', expected=True)
song_name = self._html_search_regex(
r'<h1[^>]+title="([^"]+)">', webpage, 'song name')
r'(?s)class="(?:[^"\s]+\s+)*title(?:\s+[^"\s]+)*".*?<h1[^>]+title="([^"]+)"', webpage, 'song name')
singer_name = self._html_search_regex(
r'<div[^>]+class="s_img">\s*<a[^>]+title="([^>]+)"',
webpage, 'singer name', fatal=False)

View File

@@ -0,0 +1,229 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
)
class LimelightBaseIE(InfoExtractor):
_PLAYLIST_SERVICE_URL = 'http://production-ps.lvp.llnw.net/r/PlaylistService/%s/%s/%s'
_API_URL = 'http://api.video.limelight.com/rest/organizations/%s/%s/%s/%s.json'
def _call_playlist_service(self, item_id, method, fatal=True):
return self._download_json(
self._PLAYLIST_SERVICE_URL % (self._PLAYLIST_SERVICE_PATH, item_id, method),
item_id, 'Downloading PlaylistService %s JSON' % method, fatal=fatal)
def _call_api(self, organization_id, item_id, method):
return self._download_json(
self._API_URL % (organization_id, self._API_PATH, item_id, method),
item_id, 'Downloading API %s JSON' % method)
def _extract(self, item_id, pc_method, mobile_method, meta_method):
pc = self._call_playlist_service(item_id, pc_method)
metadata = self._call_api(pc['orgId'], item_id, meta_method)
mobile = self._call_playlist_service(item_id, mobile_method, fatal=False)
return pc, mobile, metadata
def _extract_info(self, streams, mobile_urls, properties):
video_id = properties['media_id']
formats = []
for stream in streams:
stream_url = stream.get('url')
if not stream_url:
continue
if '.f4m' in stream_url:
formats.extend(self._extract_f4m_formats(stream_url, video_id))
else:
fmt = {
'url': stream_url,
'abr': float_or_none(stream.get('audioBitRate')),
'vbr': float_or_none(stream.get('videoBitRate')),
'fps': float_or_none(stream.get('videoFrameRate')),
'width': int_or_none(stream.get('videoWidthInPixels')),
'height': int_or_none(stream.get('videoHeightInPixels')),
'ext': determine_ext(stream_url)
}
rtmp = re.search(r'^(?P<url>rtmpe?://[^/]+/(?P<app>.+))/(?P<playpath>mp4:.+)$', stream_url)
if rtmp:
format_id = 'rtmp'
if stream.get('videoBitRate'):
format_id += '-%d' % int_or_none(stream['videoBitRate'])
fmt.update({
'url': rtmp.group('url'),
'play_path': rtmp.group('playpath'),
'app': rtmp.group('app'),
'ext': 'flv',
'format_id': format_id,
})
formats.append(fmt)
for mobile_url in mobile_urls:
media_url = mobile_url.get('mobileUrl')
if not media_url:
continue
format_id = mobile_url.get('targetMediaPlatform')
if determine_ext(media_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media_url, video_id, 'mp4', entry_protocol='m3u8_native',
preference=-1, m3u8_id=format_id))
else:
formats.append({
'url': media_url,
'format_id': format_id,
'preference': -1,
})
self._sort_formats(formats)
title = properties['title']
description = properties.get('description')
timestamp = int_or_none(properties.get('publish_date') or properties.get('create_date'))
duration = float_or_none(properties.get('duration_in_milliseconds'), 1000)
filesize = int_or_none(properties.get('total_storage_in_bytes'))
categories = [properties.get('category')]
tags = properties.get('tags', [])
thumbnails = [{
'url': thumbnail['url'],
'width': int_or_none(thumbnail.get('width')),
'height': int_or_none(thumbnail.get('height')),
} for thumbnail in properties.get('thumbnails', []) if thumbnail.get('url')]
subtitles = {}
for caption in properties.get('captions', {}):
lang = caption.get('language_code')
subtitles_url = caption.get('url')
if lang and subtitles_url:
subtitles[lang] = [{
'url': subtitles_url,
}]
return {
'id': video_id,
'title': title,
'description': description,
'formats': formats,
'timestamp': timestamp,
'duration': duration,
'filesize': filesize,
'categories': categories,
'tags': tags,
'thumbnails': thumbnails,
'subtitles': subtitles,
}
class LimelightMediaIE(LimelightBaseIE):
IE_NAME = 'limelight'
_VALID_URL = r'(?:limelight:media:|http://link\.videoplatform\.limelight\.com/media/\??\bmediaId=)(?P<id>[a-z0-9]{32})'
_TESTS = [{
'url': 'http://link.videoplatform.limelight.com/media/?mediaId=3ffd040b522b4485b6d84effc750cd86',
'info_dict': {
'id': '3ffd040b522b4485b6d84effc750cd86',
'ext': 'flv',
'title': 'HaP and the HB Prince Trailer',
'description': 'md5:8005b944181778e313d95c1237ddb640',
'thumbnail': 're:^https?://.*\.jpeg$',
'duration': 144.23,
'timestamp': 1244136834,
'upload_date': '20090604',
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
# video with subtitles
'url': 'limelight:media:a3e00274d4564ec4a9b29b9466432335',
'info_dict': {
'id': 'a3e00274d4564ec4a9b29b9466432335',
'ext': 'flv',
'title': '3Play Media Overview Video',
'description': '',
'thumbnail': 're:^https?://.*\.jpeg$',
'duration': 78.101,
'timestamp': 1338929955,
'upload_date': '20120605',
'subtitles': 'mincount:9',
},
'params': {
# rtmp download
'skip_download': True,
},
}]
_PLAYLIST_SERVICE_PATH = 'media'
_API_PATH = 'media'
def _real_extract(self, url):
video_id = self._match_id(url)
pc, mobile, metadata = self._extract(
video_id, 'getPlaylistByMediaId', 'getMobilePlaylistByMediaId', 'properties')
return self._extract_info(
pc['playlistItems'][0].get('streams', []),
mobile['mediaList'][0].get('mobileUrls', []) if mobile else [],
metadata)
class LimelightChannelIE(LimelightBaseIE):
IE_NAME = 'limelight:channel'
_VALID_URL = r'(?:limelight:channel:|http://link\.videoplatform\.limelight\.com/media/\??\bchannelId=)(?P<id>[a-z0-9]{32})'
_TEST = {
'url': 'http://link.videoplatform.limelight.com/media/?channelId=ab6a524c379342f9b23642917020c082',
'info_dict': {
'id': 'ab6a524c379342f9b23642917020c082',
'title': 'Javascript Sample Code',
},
'playlist_mincount': 3,
}
_PLAYLIST_SERVICE_PATH = 'channel'
_API_PATH = 'channels'
def _real_extract(self, url):
channel_id = self._match_id(url)
pc, mobile, medias = self._extract(
channel_id, 'getPlaylistByChannelId',
'getMobilePlaylistWithNItemsByChannelId?begin=0&count=-1', 'media')
entries = [
self._extract_info(
pc['playlistItems'][i].get('streams', []),
mobile['mediaList'][i].get('mobileUrls', []) if mobile else [],
medias['media_list'][i])
for i in range(len(medias['media_list']))]
return self.playlist_result(entries, channel_id, pc['title'])
class LimelightChannelListIE(LimelightBaseIE):
IE_NAME = 'limelight:channel_list'
_VALID_URL = r'(?:limelight:channel_list:|http://link\.videoplatform\.limelight\.com/media/\?.*?\bchannelListId=)(?P<id>[a-z0-9]{32})'
_TEST = {
'url': 'http://link.videoplatform.limelight.com/media/?channelListId=301b117890c4465c8179ede21fd92e2b',
'info_dict': {
'id': '301b117890c4465c8179ede21fd92e2b',
'title': 'Website - Hero Player',
},
'playlist_mincount': 2,
}
_PLAYLIST_SERVICE_PATH = 'channel_list'
def _real_extract(self, url):
channel_list_id = self._match_id(url)
channel_list = self._call_playlist_service(channel_list_id, 'getMobileChannelListById')
entries = [
self.url_result('limelight:channel:%s' % channel['id'], 'LimelightChannel')
for channel in channel_list['channelList']]
return self.playlist_result(entries, channel_list_id, channel_list['title'])

View File

@@ -200,7 +200,13 @@ class MTVServicesInfoExtractor(InfoExtractor):
if mgid is None or ':' not in mgid:
mgid = self._search_regex(
[r'data-mgid="(.*?)"', r'swfobject.embedSWF\(".*?(mgid:.*?)"'],
webpage, 'mgid')
webpage, 'mgid', default=None)
if not mgid:
sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid')
videos_info = self._get_videos_info(mgid)
return videos_info
@@ -222,6 +228,13 @@ class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
},
}
@staticmethod
def _extract_url(webpage):
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media.mtvnservices.com/embed/.+?)\1', webpage)
if mobj:
return mobj.group('url')
def _get_feed_url(self, uri):
video_id = self._id_from_uri(uri)
site_id = uri.replace(video_id, '')

View File

@@ -10,7 +10,6 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
clean_html,
)
@@ -46,11 +45,11 @@ class NaverIE(InfoExtractor):
m_id = re.search(r'var rmcPlayer = new nhn.rmcnmv.RMCVideoPlayer\("(.+?)", "(.+?)"',
webpage)
if m_id is None:
m_error = re.search(
r'(?s)<div class="(?:nation_error|nation_box)">\s*(?:<!--.*?-->)?\s*<p class="[^"]+">(?P<msg>.+?)</p>\s*</div>',
webpage)
if m_error:
raise ExtractorError(clean_html(m_error.group('msg')), expected=True)
error = self._html_search_regex(
r'(?s)<div class="(?:nation_error|nation_box|error_box)">\s*(?:<!--.*?-->)?\s*<p class="[^"]+">(?P<msg>.+?)</p>\s*</div>',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
raise ExtractorError('couldn\'t extract vid and key')
vid = m_id.group(1)
key = m_id.group(2)

View File

@@ -126,7 +126,8 @@ class AppleDailyIE(NextMediaIE):
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'md5:23c0aac567dc08c9c16a3161a2c2e3cd',
'upload_date': '20150128',
}
},
'skip': 'redirect to http://www.appledaily.com.tw/animation/',
}, {
# No thumbnail
'url': 'http://www.appledaily.com.tw/animation/realtimenews/new/20150128/5003673/',
@@ -140,10 +141,19 @@ class AppleDailyIE(NextMediaIE):
},
'expected_warnings': [
'video thumbnail',
]
],
'skip': 'redirect to http://www.appledaily.com.tw/animation/',
}, {
'url': 'http://www.appledaily.com.tw/appledaily/article/supplement/20140417/35770334/',
'only_matching': True,
'md5': 'eaa20e6b9df418c912d7f5dec2ba734d',
'info_dict': {
'id': '35770334',
'ext': 'mp4',
'title': '咖啡占卜測 XU裝熟指數',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'md5:7b859991a6a4fedbdf3dd3b66545c748',
'upload_date': '20140417',
},
}]
_URL_PATTERN = r'\{url: \'(.+)\'\}'

View File

@@ -107,6 +107,20 @@ class NFLIE(InfoExtractor):
'timestamp': 1442618809,
'upload_date': '20150918',
},
}, {
# lowercase data-contentid
'url': 'http://www.steelers.com/news/article-1/Tomlin-on-Ben-getting-Vick-ready/56399c96-4160-48cf-a7ad-1d17d4a3aef7',
'info_dict': {
'id': '12693586-6ea9-4743-9c1c-02c59e4a5ef2',
'ext': 'mp4',
'title': 'Tomlin looks ahead to Ravens on a short week',
'description': 'md5:32f3f7b139f43913181d5cbb24ecad75',
'timestamp': 1443459651,
'upload_date': '20150928',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.nfl.com/videos/nfl-network-top-ten/09000d5d810a6bd4/Top-10-Gutsiest-Performances-Jack-Youngblood',
'only_matching': True,
@@ -151,7 +165,7 @@ class NFLIE(InfoExtractor):
group='config'))
# For articles, the id in the url is not the video id
video_id = self._search_regex(
r'(?:<nflcs:avplayer[^>]+data-contentId\s*=\s*|contentId\s*:\s*)(["\'])(?P<id>.+?)\1',
r'(?:<nflcs:avplayer[^>]+data-content[Ii]d\s*=\s*|content[Ii]d\s*:\s*)(["\'])(?P<id>.+?)\1',
webpage, 'video id', default=video_id, group='id')
config = self._download_json(config_url, video_id, 'Downloading player config')
url_template = NFLIE.prepend_host(

View File

@@ -72,7 +72,7 @@ class NHLBaseInfoExtractor(InfoExtractor):
class NHLIE(NHLBaseInfoExtractor):
IE_NAME = 'nhl.com'
_VALID_URL = r'https?://video(?P<team>\.[^.]*)?\.nhl\.com/videocenter/(?:console)?(?:\?(?:.*?[?&])?)(?:id|hlg)=(?P<id>[-0-9a-zA-Z,]+)'
_VALID_URL = r'https?://video(?P<team>\.[^.]*)?\.nhl\.com/videocenter/(?:console|embed)?(?:\?(?:.*?[?&])?)(?:id|hlg|playlist)=(?P<id>[-0-9a-zA-Z,]+)'
_TESTS = [{
'url': 'http://video.canucks.nhl.com/videocenter/console?catid=6?id=453614',
@@ -136,6 +136,9 @@ class NHLIE(NHLBaseInfoExtractor):
'params': {
'skip_download': True, # Requires rtmpdump
}
}, {
'url': 'http://video.nhl.com/videocenter/embed?playlist=836127',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -146,9 +149,9 @@ class NHLIE(NHLBaseInfoExtractor):
class NHLNewsIE(NHLBaseInfoExtractor):
IE_NAME = 'nhl.com:news'
IE_DESC = 'NHL news'
_VALID_URL = r'https?://(?:www\.)?nhl\.com/ice/news\.html?(?:\?(?:.*?[?&])?)id=(?P<id>[-0-9a-zA-Z]+)'
_VALID_URL = r'https?://(?:.+?\.)?nhl\.com/(?:ice|club)/news\.html?(?:\?(?:.*?[?&])?)id=(?P<id>[-0-9a-zA-Z]+)'
_TEST = {
_TESTS = [{
'url': 'http://www.nhl.com/ice/news.htm?id=750727',
'md5': '4b3d1262e177687a3009937bd9ec0be8',
'info_dict': {
@@ -159,13 +162,26 @@ class NHLNewsIE(NHLBaseInfoExtractor):
'duration': 37,
'upload_date': '20150128',
},
}
}, {
# iframe embed
'url': 'http://sabres.nhl.com/club/news.htm?id=780189',
'md5': '9f663d1c006c90ac9fb82777d4294e12',
'info_dict': {
'id': '836127',
'ext': 'mp4',
'title': 'Morning Skate: OTT vs. BUF (9/23/15)',
'description': "Brian Duff chats with Tyler Ennis prior to Buffalo's first preseason home game.",
'duration': 93,
'upload_date': '20150923',
},
}]
def _real_extract(self, url):
news_id = self._match_id(url)
webpage = self._download_webpage(url, news_id)
video_id = self._search_regex(
[r'pVid(\d+)', r"nlid\s*:\s*'(\d+)'"],
[r'pVid(\d+)', r"nlid\s*:\s*'(\d+)'",
r'<iframe[^>]+src=["\']https?://video.*?\.nhl\.com/videocenter/embed\?.*\bplaylist=(\d+)'],
webpage, 'video id')
return self._real_extract_video(video_id)

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
ExtractorError,
float_or_none,
@@ -49,7 +50,7 @@ class NRKIE(InfoExtractor):
if data['usageRights']['isGeoBlocked']:
raise ExtractorError(
'NRK har ikke rettig-heter til å vise dette programmet utenfor Norge',
'NRK har ikke rettigheter til å vise dette programmet utenfor Norge',
expected=True)
video_url = data['mediaUrl'] + '?hdcore=3.5.0&plugin=aasp-3.5.0.151.81'
@@ -196,20 +197,6 @@ class NRKTVIE(InfoExtractor):
}
]
def _debug_print(self, txt):
if self._downloader.params.get('verbose', False):
self.to_screen('[debug] %s' % txt)
def _get_subtitles(self, subtitlesurl, video_id, baseurl):
url = "%s%s" % (baseurl, subtitlesurl)
self._debug_print('%s: Subtitle url: %s' % (video_id, url))
captions = self._download_xml(
url, video_id, 'Downloading subtitles')
lang = captions.get('lang', 'no')
return {lang: [
{'ext': 'ttml', 'url': url},
]}
def _extract_f4m(self, manifest_url, video_id):
return self._extract_f4m_formats(
manifest_url + '?hdcore=3.1.1&plugin=aasp-3.1.1.69.124', video_id, f4m_id='hds')
@@ -218,7 +205,7 @@ class NRKTVIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
part_id = mobj.group('part_id')
baseurl = mobj.group('baseurl')
base_url = mobj.group('baseurl')
webpage = self._download_webpage(url, video_id)
@@ -278,11 +265,14 @@ class NRKTVIE(InfoExtractor):
self._sort_formats(formats)
subtitles_url = self._html_search_regex(
r'data-subtitlesurl[ ]*=[ ]*"([^"]+)"',
webpage, 'subtitle URL', default=None)
subtitles = None
r'data-subtitlesurl\s*=\s*(["\'])(?P<url>.+?)\1',
webpage, 'subtitle URL', default=None, group='url')
subtitles = {}
if subtitles_url:
subtitles = self.extract_subtitles(subtitles_url, video_id, baseurl)
subtitles['no'] = [{
'ext': 'ttml',
'url': compat_urlparse.urljoin(base_url, subtitles_url),
}]
return {
'id': video_id,

View File

@@ -134,6 +134,24 @@ class PBSIE(InfoExtractor):
'params': {
'skip_download': True, # requires ffmpeg
},
},
{
# Video embedded in iframe containing angle brackets as attribute's value (e.g.
# "<iframe style='position: absolute;<br />\ntop: 0; left: 0;' ...", see
# https://github.com/rg3/youtube-dl/issues/7059)
'url': 'http://www.pbs.org/food/features/a-chefs-life-season-3-episode-5-prickly-business/',
'info_dict': {
'id': '2365546844',
'display_id': 'a-chefs-life-season-3-episode-5-prickly-business',
'ext': 'mp4',
'title': "A Chef's Life - Season 3, Ep. 5: Prickly Business",
'description': 'md5:61db2ddf27c9912f09c241014b118ed1',
'duration': 1480,
'thumbnail': 're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True, # requires ffmpeg
},
}
]
@@ -167,7 +185,7 @@ class PBSIE(InfoExtractor):
return media_id, presumptive_id, upload_date
url = self._search_regex(
r'<iframe\s+[^>]*\s+src=["\']([^\'"]+partnerplayer[^\'"]+)["\']',
r'(?s)<iframe[^>]+?(?:[a-z-]+?=["\'].*?["\'][^>]+?)*?\bsrc=["\']([^\'"]+partnerplayer[^\'"]+)["\']',
webpage, 'player URL')
mobj = re.match(self._VALID_URL, url)

View File

@@ -25,7 +25,7 @@ class QQMusicIE(InfoExtractor):
'id': '004295Et37taLD',
'ext': 'mp3',
'title': '可惜没如果',
'upload_date': '20141227',
'release_date': '20141227',
'creator': '林俊杰',
'description': 'md5:d327722d0361576fde558f1ac68a7065',
'thumbnail': 're:^https?://.*\.jpg$',
@@ -38,11 +38,26 @@ class QQMusicIE(InfoExtractor):
'id': '004MsGEo3DdNxV',
'ext': 'mp3',
'title': '如果',
'upload_date': '20050626',
'release_date': '20050626',
'creator': '李季美',
'description': 'md5:46857d5ed62bc4ba84607a805dccf437',
'thumbnail': 're:^https?://.*\.jpg$',
}
}, {
'note': 'lyrics not in .lrc format',
'url': 'http://y.qq.com/#type=song&mid=001JyApY11tIp6',
'info_dict': {
'id': '001JyApY11tIp6',
'ext': 'mp3',
'title': 'Shadows Over Transylvania',
'release_date': '19970225',
'creator': 'Dark Funeral',
'description': 'md5:ed14d5bd7ecec19609108052c25b2c11',
'thumbnail': 're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True,
},
}]
_FORMATS = {
@@ -112,15 +127,27 @@ class QQMusicIE(InfoExtractor):
self._check_formats(formats, mid)
self._sort_formats(formats)
return {
actual_lrc_lyrics = ''.join(
line + '\n' for line in re.findall(
r'(?m)^(\[[0-9]{2}:[0-9]{2}(?:\.[0-9]{2,})?\][^\n]*|\[[^\]]*\])', lrc_content))
info_dict = {
'id': mid,
'formats': formats,
'title': song_name,
'upload_date': publish_time,
'release_date': publish_time,
'creator': singer,
'description': lrc_content,
'thumbnail': thumbnail_url,
'thumbnail': thumbnail_url
}
if actual_lrc_lyrics:
info_dict['subtitles'] = {
'origin': [{
'ext': 'lrc',
'data': actual_lrc_lyrics,
}]
}
return info_dict
class QQPlaylistBaseIE(InfoExtractor):

View File

@@ -74,7 +74,7 @@ class RuutuIE(InfoExtractor):
preference = -1 if proto == 'rtmp' else 1
label = child.get('label')
tbr = int_or_none(child.get('bitrate'))
width, height = [int_or_none(x) for x in child.get('resolution', '').split('x')]
width, height = [int_or_none(x) for x in child.get('resolution', 'x').split('x')[:2]]
formats.append({
'format_id': '%s-%s' % (proto, label if label else tbr),
'url': video_url,

View File

@@ -113,7 +113,7 @@ class SoundcloudIE(InfoExtractor):
},
]
_CLIENT_ID = 'b45b1aa10f1ac2941910a7f0d10f8e28'
_CLIENT_ID = '02gUJC0hH2ct1EGOcYXQIzRFU91c72Ea'
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
def report_resolve(self, video_id):

View File

@@ -16,7 +16,7 @@ from ..utils import (
class TapelyIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tape\.ly/(?P<id>[A-Za-z0-9\-_]+)(?:/(?P<songnr>\d+))?'
_VALID_URL = r'https?://(?:www\.)?(?:tape\.ly|tapely\.com)/(?P<id>[A-Za-z0-9\-_]+)(?:/(?P<songnr>\d+))?'
_API_URL = 'http://tape.ly/showtape?id={0:}'
_S3_SONG_URL = 'http://mytape.s3.amazonaws.com/{0:}'
_SOUNDCLOUD_SONG_URL = 'http://api.soundcloud.com{0:}'
@@ -42,6 +42,10 @@ class TapelyIE(InfoExtractor):
'ext': 'm4a',
},
},
{
'url': 'https://tapely.com/my-grief-as-told-by-water',
'only_matching': True,
},
]
def _real_extract(self, url):

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
class TumblrIE(InfoExtractor):
@@ -28,6 +29,19 @@ class TumblrIE(InfoExtractor):
'description': 'md5:dba62ac8639482759c8eb10ce474586a',
'thumbnail': 're:http://.*\.jpg',
}
}, {
'url': 'http://hdvideotest.tumblr.com/post/130323439814/test-description-for-my-hd-video',
'md5': '7ae503065ad150122dc3089f8cf1546c',
'info_dict': {
'id': '130323439814',
'ext': 'mp4',
'title': 'HD Video Testing \u2014 Test description for my HD video',
'description': 'md5:97cc3ab5fcd27ee4af6356701541319c',
'thumbnail': 're:http://.*\.jpg',
},
'params': {
'format': 'hd',
},
}, {
'url': 'http://naked-yogi.tumblr.com/post/118312946248/naked-smoking-stretching',
'md5': 'de07e5211d60d4f3a2c3df757ea9f6ab',
@@ -37,6 +51,9 @@ class TumblrIE(InfoExtractor):
'title': 'naked smoking & stretching',
'upload_date': '20150506',
'timestamp': 1430931613,
'age_limit': 18,
'uploader_id': '1638622',
'uploader': 'naked-yogi',
},
'add_ie': ['Vidme'],
}, {
@@ -66,10 +83,38 @@ class TumblrIE(InfoExtractor):
if iframe_url is None:
return self.url_result(urlh.geturl(), 'Generic')
iframe = self._download_webpage(iframe_url, video_id,
'Downloading iframe page')
video_url = self._search_regex(r'<source src="([^"]+)"',
iframe, 'video url')
iframe = self._download_webpage(iframe_url, video_id, 'Downloading iframe page')
duration = None
sources = []
sd_url = self._search_regex(
r'<source[^>]+src=(["\'])(?P<url>.+?)\1', iframe,
'sd video url', default=None, group='url')
if sd_url:
sources.append((sd_url, 'sd'))
options = self._parse_json(
self._search_regex(
r'data-crt-options=(["\'])(?P<options>.+?)\1', iframe,
'hd video url', default='', group='options'),
video_id, fatal=False)
if options:
duration = int_or_none(options.get('duration'))
hd_url = options.get('hdUrl')
if hd_url:
sources.append((hd_url, 'hd'))
formats = [{
'url': video_url,
'ext': 'mp4',
'format_id': format_id,
'height': int_or_none(self._search_regex(
r'/(\d{3,4})$', video_url, 'height', default=None)),
'quality': quality,
} for quality, (video_url, format_id) in enumerate(sources)]
self._sort_formats(formats)
# The only place where you can get a title, it's not complete,
# but searching in other places doesn't work for all videos
@@ -79,9 +124,9 @@ class TumblrIE(InfoExtractor):
return {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': video_title,
'description': self._og_search_description(webpage, default=None),
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'duration': duration,
'formats': formats,
}

View File

@@ -1,17 +1,20 @@
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
)
from ..utils import ExtractorError
from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
)
class UstreamIE(InfoExtractor):
_VALID_URL = r'https?://www\.ustream\.tv/(?P<type>recorded|embed|embed/recorded)/(?P<videoID>\d+)'
_VALID_URL = r'https?://www\.ustream\.tv/(?P<type>recorded|embed|embed/recorded)/(?P<id>\d+)'
IE_NAME = 'ustream'
_TESTS = [{
'url': 'http://www.ustream.tv/recorded/20274954',
@@ -19,8 +22,12 @@ class UstreamIE(InfoExtractor):
'info_dict': {
'id': '20274954',
'ext': 'flv',
'uploader': 'Young Americans for Liberty',
'title': 'Young Americans for Liberty February 7, 2012 2:28 AM',
'description': 'Young Americans for Liberty February 7, 2012 2:28 AM',
'timestamp': 1328577035,
'upload_date': '20120207',
'uploader': 'yaliberty',
'uploader_id': '6780869',
},
}, {
# From http://sportscanada.tv/canadagames/index.php/week2/figure-skating/444
@@ -32,20 +39,21 @@ class UstreamIE(InfoExtractor):
'ext': 'flv',
'title': '-CG11- Canada Games Figure Skating',
'uploader': 'sportscanadatv',
}
},
'skip': 'This Pro Broadcaster has chosen to remove this video from the ustream.tv site.',
}]
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('videoID')
video_id = m.group('id')
# some sites use this embed format (see: http://github.com/rg3/youtube-dl/issues/2990)
if m.group('type') == 'embed/recorded':
video_id = m.group('videoID')
video_id = m.group('id')
desktop_url = 'http://www.ustream.tv/recorded/' + video_id
return self.url_result(desktop_url, 'Ustream')
if m.group('type') == 'embed':
video_id = m.group('videoID')
video_id = m.group('id')
webpage = self._download_webpage(url, video_id)
desktop_video_id = self._html_search_regex(
r'ContentVideoIds=\["([^"]*?)"\]', webpage, 'desktop_video_id')
@@ -53,52 +61,50 @@ class UstreamIE(InfoExtractor):
return self.url_result(desktop_url, 'Ustream')
params = self._download_json(
'http://cdngw.ustream.tv/rgwjson/Viewer.getVideo/' + json.dumps({
'brandId': 1,
'videoId': int(video_id),
'autoplay': False,
}), video_id)
'https://api.ustream.tv/videos/%s.json' % video_id, video_id)
if 'error' in params:
raise ExtractorError(params['error']['message'], expected=True)
error = params.get('error')
if error:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error), expected=True)
video_url = params['flv']
video = params['video']
webpage = self._download_webpage(url, video_id)
title = video['title']
filesize = float_or_none(video.get('file_size'))
self.report_extraction(video_id)
formats = [{
'id': video_id,
'url': video_url,
'ext': format_id,
'filesize': filesize,
} for format_id, video_url in video['media_urls'].items()]
self._sort_formats(formats)
video_title = self._html_search_regex(r'data-title="(?P<title>.+)"',
webpage, 'title', default=None)
description = video.get('description')
timestamp = int_or_none(video.get('created_at'))
duration = float_or_none(video.get('length'))
view_count = int_or_none(video.get('views'))
if not video_title:
try:
video_title = params['moduleConfig']['meta']['title']
except KeyError:
pass
uploader = video.get('owner', {}).get('username')
uploader_id = video.get('owner', {}).get('id')
if not video_title:
video_title = 'Ustream video ' + video_id
uploader = self._html_search_regex(r'data-content-type="channel".*?>(?P<uploader>.*?)</a>',
webpage, 'uploader', fatal=False, flags=re.DOTALL, default=None)
if not uploader:
try:
uploader = params['moduleConfig']['meta']['userName']
except KeyError:
uploader = None
thumbnail = self._html_search_regex(r'<link rel="image_src" href="(?P<thumb>.*?)"',
webpage, 'thumbnail', fatal=False)
thumbnails = [{
'id': thumbnail_id,
'url': thumbnail_url,
} for thumbnail_id, thumbnail_url in video.get('thumbnail', {}).items()]
return {
'id': video_id,
'url': video_url,
'ext': 'flv',
'title': video_title,
'title': title,
'description': description,
'thumbnails': thumbnails,
'timestamp': timestamp,
'duration': duration,
'view_count': view_count,
'uploader': uploader,
'thumbnail': thumbnail,
'uploader_id': uploader_id,
'formats': formats,
}

View File

@@ -3,11 +3,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_urlparse,
)
from ..utils import (
find_xpath_attr,
int_or_none,
ExtractorError,
parse_duration,
unified_strdate,
)
@@ -15,7 +17,7 @@ class VideoLecturesNetIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?videolectures\.net/(?P<id>[^/#?]+)/*(?:[#?].*)?$'
IE_NAME = 'videolectures.net'
_TEST = {
_TESTS = [{
'url': 'http://videolectures.net/promogram_igor_mekjavic_eng/',
'info_dict': {
'id': 'promogram_igor_mekjavic_eng',
@@ -26,61 +28,55 @@ class VideoLecturesNetIE(InfoExtractor):
'duration': 565,
'thumbnail': 're:http://.*\.jpg',
},
}
}, {
# video with invalid direct format links (HTTP 403)
'url': 'http://videolectures.net/russir2010_filippova_nlp/',
'info_dict': {
'id': 'russir2010_filippova_nlp',
'ext': 'flv',
'title': 'NLP at Google',
'description': 'md5:fc7a6d9bf0302d7cc0e53f7ca23747b3',
'duration': 5352,
'thumbnail': 're:http://.*\.jpg',
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
'url': 'http://videolectures.net/deeplearning2015_montreal/',
'info_dict': {
'id': 'deeplearning2015_montreal',
'title': 'Deep Learning Summer School, Montreal 2015',
'description': 'md5:90121a40cc6926df1bf04dcd8563ed3b',
},
'playlist_count': 30,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
smil_url = 'http://videolectures.net/%s/video/1/smil.xml' % video_id
smil = self._download_xml(smil_url, video_id)
title = find_xpath_attr(smil, './/meta', 'name', 'title').attrib['content']
description_el = find_xpath_attr(smil, './/meta', 'name', 'abstract')
description = (
None if description_el is None
else description_el.attrib['content'])
upload_date = unified_strdate(
find_xpath_attr(smil, './/meta', 'name', 'date').attrib['content'])
try:
smil = self._download_smil(smil_url, video_id)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
# Probably a playlist
webpage = self._download_webpage(url, video_id)
entries = [
self.url_result(compat_urlparse.urljoin(url, video_url), 'VideoLecturesNet')
for _, video_url in re.findall(r'<a[^>]+href=(["\'])(.+?)\1[^>]+id=["\']lec=\d+', webpage)]
playlist_title = self._html_search_meta('title', webpage, 'title', fatal=True)
playlist_description = self._html_search_meta('description', webpage, 'description')
return self.playlist_result(entries, video_id, playlist_title, playlist_description)
info = self._parse_smil(smil, smil_url, video_id)
info['id'] = video_id
switch = smil.find('.//switch')
duration = parse_duration(switch.attrib.get('dur'))
thumbnail_el = find_xpath_attr(switch, './image', 'type', 'thumbnail')
thumbnail = (
None if thumbnail_el is None else thumbnail_el.attrib.get('src'))
if switch is not None:
info['duration'] = parse_duration(switch.attrib.get('dur'))
formats = []
for v in switch.findall('./video'):
proto = v.attrib.get('proto')
if proto not in ['http', 'rtmp']:
continue
f = {
'width': int_or_none(v.attrib.get('width')),
'height': int_or_none(v.attrib.get('height')),
'filesize': int_or_none(v.attrib.get('size')),
'tbr': int_or_none(v.attrib.get('systemBitrate')) / 1000.0,
'ext': v.attrib.get('ext'),
}
src = v.attrib['src']
if proto == 'http':
if self._is_valid_url(src, video_id):
f['url'] = src
formats.append(f)
elif proto == 'rtmp':
f.update({
'url': v.attrib['streamer'],
'play_path': src,
'rtmp_real_time': True,
})
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'upload_date': upload_date,
'duration': duration,
'thumbnail': thumbnail,
'formats': formats,
}
return info

View File

@@ -119,6 +119,7 @@ class VidmeIE(InfoExtractor):
'url': f['uri'],
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'preference': 0 if f.get('type', '').endswith('clip') else 1,
} for f in video.get('formats', []) if f.get('uri')]
self._sort_formats(formats)

View File

@@ -17,6 +17,7 @@ from ..utils import (
unescapeHTML,
unified_strdate,
)
from .vimeo import VimeoIE
class VKIE(InfoExtractor):
@@ -249,6 +250,10 @@ class VKIE(InfoExtractor):
if youtube_url:
return self.url_result(youtube_url, 'Youtube')
vimeo_url = VimeoIE._extract_vimeo_url(url, info_page)
if vimeo_url is not None:
return self.url_result(vimeo_url)
m_rutube = re.search(
r'\ssrc="((?:https?:)?//rutube\.ru\\?/video\\?/embed(?:.*?))\\?"', info_page)
if m_rutube is not None:

View File

@@ -63,7 +63,9 @@ class XHamsterIE(InfoExtractor):
mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
webpage = self._download_webpage(mrss_url, video_id)
title = self._html_search_regex(r'<title>(?P<title>.+?) - xHamster\.com</title>', webpage, 'title')
title = self._html_search_regex(
[r'<title>(?P<title>.+?)(?:, (?:[^,]+? )?Porn: xHamster| - xHamster\.com)</title>',
r'<h1>([^<]+)</h1>'], webpage, 'title')
# Only a few videos have an description
mobj = re.search(r'<span>Description: </span>([^<]+)', webpage)

View File

@@ -276,7 +276,7 @@ def parseOpts(overrideArguments=None):
'For example, to only match videos that have been liked more than '
'100 times and disliked less than 50 times (or the dislike '
'functionality is not available at the given service), but who '
'also have a description, use --match-filter '
'also have a description, use --match-filter '
'"like_count > 100 & dislike_count <? 50 & description" .'
))
selection.add_option(
@@ -602,7 +602,7 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'-A', '--auto-number',
action='store_true', dest='autonumber', default=False,
help='[deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000')
help='[deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000')
filesystem.add_option(
'-t', '--title',
action='store_true', dest='usetitle', default=False,

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2015.09.22'
__version__ = '2015.10.06.1'