Compare commits

...

237 Commits

Author SHA1 Message Date
Filippo Valsorda
7b107eea51 release 2012.10.09 2012-10-09 15:53:20 +02:00
Filippo Valsorda
646b885cbf Added missing dependencies to Makefile 2012-10-09 15:49:24 +02:00
Filippo Valsorda
0bfd0b598a Re-engineered Dailymotion qualities selection (thanks @knagano, sort of merges #176) 2012-10-09 12:28:44 +02:00
Filippo Valsorda
fd873c69a4 Merge PR #422 from 'kevinamadeus/master'
Add InfoExtractor for Google Plus video
(with fixes)
2012-10-09 10:48:49 +02:00
Filippo Valsorda
d64db7409b Merge pull request #458 from grimreaper/patch-1
There is nothing bash specific in release.sh, switch to /bin/sh
2012-10-09 01:16:40 -07:00
Philipp Hagemeister
27fec0e3bd Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-08 22:14:28 +02:00
Philipp Hagemeister
65f934dc93 Correct detect_executables on Windows (Closes #447, #457) 2012-10-08 22:14:19 +02:00
grimreaper
d51d784f85 There is nothing bash specific here
/bin/bash is always wrong. Since there is nothing bash specific here, switch to /bin/sh
2012-10-06 10:00:40 -03:00
Filippo Valsorda
aa85963987 Merge pull request #452 from Tailszefox/local
Added uploaded date for Dailymotion
2012-10-03 11:29:51 -07:00
Tailszefox
413575f7a5 Added uploaded date for Dailymotion 2012-10-03 10:57:46 +02:00
Philipp Hagemeister
b7b4796bf2 Fix docs 2012-10-01 18:39:24 +02:00
Philipp Hagemeister
fcbc8c830e Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-01 18:38:19 +02:00
Philipp Hagemeister
f48ce130c7 Fix doc of extractor field 2012-10-01 18:38:10 +02:00
Filippo Valsorda
13e69f546c Merged, modified and compiled Dailymotion pull request #446 by @Steap 2012-09-30 21:45:43 +02:00
Cyril Roelandt
63ec7b7479 DailymotionIE: There is not necessarily an underscore in a Dailymotion URL. 2012-09-30 15:47:37 +02:00
Cyril Roelandt
7b6d7001d8 DailymotionIE: some videos do not use the "hqURL", "sdURL", "ldURL" keywords. In this case, the "video_url" keyword should be looked for. 2012-09-30 15:47:29 +02:00
Filippo Valsorda
39ce6e79e7 Updated youtube-dl.exe 2012-09-29 19:12:56 +02:00
Filippo Valsorda
5c961d89df Merge pull request #403 from FiloSottile/re_VERBOSE 2012-09-29 17:05:40 +02:00
Filippo Valsorda
3c4d6c9eba Not all Dailymotion videos have an hqURL, now downloads highest quality available 2012-09-29 16:53:06 +02:00
Filippo Valsorda
349e2e3e21 Fixed DailymotionIE, now downloads high-def mp4s, which might be too much (?) 2012-09-29 16:38:38 +02:00
Filippo Valsorda
551fa9dfbf adding new --output replacements. Thanks @danut007ro (closes #442) 2012-09-29 15:49:10 +02:00
Filippo Valsorda
ce3674430b added new FAQ on exe dependency 2012-09-29 15:35:07 +02:00
Filippo Valsorda
5cdfaeb37b New FAQ: What is this binary file? (+ small fix to other one) 2012-09-28 19:55:18 +02:00
Philipp Hagemeister
38612b4edc update default UA string (Closes #390) 2012-09-27 23:38:11 +02:00
Philipp Hagemeister
6c5b442a9b Add recent breakage to FAQ (Closes #433) 2012-09-27 23:30:17 +02:00
Philipp Hagemeister
5a5523698d Add new field "extractor" to the info dictionary 2012-09-27 20:48:16 +02:00
Philipp Hagemeister
05a2c206be Merge pull request #425 from danut007ro/master
Provider (youtube, etc) is now saved in info_dict
2012-09-27 11:45:07 -07:00
Philipp Hagemeister
8ca21983d8 Merge pull request #432 from cryzed/master
Fixed YouTube playlist parsing
2012-09-27 11:42:58 -07:00
Philipp Hagemeister
20326b8b1b Let Makefile use youtube-dl source code instead of compiled binary 2012-09-27 20:21:20 +02:00
Philipp Hagemeister
5d534e2fe6 Improve option definitions 2012-09-27 20:19:27 +02:00
Philipp Hagemeister
234e230c87 Merge remote-tracking branch 'FiloSottille/vbr'
Conflicts:
	youtube-dl
	youtube-dl.exe
2012-09-27 20:18:29 +02:00
Philipp Hagemeister
34ae0f9d20 Merge branch 'master' of github.com:rg3/youtube-dl 2012-09-27 19:56:29 +02:00
Philipp Hagemeister
df09e5f9e1 Merge pull request #405 from hdclark/master
Support for custom user agent
2012-09-27 10:56:25 -07:00
cryzed
3af2f7656c Fixed YouTube playlist parsing 2012-09-27 19:48:29 +02:00
Philipp Hagemeister
74e716bb64 original test video 2012-09-27 19:44:44 +02:00
Philipp Hagemeister
85f76ac90b Merge remote-tracking branch 'FiloSottille/automation' 2012-09-27 19:41:51 +02:00
Philipp Hagemeister
7f36e39676 Merge remote-tracking branch 'FiloSottille/supports'
Conflicts:
	youtube-dl
2012-09-27 19:24:41 +02:00
Philipp Hagemeister
ebe3f89ea4 Merge xnxx.com Support (NSFW). Test URL (SFW): http://video.xnxx.com/video1443330/youtube-dl_testvid_a_and_9829_._and_amp_and_38_ 2012-09-27 18:55:56 +02:00
Philipp Hagemeister
b5de8af234 Release 2012.09.27 2012-09-27 11:25:46 +02:00
Philipp Hagemeister
eb817499b0 Compile updated youtube-dl 2012-09-27 11:23:44 +02:00
Philipp Hagemeister
e2af9232b2 Merge pull request #428 from virtulis/master
A quick fix to #427
2012-09-27 02:22:05 -07:00
Danko Alexeyev
9ca667065e Add 'signature' to YouTube URLs, fixes #427 2012-09-27 09:44:49 +03:00
danut007ro
ae16f68f4a Provider (youtube, etc) is now saved in info_dict, so template filename can be something like %(provider)s_%(id)s.%(ext)s
This can be useful because videos should also be identified by their providers since id's can be the same on multiple providers.
2012-09-27 00:35:31 +03:00
danut007ro
3cd98c7894 Removed provider (mistake) and add provider parameter to process_info 2012-09-27 00:07:20 +03:00
danut007ro
2866e68838 Merge branch 'master' of https://github.com/rg3/youtube-dl 2012-09-26 21:09:44 +03:00
danut007ro
be8786a6a4 Every extractor also return it's name. 2012-09-26 21:00:28 +03:00
Filippo Valsorda
0e841bdc54 add PREFIX option to make install 2012-09-26 00:10:39 +02:00
Filippo Valsorda
225dceb046 moved make release to devscripts/release.sh 2012-09-25 23:56:01 +02:00
Philipp Hagemeister
b0d4f95899 Merge pull request #391 from rbrito/support-tube.majestyc.net
Support downloading Youtube videos via tube.majestyc.net
2012-09-25 14:17:13 -07:00
Kevin Kwan
d443aca863 Add InfoExtractor for Google Plus video 2012-09-25 16:21:02 +08:00
hdclark
ea46fe2dd4 Added support for custom user agents.
Added a few simple lines to add support for the flag "--user-agent" to pass a custom string to std_header['User-Agent'].
2012-08-22 23:40:35 -07:00
Filippo Valsorda
202e76cfb0 Made the YouTubeIE regex verbose/commented 2012-08-20 00:58:10 +02:00
Filippo Valsorda
3a68d7b467 tweaked the --audio-quality input validation/specification 2012-08-19 23:25:16 +02:00
Filippo Valsorda
795cc5059a Re-engineered XNXXIE to actually exit on ERRORs even with -i 2012-08-19 18:46:23 +02:00
Filippo Valsorda
5dc846fad0 Merge pull request #398 from tempname/master 2012-08-19 18:39:43 +02:00
Filippo Valsorda
d5c4c4c10e bugfix and standarize the youku.com support 2012-08-19 17:44:34 +02:00
Filippo Valsorda
1ac3e3315e Merge pull request #395 from thesues/master 2012-08-19 17:08:39 +02:00
Filippo Valsorda
0e4dc2fc74 Merge 'rbrito/support-tube.majestyc.net' (PR #391) with small fix 2012-08-19 17:00:20 +02:00
Filippo Valsorda
9bb8dc8e42 Python 2.6 compatibility fix. Thanks @Jamesc359 - closes #400 2012-08-19 16:06:33 +02:00
tempname
154b55dae3 added InfoExtractor for XNXX 2012-08-15 20:57:27 -03:00
tempname
6de7ef9b8d added InfoExtractor for XNXX 2012-08-15 20:54:03 -03:00
dongmao zhang
392105265c Merge branch 'master' of github.com:thesues/youtube-dl
Conflicts:
	youtube-dl
	youtube_dl/InfoExtractors.py
2012-08-10 18:32:28 +08:00
dongmao zhang
51661d8600 add www.youku.com support 2012-08-09 13:54:19 +08:00
dongmao zhang
b5809a68bf merge 2012-08-09 12:26:26 +08:00
dongmao zhang
7733d455c8 fix 0a->0A bug 2012-08-09 03:14:02 +08:00
dongmao zhang
0a98b09bc2 youku default to download hd2 video 2012-08-09 02:53:21 +08:00
dongmao zhang
302efc19ea add youku support 2012-08-09 02:04:02 +08:00
Rogério Brito
55a1fa8a56 Support downloading Youtube videos via tube.majestyc.net
A user requested (in Debian's bug tracking system) that support for
tube.majestyc.net, a frontend for Youtube with accessibility functions
(and other support for other assistive technologies), be added.

This patch adds support for this.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2012-08-05 23:37:33 -03:00
Filippo Valsorda
dce1088450 A more "make-esque" Makefile with file targets and dependencies 2012-08-03 20:10:54 +02:00
Philipp Hagemeister
a171dbfc27 Merge pull request #386 from FiloSottile/blip
Blip.tv
2012-08-01 12:26:00 -07:00
Filippo Valsorda
11a141dec9 BlipTVUserIE fix 2012-08-01 21:11:04 +02:00
Filippo Valsorda
818282710b moved the User-Agent workaround to the BlipTV IE 2012-08-01 20:51:56 +02:00
Filippo Valsorda
7a7c093ab0 added one-step realese script 'make release version=nn' - closes #158 2012-08-01 18:40:27 +02:00
Filippo Valsorda
ce7b2a40d0 added automatically generated bash-completion; closes #191 2012-08-01 17:26:50 +02:00
Filippo Valsorda
cfcec69331 auto-generating manpage from README.md (closes #151); redesigned Makefile 2012-08-01 11:54:27 +02:00
Filippo Valsorda
91645066e2 Merge branch 'joehillen/master' - pull request #381 2012-08-01 11:35:04 +02:00
Filippo Valsorda
dee5d76923 changed YouTube closed captions URL; closes #382 2012-07-31 15:56:35 +02:00
Filippo Valsorda
363a4e1114 xvideos patch by @pocoimporta - closes #370 2012-07-31 01:40:29 +02:00
joehillen
ef0c08cdfe Added install target to Makefile. 2012-07-22 13:36:22 -07:00
Philipp Hagemeister
3210735c49 Fix EscapistMagazine IE 2012-07-18 21:17:51 +02:00
Filippo Valsorda
b24676ce88 changed --audio-quality behaviour to support both CBR and VBR 2012-07-14 19:43:24 +02:00
Filippo Valsorda
cca4828ac9 fixed a logic bug in post-processing 2012-07-14 14:35:57 +02:00
Filippo Valsorda
d4e16d3e97 YouTube playlist fix; closes #365 and #331 2012-06-30 15:04:30 +02:00
Filippo Valsorda
65dc7d0272 Merge pull request #363 from chalet16/master
Change a number of subtitle sequence to begin with one - closes #362
2012-06-26 05:35:37 -07:00
Witchakorn Kamolpornwijit
5404179338 Change a number of subtitle sequence to begin with one (instead of zero) for ffmpeg,avcodec, and Matroska compatibility 2012-06-26 19:24:30 +07:00
Filippo Valsorda
7df97fb59f display a meaningful error message on rental videos (#359) 2012-06-22 13:57:17 +02:00
Filippo Valsorda
3187e42a23 Merge pull requests #356 #357 #358 by jcarlosgarciasegovia 2012-06-06 20:51:29 +02:00
Juan Carlos Garcia Segovia
f1927d71e4 Some blip.tv URLs use Unicode characters. urllib2 breaks when passing a Unicode string. it needs a UTF-8 byte buffer 2012-06-06 16:24:29 +00:00
Juan Carlos Garcia Segovia
eeeb4daabc Information Extractor for blip.tv users 2012-06-06 16:16:16 +00:00
Juan Carlos Garcia Segovia
3c4fc580bb Use an User-Agent that will allow downloading from blip.tv fixes #325 2012-06-06 13:24:12 +00:00
Filippo Valsorda
17f3c40a31 Merge pull request #353 from FiloSottile/avconv
check for avconv and ffmpeg, use as available; closes #344
2012-06-03 03:39:16 -07:00
Filippo Valsorda
505ed3088f normalize ffmpeg/avconv names printing 2012-06-03 12:11:39 +02:00
Filippo Valsorda
0b976545c7 check for avconv and ffmpeg, use as available; closes #344 2012-06-03 12:10:15 +02:00
Philipp Hagemeister
a047951477 Merge pull request #352 from chocolateboy/decontaminate_stdout
don't corrupt stdout (-o -) in verbose mode
2012-05-31 00:04:32 -07:00
chocolateboy
6ab92c8b62 don't corrupt stdout (-o -) in verbose mode 2012-05-30 11:50:13 +01:00
Filippo Valsorda
f36cd07685 fixed a couple of Windows exe update bugs 2012-05-27 23:03:45 +02:00
Philipp Hagemeister
668d975039 quiet zip in make compile 2012-05-23 19:19:53 +02:00
Philipp Hagemeister
9ab3406ddb Fix Escapist IE 2012-05-23 19:19:31 +02:00
Philipp Hagemeister
1b91a2e2cf Merge pull request #342 from FiloSottile/master
Re-organized code and a lot of other stuff.
2012-05-22 04:35:59 -07:00
Filippo Valsorda
2c288bda42 reorganized the titles sanitizing: now title is the untouched title
and stitle is created in process_info() and is cross-filesystem sanitized by sanitize_filename();
closes #164
2012-05-09 14:47:28 +02:00
Filippo Valsorda
0b8c922da9 Introduced Trouble(Exception) for more elegant non-fatal errors handling 2012-05-09 09:43:11 +00:00
Filippo Valsorda
3fe294e4ef merge upstream 2012-05-01 18:22:08 +02:00
Filippo Valsorda
921a145592 dropped the support for Python 2.5
let's elaborate the decision: Python 2.5 is a 6 years old release
and "under the current release policy, no security issues in Python
2.5 will be fixed anymore" (!!); also, it doesn't support the new
zipfile distribution format.
2012-05-01 17:01:51 +02:00
Philipp Hagemeister
0c24eed73a merge #336 2012-04-19 09:46:01 +02:00
Philipp Hagemeister
29ce2c1201 Merge git://git.jankratochvil.net/youtube-dl 2012-04-19 09:44:25 +02:00
Jan Kratochvil
532c74ae86 Add format #46 - WebM 1920x1080. 2012-04-16 17:13:01 +02:00
Filippo Valsorda
9beb5af82e some HTMLParser bugfixes 2012-04-13 22:09:24 +02:00
Filippo Valsorda
9e6dd23876 merged unescapeHTML branch; removed lxml dependency 2012-04-11 00:22:51 +02:00
Filippo Valsorda
7a8501e307 ignore parsing errors in get_element_by_id() 2012-04-10 23:08:53 +02:00
Filippo Valsorda - Campagna
781cc523af removed the undocumented HTMLParser.unescape, replaced with _unescapeHTML; fixed a bug in the use of _unescapeHTML (missing _, from d6a9615347) 2012-04-10 18:54:40 +02:00
Filippo Valsorda - Campagna
c6f45d4314 removed dependency from lxml: added IDParser 2012-04-10 18:21:00 +02:00
Filippo Valsorda - Campagna
d11d05d07a better naming for the sub-modules 2012-04-10 16:46:36 +02:00
Filippo Valsorda - Campagna
e179aadfdf moved trivialjson to a separate file 2012-04-10 16:37:40 +02:00
Filippo Valsorda - Campagna
d6a9615347 standardized the use of unescapeHTML; added clean_html() 2012-04-10 16:31:46 +02:00
Filippo Valsorda - Campagna
c6306eb798 wine-py2exe.sh to create the exe under linux (!!) 2012-04-07 20:07:42 +02:00
Filippo Valsorda
bcfde70d73 py2exe -U fix for Windows XP 2012-03-31 01:27:47 +02:00
Filippo Valsorda
53e893615d corrected -U to support new zipfile and exe (#153) formats 2012-03-31 01:19:30 +02:00
Filippo Valsorda
303692b5ed 's/ /\t/' 2012-03-30 23:54:16 +02:00
Filippo Valsorda
58ca755f40 moved increment_downloads and process_info calls from IEs to FD.download (#296) (follows current doclines); a small step towards importability #217 2012-03-30 23:45:27 +02:00
Filippo Valsorda
770234afa2 Added py2exe script 2012-03-25 23:48:53 +02:00
Filippo Valsorda
d77c3dfd02 Split code as a package, compiled into an executable zip 2012-03-25 03:07:37 +02:00
Filippo Valsorda
c23d8a74dc Merge branch 'next-url' 2012-03-25 01:07:47 +01:00
Filippo Valsorda
74a5ff5f43 transplant ceba827e9a, d891ff9fd9, 69d3b2d824, 071940680f 2012-03-24 01:23:19 +01:00
Filippo Valsorda
071940680f Always extract original URL from next_url (#318) 2012-03-24 01:17:36 +01:00
Witold Baryluk
69d3b2d824 Extract original URL from next_url parameter of verify_age page, before actual extract 2012-03-23 06:17:29 +01:00
Witold Baryluk
d891ff9fd9 Ignore leading spaces (and trailing also) in all URL from url list or command line 2012-03-23 06:15:57 +01:00
Filippo Valsorda
6af22cf0ef added support for HTTP redirects. Closes #315 2012-03-18 22:15:58 +01:00
Philipp Hagemeister
fff24d5e35 Clean up superfluous whitespace 2012-03-15 20:52:35 +01:00
Philipp Hagemeister
ceba827e9a Credit Filippo Valsorda 2012-03-15 20:47:27 +01:00
Filippo Valsorda
a0432a1e80 added --srt-lang; updated README; extended the -g FAQ 2012-03-15 14:56:08 +01:00
Filippo Valsorda
cfcf32d038 Merge branch 'master' of git://github.com/rg3/youtube-dl into closed-captions 2012-03-15 14:05:34 +01:00
Philipp Hagemeister
a67bdc34fa transplant gist of 7151f63a5f 2012-03-15 08:36:31 +01:00
Philipp Hagemeister
b3a653c245 Merge commit '7151f63a5f3820a322ba8bf61eebe8d9f75d6ee5' 2012-03-15 08:26:44 +01:00
Philipp Hagemeister
4a34b7252e transplant 2934c2ce43 and afbaa80b8b 2012-03-15 08:05:21 +01:00
Philipp Hagemeister
7e45ec57a8 transplant 0f6e296a8e 2012-03-15 07:56:32 +01:00
Filippo Valsorda
afbaa80b8b switched ytsearch to more robust Youtube Data API (fixes #307) 2012-03-14 22:44:45 +01:00
Filippo Valsorda
115d243428 added youtube closed captions .srt support (see #90) 2012-03-13 23:49:33 +01:00
cryzed
7151f63a5f Fixed downloading of unrelated videos when downloading a YouTube playlist 2012-03-09 22:05:35 +01:00
Filippo Valsorda
597e7b1805 Vimeo: Added support for flv only videos 2012-03-07 21:02:12 +01:00
Filippo Valsorda
2934c2ce43 Switch Vimeo to scraping: fixes #285 2012-03-05 17:51:16 +01:00
Filippo Valsorda
0f6e296a8e Fixed gvsearch 2012-03-02 00:35:56 +01:00
Philipp Hagemeister
9c228928b6 release 2012.02.27 2012-02-27 20:19:28 +01:00
Philipp Hagemeister
ff3a2b8eab Always determine youtube description 2012-02-27 20:19:03 +01:00
Philipp Hagemeister
c4105fa035 release 2012.02.26 2012-02-27 00:42:26 +01:00
Philipp Hagemeister
871dbd3c92 Output RTMP command line if verbose is set 2012-02-27 00:41:53 +01:00
Philipp Hagemeister
c9ed14e6d6 Move imports to top (Closes #283) 2012-02-26 23:53:56 +01:00
Philipp Hagemeister
1ad85e5061 Set default continue behavior to true, no breakage observed in the wild 2012-02-26 23:44:32 +01:00
Philipp Hagemeister
09fbc6c952 verbose flag, and output proxies if it is set 2012-02-26 23:33:19 +01:00
Philipp Hagemeister
895ec266bb Merge pull request #292 from rbrito/fixes/vimeo-ie
VimeoIE: Allow matches taken from embedded videos.
2012-02-26 14:25:12 -08:00
Rogério Brito
d85448f3bb VimeoIE: Allow matches taken from embedded videos.
With this change, I can directly cut and paste URLs embedded in 3rd-party
pages as `youtube-dl`'s arguments.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2012-02-24 07:12:21 -02:00
Philipp Hagemeister
99d46e8c27 Merge pull request #275 from grawity/winnt-unicode
Support Unicode in file names on Windows NT
2012-01-16 03:23:22 -08:00
Mantas Mikulėnas
4afdff39d7 Support Unicode in file names on Windows NT 2012-01-16 12:08:01 +02:00
Philipp Hagemeister
661a807c65 Release 2012.01.08b 2012-01-08 17:23:10 +01:00
Philipp Hagemeister
6d58c4546e correct to_screen prints 2012-01-08 17:22:48 +01:00
Philipp Hagemeister
38ffbc0222 Release 2012.01.08 2012-01-08 17:20:55 +01:00
Philipp Hagemeister
fefb166c52 Leave out characters the filesystem cannot encode (Closes: #264) 2012-01-08 17:20:18 +01:00
Philipp Hagemeister
dcb3c22e0b MTV IE 2012-01-07 01:30:30 +01:00
Philipp Hagemeister
47a53c9e46 release 2012.01.05 2012-01-05 11:08:50 +01:00
Philipp Hagemeister
1413cd87eb Correct distinction between unicode and bytes (Closes: #257) 2012-01-05 10:46:21 +01:00
Philipp Hagemeister
c92e184f75 Correct comedycentral flash URL regex 2012-01-05 00:39:47 +01:00
Philipp Hagemeister
3906e6ce60 correct epydoc 2012-01-05 00:36:47 +01:00
Philipp Hagemeister
c7d3c3db0d Fix tds RTMP url extraction 2012-01-04 14:08:17 +01:00
Philipp Hagemeister
d6639d05c2 release 2011.12.18 2011-12-17 01:35:05 +01:00
Philipp Hagemeister
633cf7cbad Add wav audio output 2011-12-17 01:32:28 +01:00
Philipp Hagemeister
a5647b79ce Only skip download if files exists; convert audio 2011-12-16 23:33:46 +01:00
Philipp Hagemeister
ba5059dd66 Release 2011.12.15 2011-12-15 20:32:37 +01:00
Philipp Hagemeister
bb8abbbbae Dailymotion: Use og:title instead of <title> to find title (Closes: #253) 2011-12-15 20:32:05 +01:00
Philipp Hagemeister
561504fffa Release 2011.12.08 2011-12-08 21:39:13 +01:00
Philipp Hagemeister
23e6b8adc8 --prefer-free-formats (Closes #231) 2011-12-08 21:38:28 +01:00
Philipp Hagemeister
3e0ea7d07a m4a: aac in mp4 container (Closes #240) 2011-12-08 21:21:25 +01:00
Philipp Hagemeister
94fd3201b2 Abort when --max-downloads is reached. 2011-12-08 20:59:02 +01:00
Philipp Hagemeister
0b3f3e1ad9 Merge pull request #245 from rbrito/fix-makefile
Makefile: Don't use `echo`'s `-e` option for portability.
2011-12-08 11:39:56 -08:00
Philipp Hagemeister
a05d2a0c05 Merge branch 'master' of github.com:rg3/youtube-dl 2011-12-08 20:39:22 +01:00
Philipp Hagemeister
0b14e0b367 OpenClassRoom IE (Closes: #234) 2011-12-08 20:36:00 +01:00
Rogério Brito
66e8777769 Makefile: Don't use echo's -e option for portability.
Many systems (including Debian, Ubuntu and derivatives like Linux Mint) use
Dash as a noninteractive version of `/bin/sh`, invoked by `make`.

Dash's `echo` command doesn't understand the `-e` option and this generates
spurious output when running `make`.  See [a bugreport][0] for one of the
many instances of this bug/feature in action.

[0]: https://bugs.launchpad.net/ubuntu/+source/dash/+bug/72167
2011-12-08 13:18:29 -02:00
Philipp Hagemeister
348486ced4 Merge pull request #238 from rbrito/add-to-gitignore
Add list of files to ignore for `youtube-dl`.
2011-11-30 11:45:17 -08:00
Rogério Brito
f1f300e629 Add list of files to ignore for youtube-dl. 2011-11-30 14:17:20 -02:00
Philipp Hagemeister
dd17922afc OpenClassRoom videos (#234) 2011-11-30 10:52:04 +01:00
Philipp Hagemeister
40fd4cb86a Move merged code to dev version 2011-11-30 10:00:36 +01:00
Philipp Hagemeister
9e9b75ae4d Merge pull request #236 from lra/dailymotion-fix
Fix the DailymotionIE to parse the new title of a webpage
2011-11-30 00:57:09 -08:00
Laurent Raufaste
8abf76ddb9 Fix the DailymotionIE to parse the new title of a webpage 2011-11-29 22:30:42 -05:00
Philipp Hagemeister
c95da745bc Mention -o - in doc (Closes #204) 2011-11-29 20:22:27 +01:00
Philipp Hagemeister
0cd235eef6 Use freedesktop.org mandated user config file location (Suggested by Tyll in #231) 2011-11-29 20:13:13 +01:00
Philipp Hagemeister
77315556f1 Do not count unmatched or skipped videos towards max-downloads (Closes #232) 2011-11-29 20:08:01 +01:00
Philipp Hagemeister
c379c181e0 Preliminary implementation of configuration files 2011-11-28 01:29:46 +01:00
Philipp Hagemeister
31a2ec2d88 Document -o %(upload_date)s (Closes #228) 2011-11-28 01:00:01 +01:00
Philipp Hagemeister
b88a52504e --max-downloads option (Closes #230) 2011-11-28 00:55:44 +01:00
Philipp Hagemeister
a95567af99 Credit shizeeg for Mixcloud IE 2011-11-24 18:58:19 +01:00
Philipp Hagemeister
849edab8ec Move MixcloudIE to __init__.py 2011-11-24 18:02:12 +01:00
sh!zeeg
b158a1d946 Mixcloud IE 2011-11-24 20:45:14 +04:00
Philipp Hagemeister
fa2672f9fc Release 2011.11.23 2011-11-23 10:36:52 +01:00
Philipp Hagemeister
28e3614bc0 Fix vimeo error (Closes #224) 2011-11-23 10:35:55 +01:00
Philipp Hagemeister
208e095f72 Correct simplify_title call in ComedyCentral IE 2011-11-22 21:26:38 +01:00
Philipp Hagemeister
0ae7abe57c Release 2011.11.22 2011-11-22 15:32:53 +01:00
Philipp Hagemeister
dc0a294a73 Make exception handling 2.5-compatible (Closes #223) 2011-11-22 15:31:30 +01:00
Philipp Hagemeister
468c99257c Release 2011.11.21 2011-11-21 21:51:24 +01:00
Philipp Hagemeister
af8e8d63f9 Allow non-ASCII characters in simplified titles(Closes #220) 2011-11-21 21:50:39 +01:00
Philipp Hagemeister
e092418d8b Simplify simplify_title 2011-11-21 20:17:07 +01:00
Philipp Hagemeister
e33e3045c6 First tests 2011-11-21 20:07:24 +01:00
Philipp Hagemeister
cb6568bf21 Use the dev version in Makefile 2011-11-21 20:00:54 +01:00
Philipp Hagemeister
235b3ba479 Move code into a separate Python module 2011-11-21 19:59:59 +01:00
Philipp Hagemeister
5b3330e0cf Remove empty real_initialize defs 2011-11-21 19:31:20 +01:00
Philipp Hagemeister
aab771fbdf Credit authors of Soundclound and InfoQ extractors 2011-11-16 09:33:03 +01:00
Philipp Hagemeister
00f95a93f5 InfoQ IE (Closes #216) 2011-11-15 23:00:31 +01:00
Philipp Hagemeister
1724e7c461 Merge remote-tracking branch 'ngokevin/soundcloud' 2011-11-15 22:37:49 +01:00
Ori Avtalion
3b98a5ddac InfoQ IE 2011-11-15 23:20:29 +02:00
Philipp Hagemeister
8b59cc93d5 Merge pull request #211 from techtonik/patch-1
Fix duplicated downloads from YouTube user page where watch URLs are not. Thanks to anatoly techtonik.
2011-11-15 01:39:17 -08:00
Philipp Hagemeister
c3e4e7c182 Fix youtube playlist IE match (Closes: #210) 2011-11-15 10:35:59 +01:00
Kevin Ngo
38348005b3 removed weird indent 2011-11-12 17:28:26 -08:00
Kevin Ngo
208c4b9128 added whitespace below soundcloudIE class 2011-11-12 17:10:21 -08:00
Kevin Ngo
ec574c2c41 extracts full title from source 2011-11-12 17:08:40 -08:00
Kevin Ngo
871be928a8 now downloads soundcloud songs, need to polish title grabbing and file naming 2011-11-12 16:48:43 -08:00
Kevin Ngo
b20d4f8626 changed spaces to tabs (by yt-dl standards), fixed bugs, but still won't download. need to figure out how the whole process works to integrate correctly 2011-11-10 01:04:33 -08:00
Kevin Ngo
073d7a5985 extracted all of the soundcloud information including description (not tested), need to hook into filedownloader 2011-11-09 01:52:36 -08:00
Kevin Ngo
40306424b1 work on soundcloud information extractor...need to talk to youtube-dl guys 2011-11-08 00:03:35 -08:00
Kevin Ngo
ecb3bfe543 going home and need to upload what little i did 2011-11-07 18:02:10 -08:00
anatoly techtonik
abeac45abe Fix duplicated downloads from YouTube user page where watch URLs are not always end with &. Stop scan on closing bracker prevents regexp to capture two links instead of one. 2011-11-06 16:42:43 +03:00
Philipp Hagemeister
0fca93ac60 Merge pull request #206 from rbrito/fixes/facebook-ie-2
FacebookIE: Fix regexp to recognize videos that weren't considered.
2011-11-02 10:56:18 -07:00
Rogério Brito
857e5f329a FacebookIE: Fix regexp to recognize videos that weren't considered. 2011-11-01 12:07:05 -02:00
Rogério Brito
053419cd24 FacebookIE: The date doesn't seem to be available anymore.
The current regular expression is likely to match a lot of stuff, as each
comment that a video has has one of those and it is not clear which one is
the date of the video *upload* itself.
2011-10-20 20:28:34 -02:00
Rogério Brito
99e207bab0 FacebookIE: Fix extraction of title as Facebook has changed stuff. 2011-10-20 20:27:48 -02:00
Rogério Brito
0067bbe7a7 FacebookIE: Not all videos are available in all formats. 2011-10-20 20:26:42 -02:00
Philipp Hagemeister
45aa690868 Release 2011.10.19 2011-10-19 00:40:13 +02:00
Philipp Hagemeister
beb245e92f Merge branch 'vimeo' of https://github.com/rbrito/youtube-dl 2011-10-19 00:38:41 +02:00
Rogério Brito
c424df0d2f vimeo: Add the ability to detect if a video is available in HD. (Closes: #194) 2011-10-19 00:37:45 +02:00
Rogério Brito
87929e4b35 vimeo: Add the ability to detect if a video is available in HD. 2011-10-18 19:47:19 -02:00
Philipp Hagemeister
d76736fc5e Merge pull request #195 from rbrito/xvideos
Fixes for the xvideos IE
2011-10-18 13:49:21 -07:00
Rogério Brito
0f9b77223e xvideos: Capture only the video title, not the name of the site. 2011-10-18 18:42:01 -02:00
Rogério Brito
9f47175a40 xvideos: Fix misleading error message when extracting the URL.
We said that we were trying to extract the title of the video.
2011-10-18 18:41:02 -02:00
Rogério Brito
a1a8713aad xvideos: Normalize the URL or it will fail with some inputs.
For instance, if we give it <www.xvideos.com/video1384059>, we would
end up passing that to urllib2, which would complain about an unknown
URL type.
2011-10-18 18:38:17 -02:00
Rogério Brito
6501a06d46 Quick and dirty IE for xvideos.com. 2011-10-13 16:44:20 -03:00
Philipp Hagemeister
8d89fbae5a CollegeHumor IE 2011-10-12 21:13:43 +02:00
Philipp Hagemeister
7a2cf5455c Fix recognition of http://www.youtube.com/course?list=PL41FDABC6AA085E78&category=University/Mathematics/Topology%20%26%20Geometry 2011-10-04 03:25:20 +02:00
Rogério Brito
7125a7ca8b Support "newstyle" Youtube playlist IDs.
Many playlists reported by Youtube now have the form like in:

    http://www.youtube.com/playlist?list=PL520044A3524F5E5D

where `PL` is prefixed to what youtube-dl used to use as playlist IDs. So,
while matching it, we adapt the regular expression so as to discard the `PL`
and only get the `520044A3524F5E5D` in the case of the playlist of the
example.
2011-10-03 21:41:33 -03:00
Philipp Hagemeister
54d47874f7 release 2011.09.30 2011-09-30 09:07:59 +02:00
Philipp Hagemeister
2761012f69 Cosmetic changes to --list-formats 2011-09-30 09:07:36 +02:00
Francois du Toit
3de2a1e635 Added option -L to list available formats 2011-09-28 01:28:37 +02:00
22 changed files with 5627 additions and 34 deletions

5
.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
*.pyc
*.pyo
*~
wine-py2exe/
py2exe.log

View File

@@ -1 +1 @@
2011.09.27 2012.10.09

View File

@@ -1,20 +1,47 @@
default: update all: youtube-dl README.md youtube-dl.1 youtube-dl.bash-completion LATEST_VERSION
# TODO: re-add youtube-dl.exe, and make sure it's 1. safe and 2. doesn't need sudo
update: update-readme update-latest clean:
rm -f youtube-dl youtube-dl.exe youtube-dl.1 LATEST_VERSION
update-latest: PREFIX=/usr/local
./youtube-dl --version > LATEST_VERSION install: youtube-dl youtube-dl.1 youtube-dl.bash-completion
install -m 755 --owner root --group root youtube-dl $(PREFIX)/bin/
install -m 644 --owner root --group root youtube-dl.1 $(PREFIX)/man/man1
install -m 644 --owner root --group root youtube-dl.bash-completion /etc/bash_completion.d/youtube-dl
update-readme: .PHONY: all clean install README.md youtube-dl.bash-completion
@options=$$(COLUMNS=80 ./youtube-dl --help | sed -e '1,/.*General Options.*/ d' -e 's/^\W\{2\}\(\w\)/### \1/') && \ # TODO un-phony README.md and youtube-dl.bash_completion by reading from .in files and generating from them
header=$$(sed -e '/.*## OPTIONS/,$$ d' README.md) && \
footer=$$(sed -e '1,/.*## FAQ/ d' README.md) && \ youtube-dl: youtube_dl/*.py
zip --quiet --junk-paths youtube-dl youtube_dl/*.py
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
youtube-dl.exe: youtube_dl/*.py
bash devscripts/wine-py2exe.sh build_exe.py
README.md: youtube_dl/*.py
@options=$$(COLUMNS=80 python -m youtube_dl --help | sed -e '1,/.*General Options.*/ d' -e 's/^\W\{2\}\(\w\)/## \1/') && \
header=$$(sed -e '/.*# OPTIONS/,$$ d' README.md) && \
footer=$$(sed -e '1,/.*# FAQ/ d' README.md) && \
echo "$${header}" > README.md && \ echo "$${header}" > README.md && \
echo -e '\n## OPTIONS' >> README.md && \ echo >> README.md && \
echo '# OPTIONS' >> README.md && \
echo "$${options}" >> README.md&& \ echo "$${options}" >> README.md&& \
echo -e '\n## FAQ' >> README.md && \ echo >> README.md && \
echo '# FAQ' >> README.md && \
echo "$${footer}" >> README.md echo "$${footer}" >> README.md
youtube-dl.1: README.md
pandoc -s -w man README.md -o youtube-dl.1
youtube-dl.bash-completion: README.md
@options=`egrep -o '(--[a-z-]+) ' README.md | sort -u | xargs echo` && \
content=`sed "s/opts=\"[^\"]*\"/opts=\"$${options}\"/g" youtube-dl.bash-completion` && \
echo "$${content}" > youtube-dl.bash-completion
.PHONY: default update update-latest update-readme LATEST_VERSION: youtube_dl/__init__.py
python -m youtube_dl --version > LATEST_VERSION

View File

@@ -1,43 +1,51 @@
# youtube-dl % youtube-dl(1)
## USAGE # NAME
youtube-dl [options] url [url...] youtube-dl
## DESCRIPTION # SYNOPSIS
**youtube-dl** [OPTIONS] URL [URL...]
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from **youtube-dl** is a small command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version YouTube.com and a few more sites. It requires the Python interpreter, version
2.x (x being at least 5), and it is not platform specific. It should work in 2.x (x being at least 6), and it is not platform specific. It should work in
your Unix box, in Windows or in Mac OS X. It is released to the public domain, your Unix box, in Windows or in Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like. which means you can modify it, redistribute it or use it however you like.
## OPTIONS # OPTIONS
-h, --help print this help text and exit -h, --help print this help text and exit
-v, --version print program version and exit --version print program version and exit
-U, --update update this program to latest version -U, --update update this program to latest version
-i, --ignore-errors continue on download errors -i, --ignore-errors continue on download errors
-r, --rate-limit LIMIT download rate limit (e.g. 50k or 44.6m) -r, --rate-limit LIMIT download rate limit (e.g. 50k or 44.6m)
-R, --retries RETRIES number of retries (default is 10) -R, --retries RETRIES number of retries (default is 10)
--dump-user-agent display the current browser identification --dump-user-agent display the current browser identification
--user-agent UA specify a custom user agent
--list-extractors List all supported extractors and the URLs they --list-extractors List all supported extractors and the URLs they
would handle would handle
### Video Selection: ## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1) --playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last) --playlist-end NUMBER playlist video to end at (default is last)
--match-title REGEX download only matching titles (regex or caseless --match-title REGEX download only matching titles (regex or caseless
sub-string) sub-string)
--reject-title REGEX skip download for matching titles (regex or --reject-title REGEX skip download for matching titles (regex or
caseless sub-string) caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
### Filesystem Options: ## Filesystem Options:
-t, --title use title in file name -t, --title use title in file name
-l, --literal use literal title in file name -l, --literal use literal title in file name
-A, --auto-number number downloaded files starting from 00000 -A, --auto-number number downloaded files starting from 00000
-o, --output TEMPLATE output filename template. Use %(stitle)s to get the -o, --output TEMPLATE output filename template. Use %(stitle)s to get the
title, %(uploader)s for the uploader name, title, %(uploader)s for the uploader name,
%(autonumber)s to get an automatically incremented %(autonumber)s to get an automatically incremented
number, %(ext)s for the filename extension, and %% number, %(ext)s for the filename extension,
for a literal percent %(upload_date)s for the upload date (YYYYMMDD),
%(extractor)s for the provider (youtube, metacafe,
etc), %(id)s for the video id and %% for a literal
percent. Use - to output to stdout.
-a, --batch-file FILE file containing URLs to download ('-' for stdin) -a, --batch-file FILE file containing URLs to download ('-' for stdin)
-w, --no-overwrites do not overwrite files -w, --no-overwrites do not overwrite files
-c, --continue resume partially downloaded files -c, --continue resume partially downloaded files
@@ -50,7 +58,7 @@ which means you can modify it, redistribute it or use it however you like.
--write-description write video description to a .description file --write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file --write-info-json write video metadata to a .info.json file
### Verbosity / Simulation Options: ## Verbosity / Simulation Options:
-q, --quiet activates quiet mode -q, --quiet activates quiet mode
-s, --simulate do not download the video and do not write anything -s, --simulate do not download the video and do not write anything
to disk to disk
@@ -63,26 +71,37 @@ which means you can modify it, redistribute it or use it however you like.
--get-format simulate, quiet but print output format --get-format simulate, quiet but print output format
--no-progress do not print progress bar --no-progress do not print progress bar
--console-title display progress in console titlebar --console-title display progress in console titlebar
-v, --verbose print various debugging information
### Video Format Options: ## Video Format Options:
-f, --format FORMAT video format code -f, --format FORMAT video format code
--all-formats download all available video formats --all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific one is
requested
--max-quality FORMAT highest quality format to download --max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats (currently youtube only)
--write-srt write video closed captions to a .srt file
(currently youtube only)
--srt-lang LANG language of the closed captions to download
(optional) use IETF language tags like 'en'
### Authentication Options: ## Authentication Options:
-u, --username USERNAME account username -u, --username USERNAME account username
-p, --password PASSWORD account password -p, --password PASSWORD account password
-n, --netrc use .netrc authentication data -n, --netrc use .netrc authentication data
### Post-processing Options: ## Post-processing Options:
--extract-audio convert video files to audio-only files (requires --extract-audio convert video files to audio-only files (requires
ffmpeg and ffprobe) ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT "best", "aac", "vorbis" or "mp3"; best by default --audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", or "wav";
--audio-quality QUALITY ffmpeg audio bitrate specification, 128k by default best by default
--audio-quality QUALITY ffmpeg/avconv audio quality specification, insert a
value between 0 (better) and 9 (worse) for VBR or a
specific bitrate like 128K (default 5)
-k, --keep-video keeps the video file on disk after the post- -k, --keep-video keeps the video file on disk after the post-
processing; the video is erased by default processing; the video is erased by default
## FAQ # FAQ
### Can you please put the -b option back? ### Can you please put the -b option back?
@@ -98,19 +117,48 @@ Once the video is fully downloaded, use any video player, such as [vlc](http://w
### The links provided by youtube-dl -g are not working anymore ### The links provided by youtube-dl -g are not working anymore
The URLs youtube-dl outputs require the downloader to have the correct cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. The URLs youtube-dl outputs require the downloader to have the correct cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
### ERROR: no fmt_url_map or conn information found in video info ### ERROR: no fmt_url_map or conn information found in video info
youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`. youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
## COPYRIGHT ### ERROR: unable to download video ###
youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
### SyntaxError: Non-ASCII character ###
The error
File "youtube-dl", line 2
SyntaxError: Non-ASCII character '\x93' ...
means you're using an outdated version of Python. Please update to Python 2.6 or 2.7.
To run youtube-dl under Python 2.5, you'll have to manually check it out like this:
git clone git://github.com/rg3/youtube-dl.git
cd youtube-dl
python -m youtube_dl --help
Please note that Python 2.5 is not supported anymore.
### What is this binary file? Where has the code gone?
Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repo to see the code. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make compile`.
### The exe throws a *Runtime error from Visual C++*
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
# COPYRIGHT
youtube-dl is released into the public domain by the copyright holders. youtube-dl is released into the public domain by the copyright holders.
This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain. This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
## BUGS # BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>

48
build_exe.py Normal file
View File

@@ -0,0 +1,48 @@
from distutils.core import setup
import py2exe
import sys, os
"""This will create an exe that needs Microsoft Visual C++ 2008 Redistributable Package"""
# If run without args, build executables
if len(sys.argv) == 1:
sys.argv.append("py2exe")
# os.chdir(os.path.dirname(os.path.abspath(sys.argv[0]))) # conflict with wine-py2exe.sh
sys.path.append('./youtube_dl')
options = {
"bundle_files": 1,
"compressed": 1,
"optimize": 2,
"dist_dir": '.',
"dll_excludes": ['w9xpopen.exe']
}
console = [{
"script":"./youtube_dl/__main__.py",
"dest_base": "youtube-dl",
}]
init_file = open('./youtube_dl/__init__.py')
for line in init_file.readlines():
if line.startswith('__version__'):
version = line[11:].strip(" ='\n")
break
else:
version = ''
setup(name='youtube-dl',
version=version,
description='Small command-line program to download videos from YouTube.com and other video sites',
url='https://github.com/rg3/youtube-dl',
packages=['youtube_dl'],
console = console,
options = {"py2exe": options},
zipfile = None,
)
import shutil
shutil.rmtree("build")

Binary file not shown.

Binary file not shown.

6
devscripts/posix-locale.sh Executable file
View File

@@ -0,0 +1,6 @@
# source this file in your shell to get a POSIX locale (which will break many programs, but that's kind of the point)
export LC_ALL=POSIX
export LANG=POSIX
export LANGUAGE=POSIX

11
devscripts/release.sh Executable file
View File

@@ -0,0 +1,11 @@
#!/bin/sh
if [ -z "$1" ]; then echo "ERROR: specify version number like this: $0 1994.09.06"; exit 1; fi
version="$1"
if [ ! -z "`git tag | grep "$version"`" ]; then echo 'ERROR: version already present'; exit 1; fi
if [ ! -z "`git status --porcelain`" ]; then echo 'ERROR: the working directory is not clean; commit or stash changes'; exit 1; fi
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/__init__.py
make all
git add -A
git commit -m "release $version"
git tag -m "Release $version" "$version"

56
devscripts/wine-py2exe.sh Executable file
View File

@@ -0,0 +1,56 @@
#!/bin/bash
# Run with as parameter a setup.py that works in the current directory
# e.g. no os.chdir()
# It will run twice, the first time will crash
set -e
SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
if [ ! -d wine-py2exe ]; then
sudo apt-get install wine1.3 axel bsdiff
mkdir wine-py2exe
cd wine-py2exe
export WINEPREFIX=`pwd`
axel -a "http://www.python.org/ftp/python/2.7/python-2.7.msi"
axel -a "http://downloads.sourceforge.net/project/py2exe/py2exe/0.6.9/py2exe-0.6.9.win32-py2.7.exe"
#axel -a "http://winetricks.org/winetricks"
# http://appdb.winehq.org/objectManager.php?sClass=version&iId=21957
echo "Follow python setup on screen"
wine msiexec /i python-2.7.msi
echo "Follow py2exe setup on screen"
wine py2exe-0.6.9.win32-py2.7.exe
#echo "Follow Microsoft Visual C++ 2008 Redistributable Package setup on screen"
#bash winetricks vcrun2008
rm py2exe-0.6.9.win32-py2.7.exe
rm python-2.7.msi
#rm winetricks
# http://bugs.winehq.org/show_bug.cgi?id=3591
mv drive_c/Python27/Lib/site-packages/py2exe/run.exe drive_c/Python27/Lib/site-packages/py2exe/run.exe.backup
bspatch drive_c/Python27/Lib/site-packages/py2exe/run.exe.backup drive_c/Python27/Lib/site-packages/py2exe/run.exe "$SCRIPT_DIR/SizeOfImage.patch"
mv drive_c/Python27/Lib/site-packages/py2exe/run_w.exe drive_c/Python27/Lib/site-packages/py2exe/run_w.exe.backup
bspatch drive_c/Python27/Lib/site-packages/py2exe/run_w.exe.backup drive_c/Python27/Lib/site-packages/py2exe/run_w.exe "$SCRIPT_DIR/SizeOfImage_w.patch"
cd -
else
export WINEPREFIX="$( cd wine-py2exe && pwd )"
fi
wine "C:\\Python27\\python.exe" "$1" py2exe > "py2exe.log" 2>&1 || true
echo '# Copying python27.dll' >> "py2exe.log"
cp "$WINEPREFIX/drive_c/windows/system32/python27.dll" build/bdist.win32/winexe/bundle-2.7/
wine "C:\\Python27\\python.exe" "$1" py2exe >> "py2exe.log" 2>&1

29
test/test_div.py Normal file
View File

@@ -0,0 +1,29 @@
# -*- coding: utf-8 -*-
# Various small unit tests
import os,sys
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
import youtube_dl
def test_simplify_title():
assert youtube_dl._simplify_title(u'abc') == u'abc'
assert youtube_dl._simplify_title(u'abc_d-e') == u'abc_d-e'
assert youtube_dl._simplify_title(u'123') == u'123'
assert u'/' not in youtube_dl._simplify_title(u'abc/de')
assert u'abc' in youtube_dl._simplify_title(u'abc/de')
assert u'de' in youtube_dl._simplify_title(u'abc/de')
assert u'/' not in youtube_dl._simplify_title(u'abc/de///')
assert u'\\' not in youtube_dl._simplify_title(u'abc\\de')
assert u'abc' in youtube_dl._simplify_title(u'abc\\de')
assert u'de' in youtube_dl._simplify_title(u'abc\\de')
assert youtube_dl._simplify_title(u'ä') == u'ä'
assert youtube_dl._simplify_title(u'кириллица') == u'кириллица'
# Strip underlines
assert youtube_dl._simplify_title(u'\'a_') == u'a'

BIN
test/testvideo-original.mp4 Normal file

Binary file not shown.

Binary file not shown.

239
youtube-dl.1 Normal file
View File

@@ -0,0 +1,239 @@
.TH youtube-dl 1 ""
.SH NAME
.PP
youtube-dl
.SH SYNOPSIS
.PP
\f[B]youtube-dl\f[] [OPTIONS] URL [URL...]
.SH DESCRIPTION
.PP
\f[B]youtube-dl\f[] is a small command-line program to download videos
from YouTube.com and a few more sites.
It requires the Python interpreter, version 2.x (x being at least 6),
and it is not platform specific.
It should work in your Unix box, in Windows or in Mac OS X.
It is released to the public domain, which means you can modify it,
redistribute it or use it however you like.
.SH OPTIONS
.IP
.nf
\f[C]
-h,\ --help\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ print\ this\ help\ text\ and\ exit
--version\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ print\ program\ version\ and\ exit
-U,\ --update\ \ \ \ \ \ \ \ \ \ \ \ \ update\ this\ program\ to\ latest\ version
-i,\ --ignore-errors\ \ \ \ \ \ continue\ on\ download\ errors
-r,\ --rate-limit\ LIMIT\ \ \ download\ rate\ limit\ (e.g.\ 50k\ or\ 44.6m)
-R,\ --retries\ RETRIES\ \ \ \ number\ of\ retries\ (default\ is\ 10)
--dump-user-agent\ \ \ \ \ \ \ \ display\ the\ current\ browser\ identification
--user-agent\ UA\ \ \ \ \ \ \ \ \ \ specify\ a\ custom\ user\ agent
--list-extractors\ \ \ \ \ \ \ \ List\ all\ supported\ extractors\ and\ the\ URLs\ they
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ would\ handle
\f[]
.fi
.SS Video Selection:
.IP
.nf
\f[C]
--playlist-start\ NUMBER\ \ playlist\ video\ to\ start\ at\ (default\ is\ 1)
--playlist-end\ NUMBER\ \ \ \ playlist\ video\ to\ end\ at\ (default\ is\ last)
--match-title\ REGEX\ \ \ \ \ \ download\ only\ matching\ titles\ (regex\ or\ caseless
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ sub-string)
--reject-title\ REGEX\ \ \ \ \ skip\ download\ for\ matching\ titles\ (regex\ or
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ caseless\ sub-string)
--max-downloads\ NUMBER\ \ \ Abort\ after\ downloading\ NUMBER\ files
\f[]
.fi
.SS Filesystem Options:
.IP
.nf
\f[C]
-t,\ --title\ \ \ \ \ \ \ \ \ \ \ \ \ \ use\ title\ in\ file\ name
-l,\ --literal\ \ \ \ \ \ \ \ \ \ \ \ use\ literal\ title\ in\ file\ name
-A,\ --auto-number\ \ \ \ \ \ \ \ number\ downloaded\ files\ starting\ from\ 00000
-o,\ --output\ TEMPLATE\ \ \ \ output\ filename\ template.\ Use\ %(stitle)s\ to\ get\ the
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ title,\ %(uploader)s\ for\ the\ uploader\ name,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(autonumber)s\ to\ get\ an\ automatically\ incremented
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ number,\ %(ext)s\ for\ the\ filename\ extension,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(upload_date)s\ for\ the\ upload\ date\ (YYYYMMDD),
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(extractor)s\ for\ the\ provider\ (youtube,\ metacafe,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ etc),\ %(id)s\ for\ the\ video\ id\ and\ %%\ for\ a\ literal
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ percent.\ Use\ -\ to\ output\ to\ stdout.
-a,\ --batch-file\ FILE\ \ \ \ file\ containing\ URLs\ to\ download\ (\[aq]-\[aq]\ for\ stdin)
-w,\ --no-overwrites\ \ \ \ \ \ do\ not\ overwrite\ files
-c,\ --continue\ \ \ \ \ \ \ \ \ \ \ resume\ partially\ downloaded\ files
--no-continue\ \ \ \ \ \ \ \ \ \ \ \ do\ not\ resume\ partially\ downloaded\ files\ (restart
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ from\ beginning)
--cookies\ FILE\ \ \ \ \ \ \ \ \ \ \ file\ to\ read\ cookies\ from\ and\ dump\ cookie\ jar\ in
--no-part\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ do\ not\ use\ .part\ files
--no-mtime\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ do\ not\ use\ the\ Last-modified\ header\ to\ set\ the\ file
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ modification\ time
--write-description\ \ \ \ \ \ write\ video\ description\ to\ a\ .description\ file
--write-info-json\ \ \ \ \ \ \ \ write\ video\ metadata\ to\ a\ .info.json\ file
\f[]
.fi
.SS Verbosity / Simulation Options:
.IP
.nf
\f[C]
-q,\ --quiet\ \ \ \ \ \ \ \ \ \ \ \ \ \ activates\ quiet\ mode
-s,\ --simulate\ \ \ \ \ \ \ \ \ \ \ do\ not\ download\ the\ video\ and\ do\ not\ write\ anything
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ to\ disk
--skip-download\ \ \ \ \ \ \ \ \ \ do\ not\ download\ the\ video
-g,\ --get-url\ \ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ URL
-e,\ --get-title\ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ title
--get-thumbnail\ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ thumbnail\ URL
--get-description\ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ video\ description
--get-filename\ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ output\ filename
--get-format\ \ \ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ output\ format
--no-progress\ \ \ \ \ \ \ \ \ \ \ \ do\ not\ print\ progress\ bar
--console-title\ \ \ \ \ \ \ \ \ \ display\ progress\ in\ console\ titlebar
-v,\ --verbose\ \ \ \ \ \ \ \ \ \ \ \ print\ various\ debugging\ information
\f[]
.fi
.SS Video Format Options:
.IP
.nf
\f[C]
-f,\ --format\ FORMAT\ \ \ \ \ \ video\ format\ code
--all-formats\ \ \ \ \ \ \ \ \ \ \ \ download\ all\ available\ video\ formats
--prefer-free-formats\ \ \ \ prefer\ free\ video\ formats\ unless\ a\ specific\ one\ is
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ requested
--max-quality\ FORMAT\ \ \ \ \ highest\ quality\ format\ to\ download
-F,\ --list-formats\ \ \ \ \ \ \ list\ all\ available\ formats\ (currently\ youtube\ only)
--write-srt\ \ \ \ \ \ \ \ \ \ \ \ \ \ write\ video\ closed\ captions\ to\ a\ .srt\ file
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (currently\ youtube\ only)
--srt-lang\ LANG\ \ \ \ \ \ \ \ \ \ language\ of\ the\ closed\ captions\ to\ download
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (optional)\ use\ IETF\ language\ tags\ like\ \[aq]en\[aq]
\f[]
.fi
.SS Authentication Options:
.IP
.nf
\f[C]
-u,\ --username\ USERNAME\ \ account\ username
-p,\ --password\ PASSWORD\ \ account\ password
-n,\ --netrc\ \ \ \ \ \ \ \ \ \ \ \ \ \ use\ .netrc\ authentication\ data
\f[]
.fi
.SS Post-processing Options:
.IP
.nf
\f[C]
--extract-audio\ \ \ \ \ \ \ \ \ \ convert\ video\ files\ to\ audio-only\ files\ (requires
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ ffmpeg\ or\ avconv\ and\ ffprobe\ or\ avprobe)
--audio-format\ FORMAT\ \ \ \ "best",\ "aac",\ "vorbis",\ "mp3",\ "m4a",\ or\ "wav";
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ best\ by\ default
--audio-quality\ QUALITY\ \ ffmpeg/avconv\ audio\ quality\ specification,\ insert\ a
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ value\ between\ 0\ (better)\ and\ 9\ (worse)\ for\ VBR\ or\ a
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ specific\ bitrate\ like\ 128K\ (default\ 5)
-k,\ --keep-video\ \ \ \ \ \ \ \ \ keeps\ the\ video\ file\ on\ disk\ after\ the\ post-
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ processing;\ the\ video\ is\ erased\ by\ default
\f[]
.fi
.SH FAQ
.SS Can you please put the -b option back?
.PP
Most people asking this question are not aware that youtube-dl now
defaults to downloading the highest available quality as reported by
YouTube, which will be 1080p or 720p in some cases, so you no longer
need the -b option.
For some specific videos, maybe YouTube does not report them to be
available in a specific high quality format you\[aq]\[aq]re interested
in.
In that case, simply request it with the -f option and youtube-dl will
try to download it.
.SS I get HTTP error 402 when trying to download a video. What\[aq]s
this?
.PP
Apparently YouTube requires you to pass a CAPTCHA test if you download
too much.
We\[aq]\[aq]re considering to provide a way to let you solve the
CAPTCHA (https://github.com/rg3/youtube-dl/issues/154), but at the
moment, your best course of action is pointing a webbrowser to the
youtube URL, solving the CAPTCHA, and restart youtube-dl.
.SS I have downloaded a video but how can I play it?
.PP
Once the video is fully downloaded, use any video player, such as
vlc (http://www.videolan.org) or mplayer (http://www.mplayerhq.hu/).
.SS The links provided by youtube-dl -g are not working anymore
.PP
The URLs youtube-dl outputs require the downloader to have the correct
cookies.
Use the \f[C]--cookies\f[] option to write the required cookies into a
file, and advise your downloader to read cookies from that file.
Some sites also require a common user agent to be used, use
\f[C]--dump-user-agent\f[] to see the one in use by youtube-dl.
.SS ERROR: no fmt_url_map or conn information found in video info
.PP
youtube has switched to a new video info format in July 2011 which is
not supported by old versions of youtube-dl.
You can update youtube-dl with \f[C]sudo\ youtube-dl\ --update\f[].
.SS ERROR: unable to download video
.PP
youtube requires an additional signature since September 2012 which is
not supported by old versions of youtube-dl.
You can update youtube-dl with \f[C]sudo\ youtube-dl\ --update\f[].
.SS SyntaxError: Non-ASCII character
.PP
The error
.IP
.nf
\f[C]
File\ "youtube-dl",\ line\ 2
SyntaxError:\ Non-ASCII\ character\ \[aq]\\x93\[aq]\ ...
\f[]
.fi
.PP
means you\[aq]re using an outdated version of Python.
Please update to Python 2.6 or 2.7.
.PP
To run youtube-dl under Python 2.5, you\[aq]ll have to manually check it
out like this:
.IP
.nf
\f[C]
git\ clone\ git://github.com/rg3/youtube-dl.git
cd\ youtube-dl
python\ -m\ youtube_dl\ --help
\f[]
.fi
.PP
Please note that Python 2.5 is not supported anymore.
.SS What is this binary file? Where has the code gone?
.PP
Since June 2012 (#342) youtube-dl is packed as an executable zipfile,
simply unzip it (might need renaming to \f[C]youtube-dl.zip\f[] first on
some systems) or clone the git repo to see the code.
If you modify the code, you can run it by executing the
\f[C]__main__.py\f[] file.
To recompile the executable, run \f[C]make\ compile\f[].
.SS The exe throws a \f[I]Runtime error from Visual C++\f[]
.PP
To run the exe you need to install first the Microsoft Visual C++ 2008
Redistributable
Package (http://www.microsoft.com/en-us/download/details.aspx?id=29).
.SH COPYRIGHT
.PP
youtube-dl is released into the public domain by the copyright holders.
.PP
This README file was originally written by Daniel Bolton
(<https://github.com/dbbolton>) and is likewise released into the public
domain.
.SH BUGS
.PP
Bugs and suggestions should be reported at:
<https://github.com/rg3/youtube-dl/issues>
.PP
Please include:
.IP \[bu] 2
Your exact command line, like
\f[C]youtube-dl\ -t\ "http://www.youtube.com/watch?v=uHlDtZ6Oc3s&feature=channel_video_title"\f[].
A common mistake is not to escape the \f[C]&\f[].
Putting URLs in quotes should solve this problem.
.IP \[bu] 2
The output of \f[C]youtube-dl\ --version\f[]
.IP \[bu] 2
The output of \f[C]python\ --version\f[]
.IP \[bu] 2
The name and version of your Operating System ("Ubuntu 11.04 x64" or
"Windows 7 x64" is usually enough).

View File

@@ -0,0 +1,14 @@
__youtube-dl()
{
local cur prev opts
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
opts="--all-formats --audio-format --audio-quality --auto-number --batch-file --console-title --continue --cookies --dump-user-agent --extract-audio --format --get-description --get-filename --get-format --get-thumbnail --get-title --get-url --help --ignore-errors --keep-video --list-extractors --list-formats --literal --match-title --max-downloads --max-quality --netrc --no-continue --no-mtime --no-overwrites --no-part --no-progress --output --password --playlist-end --playlist-start --prefer-free-formats --quiet --rate-limit --reject-title --retries --simulate --skip-download --srt-lang --title --update --user-agent --username --verbose --version --write-description --write-info-json --write-srt"
if [[ ${cur} == * ]] ; then
COMPREPLY=( $(compgen -W "${opts}" -- ${cur}) )
return 0
fi
}
complete -F __youtube-dl youtube-dl

BIN
youtube-dl.exe Executable file

Binary file not shown.

View File

@@ -0,0 +1,690 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import httplib
import math
import os
import re
import socket
import subprocess
import sys
import time
import urllib2
if os.name == 'nt':
import ctypes
from utils import *
class FileDownloader(object):
"""File Downloader class.
File downloader objects are the ones responsible of downloading the
actual video file and writing it to disk if the user has requested
it, among some other tasks. In most cases there should be one per
program. As, given a video URL, the downloader doesn't know how to
extract all the needed information, task that InfoExtractors do, it
has to pass the URL to one of them.
For this, file downloader objects have a method that allows
InfoExtractors to be registered in a given order. When it is passed
a URL, the file downloader handles it to the first InfoExtractor it
finds that reports being able to handle it. The InfoExtractor extracts
all the information about the video or videos the URL refers to, and
asks the FileDownloader to process the video information, possibly
downloading the video.
File downloaders accept a lot of parameters. In order not to saturate
the object constructor with arguments, it receives a dictionary of
options instead. These options are available through the params
attribute for the InfoExtractors to use. The FileDownloader also
registers itself as the downloader in charge for the InfoExtractors
that are added to it, so this is a "mutual registration".
Available options:
username: Username for authentication purposes.
password: Password for authentication purposes.
usenetrc: Use netrc for authentication instead.
quiet: Do not print messages to stdout.
forceurl: Force printing final URL.
forcetitle: Force printing title.
forcethumbnail: Force printing thumbnail URL.
forcedescription: Force printing description.
forcefilename: Force printing final filename.
simulate: Do not download the video files.
format: Video format code.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
ignoreerrors: Do not stop on download errors.
ratelimit: Download speed limit, in bytes/sec.
nooverwrites: Prevent overwriting files.
retries: Number of times to retry for HTTP error 5xx
continuedl: Try to continue downloads if possible.
noprogress: Do not print the progress bar.
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logtostderr: Log messages to stderr instead of stdout.
consoletitle: Display progress in console window's titlebar.
nopart: Do not use temporary .part files.
updatetime: Use the Last-modified header to set output file timestamps.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writesubtitles: Write the video subtitles to a .srt file
subtitleslang: Language of the subtitles to download
"""
params = None
_ies = []
_pps = []
_download_retcode = None
_num_downloads = None
_screen_file = None
def __init__(self, params):
"""Create a FileDownloader object with the given options."""
self._ies = []
self._pps = []
self._download_retcode = 0
self._num_downloads = 0
self._screen_file = [sys.stdout, sys.stderr][params.get('logtostderr', False)]
self.params = params
@staticmethod
def format_bytes(bytes):
if bytes is None:
return 'N/A'
if type(bytes) is str:
bytes = float(bytes)
if bytes == 0.0:
exponent = 0
else:
exponent = long(math.log(bytes, 1024.0))
suffix = 'bkMGTPEZY'[exponent]
converted = float(bytes) / float(1024 ** exponent)
return '%.2f%s' % (converted, suffix)
@staticmethod
def calc_percent(byte_counter, data_len):
if data_len is None:
return '---.-%'
return '%6s' % ('%3.1f%%' % (float(byte_counter) / float(data_len) * 100.0))
@staticmethod
def calc_eta(start, now, total, current):
if total is None:
return '--:--'
dif = now - start
if current == 0 or dif < 0.001: # One millisecond
return '--:--'
rate = float(current) / dif
eta = long((float(total) - float(current)) / rate)
(eta_mins, eta_secs) = divmod(eta, 60)
if eta_mins > 99:
return '--:--'
return '%02d:%02d' % (eta_mins, eta_secs)
@staticmethod
def calc_speed(start, now, bytes):
dif = now - start
if bytes == 0 or dif < 0.001: # One millisecond
return '%10s' % '---b/s'
return '%10s' % ('%s/s' % FileDownloader.format_bytes(float(bytes) / dif))
@staticmethod
def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0)
new_max = min(max(bytes * 2.0, 1.0), 4194304) # Do not surpass 4 MB
if elapsed_time < 0.001:
return long(new_max)
rate = bytes / elapsed_time
if rate > new_max:
return long(new_max)
if rate < new_min:
return long(new_min)
return long(rate)
@staticmethod
def parse_bytes(bytestr):
"""Parse a string indicating a byte quantity into a long integer."""
matchobj = re.match(r'(?i)^(\d+(?:\.\d+)?)([kMGTPEZY]?)$', bytestr)
if matchobj is None:
return None
number = float(matchobj.group(1))
multiplier = 1024.0 ** 'bkmgtpezy'.index(matchobj.group(2).lower())
return long(round(number * multiplier))
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
self._ies.append(ie)
ie.set_downloader(self)
def add_post_processor(self, pp):
"""Add a PostProcessor object to the end of the chain."""
self._pps.append(pp)
pp.set_downloader(self)
def to_screen(self, message, skip_eol=False):
"""Print message to stdout if not in quiet mode."""
assert type(message) == type(u'')
if not self.params.get('quiet', False):
terminator = [u'\n', u''][skip_eol]
output = message + terminator
if 'b' not in self._screen_file.mode or sys.version_info[0] < 3: # Python 2 lies about the mode of sys.stdout/sys.stderr
output = output.encode(preferredencoding(), 'ignore')
self._screen_file.write(output)
self._screen_file.flush()
def to_stderr(self, message):
"""Print message to stderr."""
print >>sys.stderr, message.encode(preferredencoding())
def to_cons_title(self, message):
"""Set console/terminal window title to message."""
if not self.params.get('consoletitle', False):
return
if os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
# c_wchar_p() might not be necessary if `message` is
# already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
elif 'TERM' in os.environ:
sys.stderr.write('\033]0;%s\007' % message.encode(preferredencoding()))
def fixed_template(self):
"""Checks if the output template is fixed."""
return (re.search(ur'(?u)%\(.+?\)s', self.params['outtmpl']) is None)
def trouble(self, message=None):
"""Determine action to take when a download problem appears.
Depending on if the downloader has been configured to ignore
download errors or not, this method may throw an exception or
not when errors are found, after printing the message.
"""
if message is not None:
self.to_stderr(message)
if not self.params.get('ignoreerrors', False):
raise DownloadError(message)
self._download_retcode = 1
def slow_down(self, start_time, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
if rate_limit is None or byte_counter == 0:
return
now = time.time()
elapsed = now - start_time
if elapsed <= 0.0:
return
speed = float(byte_counter) / elapsed
if speed > rate_limit:
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
def temp_name(self, filename):
"""Returns a temporary filename for the given filename."""
if self.params.get('nopart', False) or filename == u'-' or \
(os.path.exists(encodeFilename(filename)) and not os.path.isfile(encodeFilename(filename))):
return filename
return filename + u'.part'
def undo_temp_name(self, filename):
if filename.endswith(u'.part'):
return filename[:-len(u'.part')]
return filename
def try_rename(self, old_filename, new_filename):
try:
if old_filename == new_filename:
return
os.rename(encodeFilename(old_filename), encodeFilename(new_filename))
except (IOError, OSError), err:
self.trouble(u'ERROR: unable to rename file')
def try_utime(self, filename, last_modified_hdr):
"""Try to set the last-modified time of the given file."""
if last_modified_hdr is None:
return
if not os.path.isfile(encodeFilename(filename)):
return
timestr = last_modified_hdr
if timestr is None:
return
filetime = timeconvert(timestr)
if filetime is None:
return filetime
try:
os.utime(filename, (time.time(), filetime))
except:
pass
return filetime
def report_writedescription(self, descfn):
""" Report that the description file is being written """
self.to_screen(u'[info] Writing video description to: ' + descfn)
def report_writesubtitles(self, srtfn):
""" Report that the subtitles file is being written """
self.to_screen(u'[info] Writing video subtitles to: ' + srtfn)
def report_writeinfojson(self, infofn):
""" Report that the metadata file has been written """
self.to_screen(u'[info] Video description metadata as JSON to: ' + infofn)
def report_destination(self, filename):
"""Report destination filename."""
self.to_screen(u'[download] Destination: ' + filename)
def report_progress(self, percent_str, data_len_str, speed_str, eta_str):
"""Report download progress."""
if self.params.get('noprogress', False):
return
self.to_screen(u'\r[download] %s of %s at %s ETA %s' %
(percent_str, data_len_str, speed_str, eta_str), skip_eol=True)
self.to_cons_title(u'youtube-dl - %s of %s at %s ETA %s' %
(percent_str.strip(), data_len_str.strip(), speed_str.strip(), eta_str.strip()))
def report_resuming_byte(self, resume_len):
"""Report attempt to resume at given byte."""
self.to_screen(u'[download] Resuming download at byte %s' % resume_len)
def report_retry(self, count, retries):
"""Report retry in case of HTTP error 5xx"""
self.to_screen(u'[download] Got server HTTP error. Retrying (attempt %d of %d)...' % (count, retries))
def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded."""
try:
self.to_screen(u'[download] %s has already been downloaded' % file_name)
except (UnicodeEncodeError), err:
self.to_screen(u'[download] The file has already been downloaded')
def report_unable_to_resume(self):
"""Report it was impossible to resume download."""
self.to_screen(u'[download] Unable to resume')
def report_finish(self):
"""Report download finished."""
if self.params.get('noprogress', False):
self.to_screen(u'[download] Download completed')
else:
self.to_screen(u'')
def increment_downloads(self):
"""Increment the ordinal that assigns a number to each file."""
self._num_downloads += 1
def prepare_filename(self, info_dict):
"""Generate the output filename."""
try:
template_dict = dict(info_dict)
template_dict['epoch'] = unicode(long(time.time()))
template_dict['autonumber'] = unicode('%05d' % self._num_downloads)
filename = self.params['outtmpl'] % template_dict
return filename
except (ValueError, KeyError), err:
self.trouble(u'ERROR: invalid system charset or erroneous output template')
return None
def _match_entry(self, info_dict):
""" Returns None iff the file should be downloaded """
title = info_dict['title']
matchtitle = self.params.get('matchtitle', False)
if matchtitle and not re.search(matchtitle, title, re.IGNORECASE):
return u'[download] "' + title + '" title did not match pattern "' + matchtitle + '"'
rejecttitle = self.params.get('rejecttitle', False)
if rejecttitle and re.search(rejecttitle, title, re.IGNORECASE):
return u'"' + title + '" title matched reject pattern "' + rejecttitle + '"'
return None
def process_info(self, info_dict):
"""Process a single dictionary returned by an InfoExtractor."""
info_dict['stitle'] = sanitize_filename(info_dict['title'])
reason = self._match_entry(info_dict)
if reason is not None:
self.to_screen(u'[download] ' + reason)
return
max_downloads = self.params.get('max_downloads')
if max_downloads is not None:
if self._num_downloads > int(max_downloads):
raise MaxDownloadsReached()
filename = self.prepare_filename(info_dict)
# Forced printings
if self.params.get('forcetitle', False):
print info_dict['title'].encode(preferredencoding(), 'xmlcharrefreplace')
if self.params.get('forceurl', False):
print info_dict['url'].encode(preferredencoding(), 'xmlcharrefreplace')
if self.params.get('forcethumbnail', False) and 'thumbnail' in info_dict:
print info_dict['thumbnail'].encode(preferredencoding(), 'xmlcharrefreplace')
if self.params.get('forcedescription', False) and 'description' in info_dict:
print info_dict['description'].encode(preferredencoding(), 'xmlcharrefreplace')
if self.params.get('forcefilename', False) and filename is not None:
print filename.encode(preferredencoding(), 'xmlcharrefreplace')
if self.params.get('forceformat', False):
print info_dict['format'].encode(preferredencoding(), 'xmlcharrefreplace')
# Do nothing else if in simulate mode
if self.params.get('simulate', False):
return
if filename is None:
return
try:
dn = os.path.dirname(encodeFilename(filename))
if dn != '' and not os.path.exists(dn): # dn is already encoded
os.makedirs(dn)
except (OSError, IOError), err:
self.trouble(u'ERROR: unable to create directory ' + unicode(err))
return
if self.params.get('writedescription', False):
try:
descfn = filename + u'.description'
self.report_writedescription(descfn)
descfile = open(encodeFilename(descfn), 'wb')
try:
descfile.write(info_dict['description'].encode('utf-8'))
finally:
descfile.close()
except (OSError, IOError):
self.trouble(u'ERROR: Cannot write description file ' + descfn)
return
if self.params.get('writesubtitles', False) and 'subtitles' in info_dict and info_dict['subtitles']:
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
try:
srtfn = filename.rsplit('.', 1)[0] + u'.srt'
self.report_writesubtitles(srtfn)
srtfile = open(encodeFilename(srtfn), 'wb')
try:
srtfile.write(info_dict['subtitles'].encode('utf-8'))
finally:
srtfile.close()
except (OSError, IOError):
self.trouble(u'ERROR: Cannot write subtitles file ' + descfn)
return
if self.params.get('writeinfojson', False):
infofn = filename + u'.info.json'
self.report_writeinfojson(infofn)
try:
json.dump
except (NameError,AttributeError):
self.trouble(u'ERROR: No JSON encoder found. Update to Python 2.6+, setup a json module, or leave out --write-info-json.')
return
try:
infof = open(encodeFilename(infofn), 'wb')
try:
json_info_dict = dict((k,v) for k,v in info_dict.iteritems() if not k in ('urlhandle',))
json.dump(json_info_dict, infof)
finally:
infof.close()
except (OSError, IOError):
self.trouble(u'ERROR: Cannot write metadata to JSON file ' + infofn)
return
if not self.params.get('skip_download', False):
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(filename)):
success = True
else:
try:
success = self._do_download(filename, info_dict)
except (OSError, IOError), err:
raise UnavailableVideoError
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self.trouble(u'ERROR: unable to download video data: %s' % str(err))
return
except (ContentTooShortError, ), err:
self.trouble(u'ERROR: content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success:
try:
self.post_process(filename, info_dict)
except (PostProcessingError), err:
self.trouble(u'ERROR: postprocessing: %s' % str(err))
return
def download(self, url_list):
"""Download a given list of URLs."""
if len(url_list) > 1 and self.fixed_template():
raise SameFileError(self.params['outtmpl'])
for url in url_list:
suitable_found = False
for ie in self._ies:
# Go to next InfoExtractor if not suitable
if not ie.suitable(url):
continue
# Suitable InfoExtractor found
suitable_found = True
# Extract information from URL and process it
videos = ie.extract(url)
for video in videos or []:
video['extractor'] = ie.IE_NAME
try:
self.increment_downloads()
self.process_info(video)
except UnavailableVideoError:
self.trouble(u'\nERROR: unable to download video')
# Suitable InfoExtractor had been found; go to next URL
break
if not suitable_found:
self.trouble(u'ERROR: no suitable InfoExtractor: %s' % url)
return self._download_retcode
def post_process(self, filename, ie_info):
"""Run the postprocessing chain on the given file."""
info = dict(ie_info)
info['filepath'] = filename
for pp in self._pps:
info = pp.run(info)
if info is None:
break
def _download_with_rtmpdump(self, filename, url, player_url):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
# Check for rtmpdump first
try:
subprocess.call(['rtmpdump', '-h'], stdout=(file(os.path.devnull, 'w')), stderr=subprocess.STDOUT)
except (OSError, IOError):
self.trouble(u'ERROR: RTMP download detected but "rtmpdump" could not be run')
return False
# Download using rtmpdump. rtmpdump returns exit code 2 when
# the connection was interrumpted and resuming appears to be
# possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = ['rtmpdump', '-q'] + [[], ['-W', player_url]][player_url is not None] + ['-r', url, '-o', tmpfilename]
args = basic_args + [[], ['-e', '-k', '1']][self.params.get('continuedl', False)]
if self.params.get('verbose', False):
try:
import pipes
shell_quote = lambda args: ' '.join(map(pipes.quote, args))
except ImportError:
shell_quote = repr
self.to_screen(u'[debug] rtmpdump command line: ' + shell_quote(args))
retval = subprocess.call(args)
while retval == 2 or retval == 1:
prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[rtmpdump] %s bytes' % prevsize, skip_eol=True)
time.sleep(5.0) # This seems to be needed
retval = subprocess.call(basic_args + ['-e'] + [[], ['-k', '1']][retval == 1])
cursize = os.path.getsize(encodeFilename(tmpfilename))
if prevsize == cursize and retval == 1:
break
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
if prevsize == cursize and retval == 2 and cursize > 1024:
self.to_screen(u'\r[rtmpdump] Could not download the whole video. This can happen for some advertisements.')
retval = 0
break
if retval == 0:
self.to_screen(u'\r[rtmpdump] %s bytes' % os.path.getsize(encodeFilename(tmpfilename)))
self.try_rename(tmpfilename, filename)
return True
else:
self.trouble(u'\nERROR: rtmpdump exited with code %d' % retval)
return False
def _do_download(self, filename, info_dict):
url = info_dict['url']
player_url = info_dict.get('player_url', None)
# Check file already present
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)) and not self.params.get('nopart', False):
self.report_file_already_downloaded(filename)
return True
# Attempt to download using rtmpdump
if url.startswith('rtmp'):
return self._download_with_rtmpdump(filename, url, player_url)
tmpfilename = self.temp_name(filename)
stream = None
# Do not include the Accept-Encoding header
headers = {'Youtubedl-no-compression': 'True'}
basic_request = urllib2.Request(url, None, headers)
request = urllib2.Request(url, None, headers)
# Establish possible resume length
if os.path.isfile(encodeFilename(tmpfilename)):
resume_len = os.path.getsize(encodeFilename(tmpfilename))
else:
resume_len = 0
open_mode = 'wb'
if resume_len != 0:
if self.params.get('continuedl', False):
self.report_resuming_byte(resume_len)
request.add_header('Range','bytes=%d-' % resume_len)
open_mode = 'ab'
else:
resume_len = 0
count = 0
retries = self.params.get('retries', 0)
while count <= retries:
# Establish connection
try:
if count == 0 and 'urlhandle' in info_dict:
data = info_dict['urlhandle']
data = urllib2.urlopen(request)
break
except (urllib2.HTTPError, ), err:
if (err.code < 500 or err.code >= 600) and err.code != 416:
# Unexpected HTTP error
raise
elif err.code == 416:
# Unable to resume (requested range not satisfiable)
try:
# Open the connection again without the range header
data = urllib2.urlopen(basic_request)
content_length = data.info()['Content-Length']
except (urllib2.HTTPError, ), err:
if err.code < 500 or err.code >= 600:
raise
else:
# Examine the reported length
if (content_length is not None and
(resume_len - 100 < long(content_length) < resume_len + 100)):
# The file had already been fully downloaded.
# Explanation to the above condition: in issue #175 it was revealed that
# YouTube sometimes adds or removes a few bytes from the end of the file,
# changing the file size slightly and causing problems for some users. So
# I decided to implement a suggested change and consider the file
# completely downloaded if the file size differs less than 100 bytes from
# the one in the hard drive.
self.report_file_already_downloaded(filename)
self.try_rename(tmpfilename, filename)
return True
else:
# The length does not match, we start the download over
self.report_unable_to_resume()
open_mode = 'wb'
break
# Retry
count += 1
if count <= retries:
self.report_retry(count, retries)
if count > retries:
self.trouble(u'ERROR: giving up after %s retries' % retries)
return False
data_len = data.info().get('Content-length', None)
if data_len is not None:
data_len = long(data_len) + resume_len
data_len_str = self.format_bytes(data_len)
byte_counter = 0 + resume_len
block_size = 1024
start = time.time()
while True:
# Download and write
before = time.time()
data_block = data.read(block_size)
after = time.time()
if len(data_block) == 0:
break
byte_counter += len(data_block)
# Open file just in time
if stream is None:
try:
(stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
assert stream is not None
filename = self.undo_temp_name(tmpfilename)
self.report_destination(filename)
except (OSError, IOError), err:
self.trouble(u'ERROR: unable to open for writing: %s' % str(err))
return False
try:
stream.write(data_block)
except (IOError, OSError), err:
self.trouble(u'\nERROR: unable to write data: %s' % str(err))
return False
block_size = self.best_block_size(after - before, len(data_block))
# Progress message
speed_str = self.calc_speed(start, time.time(), byte_counter - resume_len)
if data_len is None:
self.report_progress('Unknown %', data_len_str, speed_str, 'Unknown ETA')
else:
percent_str = self.calc_percent(byte_counter, data_len)
eta_str = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
self.report_progress(percent_str, data_len_str, speed_str, eta_str)
# Apply rate limit
self.slow_down(start, byte_counter - resume_len)
if stream is None:
self.trouble(u'\nERROR: Did not get any data blocks')
return False
stream.close()
self.report_finish()
if data_len is not None and byte_counter != data_len:
raise ContentTooShortError(byte_counter, long(data_len))
self.try_rename(tmpfilename, filename)
# Update file modification time
if self.params.get('updatetime', True):
info_dict['filetime'] = self.try_utime(filename, data.info().get('last-modified', None))
return True

3307
youtube_dl/InfoExtractors.py Normal file

File diff suppressed because it is too large Load Diff

198
youtube_dl/PostProcessor.py Normal file
View File

@@ -0,0 +1,198 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import subprocess
import sys
import time
from utils import *
class PostProcessor(object):
"""Post Processor class.
PostProcessor objects can be added to downloaders with their
add_post_processor() method. When the downloader has finished a
successful download, it will take its internal chain of PostProcessors
and start calling the run() method on each one of them, first with
an initial argument and then with the returned value of the previous
PostProcessor.
The chain will be stopped if one of them ever returns None or the end
of the chain is reached.
PostProcessor objects follow a "mutual registration" process similar
to InfoExtractor objects.
"""
_downloader = None
def __init__(self, downloader=None):
self._downloader = downloader
def set_downloader(self, downloader):
"""Sets the downloader for this PP."""
self._downloader = downloader
def run(self, information):
"""Run the PostProcessor.
The "information" argument is a dictionary like the ones
composed by InfoExtractors. The only difference is that this
one has an extra field called "filepath" that points to the
downloaded file.
When this method returns None, the postprocessing chain is
stopped. However, this method may return an information
dictionary that will be passed to the next postprocessing
object in the chain. It can be the one it received after
changing some fields.
In addition, this method may raise a PostProcessingError
exception that will be taken into account by the downloader
it was called from.
"""
return information # by default, do nothing
class AudioConversionError(BaseException):
def __init__(self, message):
self.message = message
class FFmpegExtractAudioPP(PostProcessor):
def __init__(self, downloader=None, preferredcodec=None, preferredquality=None, keepvideo=False):
PostProcessor.__init__(self, downloader)
if preferredcodec is None:
preferredcodec = 'best'
self._preferredcodec = preferredcodec
self._preferredquality = preferredquality
self._keepvideo = keepvideo
self._exes = self.detect_executables()
@staticmethod
def detect_executables():
def executable(exe):
try:
subprocess.check_output([exe, '-version'])
except OSError:
return False
return exe
programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
return dict((program, executable(program)) for program in programs)
def get_audio_codec(self, path):
if not self._exes['ffprobe'] and not self._exes['avprobe']: return None
try:
cmd = [self._exes['avprobe'] or self._exes['ffprobe'], '-show_streams', '--', encodeFilename(path)]
handle = subprocess.Popen(cmd, stderr=file(os.path.devnull, 'w'), stdout=subprocess.PIPE)
output = handle.communicate()[0]
if handle.wait() != 0:
return None
except (IOError, OSError):
return None
audio_codec = None
for line in output.split('\n'):
if line.startswith('codec_name='):
audio_codec = line.split('=')[1].strip()
elif line.strip() == 'codec_type=audio' and audio_codec is not None:
return audio_codec
return None
def run_ffmpeg(self, path, out_path, codec, more_opts):
if not self._exes['ffmpeg'] and not self._exes['avconv']:
raise AudioConversionError('ffmpeg or avconv not found. Please install one.')
if codec is None:
acodec_opts = []
else:
acodec_opts = ['-acodec', codec]
cmd = ([self._exes['avconv'] or self._exes['ffmpeg'], '-y', '-i', encodeFilename(path), '-vn']
+ acodec_opts + more_opts +
['--', encodeFilename(out_path)])
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout,stderr = p.communicate()
if p.returncode != 0:
msg = stderr.strip().split('\n')[-1]
raise AudioConversionError(msg)
def run(self, information):
path = information['filepath']
filecodec = self.get_audio_codec(path)
if filecodec is None:
self._downloader.to_stderr(u'WARNING: unable to obtain file audio codec with ffprobe')
return None
more_opts = []
if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
if self._preferredcodec == 'm4a' and filecodec == 'aac':
# Lossless, but in another container
acodec = 'copy'
extension = self._preferredcodec
more_opts = [self._exes['avconv'] and '-bsf:a' or '-absf', 'aac_adtstoasc']
elif filecodec in ['aac', 'mp3', 'vorbis']:
# Lossless if possible
acodec = 'copy'
extension = filecodec
if filecodec == 'aac':
more_opts = ['-f', 'adts']
if filecodec == 'vorbis':
extension = 'ogg'
else:
# MP3 otherwise.
acodec = 'libmp3lame'
extension = 'mp3'
more_opts = []
if self._preferredquality is not None:
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality]
else:
# We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
extension = self._preferredcodec
more_opts = []
if self._preferredquality is not None:
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality]
if self._preferredcodec == 'aac':
more_opts += ['-f', 'adts']
if self._preferredcodec == 'm4a':
more_opts += [self._exes['avconv'] and '-bsf:a' or '-absf', 'aac_adtstoasc']
if self._preferredcodec == 'vorbis':
extension = 'ogg'
if self._preferredcodec == 'wav':
extension = 'wav'
more_opts += ['-f', 'wav']
prefix, sep, ext = path.rpartition(u'.') # not os.path.splitext, since the latter does not work on unicode in all setups
new_path = prefix + sep + extension
self._downloader.to_screen(u'[' + (self._exes['avconv'] and 'avconv' or 'ffmpeg') + '] Destination: ' + new_path)
try:
self.run_ffmpeg(path, new_path, acodec, more_opts)
except:
etype,e,tb = sys.exc_info()
if isinstance(e, AudioConversionError):
self._downloader.to_stderr(u'ERROR: audio conversion failed: ' + e.message)
else:
self._downloader.to_stderr(u'ERROR: error running ' + (self._exes['avconv'] and 'avconv' or 'ffmpeg'))
return None
# Try to update the date time for extracted audio file.
if information.get('filetime') is not None:
try:
os.utime(encodeFilename(new_path), (time.time(), information['filetime']))
except:
self._downloader.to_stderr(u'WARNING: Cannot update utime of audio file')
if not self._keepvideo:
try:
os.remove(encodeFilename(path))
except (IOError, OSError):
self._downloader.to_stderr(u'WARNING: Unable to remove downloaded video file')
return None
information['filepath'] = new_path
return information

554
youtube_dl/__init__.py Normal file
View File

@@ -0,0 +1,554 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__authors__ = (
'Ricardo Garcia Gonzalez',
'Danny Colligan',
'Benjamin Johnson',
'Vasyl\' Vavrychuk',
'Witold Baryluk',
'Paweł Paprota',
'Gergely Imreh',
'Rogério Brito',
'Philipp Hagemeister',
'Sören Schulze',
'Kevin Ngo',
'Ori Avtalion',
'shizeeg',
'Filippo Valsorda',
)
__license__ = 'Public Domain'
__version__ = '2012.10.09'
UPDATE_URL = 'https://raw.github.com/rg3/youtube-dl/master/youtube-dl'
UPDATE_URL_VERSION = 'https://raw.github.com/rg3/youtube-dl/master/LATEST_VERSION'
UPDATE_URL_EXE = 'https://raw.github.com/rg3/youtube-dl/master/youtube-dl.exe'
import cookielib
import getpass
import optparse
import os
import re
import shlex
import socket
import subprocess
import sys
import urllib2
import warnings
from utils import *
from FileDownloader import *
from InfoExtractors import *
from PostProcessor import *
def updateSelf(downloader, filename):
''' Update the program file with the latest version from the repository '''
# Note: downloader only used for options
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
downloader.to_screen(u'Updating to latest version...')
urlv = urllib2.urlopen(UPDATE_URL_VERSION)
newversion = urlv.read().strip()
if newversion == __version__:
downloader.to_screen(u'youtube-dl is up-to-date (' + __version__ + ')')
return
urlv.close()
if hasattr(sys, "frozen"): #py2exe
exe = os.path.abspath(filename)
directory = os.path.dirname(exe)
if not os.access(directory, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % directory)
try:
urlh = urllib2.urlopen(UPDATE_URL_EXE)
newcontent = urlh.read()
urlh.close()
with open(exe + '.new', 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError), err:
sys.exit('ERROR: unable to download latest version')
try:
bat = os.path.join(directory, 'youtube-dl-updater.bat')
b = open(bat, 'w')
print >> b, """
echo Updating youtube-dl...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s"
del "%s"
""" %(exe, exe, bat)
b.close()
os.startfile(bat)
except (IOError, OSError), err:
sys.exit('ERROR: unable to overwrite current version')
else:
try:
urlh = urllib2.urlopen(UPDATE_URL)
newcontent = urlh.read()
urlh.close()
except (IOError, OSError), err:
sys.exit('ERROR: unable to download latest version')
try:
with open(filename, 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError), err:
sys.exit('ERROR: unable to overwrite current version')
downloader.to_screen(u'Updated youtube-dl. Restart youtube-dl to use the new version.')
def parseOpts():
def _readOptions(filename_bytes):
try:
optionf = open(filename_bytes)
except IOError:
return [] # silently skip if file is not present
try:
res = []
for l in optionf:
res += shlex.split(l, comments=True)
finally:
optionf.close()
return res
def _format_option_string(option):
''' ('-o', '--option') -> -o, --format METAVAR'''
opts = []
if option._short_opts: opts.append(option._short_opts[0])
if option._long_opts: opts.append(option._long_opts[0])
if len(opts) > 1: opts.insert(1, ', ')
if option.takes_value(): opts.append(' %s' % option.metavar)
return "".join(opts)
def _find_term_columns():
columns = os.environ.get('COLUMNS', None)
if columns:
return int(columns)
try:
sp = subprocess.Popen(['stty', 'size'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out,err = sp.communicate()
return int(out.split()[1])
except:
pass
return None
max_width = 80
max_help_position = 80
# No need to wrap help messages if we're on a wide console
columns = _find_term_columns()
if columns: max_width = columns
fmt = optparse.IndentedHelpFormatter(width=max_width, max_help_position=max_help_position)
fmt.format_option_strings = _format_option_string
kw = {
'version' : __version__,
'formatter' : fmt,
'usage' : '%prog [options] url [url...]',
'conflict_handler' : 'resolve',
}
parser = optparse.OptionParser(**kw)
# option groups
general = optparse.OptionGroup(parser, 'General Options')
selection = optparse.OptionGroup(parser, 'Video Selection')
authentication = optparse.OptionGroup(parser, 'Authentication Options')
video_format = optparse.OptionGroup(parser, 'Video Format Options')
postproc = optparse.OptionGroup(parser, 'Post-processing Options')
filesystem = optparse.OptionGroup(parser, 'Filesystem Options')
verbosity = optparse.OptionGroup(parser, 'Verbosity / Simulation Options')
general.add_option('-h', '--help',
action='help', help='print this help text and exit')
general.add_option('-v', '--version',
action='version', help='print program version and exit')
general.add_option('-U', '--update',
action='store_true', dest='update_self', help='update this program to latest version')
general.add_option('-i', '--ignore-errors',
action='store_true', dest='ignoreerrors', help='continue on download errors', default=False)
general.add_option('-r', '--rate-limit',
dest='ratelimit', metavar='LIMIT', help='download rate limit (e.g. 50k or 44.6m)')
general.add_option('-R', '--retries',
dest='retries', metavar='RETRIES', help='number of retries (default is %default)', default=10)
general.add_option('--dump-user-agent',
action='store_true', dest='dump_user_agent',
help='display the current browser identification', default=False)
general.add_option('--user-agent',
dest='user_agent', help='specify a custom user agent', metavar='UA')
general.add_option('--list-extractors',
action='store_true', dest='list_extractors',
help='List all supported extractors and the URLs they would handle', default=False)
selection.add_option('--playlist-start',
dest='playliststart', metavar='NUMBER', help='playlist video to start at (default is %default)', default=1)
selection.add_option('--playlist-end',
dest='playlistend', metavar='NUMBER', help='playlist video to end at (default is last)', default=-1)
selection.add_option('--match-title', dest='matchtitle', metavar='REGEX',help='download only matching titles (regex or caseless sub-string)')
selection.add_option('--reject-title', dest='rejecttitle', metavar='REGEX',help='skip download for matching titles (regex or caseless sub-string)')
selection.add_option('--max-downloads', metavar='NUMBER', dest='max_downloads', help='Abort after downloading NUMBER files', default=None)
authentication.add_option('-u', '--username',
dest='username', metavar='USERNAME', help='account username')
authentication.add_option('-p', '--password',
dest='password', metavar='PASSWORD', help='account password')
authentication.add_option('-n', '--netrc',
action='store_true', dest='usenetrc', help='use .netrc authentication data', default=False)
video_format.add_option('-f', '--format',
action='store', dest='format', metavar='FORMAT', help='video format code')
video_format.add_option('--all-formats',
action='store_const', dest='format', help='download all available video formats', const='all')
video_format.add_option('--prefer-free-formats',
action='store_true', dest='prefer_free_formats', default=False, help='prefer free video formats unless a specific one is requested')
video_format.add_option('--max-quality',
action='store', dest='format_limit', metavar='FORMAT', help='highest quality format to download')
video_format.add_option('-F', '--list-formats',
action='store_true', dest='listformats', help='list all available formats (currently youtube only)')
video_format.add_option('--write-srt',
action='store_true', dest='writesubtitles',
help='write video closed captions to a .srt file (currently youtube only)', default=False)
video_format.add_option('--srt-lang',
action='store', dest='subtitleslang', metavar='LANG',
help='language of the closed captions to download (optional) use IETF language tags like \'en\'')
verbosity.add_option('-q', '--quiet',
action='store_true', dest='quiet', help='activates quiet mode', default=False)
verbosity.add_option('-s', '--simulate',
action='store_true', dest='simulate', help='do not download the video and do not write anything to disk', default=False)
verbosity.add_option('--skip-download',
action='store_true', dest='skip_download', help='do not download the video', default=False)
verbosity.add_option('-g', '--get-url',
action='store_true', dest='geturl', help='simulate, quiet but print URL', default=False)
verbosity.add_option('-e', '--get-title',
action='store_true', dest='gettitle', help='simulate, quiet but print title', default=False)
verbosity.add_option('--get-thumbnail',
action='store_true', dest='getthumbnail',
help='simulate, quiet but print thumbnail URL', default=False)
verbosity.add_option('--get-description',
action='store_true', dest='getdescription',
help='simulate, quiet but print video description', default=False)
verbosity.add_option('--get-filename',
action='store_true', dest='getfilename',
help='simulate, quiet but print output filename', default=False)
verbosity.add_option('--get-format',
action='store_true', dest='getformat',
help='simulate, quiet but print output format', default=False)
verbosity.add_option('--no-progress',
action='store_true', dest='noprogress', help='do not print progress bar', default=False)
verbosity.add_option('--console-title',
action='store_true', dest='consoletitle',
help='display progress in console titlebar', default=False)
verbosity.add_option('-v', '--verbose',
action='store_true', dest='verbose', help='print various debugging information', default=False)
filesystem.add_option('-t', '--title',
action='store_true', dest='usetitle', help='use title in file name', default=False)
filesystem.add_option('-l', '--literal',
action='store_true', dest='useliteral', help='use literal title in file name', default=False)
filesystem.add_option('-A', '--auto-number',
action='store_true', dest='autonumber',
help='number downloaded files starting from 00000', default=False)
filesystem.add_option('-o', '--output',
dest='outtmpl', metavar='TEMPLATE', help='output filename template. Use %(stitle)s to get the title, %(uploader)s for the uploader name, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id and %% for a literal percent. Use - to output to stdout.')
filesystem.add_option('-a', '--batch-file',
dest='batchfile', metavar='FILE', help='file containing URLs to download (\'-\' for stdin)')
filesystem.add_option('-w', '--no-overwrites',
action='store_true', dest='nooverwrites', help='do not overwrite files', default=False)
filesystem.add_option('-c', '--continue',
action='store_true', dest='continue_dl', help='resume partially downloaded files', default=True)
filesystem.add_option('--no-continue',
action='store_false', dest='continue_dl',
help='do not resume partially downloaded files (restart from beginning)')
filesystem.add_option('--cookies',
dest='cookiefile', metavar='FILE', help='file to read cookies from and dump cookie jar in')
filesystem.add_option('--no-part',
action='store_true', dest='nopart', help='do not use .part files', default=False)
filesystem.add_option('--no-mtime',
action='store_false', dest='updatetime',
help='do not use the Last-modified header to set the file modification time', default=True)
filesystem.add_option('--write-description',
action='store_true', dest='writedescription',
help='write video description to a .description file', default=False)
filesystem.add_option('--write-info-json',
action='store_true', dest='writeinfojson',
help='write video metadata to a .info.json file', default=False)
postproc.add_option('--extract-audio', action='store_true', dest='extractaudio', default=False,
help='convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
postproc.add_option('--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='"best", "aac", "vorbis", "mp3", "m4a", or "wav"; best by default')
postproc.add_option('--audio-quality', metavar='QUALITY', dest='audioquality', default='5',
help='ffmpeg/avconv audio quality specification, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default 5)')
postproc.add_option('-k', '--keep-video', action='store_true', dest='keepvideo', default=False,
help='keeps the video file on disk after the post-processing; the video is erased by default')
parser.add_option_group(general)
parser.add_option_group(selection)
parser.add_option_group(filesystem)
parser.add_option_group(verbosity)
parser.add_option_group(video_format)
parser.add_option_group(authentication)
parser.add_option_group(postproc)
xdg_config_home = os.environ.get('XDG_CONFIG_HOME')
if xdg_config_home:
userConf = os.path.join(xdg_config_home, 'youtube-dl.conf')
else:
userConf = os.path.join(os.path.expanduser('~'), '.config', 'youtube-dl.conf')
argv = _readOptions('/etc/youtube-dl.conf') + _readOptions(userConf) + sys.argv[1:]
opts, args = parser.parse_args(argv)
return parser, opts, args
def gen_extractors():
""" Return a list of an instance of every supported extractor.
The order does matter; the first extractor matched is the one handling the URL.
"""
return [
YoutubePlaylistIE(),
YoutubeUserIE(),
YoutubeSearchIE(),
YoutubeIE(),
MetacafeIE(),
DailymotionIE(),
GoogleIE(),
GoogleSearchIE(),
PhotobucketIE(),
YahooIE(),
YahooSearchIE(),
DepositFilesIE(),
FacebookIE(),
BlipTVUserIE(),
BlipTVIE(),
VimeoIE(),
MyVideoIE(),
ComedyCentralIE(),
EscapistIE(),
CollegeHumorIE(),
XVideosIE(),
SoundcloudIE(),
InfoQIE(),
MixcloudIE(),
StanfordOpenClassroomIE(),
MTVIE(),
YoukuIE(),
XNXXIE(),
GooglePlusIE(),
GenericIE()
]
def _real_main():
parser, opts, args = parseOpts()
# Open appropriate CookieJar
if opts.cookiefile is None:
jar = cookielib.CookieJar()
else:
try:
jar = cookielib.MozillaCookieJar(opts.cookiefile)
if os.path.isfile(opts.cookiefile) and os.access(opts.cookiefile, os.R_OK):
jar.load()
except (IOError, OSError), err:
sys.exit(u'ERROR: unable to open cookie file')
# Set user agent
if opts.user_agent is not None:
std_headers['User-Agent'] = opts.user_agent
# Dump user agent
if opts.dump_user_agent:
print std_headers['User-Agent']
sys.exit(0)
# Batch file verification
batchurls = []
if opts.batchfile is not None:
try:
if opts.batchfile == '-':
batchfd = sys.stdin
else:
batchfd = open(opts.batchfile, 'r')
batchurls = batchfd.readlines()
batchurls = [x.strip() for x in batchurls]
batchurls = [x for x in batchurls if len(x) > 0 and not re.search(r'^[#/;]', x)]
except IOError:
sys.exit(u'ERROR: batch file could not be read')
all_urls = batchurls + args
all_urls = map(lambda url: url.strip(), all_urls)
# General configuration
cookie_processor = urllib2.HTTPCookieProcessor(jar)
proxy_handler = urllib2.ProxyHandler()
opener = urllib2.build_opener(proxy_handler, cookie_processor, YoutubeDLHandler())
urllib2.install_opener(opener)
socket.setdefaulttimeout(300) # 5 minutes should be enough (famous last words)
extractors = gen_extractors()
if opts.list_extractors:
for ie in extractors:
print(ie.IE_NAME)
matchedUrls = filter(lambda url: ie.suitable(url), all_urls)
all_urls = filter(lambda url: url not in matchedUrls, all_urls)
for mu in matchedUrls:
print(u' ' + mu)
sys.exit(0)
# Conflicting, missing and erroneous options
if opts.usenetrc and (opts.username is not None or opts.password is not None):
parser.error(u'using .netrc conflicts with giving username/password')
if opts.password is not None and opts.username is None:
parser.error(u'account username missing')
if opts.outtmpl is not None and (opts.useliteral or opts.usetitle or opts.autonumber):
parser.error(u'using output template conflicts with using title, literal title or auto number')
if opts.usetitle and opts.useliteral:
parser.error(u'using title conflicts with using literal title')
if opts.username is not None and opts.password is None:
opts.password = getpass.getpass(u'Type account password and press return:')
if opts.ratelimit is not None:
numeric_limit = FileDownloader.parse_bytes(opts.ratelimit)
if numeric_limit is None:
parser.error(u'invalid rate limit specified')
opts.ratelimit = numeric_limit
if opts.retries is not None:
try:
opts.retries = long(opts.retries)
except (TypeError, ValueError), err:
parser.error(u'invalid retry count specified')
try:
opts.playliststart = int(opts.playliststart)
if opts.playliststart <= 0:
raise ValueError(u'Playlist start must be positive')
except (TypeError, ValueError), err:
parser.error(u'invalid playlist start number specified')
try:
opts.playlistend = int(opts.playlistend)
if opts.playlistend != -1 and (opts.playlistend <= 0 or opts.playlistend < opts.playliststart):
raise ValueError(u'Playlist end must be greater than playlist start')
except (TypeError, ValueError), err:
parser.error(u'invalid playlist end number specified')
if opts.extractaudio:
if opts.audioformat not in ['best', 'aac', 'mp3', 'vorbis', 'm4a', 'wav']:
parser.error(u'invalid audio format specified')
if opts.audioquality:
opts.audioquality = opts.audioquality.strip('k').strip('K')
if not opts.audioquality.isdigit():
parser.error(u'invalid audio quality specified')
# File downloader
fd = FileDownloader({
'usenetrc': opts.usenetrc,
'username': opts.username,
'password': opts.password,
'quiet': (opts.quiet or opts.geturl or opts.gettitle or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat),
'forceurl': opts.geturl,
'forcetitle': opts.gettitle,
'forcethumbnail': opts.getthumbnail,
'forcedescription': opts.getdescription,
'forcefilename': opts.getfilename,
'forceformat': opts.getformat,
'simulate': opts.simulate,
'skip_download': (opts.skip_download or opts.simulate or opts.geturl or opts.gettitle or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat),
'format': opts.format,
'format_limit': opts.format_limit,
'listformats': opts.listformats,
'outtmpl': ((opts.outtmpl is not None and opts.outtmpl.decode(preferredencoding()))
or (opts.format == '-1' and opts.usetitle and u'%(stitle)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and opts.useliteral and u'%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and u'%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and u'%(autonumber)s-%(stitle)s-%(id)s.%(ext)s')
or (opts.useliteral and opts.autonumber and u'%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and u'%(stitle)s-%(id)s.%(ext)s')
or (opts.useliteral and u'%(title)s-%(id)s.%(ext)s')
or (opts.autonumber and u'%(autonumber)s-%(id)s.%(ext)s')
or u'%(id)s.%(ext)s'),
'ignoreerrors': opts.ignoreerrors,
'ratelimit': opts.ratelimit,
'nooverwrites': opts.nooverwrites,
'retries': opts.retries,
'continuedl': opts.continue_dl,
'noprogress': opts.noprogress,
'playliststart': opts.playliststart,
'playlistend': opts.playlistend,
'logtostderr': opts.outtmpl == '-',
'consoletitle': opts.consoletitle,
'nopart': opts.nopart,
'updatetime': opts.updatetime,
'writedescription': opts.writedescription,
'writeinfojson': opts.writeinfojson,
'writesubtitles': opts.writesubtitles,
'subtitleslang': opts.subtitleslang,
'matchtitle': opts.matchtitle,
'rejecttitle': opts.rejecttitle,
'max_downloads': opts.max_downloads,
'prefer_free_formats': opts.prefer_free_formats,
'verbose': opts.verbose,
})
if opts.verbose:
fd.to_screen(u'[debug] Proxy map: ' + str(proxy_handler.proxies))
for extractor in extractors:
fd.add_info_extractor(extractor)
# PostProcessors
if opts.extractaudio:
fd.add_post_processor(FFmpegExtractAudioPP(preferredcodec=opts.audioformat, preferredquality=opts.audioquality, keepvideo=opts.keepvideo))
# Update version
if opts.update_self:
updateSelf(fd, sys.argv[0])
# Maybe do nothing
if len(all_urls) < 1:
if not opts.update_self:
parser.error(u'you must provide at least one URL')
else:
sys.exit()
try:
retcode = fd.download(all_urls)
except MaxDownloadsReached:
fd.to_screen(u'--max-download limit reached, aborting.')
retcode = 101
# Dump cookie jar if requested
if opts.cookiefile is not None:
try:
jar.save()
except (IOError, OSError), err:
sys.exit(u'ERROR: unable to save cookie jar')
sys.exit(retcode)
def main():
try:
_real_main()
except DownloadError:
sys.exit(1)
except SameFileError:
sys.exit(u'ERROR: fixed output name but more than one file to download')
except KeyboardInterrupt:
sys.exit(u'\nERROR: Interrupted by user')

7
youtube_dl/__main__.py Executable file
View File

@@ -0,0 +1,7 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import __init__
if __name__ == '__main__':
__init__.main()

354
youtube_dl/utils.py Normal file
View File

@@ -0,0 +1,354 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import gzip
import htmlentitydefs
import HTMLParser
import locale
import os
import re
import sys
import zlib
import urllib2
import email.utils
import json
try:
import cStringIO as StringIO
except ImportError:
import StringIO
std_headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-us,en;q=0.5',
}
def preferredencoding():
"""Get preferred encoding.
Returns the best encoding scheme for the system, based on
locale.getpreferredencoding() and some further tweaks.
"""
def yield_preferredencoding():
try:
pref = locale.getpreferredencoding()
u'TEST'.encode(pref)
except:
pref = 'UTF-8'
while True:
yield pref
return yield_preferredencoding().next()
def htmlentity_transform(matchobj):
"""Transforms an HTML entity to a Unicode character.
This function receives a match object and is intended to be used with
the re.sub() function.
"""
entity = matchobj.group(1)
# Known non-numeric HTML entity
if entity in htmlentitydefs.name2codepoint:
return unichr(htmlentitydefs.name2codepoint[entity])
# Unicode character
mobj = re.match(ur'(?u)#(x?\d+)', entity)
if mobj is not None:
numstr = mobj.group(1)
if numstr.startswith(u'x'):
base = 16
numstr = u'0%s' % numstr
else:
base = 10
return unichr(long(numstr, base))
# Unknown entity in name, return its literal representation
return (u'&%s;' % entity)
HTMLParser.locatestarttagend = re.compile(r"""<[a-zA-Z][-.a-zA-Z0-9:_]*(?:\s+(?:(?<=['"\s])[^\s/>][^\s/=>]*(?:\s*=+\s*(?:'[^']*'|"[^"]*"|(?!['"])[^>\s]*))?\s*)*)?\s*""", re.VERBOSE) # backport bugfix
class IDParser(HTMLParser.HTMLParser):
"""Modified HTMLParser that isolates a tag with the specified id"""
def __init__(self, id):
self.id = id
self.result = None
self.started = False
self.depth = {}
self.html = None
self.watch_startpos = False
self.error_count = 0
HTMLParser.HTMLParser.__init__(self)
def error(self, message):
print >> sys.stderr, self.getpos()
if self.error_count > 10 or self.started:
raise HTMLParser.HTMLParseError(message, self.getpos())
self.rawdata = '\n'.join(self.html.split('\n')[self.getpos()[0]:]) # skip one line
self.error_count += 1
self.goahead(1)
def loads(self, html):
self.html = html
self.feed(html)
self.close()
def handle_starttag(self, tag, attrs):
attrs = dict(attrs)
if self.started:
self.find_startpos(None)
if 'id' in attrs and attrs['id'] == self.id:
self.result = [tag]
self.started = True
self.watch_startpos = True
if self.started:
if not tag in self.depth: self.depth[tag] = 0
self.depth[tag] += 1
def handle_endtag(self, tag):
if self.started:
if tag in self.depth: self.depth[tag] -= 1
if self.depth[self.result[0]] == 0:
self.started = False
self.result.append(self.getpos())
def find_startpos(self, x):
"""Needed to put the start position of the result (self.result[1])
after the opening tag with the requested id"""
if self.watch_startpos:
self.watch_startpos = False
self.result.append(self.getpos())
handle_entityref = handle_charref = handle_data = handle_comment = \
handle_decl = handle_pi = unknown_decl = find_startpos
def get_result(self):
if self.result == None: return None
if len(self.result) != 3: return None
lines = self.html.split('\n')
lines = lines[self.result[1][0]-1:self.result[2][0]]
lines[0] = lines[0][self.result[1][1]:]
if len(lines) == 1:
lines[-1] = lines[-1][:self.result[2][1]-self.result[1][1]]
lines[-1] = lines[-1][:self.result[2][1]]
return '\n'.join(lines).strip()
def get_element_by_id(id, html):
"""Return the content of the tag with the specified id in the passed HTML document"""
parser = IDParser(id)
try:
parser.loads(html)
except HTMLParser.HTMLParseError:
pass
return parser.get_result()
def clean_html(html):
"""Clean an HTML snippet into a readable string"""
# Newline vs <br />
html = html.replace('\n', ' ')
html = re.sub('\s*<\s*br\s*/?\s*>\s*', '\n', html)
# Strip html tags
html = re.sub('<.*?>', '', html)
# Replace html entities
html = unescapeHTML(html)
return html
def sanitize_open(filename, open_mode):
"""Try to open the given filename, and slightly tweak it if this fails.
Attempts to open the given filename. If this fails, it tries to change
the filename slightly, step by step, until it's either able to open it
or it fails and raises a final exception, like the standard open()
function.
It returns the tuple (stream, definitive_file_name).
"""
try:
if filename == u'-':
if sys.platform == 'win32':
import msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
return (sys.stdout, filename)
stream = open(encodeFilename(filename), open_mode)
return (stream, filename)
except (IOError, OSError), err:
# In case of error, try to remove win32 forbidden chars
filename = re.sub(ur'[/<>:"\|\?\*]', u'#', filename)
# An exception here should be caught in the caller
stream = open(encodeFilename(filename), open_mode)
return (stream, filename)
def timeconvert(timestr):
"""Convert RFC 2822 defined time string into system timestamp"""
timestamp = None
timetuple = email.utils.parsedate_tz(timestr)
if timetuple is not None:
timestamp = email.utils.mktime_tz(timetuple)
return timestamp
def sanitize_filename(s):
"""Sanitizes a string so it could be used as part of a filename."""
def replace_insane(char):
if char in u' .\\/|?*<>:"' or ord(char) < 32:
return '_'
return char
return u''.join(map(replace_insane, s)).strip('_')
def orderedSet(iterable):
""" Remove all duplicates from the input iterable """
res = []
for el in iterable:
if el not in res:
res.append(el)
return res
def unescapeHTML(s):
"""
@param s a string (of type unicode)
"""
assert type(s) == type(u'')
result = re.sub(ur'(?u)&(.+?);', htmlentity_transform, s)
return result
def encodeFilename(s):
"""
@param s The name of the file (of type unicode)
"""
assert type(s) == type(u'')
if sys.platform == 'win32' and sys.getwindowsversion()[0] >= 5:
# Pass u'' directly to use Unicode APIs on Windows 2000 and up
# (Detecting Windows NT 4 is tricky because 'major >= 4' would
# match Windows 9x series as well. Besides, NT 4 is obsolete.)
return s
else:
return s.encode(sys.getfilesystemencoding(), 'ignore')
class DownloadError(Exception):
"""Download Error exception.
This exception may be thrown by FileDownloader objects if they are not
configured to continue on errors. They will contain the appropriate
error message.
"""
pass
class SameFileError(Exception):
"""Same File exception.
This exception will be thrown by FileDownloader objects if they detect
multiple files would have to be downloaded to the same file on disk.
"""
pass
class PostProcessingError(Exception):
"""Post Processing exception.
This exception may be raised by PostProcessor's .run() method to
indicate an error in the postprocessing task.
"""
pass
class MaxDownloadsReached(Exception):
""" --max-downloads limit has been reached. """
pass
class UnavailableVideoError(Exception):
"""Unavailable Format exception.
This exception will be thrown when a video is requested
in a format that is not available for that video.
"""
pass
class ContentTooShortError(Exception):
"""Content Too Short exception.
This exception may be raised by FileDownloader objects when a file they
download is too small for what the server announced first, indicating
the connection was probably interrupted.
"""
# Both in bytes
downloaded = None
expected = None
def __init__(self, downloaded, expected):
self.downloaded = downloaded
self.expected = expected
class Trouble(Exception):
"""Trouble helper exception
This is an exception to be handled with
FileDownloader.trouble
"""
class YoutubeDLHandler(urllib2.HTTPHandler):
"""Handler for HTTP requests and responses.
This class, when installed with an OpenerDirector, automatically adds
the standard headers to every HTTP request and handles gzipped and
deflated responses from web servers. If compression is to be avoided in
a particular request, the original request in the program code only has
to include the HTTP header "Youtubedl-No-Compression", which will be
removed before making the real request.
Part of this code was copied from:
http://techknack.net/python-urllib2-handlers/
Andrew Rowls, the author of that code, agreed to release it to the
public domain.
"""
@staticmethod
def deflate(data):
try:
return zlib.decompress(data, -zlib.MAX_WBITS)
except zlib.error:
return zlib.decompress(data)
@staticmethod
def addinfourl_wrapper(stream, headers, url, code):
if hasattr(urllib2.addinfourl, 'getcode'):
return urllib2.addinfourl(stream, headers, url, code)
ret = urllib2.addinfourl(stream, headers, url)
ret.code = code
return ret
def http_request(self, req):
for h in std_headers:
if h in req.headers:
del req.headers[h]
req.add_header(h, std_headers[h])
if 'Youtubedl-no-compression' in req.headers:
if 'Accept-encoding' in req.headers:
del req.headers['Accept-encoding']
del req.headers['Youtubedl-no-compression']
return req
def http_response(self, req, resp):
old_resp = resp
# gzip
if resp.headers.get('Content-encoding', '') == 'gzip':
gz = gzip.GzipFile(fileobj=StringIO.StringIO(resp.read()), mode='r')
resp = self.addinfourl_wrapper(gz, old_resp.headers, old_resp.url, old_resp.code)
resp.msg = old_resp.msg
# deflate
if resp.headers.get('Content-encoding', '') == 'deflate':
gz = StringIO.StringIO(self.deflate(resp.read()))
resp = self.addinfourl_wrapper(gz, old_resp.headers, old_resp.url, old_resp.code)
resp.msg = old_resp.msg
return resp