Compare commits

..

1262 Commits

Author SHA1 Message Date
Philipp Hagemeister
fd87ff26b9 release 2013.07.11 2013-07-11 21:04:59 +02:00
Jaime Marquínez Ferrándiz
85347e1cb6 YoutubeIE: a new algo for length 83 2013-07-11 20:21:45 +02:00
Jaime Marquínez Ferrándiz
41897817cc GametrailersIE: support multipart videos
Use xml.etree.ElementTree instead of re when possible
2013-07-11 18:24:53 +02:00
Philipp Hagemeister
45ff2d51d0 [brightcove] add import 2013-07-11 16:31:29 +02:00
Philipp Hagemeister
5de3ece225 [brightcove] fix on Python 2.6 2013-07-11 16:16:02 +02:00
Philipp Hagemeister
df50a41289 [arte] Fix on 2.6 2013-07-11 16:12:16 +02:00
Philipp Hagemeister
59ae56fad5 Add helper function find_path_attr 2013-07-11 16:12:08 +02:00
Philipp Hagemeister
690e872c51 Remove video_result helper method
Calling it was more complex then actually including the type in the video info
2013-07-11 12:12:30 +02:00
Philipp Hagemeister
81082e046e [ehow] improve minor bits 2013-07-11 12:11:00 +02:00
Philipp Hagemeister
3fa9550837 Merge remote-tracking branch 'yasoob/master' 2013-07-11 12:02:16 +02:00
M.Yasoob Khalid
b1082f01a6 added test for ehow 2013-07-11 14:30:25 +05:00
M.Yasoob Khalid
f35b84c807 added an IE for Ehow videos 2013-07-11 14:25:14 +05:00
Jaime Marquínez Ferrándiz
117adb0f0f GenericIE: detect more Brightcove videos
In some sites "class" contains more that BrightcoveExperience
2013-07-11 00:25:38 +02:00
Jaime Marquínez Ferrándiz
abb285fb1b BrightcoveIE: add support for playlists 2013-07-11 00:04:33 +02:00
Jaime Marquínez Ferrándiz
a431154706 Set the playlist_index and playlist fields for already resolved video results. 2013-07-10 23:36:30 +02:00
Jaime Marquínez Ferrándiz
cfe50f04ed GenericIE: Detect videos from Brightcove
Brightcove videos info is usually found in an <object class="BrightcoveExperience"></object> node, this is passed to a new method of BrightcoveIE that builds a url to extract the video.
2013-07-10 17:49:11 +02:00
Jaime Marquínez Ferrándiz
a7055eb956 YoutubeIE: show a more meaningful error when it founds a rtmpe download (related #343) 2013-07-10 14:35:11 +02:00
Philipp Hagemeister
0a1be1e997 release 2013.07.10 2013-07-10 11:36:11 +02:00
Jaime Marquínez Ferrándiz
c93898dae9 YoutubeIE: new algo for length 83 (closes #1017 and closes #1016) 2013-07-10 10:44:04 +02:00
Jaime Marquínez Ferrándiz
ebdf2af727 GameSpotIE: support more urls and download videos in the best quality 2013-07-09 20:07:52 +02:00
Jaime Marquínez Ferrándiz
c108eb73cc YoutubeIE: Fix vevo explicit videos (closes #956)
When an age restricted video is detected it simulates accessing the video from www.youtube.com/v/{video_id}
2013-07-09 15:43:44 +02:00
Jaime Marquínez Ferrándiz
3a1375dacf VeohIE: remove debug logging 2013-07-09 11:11:55 +02:00
Jaime Marquínez Ferrándiz
41bece30b4 DotsubIE: simplify and extract the upload date
Do not declare variables for fields in the info dictionary.
2013-07-08 22:40:42 +02:00
Jaime Marquínez Ferrándiz
16ea58cbda Merge pull request #1009 from yasoob/master
Added an IE and test for dotsub.com videos. ( closes #1008 )
2013-07-08 22:21:06 +02:00
Jaime Marquínez Ferrándiz
99e350d902 Add VeohIE (closes #1006) 2013-07-08 22:02:23 +02:00
M.Yasoob Khalid
13e06d298c added an IE and test for dotsub. 2013-07-09 00:05:52 +05:00
Jaime Marquínez Ferrándiz
81f0259b9e YoutubeSubscriptionsIE: raise an error if there's no login information. 2013-07-08 11:24:11 +02:00
Jaime Marquínez Ferrándiz
fefcb5d314 YoutubeIE: use the new method in the base IE for getting the login info 2013-07-08 11:24:11 +02:00
Philipp Hagemeister
345b0c9b46 Remove dead code 2013-07-08 02:13:50 +02:00
Philipp Hagemeister
20c3893f0e Do not redefine variables in list comprehensions 2013-07-08 02:12:20 +02:00
Philipp Hagemeister
29293c1e09 release 2013.07.08.1 2013-07-08 02:05:22 +02:00
Philipp Hagemeister
5fe3a3c3fb [archive.org] Add extractor (Fixes #1003) 2013-07-08 02:05:02 +02:00
Philipp Hagemeister
b04621d155 release 2013.07.08 2013-07-08 01:29:16 +02:00
Philipp Hagemeister
b227060388 [arte] Always look for the JSON URL (Fixes #1002) 2013-07-08 01:28:19 +02:00
Philipp Hagemeister
d93e4dcbb7 Merge branch 'master' of github.com:rg3/youtube-dl 2013-07-08 01:15:19 +02:00
Philipp Hagemeister
73e79f2a1b [3sat] Add support (Fixes #1001) 2013-07-08 01:13:55 +02:00
Jaime Marquínez Ferrándiz
fc79158de2 VimeoIE: authentication support (closes #885) and add a method in the base InfoExtractor to get the login info 2013-07-07 23:24:34 +02:00
Jaime Marquínez Ferrándiz
7763b04e5f YoutubeIE: extract the thumbnail in the best possible quality 2013-07-07 21:21:15 +02:00
Philipp Hagemeister
9d7b44b4cc release 2013.07.07.01 2013-07-07 17:13:56 +02:00
Philipp Hagemeister
897f36d179 [youtube:subscriptions] Use colon for differentiation of shortcuts 2013-07-07 17:13:26 +02:00
Philipp Hagemeister
94c3637f6d release 2013.07.07 2013-07-07 16:55:06 +02:00
Jaime Marquínez Ferrándiz
04cc96173c [youtube] Add and extractor for the subscriptions feed (closes #498)
It can be downloaded using the ytsubscriptions keyword.
It needs the login information.
2013-07-07 13:58:23 +02:00
Jaime Marquínez Ferrándiz
fbaaad49d7 Add BrightcoveIE (closes #832)
It only accepts the urls that are use for embedding the video, it doesn't search in generic webpages to find Brightcove videos
2013-07-05 21:31:50 +02:00
Jaime Marquínez Ferrándiz
b29f3b250d DailymotionIE: extract thumbnail 2013-07-05 19:39:37 +02:00
Philipp Hagemeister
fa343954d4 release 2013.07.05 2013-07-05 14:46:24 +02:00
Jaime Marquínez Ferrándiz
2491f5898e DailymotionIE: simplify the extraction of the title and remove an unused assignment of video_uploader 2013-07-05 14:20:15 +02:00
Jaime Marquínez Ferrándiz
b27c856fbc Dailymotion: fix the download of the video in the max quality (closes #986) 2013-07-05 14:15:26 +02:00
Jaime Marquínez Ferrándiz
9941ceb331 ArteTVIE: support emission urls that don't contain the video id
Like http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
2013-07-05 12:56:41 +02:00
Philipp Hagemeister
c536d38059 release 2013.07.04 2013-07-04 18:07:34 +02:00
Philipp Hagemeister
8de64cac98 [arte] Fix language selection (Fixes #988) 2013-07-04 18:07:03 +02:00
Philipp Hagemeister
6d6d286539 Merge branch 'master' of github.com:rg3/youtube-dl 2013-07-03 16:36:42 +02:00
Philipp Hagemeister
5d2eac9eba [auengine] Add tests (Fixes #985) 2013-07-03 16:36:36 +02:00
Jaime Marquínez Ferrándiz
9826925a20 ArteTVIE: extract the video with the correct language
Some urls from the French version of the page could download the German version.

Also instead of extracting the json url from the webpage, build it to skip the download
2013-07-02 17:34:40 +02:00
Jaime Marquínez Ferrándiz
24a267b562 TudouIE: extract all the segments of the video and download the best quality (closes #975)
Also simplify a bit the extraction of the id from the url and write directly the title for the test video
2013-07-02 12:38:24 +02:00
Jaime Marquínez Ferrándiz
d4da3d6116 BlipTVIE: download the video in the best quality (closes #215) 2013-07-02 10:40:23 +02:00
Philipp Hagemeister
d5a62e4f5f release 2013.07.02 2013-07-02 09:14:09 +02:00
Philipp Hagemeister
9a82b2389f Do not show bug report for errors that are to be expected (Closes #973) 2013-07-02 08:40:21 +02:00
Philipp Hagemeister
8dba13f7e8 Squelch git not found exception (#973) 2013-07-02 08:36:20 +02:00
Philipp Hagemeister
deacef651f Improve formatting 2013-07-02 08:35:39 +02:00
Philipp Hagemeister
2e1b3afeca README.md: Fix markup and some of the text.
(Originally from Rogério Brito <rbrito@ime.usp.br>)
2013-07-02 07:39:54 +02:00
Rogério Brito
652e776893 setup: PEP-8 fixes.
Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-07-01 23:17:48 -03:00
Rogério Brito
d055fe4cb0 setup: cosmetics: Add/remove some whitespace for readability.
This also fixes some long lines.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-07-01 23:17:48 -03:00
Rogério Brito
131842bb0b setup: Move pseudo-docstring to a proper comment.
A string statement is not a docstring if it doesn't occur right at the top
of modules, functions, class definitions etc.

This patch fixes it.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-07-01 23:17:48 -03:00
Jaime Marquínez Ferrándiz
59fc531f78 Add InstagramIE (related #904) 2013-07-01 21:08:54 +02:00
Jaime Marquínez Ferrándiz
5c44c15438 GenericIE: match titles that spread across multiple lines (related #904) 2013-07-01 20:50:50 +02:00
Philipp Hagemeister
62067cb9b8 Shorten --list-extractor-descriptions to --extractor-descriptions 2013-07-01 18:59:29 +02:00
Philipp Hagemeister
0f81866329 Add --list-extractor-descriptions (human-readable list of IEs) 2013-07-01 18:52:19 +02:00
Philipp Hagemeister
2db67bc0f4 Merge branch 'master' of github.com:rg3/youtube-dl 2013-07-01 18:21:36 +02:00
Philipp Hagemeister
7dba9cd039 Sort IEs alphabetically in --list-extractors 2013-07-01 18:21:29 +02:00
Jaime Marquínez Ferrándiz
75dff0eef7 [youtube]: add YoutubeShowIE (closes #14)
It just extracts the playlists urls for each season
2013-07-01 17:59:28 +02:00
Jaime Marquínez Ferrándiz
d828f3a550 YoutubeIE: use a negative index when accessing the last element of the format list 2013-07-01 17:19:33 +02:00
Jaime Marquínez Ferrándiz
bcd6e4bd07 YoutubeIE: extract the correct video id for movie URLs (closes #597) 2013-07-01 16:51:18 +02:00
Philipp Hagemeister
53936f3d57 Merge remote-tracking branch 'yasoob/master'
Conflicts:
	youtube_dl/extractor/__init__.py
2013-07-01 15:19:45 +02:00
Philipp Hagemeister
0beb3add18 Separate downloader options 2013-07-01 14:53:25 +02:00
Philipp Hagemeister
f9bd64c098 [update] Add package manager to error message (#959) 2013-07-01 02:36:49 +02:00
Philipp Hagemeister
d7f44b5bdb [youtube] Warn if URL is most likely wrong (#969) 2013-07-01 02:29:29 +02:00
Philipp Hagemeister
48bfb5f238 [instagram] Fix title 2013-06-30 14:07:32 +02:00
Jaime Marquínez Ferrándiz
97ebe8dcaf StatigramIE: update the title of the test video 2013-06-30 13:57:57 +02:00
Jaime Marquínez Ferrándiz
d4409747ba TumblrIE: update test
The video (once more) is no longer available
2013-06-30 13:52:20 +02:00
Jaime Marquínez Ferrándiz
37b6a6617f ArteTvIE: support videos from videos.arte.tv
Each source of videos have a different extraction process, they are in different methods of the extractor.
Changed the extension of videos from mp4 to flv.
2013-06-30 13:38:22 +02:00
Philipp Hagemeister
ca1c9cfe11 release 2013.06.34.4 2013-06-29 20:22:08 +02:00
Philipp Hagemeister
adeb4d7469 Merge remote-tracking branch 'origin/master' 2013-06-29 20:21:13 +02:00
Philipp Hagemeister
50587ee8ec [vimeo] fix detection for http://vimeo.com/groups/124584/videos/24973060 2013-06-29 20:20:20 +02:00
Jaime Marquínez Ferrándiz
8244288dfe WatIE: support videos divided in multiple parts (closes #222 and #659)
The id for the videos is now the full id, no the one in the webpage url.
Also extract more information: description, view_count and upload_date
2013-06-29 18:22:03 +02:00
Philipp Hagemeister
6ffe72835a [tutv] Fix URL type (for Python 3) 2013-06-29 17:42:15 +02:00
Philipp Hagemeister
8ba5e990a5 release 2013.06.34.3 2013-06-29 17:30:11 +02:00
Philipp Hagemeister
9afb1afcc6 [tutv] Add IE (Fixes #965) 2013-06-29 17:29:40 +02:00
Philipp Hagemeister
0e21093a8f Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-29 16:57:34 +02:00
Philipp Hagemeister
9c5cd0948f [ted] Fix test checksum 2013-06-29 16:45:56 +02:00
Jaime Marquínez Ferrándiz
1083705fe8 Update the default output template in the README
It was changed in 08b2ac745a
2013-06-29 16:35:28 +02:00
Philipp Hagemeister
f3d294617f Document view_count (Closes #963) 2013-06-29 16:32:28 +02:00
Philipp Hagemeister
de33a30858 Merge pull request #962 from jaimeMF/TF1
Add TF1IE
2013-06-29 07:30:49 -07:00
M.Yasoob Khalid
887a227953 added an IE and test for traileraddict.com 2013-06-29 19:17:27 +05:00
Jaime Marquínez Ferrándiz
705f6f35bc Move TF1IE to its own file 2013-06-29 15:18:19 +02:00
Jaime Marquínez Ferrándiz
e648b22dbd Add TF1IE 2013-06-29 15:07:25 +02:00
Filippo Valsorda
257a2501fa keep track of the dates and html5player versions of working YT signature algos 2013-06-29 01:05:36 +02:00
Jaime Marquínez Ferrándiz
99afb3ddd4 Add WatIE 2013-06-28 22:01:47 +02:00
Philipp Hagemeister
a3c776203f Rewrote error message a bit to clarify 2013-06-28 18:53:31 +02:00
M.Yasoob Ullah Khalid
53f350c165 Changed the error message.
I changed the ExtractorError from ```msg = msg + u'; please report this issue on http://yt-dl.org/bug'``` to ```msg = msg + u'; please report this issue on http://yt-dl.org/bug with the complete output by running the same command with --verbose flag'```
Hopefully this will tell the users to report bugs with the complete output.
2013-06-28 18:51:54 +02:00
M.Yasoob Khalid
f46d31f948 Add RingTVIE (Thanks @yasoob) 2013-06-28 18:51:00 +02:00
M.Yasoob Khalid
bf64ff72db Added an IE for gamespot. Although gamespot allows downloading but it is only available to registered users. With this IE no registration is required. 2013-06-28 18:42:45 +02:00
Jaime Marquínez Ferrándiz
bc2884afc1 Print which IE is being skipped in test_download 2013-06-28 11:20:00 +02:00
Jaime Marquínez Ferrándiz
023fa8c440 Add function add_default_info_extractors to YoutubeDL
It adds to the list the ies returned by ge_extractors
2013-06-27 23:51:06 +02:00
Philipp Hagemeister
427023a1e6 Merge branch 'generate-ie-list' 2013-06-27 22:44:02 +02:00
Philipp Hagemeister
a924876fed Make sure that IEs only accept their own URLs 2013-06-27 21:25:51 +02:00
Philipp Hagemeister
3f223f7b2e [tumblr] Fix title 2013-06-27 21:19:42 +02:00
Philipp Hagemeister
fc2c063e1e Move testcase generator to helper 2013-06-27 21:15:16 +02:00
Philipp Hagemeister
20db33e299 Make sure SoundcloudIE does not match soundcloud sets 2013-06-27 21:11:23 +02:00
Philipp Hagemeister
c0109aa497 release 2013.06.34.2 2013-06-27 20:50:57 +02:00
Philipp Hagemeister
ba7a1de04d Credit @gitprojs for auengine 2013-06-27 20:50:34 +02:00
Philipp Hagemeister
4269e78a80 Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-27 20:47:03 +02:00
Philipp Hagemeister
6f5ac90cf3 Move tests to the IE definitions 2013-06-27 20:46:46 +02:00
Philipp Hagemeister
de282fc217 Merge pull request #954 from gitprojs/generic
Augmented Generic IE
2013-06-27 11:44:46 -07:00
Philipp Hagemeister
ddbd903576 Tests: Add coding to files 2013-06-27 20:32:02 +02:00
Philipp Hagemeister
0c56a3f773 [googleplus] move tests 2013-06-27 20:31:27 +02:00
Philipp Hagemeister
9d069c4778 [infoq] move tests 2013-06-27 20:27:08 +02:00
Philipp Hagemeister
0d843f796b Remove superfluous name declarations 2013-06-27 20:25:56 +02:00
Philipp Hagemeister
67f51b3d8c [youku] move tests 2013-06-27 20:25:46 +02:00
Philipp Hagemeister
5c5de1c79a [eighttracks] move test 2013-06-27 20:22:00 +02:00
Philipp Hagemeister
0821771466 [steam] move test 2013-06-27 20:20:00 +02:00
Philipp Hagemeister
83f6f68e79 [metacafe] move tests 2013-06-27 20:18:35 +02:00
Albert Kim
27473d18da Made 'video' the default title for generic IE 2013-06-27 19:18:15 +01:00
Philipp Hagemeister
0c6c096c20 [soundcloud] Move tests 2013-06-27 20:17:21 +02:00
Albert Kim
52c8ade4ad Made generic IE handle more cases
Added a possible quote after file, so it can now handle cases like:
'file': 'http://www.a.com/b.mp4'
2013-06-27 19:16:09 +01:00
Philipp Hagemeister
0e853ca4c4 [youtube] Fix tests in 2.x 2013-06-27 19:55:39 +02:00
Philipp Hagemeister
41beccbab0 Use str every time 2013-06-27 19:43:43 +02:00
Philipp Hagemeister
2eb88d953f Allow _TESTS attribute for IEs with multiple tests
This also improves the numbering of duplicate tests
2013-06-27 19:13:11 +02:00
Philipp Hagemeister
1f0483b4b1 Generate the list of IEs automatically
It seems like GenericIE needs to be last, but other than that, the order really does not matter anymore.
To cut down on merge conflicts, generate the list of IEs automatically.
2013-06-27 18:43:32 +02:00
Philipp Hagemeister
6b47c7f24e Allow moving tests into IE files
Allow adding download tests right in the IE file.
This will cut down on merge conflicts and make it more likely that new IE authors will add tests right away.
2013-06-27 18:28:45 +02:00
Philipp Hagemeister
d798e1c7a9 [auengine] Rename to official capitalization 2013-06-27 18:19:19 +02:00
Philipp Hagemeister
3a8736bd74 Merge remote-tracking branch 'gitprojs/master'
Conflicts:
	youtube_dl/extractor/__init__.py
2013-06-27 18:16:41 +02:00
Philipp Hagemeister
c8c5163618 release 2013.06.34.1 2013-06-27 17:58:58 +02:00
Philipp Hagemeister
500f3d2432 Merge remote-tracking branch 'origin/HEAD' 2013-06-27 17:58:42 +02:00
Philipp Hagemeister
ed4a915e08 Add tests and improve for HotNewHipHop 2013-06-27 17:56:48 +02:00
Philipp Hagemeister
b8f7b1579a Merge remote-tracking branch 'JohnyMoSwag/master' 2013-06-27 17:52:41 +02:00
Johny Mo Swag
ed54491c60 fix for detecting youtube embedded videos. 2013-06-27 08:39:32 -07:00
Albert Kim
e4decf2750 Updated auengine IE to use compat_urllib* utils 2013-06-27 13:48:28 +01:00
Jaime Marquínez Ferrándiz
c90f13d106 YoutubeIE: update the docstrings and the error message of _decrypt_signature
Now it doesn't check the size of the two parts of the key.
2013-06-27 14:37:45 +02:00
Albert Kim
62008f69c1 Added an IE for auengine.com 2013-06-27 12:58:09 +01:00
Philipp Hagemeister
e88f5e0b4e release 2013.06.34 2013-06-27 13:02:57 +02:00
Filippo Valsorda
769fda3c5a print more encrypted signature info on -v (rel: #948) 2013-06-27 12:54:07 +02:00
Filippo Valsorda
23300d7149 a new day, a new s algo - fix #946 2013-06-27 12:24:46 +02:00
Philipp Hagemeister
f5756f388a Check in signature generator 2013-06-27 11:15:29 +02:00
Philipp Hagemeister
ee313cdcbf simplify youtube signature generation 2013-06-27 11:15:01 +02:00
Johny Mo Swag
8b50fed04b removed print statement 2013-06-26 19:04:05 -07:00
Johny Mo Swag
5b66de8859 Added HotNewHipHop IE 2013-06-26 18:38:48 -07:00
Philipp Hagemeister
e38af9e00c Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-27 01:52:13 +02:00
Philipp Hagemeister
6b37f0be55 Add a clean-room implementation for youtube signatures 2013-06-27 01:51:10 +02:00
Jaime Marquínez Ferrándiz
6e5d5f2fc1 Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-27 00:16:02 +02:00
Jaime Marquínez Ferrándiz
75c9481224 ArteTvIE: rewrite the extract process to support the new site (fixes #875)
The video can be downloaded with rtmp or http, but the best quality format seems to always use rtmp.
Deleted the old methods.
2013-06-27 00:09:51 +02:00
Philipp Hagemeister
5746f9da99 Add test for youtube signature algorithm 2013-06-27 00:09:25 +02:00
Philipp Hagemeister
112da0a0ce Simplify FakeYDL 2013-06-27 00:09:05 +02:00
Jaime Marquínez Ferrándiz
bcd606c0fe ComedycentralIE: Force conversion of the description to unicode (close #941)
When writing to a file it would fail.
2013-06-26 21:38:01 +02:00
Philipp Hagemeister
ed92bc9f6e [wimp] minor readability improvements (#940) 2013-06-26 18:22:42 +02:00
Philipp Hagemeister
9b0756f8f2 [vevo] remove unused import 2013-06-26 18:05:01 +02:00
Jaime Marquínez Ferrándiz
aa0c87391c Add CSpanIE (closes #312) 2013-06-26 17:55:54 +02:00
M.Yasoob Khalid
b1dfdc51b1 added .decode('ascii') 2013-06-26 19:41:55 +05:00
Jaime Marquínez Ferrándiz
2e32528012 FileDownloader: fixed call to "report_error" of YoutubeDL
It was being called as "error"
2013-06-26 16:32:47 +02:00
M.Yasoob Khalid
f64e7695a1 added b'' to my regex expression in order to solve the error on python 3 2013-06-26 18:46:05 +05:00
M.Yasoob Khalid
5abeaf0650 changed wimp.py according to the changes suggested by jaime 2013-06-26 17:26:59 +05:00
M.Yasoob Khalid
8bcc355972 removed trailing ',' and corrected the title in test 2013-06-26 15:51:25 +05:00
M.Yasoob Khalid
6b4642fae3 added test for wimp.com 2013-06-26 15:40:24 +05:00
M.Yasoob Khalid
d1bd37deac Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-26 15:30:21 +05:00
M.Yasoob Khalid
405ec05cb2 added an IE for wimp.com 2013-06-26 15:25:53 +05:00
Jaime Marquínez Ferrándiz
52e8e1dc88 Merge pull request #936 from iemejia/master
Added option for vtt WebVTT subtitle format for Youtube
2013-06-26 03:06:06 -07:00
Ismael Mejia
b98a6b2f72 Fixed typo in subtitle format option (from: sbt => sbv) 2013-06-26 11:59:29 +02:00
Ismael Mejia
0ca45b233f Added missing write-auto-sub option in README file 2013-06-26 11:34:38 +02:00
Ismael Mejia
65cceef8f4 Added support for additional vtt subtitle format (WebVTT) in youtube-dl. 2013-06-26 11:28:47 +02:00
Jaime Marquínez Ferrándiz
b004821fa9 Add the option "--write-auto-sub" to download automatic subtitles from Youtube
Now automatic subtitles are only downloaded if the option is given.
(closes #903)
2013-06-25 23:46:24 +02:00
Philipp Hagemeister
81b42336ad release 2013.06.33 2013-06-25 22:42:02 +02:00
Jaime Marquínez Ferrándiz
c6c1974672 Add "--video-password" option (related #889)
Used only for accessing a private video

Restore the error when the account is missing
2013-06-25 22:22:32 +02:00
Jaime Marquínez Ferrándiz
a545d1d262 Merge pull request #922 from JohnyMoSwag/master
Added embedded youtube detection to WorldstarIE
2013-06-25 22:08:58 +02:00
Jaime Marquínez Ferrándiz
037fcd0047 JukeboxIE: support more countries 2013-06-25 22:04:44 +02:00
Philipp Hagemeister
318452bc0c Sort IEs alphabetically 2013-06-25 21:11:57 +02:00
Philipp Hagemeister
d746cd88c2 Merge remote-tracking branch 'yasoob/master' 2013-06-25 21:09:15 +02:00
Philipp Hagemeister
9c42603b5a release 2013.06.32 2013-06-25 20:55:47 +02:00
Philipp Hagemeister
ea93cce4f6 Directly call update_latest 2013-06-25 20:50:54 +02:00
M.Yasoob Khalid
f4daa18152 added test for tudou.com 2013-06-25 22:52:21 +05:00
M.Yasoob Khalid
9caa687d81 Added an IE for todou 2013-06-25 22:48:08 +05:00
Philipp Hagemeister
3b58c6fb54 Update latest files on release 2013-06-25 18:48:57 +02:00
Philipp Hagemeister
5926c10690 release 2013.06.31 2013-06-25 18:40:58 +02:00
Philipp Hagemeister
df725153d2 Credit mc2avr for JukeboxIE (#924) 2013-06-25 17:57:47 +02:00
Philipp Hagemeister
d662896090 [googleplus] Adapt to new detail URL format 2013-06-25 17:52:32 +02:00
Philipp Hagemeister
db241e8645 Add encoding to jukebox IE and simplify it a little bit 2013-06-25 17:16:38 +02:00
Philipp Hagemeister
ead28ff30a Make upload atomic (#925) 2013-06-25 17:14:25 +02:00
Philipp Hagemeister
515d7a5e73 Add Jukebox IE 2013-06-25 17:12:35 +02:00
mc2avr
14fbdc9cdd [jukebox] call YoutubeIE if necessary 2013-06-25 16:51:09 +02:00
Filippo Valsorda
98bcd2834a improve generic and encrypted signature error messages 2013-06-25 16:47:16 +02:00
Filippo Valsorda
f7ab6cbe16 add tests for use_cipher_signature videos (#897) and the ability to test multiple videos per IE 2013-06-25 14:38:00 +02:00
mc2avr
28ef06f7c2 add JukeboxIE 2013-06-25 13:28:59 +02:00
Philipp Hagemeister
577d02370d release 2013.06.30 2013-06-25 12:28:40 +02:00
Philipp Hagemeister
50be92c11c Handle video pages without vevo IDs (Fixes #923) 2013-06-25 12:28:17 +02:00
Johny Mo Swag
d18596baf4 added Youtube embed detection to WorldstarIE 2013-06-24 18:58:49 -07:00
Jaime Marquínez Ferrándiz
7ce7e39476 YoutubeIE: Extend decryption of signatures to all videos that have the 's' field in the url_encoded_fmt_stream_map (related #920) 2013-06-24 21:25:12 +02:00
Filippo Valsorda
93eb15c573 clean up printing in __init__.py 2013-06-24 15:57:53 +02:00
Philipp Hagemeister
9f4d83e3b1 release 2013.06.29 2013-06-24 14:51:24 +02:00
Jaime Marquínez Ferrándiz
1c251cd948 MTVIE: add support for Vevo videos (related #913) 2013-06-24 13:54:19 +02:00
Jaime Marquínez Ferrándiz
70d1924f8b Add VevoIE 2013-06-24 12:31:41 +02:00
Philipp Hagemeister
7b4948b05f release 2013.06.28 2013-06-24 11:11:33 +02:00
Philipp Hagemeister
878b5d9f0d Merge remote-tracking branch 'jaimeMF/youtubedl_class' 2013-06-24 10:48:41 +02:00
Philipp Hagemeister
2bc1820660 release 2013.06.27 2013-06-24 10:32:08 +02:00
Jaime Marquínez Ferrándiz
8bf8b5a577 Use the new class in the tests 2013-06-24 10:21:44 +02:00
Jaime Marquínez Ferrándiz
8222d8de88 Split FileDownloader in two classes: FileDownloader and YoutubeDL
YoutubeDL is the class that coordinates everything
FileDownloader gets a filename and an info dict and downloads the video.
2013-06-24 10:21:43 +02:00
Jaime Marquínez Ferrándiz
c7253e2e8c [youtube] fix condition always being evaluated to true 2013-06-24 09:42:46 +02:00
Philipp Hagemeister
d69cf69a6a [youtube] Use mp4 as extension for format 38 (Fixes #892) 2013-06-24 01:22:59 +02:00
Philipp Hagemeister
d02ecdefab release 2013.06.26 2013-06-24 01:01:53 +02:00
Philipp Hagemeister
bc857bfce0 Remove includes from setup.py for windows build 2013-06-24 01:01:17 +02:00
Philipp Hagemeister
f8bf74575a release 2013.06.25 2013-06-24 00:20:36 +02:00
Philipp Hagemeister
964ac8b584 Fix release script once more 2013-06-24 00:09:57 +02:00
Philipp Hagemeister
a3522dfddd Merge branch 'master' of github.com:rg3/youtube-dl 2013-06-24 00:09:11 +02:00
Philipp Hagemeister
d3a8613b6e Improve test skipping functionality 2013-06-24 00:05:02 +02:00
Philipp Hagemeister
200b388752 Correct comparison test 2013-06-24 00:02:49 +02:00
Philipp Hagemeister
dabcaf3b06 release 2013.06.24 2013-06-24 00:02:20 +02:00
Philipp Hagemeister
e646ffe795 Add included files for Windows build 2013-06-24 00:01:41 +02:00
Jaime Marquínez Ferrándiz
b0dcc3c47f setup.py: include the new extractor module 2013-06-23 23:54:08 +02:00
Philipp Hagemeister
b07d9c23c5 release 2013.06.23 2013-06-23 23:42:21 +02:00
Philipp Hagemeister
d71cae62cc allow skipping tests when releasing
(YouTube Subtitles are currently flaky in Germany, especially via IPv6)
2013-06-23 23:41:54 +02:00
Philipp Hagemeister
633a50cf4b Update Makefile to packaged paths 2013-06-23 23:27:28 +02:00
Philipp Hagemeister
825e0984e2 [break] adapt to new paths 2013-06-23 22:59:51 +02:00
Philipp Hagemeister
d1cade5ade Correct module name 2013-06-23 22:53:42 +02:00
Philipp Hagemeister
190717e31f [justin.tv] Clarify variable content 2013-06-23 22:52:43 +02:00
Philipp Hagemeister
0824c28c8b Remove mentions of old InfoExtractors module 2013-06-23 22:42:59 +02:00
Philipp Hagemeister
c59b4aaeef Fix imports and restrict available legacy imports 2013-06-23 22:38:59 +02:00
Philipp Hagemeister
f9c6cbf002 Move extractor imports and functions into extractor/__init__.py 2013-06-23 22:36:24 +02:00
Philipp Hagemeister
b8fe71ab86 Remove unused imports from InfoExtractor 2013-06-23 22:34:23 +02:00
Philipp Hagemeister
cb10cded2a [xhamster] Move into own file 2013-06-23 22:32:44 +02:00
Philipp Hagemeister
cd8b830292 [Teamcoco] Move into own file 2013-06-23 22:31:50 +02:00
Philipp Hagemeister
1ac4004f3a [flickr] Move into own file 2013-06-23 22:31:12 +02:00
Philipp Hagemeister
e17d368ae2 [howcast] Move into own file 2013-06-23 22:30:16 +02:00
Philipp Hagemeister
27110b0567 [hypem] Move into own file 2013-06-23 22:29:27 +02:00
Philipp Hagemeister
9fe4de3471 [ina] Move into own file 2013-06-23 22:28:19 +02:00
Philipp Hagemeister
d26d440e19 [redtube] Simplify 2013-06-23 22:27:34 +02:00
Philipp Hagemeister
9f5daf0006 [redtube] move into own file 2013-06-23 22:27:16 +02:00
Philipp Hagemeister
eb1634cbf8 [Vine] move into own file 2013-06-23 22:26:30 +02:00
Philipp Hagemeister
01c10ca26e [VBox7] move into own file 2013-06-23 22:25:46 +02:00
Philipp Hagemeister
45aef47281 [Bandcamp] move into own file 2013-06-23 22:24:58 +02:00
Philipp Hagemeister
ae287755b7 [Tumblr] move into own file 2013-06-23 22:24:07 +02:00
Philipp Hagemeister
a37f27ae99 [LiveLeak] move into own file 2013-06-23 22:23:19 +02:00
Philipp Hagemeister
49f5f315fd [Spiegel] move into own file 2013-06-23 22:22:08 +02:00
Philipp Hagemeister
97d2db017c [myspass] Move into own file and default to mp4 ext 2013-06-23 22:20:45 +02:00
Philipp Hagemeister
2c64df0399 [keek] move into own file 2013-06-23 22:16:41 +02:00
Philipp Hagemeister
828400422a [8tracks] Move into own file 2013-06-23 22:15:50 +02:00
Philipp Hagemeister
c3c77cec30 [youjizz] move into own file 2013-06-23 22:14:22 +02:00
Philipp Hagemeister
1183b85f50 [pornotube] move into own file 2013-06-23 22:13:32 +02:00
Philipp Hagemeister
0143dc029c [YouPorn] move into own file 2013-06-23 22:12:14 +02:00
Philipp Hagemeister
e10e576fed [RBMARadio] move into own file 2013-06-23 22:09:32 +02:00
Philipp Hagemeister
78af8eb1d1 [ustream] move into its own file 2013-06-23 22:08:28 +02:00
Philipp Hagemeister
79e93125d0 [justin.tv] move into own file 2013-06-23 22:07:27 +02:00
Philipp Hagemeister
48db0b1f4a [FunnyOrDie] Remove unused import 2013-06-23 22:07:17 +02:00
Philipp Hagemeister
8f0578f0fc Move FunnyOrDie into its own file 2013-06-23 22:05:23 +02:00
Philipp Hagemeister
250f557872 Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00
Philipp Hagemeister
462dc88b17 Move Steam IE into its own file 2013-06-23 22:02:56 +02:00
Philipp Hagemeister
570fa151fc Move XNXX into its own file 2013-06-23 22:01:57 +02:00
Philipp Hagemeister
9c286cfa00 Move Youku IE into its own file 2013-06-23 22:01:02 +02:00
Philipp Hagemeister
80cbb6ddbb Move MixCloud into its own file 2013-06-23 21:59:15 +02:00
Philipp Hagemeister
9fd5ce0cbe Move TED IE into its own file 2013-06-23 21:55:53 +02:00
Philipp Hagemeister
1736dec629 Mark MTV as broken for now (#913) 2013-06-23 21:52:41 +02:00
Philipp Hagemeister
b8a360837a Fix Statigram test 2013-06-23 21:34:40 +02:00
Philipp Hagemeister
fc28721960 Add MTV IE file (oops) 2013-06-23 21:34:03 +02:00
Philipp Hagemeister
51ce3a75c9 Improve error reporting for downloads 2013-06-23 21:33:11 +02:00
Philipp Hagemeister
335056663a Move MTV IE into its own file 2013-06-23 21:27:38 +02:00
Philipp Hagemeister
5b286728de Move NBA IE into its own file 2013-06-23 21:18:00 +02:00
Philipp Hagemeister
291a168bcc Move StanfordOC IE into its own file 2013-06-23 21:16:32 +02:00
Philipp Hagemeister
fda7d31aa0 Move infoq into its own file 2013-06-23 21:14:19 +02:00
Philipp Hagemeister
cbf46c737c Move XVideos IE into its own file (and simplify it a bit) 2013-06-23 21:11:47 +02:00
Philipp Hagemeister
7beb36a529 Move Collegehumor IE into its own file 2013-06-23 21:10:21 +02:00
Philipp Hagemeister
153697660d Move Escapist into its own file 2013-06-23 21:08:17 +02:00
Philipp Hagemeister
60a72e8d45 Simplify EscapistIE 2013-06-23 21:06:49 +02:00
Philipp Hagemeister
426ff04282 Move DepositFiles into its own IE 2013-06-23 21:06:20 +02:00
Philipp Hagemeister
a50e1b32e4 Add facebook import 2013-06-23 21:00:34 +02:00
Philipp Hagemeister
9eae41ddef Move Facebook into its own file 2013-06-23 20:59:45 +02:00
Philipp Hagemeister
aad0d6d5ba Move Soundcloud into its own file 2013-06-23 20:57:44 +02:00
Philipp Hagemeister
7aca14a1ec Move G+ IE into its own file, and move google search into a more descriptive module 2013-06-23 20:55:15 +02:00
Philipp Hagemeister
d1596ef439 Add import for google search 2013-06-23 20:51:42 +02:00
Philipp Hagemeister
ea63e4998b Move comedycentral into its own file 2013-06-23 20:51:04 +02:00
Philipp Hagemeister
a08dfd27a8 Move MyVideo into its own file 2013-06-23 20:48:32 +02:00
Philipp Hagemeister
f58848011e Move blip.tv extractors into their own file 2013-06-23 20:44:48 +02:00
Philipp Hagemeister
934858ad86 Move YahooSearchIE to youtube_dl.extractor.yahoo 2013-06-23 20:41:54 +02:00
Philipp Hagemeister
3c25b9abae Remove useless headers 2013-06-23 20:35:50 +02:00
Philipp Hagemeister
3fc03845a1 Move GoogleSearchIE into its own file 2013-06-23 20:32:49 +02:00
Philipp Hagemeister
9b122384e9 Move GenericIE into its own file 2013-06-23 20:31:45 +02:00
Philipp Hagemeister
9f4e6bbaeb Move gametrailers IE into its own file 2013-06-23 20:29:56 +02:00
Philipp Hagemeister
b05654f0e3 Move YoutubeSearchIE to the other youtube IEs 2013-06-23 20:28:15 +02:00
Philipp Hagemeister
9b3a760bbb [arte] Mark dead code as such 2013-06-23 20:26:35 +02:00
Philipp Hagemeister
d5822b96b0 Move ARD, Arte, ZDF into their own files 2013-06-23 20:24:07 +02:00
Philipp Hagemeister
b3d14cbfa7 Move Vimeo into its own file 2013-06-23 20:18:21 +02:00
Philipp Hagemeister
d6039175e5 Move yahoo into its own file 2013-06-23 20:13:52 +02:00
Philipp Hagemeister
97d6faaced Move Photobucket into its own file 2013-06-23 20:12:18 +02:00
Philipp Hagemeister
219b8130df Move DailyMotion into its own file 2013-06-23 20:12:03 +02:00
Philipp Hagemeister
38cbc40a64 Move Metacafe and Statigram into their own files, and remove absolute import 2013-06-23 20:07:51 +02:00
Philipp Hagemeister
93d3a642a9 [youtube] remove dead code 2013-06-23 19:59:40 +02:00
Philipp Hagemeister
c5e8d7af0e Move youtube extractors to youtube_dl.extractor.youtube 2013-06-23 19:58:33 +02:00
Philipp Hagemeister
d6983cb460 Fix generic class move (add all files) 2013-06-23 19:57:38 +02:00
Philipp Hagemeister
dd9829292e Improve vevo message 2013-06-23 19:45:42 +02:00
Philipp Hagemeister
89cb0eb0b6 Use new signature calculation method only if sig is not present 2013-06-23 19:43:18 +02:00
M.Yasoob Khalid
9b5fffb149 added an IE and test for break.com 2013-06-23 22:42:51 +05:00
Philipp Hagemeister
1f90438025 Merge remote-tracking branch 'jaimeMF/vevo_fix' 2013-06-23 19:42:27 +02:00
Philipp Hagemeister
a130adb25b [Statigr.am] Correct uploader id 2013-06-23 19:41:28 +02:00
Philipp Hagemeister
8756c5fe7a Merge remote-tracking branch 'origin/vimeo_passworded_videos' 2013-06-23 19:00:16 +02:00
Philipp Hagemeister
828dba2983 Improvge error reporting 2013-06-23 18:59:01 +02:00
Philipp Hagemeister
6b3f5a329b Improve Statigr.am IE 2013-06-23 18:58:53 +02:00
Philipp Hagemeister
63ef586b05 Merge remote-tracking branch 'yasoob/master' 2013-06-23 18:45:50 +02:00
Philipp Hagemeister
383a6a61b1 Merge pull request #905 from rbrito/manpage-apropos
README: Add brief description for manpages/apropos.
2013-06-23 09:41:59 -07:00
M.Yasoob Khalid
4fdd4e6f6f added test for Statigr 2013-06-23 18:56:26 +05:00
M.Yasoob Khalid
01ba4b80a7 added StatigrIE 2013-06-23 18:02:55 +05:00
M.Yasoob Khalid
de66764e4e added StatigrIE 2013-06-23 17:46:14 +05:00
Jaime Marquínez Ferrándiz
1037d53988 GenericIE: look for Open Graph info
Only if there is a direct link to the file, don't try if it points to a Flash player
2013-06-23 13:26:49 +02:00
Jaime Marquínez Ferrándiz
c3ab8f866c Change metavar of "--sub-format" from LANG to FORMAT 2013-06-23 12:59:20 +02:00
Rogério Brito
94eb2dd1fe README: Add brief description for manpages/apropos.
Trying to mimic the manpage of (GNU) `ls`, we don't conjugate the verb as
"downloads" or something else.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-06-22 19:16:11 -03:00
Jaime Marquínez Ferrándiz
346b5ce8fd YoutubeIE: report warnings instead of errors if the subtitles are not found (related #901)
For example when downloading a playlist some videos may not have subtitles but the download shouldn't stop.
2013-06-22 14:15:33 +02:00
Jaime Marquínez Ferrándiz
b37fbb990b Move the decrypting function to a static method 2013-06-22 13:20:06 +02:00
Jaime Marquínez Ferrándiz
ef75f76f5c Detect more vevo videos 2013-06-22 13:13:40 +02:00
Jaime Marquínez Ferrándiz
e296100005 Merge pull request #888 from rg3/youtube_playlists_fix_886
YoutubePlaylistIE: try to extract the url of the entries from the media$group dictionary (closes #886)
2013-06-22 03:35:32 -07:00
Jaime Marquínez Ferrándiz
953dd93a48 YoutubePlaylistIE: don't look into entry['content']['src'], accruing to the docs this can return live stream urls 2013-06-22 12:32:27 +02:00
Jaime Marquínez Ferrándiz
e704f4d378 YoutubeIE: If not subtitles language is given default to English for automatic captions (related #901) 2013-06-22 12:14:24 +02:00
Jaime Marquínez Ferrándiz
77d0f05f71 YoutubeIE: Detect new Vevo style videos
The url_encoded_fmt_stream_map can be found in the video page, but the signature must be decrypted, we get it from the webpage instead of the `get_video_info` pages because we have only discover the algorithm for keys with both sub keys of size 43.
2013-06-21 21:51:10 +02:00
Philipp Hagemeister
50d2376769 Leave out sig if not present (#896) 2013-06-21 01:22:47 +02:00
Philipp Hagemeister
759d525301 release 2013.06.21 2013-06-21 00:33:44 +02:00
Philipp Hagemeister
fcfa188548 Show which IEs are slow during release 2013-06-21 00:29:31 +02:00
Jaime Marquínez Ferrándiz
f4c8bbcfc2 TEDIE: download the best quality video and use the new _search_regex functions
Also extracts the description.
2013-06-20 20:51:20 +02:00
Jaime Marquínez Ferrándiz
31eead52e7 YoutubePlaylistIE: try to extract the url of the entries from the media$group dictionary
Extracting it from content can return rtsp urls.
2013-06-20 17:23:27 +02:00
Jaime Marquínez Ferrándiz
038a3a1a61 RBMARadioIE: fix the extraction of the JSON data 2013-06-20 14:37:43 +02:00
Jaime Marquínez Ferrándiz
587c68b2cd DailymotionIE: fix the extraction of the video uploader and use _search_regex for getting it 2013-06-20 14:15:29 +02:00
Jaime Marquínez Ferrándiz
377fdf5dde Update the TumblrIE: the video is no longer available 2013-06-20 14:02:21 +02:00
Jaime Marquínez Ferrándiz
5c67601931 Revert "Fix GooglePlusIE: the video_page url has changed of place"
The old method is working again.

This reverts commit 449d5c910c.
2013-06-20 13:53:04 +02:00
Jaime Marquínez Ferrándiz
68f54207a3 SteamIE: only verify the age if needed
Also use the _html_search_regex function
2013-06-20 13:43:44 +02:00
Philipp Hagemeister
bb47437686 Ignore invalid dates (Fixes #894) 2013-06-19 22:13:16 +02:00
Jaime Marquínez Ferrándiz
213b715893 Merge pull request #887 from anisse/master
Fetch all entries that are in a youtube playlist

Also add a test.
2013-06-19 12:52:44 +02:00
Jaime Marquínez Ferrándiz
449d5c910c Fix GooglePlusIE: the video_page url has changed of place 2013-06-18 14:22:16 +02:00
Filippo Valsorda
0251f9c9c0 add _search_regex to the new IEs 2013-06-17 19:47:44 +02:00
Filippo Valsorda
8bc7c3d858 Merge branch 'search_regex' - PR #872 - closes #847 2013-06-17 19:28:18 +02:00
Filippo Valsorda
af44c94862 use _search_regex in GenericIE 2013-06-17 19:25:35 +02:00
Jaime Marquínez Ferrándiz
36ed7177f0 Fix HypemIE test: the song name has been changed 2013-06-16 20:42:28 +02:00
Jaime Marquínez Ferrándiz
32aa88bcae Add GametrailersIE 2013-06-16 20:34:45 +02:00
Jaime Marquínez Ferrándiz
51090d636b VimeoIE: allow to download password protected videos 2013-06-15 11:35:14 +02:00
Jaime Marquínez Ferrándiz
31513ea6b9 Update test_issue_673 in Youtube Lists
Some videos have been removed.
Delete the title check, it's not the purpose of that test.
2013-06-15 11:20:22 +02:00
Anisse Astier
88cebbd7b8 YoutubePlaylistIE: get *all* videos
For that, we add parameter safeSearch=none that asks youtube not filter
results before sending them to us.

Note: this parameter could be added to YoutubeSearchIE and YoutubeUserIE
as well, but I don't know what would be the impact in term of unwanted
results. Maybe expose that as a parameter? For a playlist it's different
since the user chose what she put in the playlist.
2013-06-13 23:45:32 +02:00
Jaime Marquínez Ferrándiz
fb8f7280bc GenericIE: try to find videos from twitter cards info 2013-06-13 08:26:39 +02:00
Jaime Marquínez Ferrándiz
f380401bbd YoutubeSearchIE: the query is a str, in python 3 it fails if decode is called 2013-06-11 19:15:07 +02:00
Jaime Marquínez Ferrándiz
9abc6c8b31 Update YahooIE test
The old test video is no longer available.
2013-06-10 19:42:02 +02:00
Philipp Hagemeister
8cd252f115 Use long rtmpdump options
Note that we accidentally called rtmpdump with -v (--live) instead of -V (--verbose) because we missed this.
2013-06-10 18:14:45 +02:00
Philipp Hagemeister
53f72b11e5 Allow unsetting the proxy with the --proxy option 2013-06-09 23:43:18 +02:00
Filippo Valsorda
ee55fcbe12 switch long info_dict fields checking to md5 2013-06-09 15:03:54 +02:00
Filippo Valsorda
78d3442b12 test: extend the reach of info_dict checking
* print the info_dict in a format suitable to easy adding to tests.json during tests if un-tested fields are detected
* make it possible to put the crc32 in tests.json if the field is too long
* complete the "info_dict" fields in existing tests
* fixed the bugs catched doing this
2013-06-09 14:21:42 +02:00
Filippo Valsorda
979a9dd4c4 _html_search_regex with clean_html superpowers 2013-06-09 11:57:13 +02:00
Filippo Valsorda
d5979c5d55 do not ask the user to report network errors 2013-06-09 11:55:08 +02:00
Jaime Marquínez Ferrándiz
8027175600 Set the extractor key in playlists entries
If they were videos the extractor key wasn't being set anywhere else
Closes 877
2013-06-08 12:08:44 +02:00
Jaime Marquínez Ferrándiz
3054ff0cbe Merge pull request #853 from mc2avr/master
add ZDFIE
2013-06-08 11:44:01 +02:00
Jaime Marquínez Ferrándiz
cd453d38bb Merge pull request #878 from yasoob/master
Added Vbox7.com InfoExtractor and tests.
2013-06-08 10:54:47 +02:00
Filippo Valsorda
f5a290eed9 print "please report this issue on GitHub" on every ExtractorError 2013-06-08 09:56:34 +02:00
M.Yasoob Khalid
ecb3e676a5 Added Vbox7 Infoextractor 2013-06-08 12:44:38 +05:00
Filippo Valsorda
8b59a98610 XHamster: Can't see the description anywhere in the UI 2013-06-07 12:47:12 +02:00
Filippo Valsorda
8409501206 use search_regex in new IEs 2013-06-07 12:47:12 +02:00
Filippo Valsorda
be95cac157 raise exceptions on warnings during tests - and solve a couple of them 2013-06-07 12:46:23 +02:00
Filippo Valsorda
476203d025 print WARNINGs during test + minor fix to NBAIE 2013-06-06 15:07:05 +02:00
Filippo Valsorda
468e2e926b implement fallbacks and defaults in _search_regex 2013-06-06 14:35:08 +02:00
Anna Bernardi
ac3e9394e7 Implement search_regex from #847 2013-06-06 14:01:44 +02:00
Filippo Valsorda
868d62a509 style and error handling edits to HypemIE 2013-06-06 12:02:36 +02:00
M.Yasoob Khalid
157b864a01 added HypemIE
rebased, closes PR #871
2013-06-06 12:01:07 +02:00
Filippo Valsorda
951b9dfd94 Merge pull request #866 from yasoob/master
Added support for XHamster - closes #841
2013-06-04 10:39:31 -07:00
Filippo Valsorda
1142d31164 Merge pull request #863 from davidcl/master
Add some tests to match Justin.tv / Twitch.tv URLs
2013-06-04 10:36:36 -07:00
Jaime Marquínez Ferrándiz
9131bde941 SpiegelE: the page layout has changed a bit 2013-06-04 19:31:06 +02:00
Jaime Marquínez Ferrándiz
1132c10dc2 Merge pull request #864 from jacobian/vimeopro
Fixed an error downloading vimeo pro videos.
2013-06-04 10:15:12 -07:00
M.Yasoob Ullah Khalid
c978a96c02 Added test for XHamster.com 2013-06-04 17:33:02 +05:00
M.Yasoob Ullah Khalid
71e458d437 Added support for xhamster in infoextractors 2013-06-04 17:30:54 +05:00
Clément DAVID
57bde0d9c7 Fix the test_all_urls (Import issue) 2013-06-04 13:10:12 +02:00
Clément DAVID
50b4d25980 Merge within test_all_urls 2013-06-04 13:06:49 +02:00
Jaime Marquínez Ferrándiz
eda60e8251 VimeoIE: support videos from vimeopro.com 2013-06-04 12:04:54 +02:00
Jacob Kaplan-Moss
c794cbbb19 Fixed an error downloading vimeo pro videos. 2013-06-03 18:03:59 -05:00
Clément DAVID
4a76d1dbe5 Add tests for justin.tv and twitch.tv 2013-06-03 22:16:55 +02:00
Jaime Marquínez Ferrándiz
418f734a58 Merge pull request #854 from rg3/youtube_automatic_captions
YoutubeIE: fallback to automatic captions when subtitles aren't found
2013-06-01 14:18:27 -07:00
Jaime Marquínez Ferrándiz
dc1c355b72 YoutubeIE: fallback to automatic captions when subtitles aren't found (closes #843)
Also modify test_youtube_subtitles to support running the tests in any order.
2013-05-31 17:03:40 +02:00
Jaime Marquínez Ferrándiz
1b2b22ed9f BlipTV: accept urls in the format http://a.blip.tv/api.swf#{id} (closes #857)
Tweak the regex so that BlipTV can be before BlipTVUser.
2013-05-28 15:12:39 +02:00
mc2avr
f2cd958c0a add ZDFIE and _download_with_mplayer(mms://,rtsp://) 2013-05-23 21:42:03 +02:00
Philipp Hagemeister
57adeaea87 release 2013.05.23 2013-05-23 13:37:19 +02:00
Philipp Hagemeister
8f3f1aef05 Fix HowCast IE 2013-05-23 13:34:33 +02:00
Filippo Valsorda
51d2453c7a small tweaks 2013-05-21 16:07:27 +02:00
Jaime Marquínez Ferrándiz
45014296be Add TeamcocoIE (closes #212) 2013-05-21 14:37:32 +02:00
Anna Bernardi
afef36c950 add support for Flickr videos - closes #261 2013-05-20 23:19:38 +02:00
Filippo Valsorda
b31756c18e Python 2 compat fixes for MyVideo.de rtmpdump downloads 2013-05-20 11:57:10 +02:00
Filippo Valsorda
f008688520 make rtmpdump inherit the verbose option for debugging 2013-05-20 11:54:21 +02:00
Filippo Valsorda
5b68ea215b Merge pull request #842 - myvideo, rtmp support
@dersphere code, from dersphere/plugin.video.myvideo_de.git
rewritten by @mc2avr
released in the Public Domain by the author
ref: https://github.com/rg3/youtube-dl/pull/842
2013-05-20 09:49:58 +02:00
Jaime Marquínez Ferrándiz
b1d568f0bc HowcastIE: extract thumbnail 2013-05-20 08:39:41 +02:00
Jaime Marquínez Ferrándiz
17bd1b2f41 VineIE: extract more information and minor style changes 2013-05-20 08:31:03 +02:00
Anna Bernardi
5b0d3cc0cd Add support for Vine - closes #845 2013-05-20 00:33:14 +02:00
Filippo Valsorda
d4f76f1674 Add support for Howcast.com - closes #835 2013-05-18 19:17:19 +02:00
Jaime Marquínez Ferrándiz
340fa21198 UstreamIE: get thumbnail and uploader name 2013-05-18 11:54:18 +02:00
mc2avr
de5d66d431 MyVideoIE: add rtmp support 2013-05-15 23:38:44 +02:00
Jaime Marquínez Ferrándiz
7bdb17d4d5 Add extra_info argument to extract_info and process_ie_result
It allows to update the info_dicts with other values

(closes #840)
2013-05-14 14:40:40 +02:00
Philipp Hagemeister
419c64b107 Throw a better error if the protocol is invalid 2013-05-13 19:54:07 +02:00
Philipp Hagemeister
99a5ae3f8e Simplify generic search IE (Closes #839) 2013-05-13 19:53:52 +02:00
Philipp Hagemeister
c7563c528b Merge remote-tracking branch 'jaimeMF/SearchIE' 2013-05-13 19:43:35 +02:00
Jaime Marquínez Ferrándiz
e30e9318da Add base class SearchInfoExtractor for search queries IEs 2013-05-13 14:58:44 +02:00
Philipp Hagemeister
5c51028d38 release 2013.05.14 2013-05-13 13:50:05 +02:00
Philipp Hagemeister
c1d58e1c67 Merge pull request #834 from chocolateboy/install_prefix_fix
only install to /etc if PREFIX is /usr or /usr/local
2013-05-13 00:42:24 -07:00
Philipp Hagemeister
02030ff7fe release 2013.05.13 2013-05-13 09:38:27 +02:00
Philipp Hagemeister
f45c185fa9 Do not re-encode / to # if / is a platform separator, and correctly handle permission errors (Fixes #831) 2013-05-13 09:20:08 +02:00
Philipp Hagemeister
1bd96c3a60 Deprecate --only-sub 2013-05-13 09:06:18 +02:00
Jaime Marquínez Ferrándiz
929f85d851 Remove a print call used for debugging 2013-05-12 20:56:54 +02:00
Jaime Marquínez Ferrándiz
98d4a4e6bc YoutubeSearchIE: return a playlist (related #838) 2013-05-12 20:53:37 +02:00
Jaime Marquínez Ferrándiz
fb2f83360c FFmpegPostProcessor: decode stderr first and then get the last line (closes #837) 2013-05-12 19:08:32 +02:00
Jaime Marquínez Ferrándiz
3c5e7729e1 GoogleSearchIE: change query urls to http://www.google.com/search
The old one was given HTTP 404 errors
2013-05-12 18:44:56 +02:00
Jaime Marquínez Ferrándiz
5a853e1423 Fix YahooSearchIE: (closes #300) 2013-05-12 17:49:35 +02:00
Jaime Marquínez Ferrándiz
2f58b12dad YahooIE: support more videos 2013-05-12 17:05:43 +02:00
Jaime Marquínez Ferrándiz
59f4fd4dc6 YahooIE: remove old code and accept screen.yahoo.com videos (#300)
Videos require rtmpdump
2013-05-12 14:05:14 +02:00
chocolateboy
5738240ee8 only install to /etc if PREFIX is /usr or /usr/local 2013-05-10 23:05:58 +01:00
Philipp Hagemeister
86fd453ea8 Merge remote-tracking branch 'origin/master' 2013-05-10 09:21:24 +02:00
Philipp Hagemeister
c83411b9ee Skip bandcamp tests for now - free limit has been exceeded 2013-05-10 09:10:34 +02:00
Jaime Marquínez Ferrándiz
057c9938a1 Import FileDownloader in test_youtube_subtitles
Fix last commit
2013-05-10 08:37:49 +02:00
Jaime Marquínez Ferrándiz
9259966132 test_youtube_subtitles: FakeDownloader inherits form FileDownloader 2013-05-10 08:31:30 +02:00
Philipp Hagemeister
b08980412e Merge pull request #826 from jakeogh/master
Added --get-id option to print video IDs
2013-05-09 16:52:54 -07:00
Philipp Hagemeister
532a1e0429 release 2013.05.10 2013-05-10 01:45:21 +02:00
Filippo Valsorda
2a36c352a0 Retry to disable YT ratelimit to unlock full bandwidth
This is the second attempt: a60b854d90
Sometimes the ratelimit=yes is already in the URL, and doubling it
leads to a 403. Now should work on all videos, at least works on all
I could test.

Closes #648
2013-05-09 00:39:10 +02:00
jakeogh
1a2adf3f49 added --get-id option to print video IDs 2013-05-05 22:30:07 -07:00
Jaime Marquínez Ferrándiz
43b62accbb GoogleSearchIE: rename _download_n_results to _get_n_results 2013-05-05 22:12:41 +02:00
Jaime Marquínez Ferrándiz
be74864ace Credit @JohnyMoSwag for WorldstarhiphopIE (#730) 2013-05-05 21:56:38 +02:00
Philipp Hagemeister
0ae456f08a Credit @julienfr112 for Ina IE (#823) 2013-05-05 21:35:50 +02:00
Philipp Hagemeister
0f75d25991 release 2013.05.07 2013-05-05 21:13:16 +02:00
Philipp Hagemeister
67129e4a15 release 2013.05.06 2013-05-05 21:01:46 +02:00
Philipp Hagemeister
dfb9323cf9 Clean up InaIE (Closes #823) 2013-05-05 20:57:19 +02:00
julien
7f5bd09baf Add support to www.ina.fr 2013-05-05 20:54:36 +02:00
Philipp Hagemeister
02d5eb935f Merge remote-tracking branch 'origin/master'
Conflicts:
	youtube_dl/InfoExtractors.py
2013-05-05 20:51:27 +02:00
Philipp Hagemeister
94ca71b7cc Fix GoogleSearchIE (Fixes #822) 2013-05-05 20:49:57 +02:00
Philipp Hagemeister
b338f1b154 FileDownloader: Simplify and document 2013-05-05 20:49:42 +02:00
Jaime Marquínez Ferrándiz
486f0c9476 More callbacks changed to raise ExtractorError 2013-05-05 13:59:25 +02:00
Jaime Marquínez Ferrándiz
d96680f58d PhotobucketIE: accept new format of urls and add a test 2013-05-05 13:07:00 +02:00
Jaime Marquínez Ferrándiz
f8602d3242 ArteTvIE: Fix format of upload date 2013-05-05 11:48:47 +02:00
Jaime Marquínez Ferrándiz
0c021ad171 More callbacks changed to raise ExtractorError 2013-05-04 14:23:16 +02:00
Philipp Hagemeister
086d7b4500 Merge pull request #802 from joeframbach/master
If path and new_path are the same, then dont delete the file
2013-05-04 03:35:19 -07:00
Philipp Hagemeister
891629c84a release 2013.05.05 2013-05-04 12:31:17 +02:00
Philipp Hagemeister
ea6d901e51 Add --no-check-certificate (#814) 2013-05-04 12:22:56 +02:00
Philipp Hagemeister
4539dd30e6 twitch.tv chapters (#810): print out start and end time 2013-05-04 12:02:18 +02:00
Philipp Hagemeister
c43e57242e twitch.tv chapters: Include uploader (#810) 2013-05-04 11:44:59 +02:00
Philipp Hagemeister
db8fd71ca9 twitch.tv chapters: Use API for title and other metadata 2013-05-04 11:42:44 +02:00
Philipp Hagemeister
f4f316881d Improve Twitch.tv chapter support (#810) 2013-05-04 11:27:39 +02:00
Philipp Hagemeister
0e16f09474 Work on twitch.tv chapters (#810) 2013-05-04 10:36:37 +02:00
Philipp Hagemeister
09dd418f53 Experimentally whitelist Escapist test 2013-05-04 09:11:38 +02:00
Philipp Hagemeister
decd1d1737 raise ExtractorError instead of calling back 2013-05-04 08:38:28 +02:00
Philipp Hagemeister
180e689f7e Simplify WorldStarHipHop 2013-05-04 08:06:56 +02:00
Johny Mo Swag
7da5556ac2 Better fix for getting source url's 2013-05-04 08:04:28 +02:00
Johny Mo Swag
f23a03a89b updated regular experssion for possible future updates to source url 2013-05-04 07:59:33 +02:00
Philipp Hagemeister
84e4682f0e Always use HTTPS for youtube (Fixes #691) 2013-05-04 07:49:25 +02:00
Philipp Hagemeister
1f99511210 release 2013.05.04 2013-05-04 07:12:33 +02:00
Philipp Hagemeister
0d94f2474c Work around a Python bug on Windows with UTF-8 configuration (#820) 2013-05-04 07:09:50 +02:00
Philipp Hagemeister
480b6c1e8b Fix comedycentral: newest 2013-05-04 02:53:26 +02:00
Philipp Hagemeister
95464f14d1 Credit @yasoob for IE 2013-05-03 20:08:16 +02:00
Philipp Hagemeister
c34407d16c Simplify RedTube 2013-05-03 20:07:35 +02:00
M.Yasoob Ullah Khalid
5e34d2ebbf Moved redtube info extractor to the end 2013-05-03 23:57:16 +06:00
M.Yasoob Ullah Khalid
815dd2ffa8 Redtube test now works
I just did a little makeover by changing redtube tests. Now they are passed.
2013-05-03 23:51:27 +06:00
M.Yasoob Ullah Khalid
ecd5fb49c5 added redtube.com in InfoExtractors (2nd pull request with the required amindments)
added redtube.com in InfoExtractors (2nd pull request with the required amindments). Now this script can also download redtube.com videos
2013-05-03 22:44:34 +06:00
M.Yasoob Ullah Khalid
b86174e7a3 added test for redtube.com
I just added the test for redtube.com
2013-05-03 22:40:56 +06:00
Jaime Marquínez Ferrándiz
2e2038dc35 TEDIE: report the correct talk title when a link with the language code is given 2013-05-02 18:28:07 +02:00
Jaime Marquínez Ferrándiz
46bfb42258 InfoExtractors: use _download_webpage in more IEs
IEs without tests are intact.
2013-05-02 18:18:27 +02:00
Jaime Marquínez Ferrándiz
feecf22511 InfoExtractors: fix some regular expressions where dots weren't escaped 2013-05-02 13:39:56 +02:00
Jaime Marquínez Ferrándiz
4c4f15eb78 Merge pull request #815 from JohnyMoSwag/master
Update for new source links on worldstarhiphop.com
2013-05-02 13:23:32 +02:00
Jaime Marquínez Ferrándiz
104ccdb8b4 TumblrIE: fix title matching 2013-05-02 13:12:41 +02:00
Johny Mo Swag
6ccff79594 Small update for additon of new video source links 2013-05-01 20:30:14 -07:00
Jaime Marquínez Ferrándiz
aed523ecc1 Add BandcampIE (closes #568) 2013-05-01 15:55:46 +02:00
Philipp Hagemeister
d496a75d0a release 2013.05.01 2013-05-01 14:07:23 +02:00
Philipp Hagemeister
5c01dd1e73 Merge remote-tracking branch 'origin/master' 2013-05-01 14:05:02 +02:00
Philipp Hagemeister
11d9224e3b add --write-thumbnail option to download thumbnail (Suggested by `) 2013-05-01 14:04:33 +02:00
Jaime Marquínez Ferrándiz
34c29ba1d7 Add test for SoundcloudSet 2013-04-30 21:23:38 +02:00
Philipp Hagemeister
6cd657f9f2 release 2013.04.31 2013-04-30 19:50:20 +02:00
Philipp Hagemeister
4ae9e55822 Correctly clear the line before writing a new status line 2013-04-30 19:42:58 +02:00
Philipp Hagemeister
8749b71273 Fix FakeDownloaders 2013-04-30 19:42:13 +02:00
Philipp Hagemeister
dbc50fdf82 Fix help for --proxy 2013-04-30 18:27:54 +02:00
Philipp Hagemeister
b1d2ef9255 release 2013.04.30 2013-04-30 18:00:56 +02:00
Philipp Hagemeister
5fb16555af --proxy option 2013-04-30 17:57:13 +02:00
Jaime Marquínez Ferrándiz
ba7c775a04 Remove a commented line I forgot.
[ci skip]
2013-04-30 14:21:46 +02:00
Jaime Marquínez Ferrándiz
fe348844d9 SoundcloudSetIE: Use upload_date in the unified format (fixes #812) 2013-04-29 23:57:36 +02:00
Jaime Marquínez Ferrándiz
767e00277f Use report_warning when a not working IE will be uses 2013-04-28 17:12:07 +02:00
Philipp Hagemeister
6ce533a220 release 2013.04.28 2013-04-28 16:32:05 +02:00
Philipp Hagemeister
08b2ac745a Default to --title (Fixes #499) 2013-04-28 16:26:11 +02:00
Philipp Hagemeister
46a127eecb Fix print_notes 2013-04-28 16:21:29 +02:00
Philipp Hagemeister
fc63faf070 release 2013.04.27 2013-04-28 15:53:14 +02:00
Philipp Hagemeister
9665577802 Adapt tests to changes in youtube's "Most Popular" channel 2013-04-28 15:50:29 +02:00
Philipp Hagemeister
434aca5b14 Automatically set HTTPS proxy if given (Fixes #805) 2013-04-28 15:41:05 +02:00
Jaime Marquínez Ferrándiz
e31852aba9 Document the video selection using the upload date 2013-04-28 12:02:30 +02:00
Jaime Marquínez Ferrándiz
37254abc36 Allow to use relative dates in the format (now|today)[+-][0-9](day|week|month|year)(s)? (Closes #137)
Also fix DateRange not accepting ranges of one day.
2013-04-28 11:39:37 +02:00
Philipp Hagemeister
a11ea50319 Re-enable Dailymotion (tests pass) 2013-04-27 21:53:21 +02:00
Philipp Hagemeister
81df121dd3 Merge branch 'master' of github.com:rg3/youtube-dl 2013-04-27 20:26:42 +02:00
Philipp Hagemeister
50f6412eb8 Rename soundcloud to soundcloud:set 2013-04-27 20:12:46 +02:00
Jaime Marquínez Ferrándiz
bf50b0383e Fix some IEs that didn't return the uploade_date in the YYYYMMDD format
Create a function unified_strdate in utils.py to fix these problems
2013-04-27 15:14:20 +02:00
Jaime Marquínez Ferrándiz
bd55852517 Allow to select videos to download by their upload dates (related #137)
Only absolute dates.
2013-04-27 14:01:55 +02:00
Jaime Marquínez Ferrándiz
4c9f7a9988 SteamIE: accept urls with agecheck 2013-04-27 11:03:34 +02:00
Jaime Marquínez Ferrándiz
aba8df23ed YoutubePlaylistIE: don't crash with empty lists (related #808)
The playlist_title wasn't initialized.
2013-04-27 10:41:52 +02:00
Jaime Marquínez Ferrándiz
3820df0106 Merge pull request #801 from expleo/add_referer_support 2013-04-26 19:34:32 +02:00
Joe Frambach
e74c504f91 Dont delete source file when source file and post-processed file are the same 2013-04-24 21:59:10 +00:00
Jaime Marquínez Ferrándiz
fa70605db2 IEs: clean __init__ methods
They are not needed
2013-04-24 23:05:43 +02:00
Jaime Marquínez Ferrándiz
0d173446ff InfoExtractors: use report_download_webpage in _request_webpage
Allows to show the warning when falling back on GenericIE
2013-04-24 22:11:57 +02:00
Jaime Marquínez Ferrándiz
320e26a0af Clean duplicate method report_download_webpage in InfoExtractors 2013-04-24 22:02:20 +02:00
Jaime Marquínez Ferrándiz
a3d689cfb3 Fix InfoQ 2013-04-24 21:16:10 +02:00
Bjorn Heesakkers
59cc5d9380 Updated README 2013-04-24 14:12:33 +02:00
Bjorn Heesakkers
28535652ab Adds support for passing a referer. 2013-04-24 13:56:04 +02:00
Philipp Hagemeister
7b670a4483 YouTube: Fall back to <meta> description if video is rated (Fixes #800) 2013-04-23 13:54:17 +02:00
Jaime Marquínez Ferrándiz
69fc019f26 YoutubeIE when no description is found use an empty unicode string (closes #800) 2013-04-23 12:24:08 +02:00
Jaime Marquínez Ferrándiz
613bf66939 More calls to trouble changed to report_error 2013-04-23 11:31:37 +02:00
Jaime Marquínez Ferrándiz
9edb0916f4 Disable colored messages in Windows (related #794) 2013-04-23 11:09:22 +02:00
Jaime Marquínez Ferrándiz
f4b659f782 Document order of preference for format selection (closes #798) 2013-04-23 10:33:54 +02:00
Philipp Hagemeister
c70446c7df Merge branch 'master' of github.com:rg3/youtube-dl 2013-04-22 23:15:15 +02:00
Philipp Hagemeister
c76cb6d548 Correct indentation 2013-04-22 23:15:05 +02:00
Philipp Hagemeister
71f37e90ef Merge pull request #797 from AI0867/patch-1
Use standard unit symbols in format_bytes
2013-04-22 14:13:52 -07:00
Philipp Hagemeister
75b5c590a8 Do not read configuration files if explicit arguments are given by a host program (#792) 2013-04-22 23:05:14 +02:00
Jaime Marquínez Ferrándiz
4469666780 Merge pull request #792 from fp7/master
Parameters as arguments to main
2013-04-22 13:44:05 -07:00
Jaime Marquínez Ferrándiz
c15e024141 TumblrIE
I haven't found many videos to test, so it may not work for all.
2013-04-22 21:27:27 +02:00
Philipp Hagemeister
8cb94542f4 release 2013.04.22 2013-04-22 20:01:56 +02:00
Philipp Hagemeister
c681a03918 Fix --list-formats (Closes #799) 2013-04-22 19:51:56 +02:00
Finn Petersen
30f2999962 Added parenthesis for explicity 2013-04-22 10:15:58 +02:00
Jaime Marquínez Ferrándiz
74e3452b9e Add playlist and playlist_index to the help string for the output option
Also split the help string in different lines to make editing easier.
2013-04-22 10:06:07 +02:00
Jaime Marquínez Ferrándiz
9e1cf0c200 SteamIE returns a playlist
With the game name as title.
2013-04-21 22:05:21 +02:00
Jaime Marquínez Ferrándiz
e11eb11906 Allow to download videos with age check from Steam
Also move method report_age_confirmation to the base IE class.
2013-04-21 21:56:13 +02:00
Philipp Hagemeister
c04bca6f60 release 2013.04.21 2013-04-21 12:52:45 +02:00
Alexander van Gessel
b0936ef423 Use standard unit symbols in format_bytes 2013-04-21 02:38:37 +03:00
Jaime Marquínez Ferrándiz
41a6eb949a Clean duplicate method report_extraction in InfoExtractors
A lot of IEs had implemented the method in the same way.
2013-04-20 21:12:29 +02:00
Jaime Marquínez Ferrándiz
f17ce13a92 Write the method to_screen in InfoExtractor (related #608)
Except the ones in youtube subtypes (user, channels ..) all calls to _downloader.to_screen has been changed.
The calls not prefixed with the IE name hasn't been touched.
2013-04-20 20:55:40 +02:00
Jaime Marquínez Ferrándiz
8c416ad29a Remove calls to _downloader.download in Youtube searchs
Instead, return the urls of the videos.
2013-04-20 19:22:45 +02:00
Jaime Marquínez Ferrándiz
c72938240e Get the title of Youtube playlists 2013-04-20 18:57:05 +02:00
Jaime Marquínez Ferrándiz
e905b6f80e TEDIE can now return a playlist 2013-04-20 13:31:21 +02:00
Jaime Marquínez Ferrándiz
6de8f1afb7 Allows to specify which IE should be used for extracting info for a result of type url 2013-04-20 12:58:35 +02:00
Jaime Marquínez Ferrándiz
9341212642 Create a function in InfoExtractors that returns the InfoExtractor class with the given name 2013-04-20 12:42:57 +02:00
Jaime Marquínez Ferrándiz
f7a9721e16 Fix some metacafe videos, closes #562 2013-04-20 12:06:58 +02:00
Jaime Marquínez Ferrándiz
089e843b0f Use _download_webpage in MetacafeIE 2013-04-20 11:40:05 +02:00
Jaime Marquínez Ferrándiz
c8056d866a Add myself to travis notifications 2013-04-20 11:17:03 +02:00
Jaime Marquínez Ferrándiz
49da66e459 The test video for subtitles has added a new language 2013-04-20 10:39:02 +02:00
ispedals
fb6c319904 Add tests for YoutubeChannelIE
- tests for identifying channel urls
- test retrieval of paginated channel
- test retrieval of autogenerated channel
2013-04-19 18:11:05 -04:00
ispedals
5a8d13199c Fix YoutubeChannelIE
- urls with query parameters now match
- fixes regex for identifying videos
- fixes pagination
2013-04-19 18:05:35 -04:00
Jaime Marquínez Ferrándiz
dce9027045 Merge branch 'extract_info_rewrite' 2013-04-19 21:57:08 +02:00
Philipp Hagemeister
feba604e92 Fix playlists with size 50i ∀ i∉ℕ (Closes #782) 2013-04-18 07:28:43 +02:00
Philipp Hagemeister
d22f65413a release 2013.04.18 2013-04-18 06:29:32 +02:00
Philipp Hagemeister
0599ef8c08 Limit titles to 200 characters (Closes #789) 2013-04-18 06:27:11 +02:00
Philipp Hagemeister
bfdf469295 Fix FunnyOrDie extraction for a special video (#789) 2013-04-18 06:21:46 +02:00
Philipp Hagemeister
32c96387c1 Fix facebook IE 2013-04-18 04:41:48 +02:00
Philipp Hagemeister
c8c5443bb5 Revert "disable YT ratelimit; this should enable to max out the connection bandwidth"
Although cool, that seems to break a lot of youtube videos.

This reverts commit a60b854d90.
2013-04-17 23:22:25 +02:00
Filippo Valsorda
a60b854d90 disable YT ratelimit; this should enable to max out the connection bandwidth 2013-04-17 19:48:35 +02:00
Finn Petersen
b8ad4f02a2 Arguments as parameter to function _real_main so it can be used programmatically 2013-04-16 19:26:48 +02:00
Jaime Marquínez Ferrándiz
d281274bf2 Add a playlist_index key to the info_dict, can be used in the output template 2013-04-16 15:13:29 +02:00
Philipp Hagemeister
b625bc2c31 release 2013.04.11 2013-04-11 18:42:57 +02:00
Philipp Hagemeister
f4381ab88a Fix keek title extraction 2013-04-11 18:39:13 +02:00
Philipp Hagemeister
744435f2a4 Show whole diff in error cases 2013-04-11 18:38:43 +02:00
Philipp Hagemeister
855703e55e Option to dump intermediate pages 2013-04-11 18:31:35 +02:00
Philipp Hagemeister
927c8c4924 Use download_webpage in youtube IE 2013-04-11 18:18:15 +02:00
Philipp Hagemeister
0ba994e9e3 Skip ARD test as it requires rtmpdump 2013-04-11 17:20:17 +02:00
Philipp Hagemeister
af9ad45cd4 Re-enable Stanford OC test 2013-04-11 17:20:05 +02:00
Philipp Hagemeister
e0fee250c3 Fix default for variable-size autonumbering 2013-04-11 17:07:55 +02:00
Philipp Hagemeister
72ca05016d Merge remote-tracking branch 'sagittarian/vimeo-no-desc' 2013-04-11 10:56:01 +02:00
Philipp Hagemeister
844d1f9fa1 Removed overly verbose options and arguments (Should be obvious from the previous lines) 2013-04-11 10:54:37 +02:00
Stanislav Kupryakhin
213c31ae16 Added option --autonumber-size:
Specifies the number of digits in %(autonumber)s when it is present in output filename template or --autonumber option is given
2013-04-11 10:53:57 +02:00
Philipp Hagemeister
04f3d551a0 Merge remote-tracking branch 'sagittarian/resolve-symlinks' 2013-04-11 10:51:13 +02:00
Philipp Hagemeister
e8600d69fd Credit @catch22 for ARD IE 2013-04-11 10:48:37 +02:00
Philipp Hagemeister
b03d65c237 Minor improvements for ARD IE 2013-04-11 10:47:21 +02:00
Adam Mesha
8743974189 Resolve the symlink if __main__.py is invoke as a symlink. 2013-04-11 08:02:17 +03:00
Adam Mesha
dc36bc9434 Fix bug when the vimeo description is empty on Python 2.x. 2013-04-11 07:27:04 +03:00
Jaime Marquínez Ferrándiz
bce878a7c1 Implement the playlist/start options in FileDownloader
It makes it available for all the InfoExtractors
2013-04-10 14:32:03 +02:00
Jaime Marquínez Ferrándiz
532d797824 In MetacafeIE return a url if YoutubeIE should do the job 2013-04-10 00:06:03 +02:00
Jaime Marquínez Ferrándiz
146c12a2da Change the order for extracting/downloading
Now it gets a video info and directly downloads it, the it pass to the next video founded.
2013-04-10 00:05:04 +02:00
Jaime Marquínez Ferrándiz
d39919c03e Add progress counter for playlists
Closes #276
2013-04-09 13:45:52 +02:00
Michael Walter
df2dedeefb added ARD InfoExtractor (german state television) 2013-04-07 15:23:48 +02:00
Michael Walter
adb029ed81 added --playpath/-y support to RTMP downloads (via 'play_path' entry in 'info_dict') 2013-04-07 15:17:36 +02:00
Ricardo Garcia
43ff1a347d Change rg3.github.com to rg3.github.io almost everywhere 2013-04-06 10:46:17 +02:00
Jaime Marquínez Ferrándiz
14294236bf Merge branch 'master' into extract_info_rewrite 2013-04-05 12:39:51 +02:00
Philipp Hagemeister
c2b293ba30 release 2013.04.03 2013-04-03 19:43:53 +02:00
Philipp Hagemeister
37cd9f522f Restore youtube-dl (update) binary (#770) 2013-04-01 23:43:20 +02:00
Filippo Valsorda
f33154cd39 Merge pull request #764 from jaimeMF/subtitles_not_found
Fix crash when subtitles are not found
2013-03-31 19:02:18 -07:00
Jaime Marquínez Ferrándiz
bafeed9f5d Don't crash in FileDownloader if subtitles couldn't be found and errors are ignored 2013-03-31 12:21:35 +02:00
Jaime Marquínez Ferrándiz
ef767f9fd5 Fix crash when subtitles are not found and the option --all-subs is given 2013-03-31 12:19:13 +02:00
Jaime Marquínez Ferrándiz
bc97f6d60c Use report_error in subtitles error handling 2013-03-31 12:10:12 +02:00
Filippo Valsorda
90a99c1b5e retry on UnavailableVideoError 2013-03-31 03:29:34 +02:00
Filippo Valsorda
f375d4b7de import all IEs when testing to resemble more closely the real env 2013-03-31 03:12:28 +02:00
Filippo Valsorda
fa41fbd318 don't catch YT user URLs in YoutubePlaylistIE (fix #754, fix #763) 2013-03-31 03:02:49 +02:00
Jaime Marquínez Ferrándiz
6a205c8876 More fixes on subtitles errors handling 2013-03-30 14:17:12 +01:00
Jaime Marquínez Ferrándiz
0fb3756409 Fix crash when subtitles are not found 2013-03-30 14:11:33 +01:00
Philipp Hagemeister
fbbdf475b1 Different feed file name 2013-03-29 21:44:11 +01:00
Philipp Hagemeister
c238be3e3a Correct feed title 2013-03-29 21:41:20 +01:00
Philipp Hagemeister
1bf2801e6a release 2013.03.29 2013-03-29 21:22:57 +01:00
Philipp Hagemeister
c9c8402093 Merge pull request #758 from jaimeMF/atom-feed
Add an Atom feed generator in devscripts
2013-03-29 12:50:20 -07:00
Jaime Marquínez Ferrándiz
6060788083 Write a new feed each time, reading from versions.json 2013-03-29 19:42:33 +01:00
Filippo Valsorda
e3700fc9e4 Merge pull request #736 from rg3/retry
Exception stacking and test retry
2013-03-29 09:01:27 -07:00
Filippo Valsorda
b693216d8d Merge pull request #752 from dodo/master
SoundcloudSetIE
2013-03-29 08:40:22 -07:00
Filippo Valsorda
46b9d8295d Merge pull request #730 by @JohnyMoSwag
Support for Worldstarhiphop.com
2013-03-29 16:14:49 +01:00
Filippo Valsorda
7decf8951c fix FunnyOrDieIE, MyVideoIE, TEDIE 2013-03-29 15:59:13 +01:00
Filippo Valsorda
1f46c15262 fix SpiegelIE 2013-03-29 15:31:38 +01:00
Filippo Valsorda
0cd358676c Rebased, fixed and extended LiveLeak.com support
close #757 - close #761
2013-03-29 15:13:24 +01:00
kkalpakloglou
43113d92cc Update InfoExtractors.py 2013-03-29 14:23:09 +01:00
Jaime Marquínez Ferrándiz
7eab8dc750 Pass the playlist info_dict to process_info
the playlist value can be used in the output template
2013-03-29 12:32:42 +01:00
Johny Mo Swag
44e939514e Added test for WorldStarHipHop 2013-03-28 20:05:28 -07:00
Philipp Hagemeister
95506f1235 Merge remote-tracking branch 'jaimeMF/color_error_messages' 2013-03-29 00:25:48 +01:00
Philipp Hagemeister
a91556fd74 Add a note on MaxDownloadsReached (#732, thanks to CBGoodBuddy) 2013-03-29 00:20:13 +01:00
Philipp Hagemeister
1447f728b5 Merge branch 'master' of github.com:rg3/youtube-dl 2013-03-29 00:06:48 +01:00
Jaime Marquínez Ferrándiz
d2c690828a Add title and id to playlist results
Not all IE give both. They are not used yet.
2013-03-28 13:39:00 +01:00
Jaime Marquínez Ferrándiz
cfa90f4adc Merge branch 'master' into extract_info_rewrite 2013-03-28 13:20:33 +01:00
Filippo Valsorda
898280a056 use sys.stdout.buffer only on Python3 2013-03-28 13:13:03 +01:00
Filippo Valsorda
59b4a2f0e4 Merge pull request #762 from jynnantonix/master
Use sys.stdout.buffer when writing to standard out
2013-03-28 05:11:51 -07:00
Chirantan Ekbote
1ee9778405 Use sys.stdout.buffer instead of sys.stdout
sys.stdout defaults to text mode, we need to use the underlying buffer
instead when writing binary data.

Signed-off-by: Chirantan Ekbote <chirantan.ekbote@gmail.com>
2013-03-27 15:57:11 -04:00
Jaime Marquínez Ferrándiz
db74c11d2b Add an Atom feed generator in devscripts 2013-03-26 18:13:52 +01:00
dodo
5011cded16 SoundcloudSetIE
info extractor for soundcloud sets
2013-03-24 02:24:07 +01:00
Filippo Valsorda
f10b2a9c14 fix KeekIE 2013-03-20 12:13:52 +01:00
Filippo Valsorda
5cb3c0b319 Merge pull request #699 by @iemejia
Removed innecesary function to convert subtitles, improved use of the youtube api
2013-03-20 11:35:55 +01:00
Filippo Valsorda
b9fc428494 add '--write-srt' and '--srt-lang' aliases for backwards compatibility 2013-03-20 11:29:07 +01:00
Ismael Mejia
c0ba104674 Fixed typo in error message when no subtitles were available. 2013-03-20 08:41:54 +01:00
Ismael Mejia
2a4093eaf3 Added new option '--list-subs' to show the available subtitle languages 2013-03-20 08:41:54 +01:00
Ismael Mejia
9e62bc4439 Added new option '--sub-format' to choose the format of the subtitles to downloade (defaut=srt) 2013-03-20 08:41:54 +01:00
Ismael Mejia
553d097442 Refactor subtitle options from srt to the more generic 'sub'.
In order to be more consistent with different subtitle formats.
From:
* --write-srt to --write-sub
* --only-srt to --only-sub
* --all-srt to --all-subs
* --srt-lang to --sub-lang'

Refactored also all the mentions of srt for sub in all the source code.
2013-03-20 08:41:53 +01:00
Ismael Mejia
ae608b8076 Added new option '--all-srt' to download all the subtitles of a video.
Only works in youtube for the moment.
2013-03-20 08:41:53 +01:00
Philipp Hagemeister
c397187061 Spiegel: Support hash at end of URL 2013-03-16 23:52:17 +01:00
Philipp Hagemeister
e32b06e977 Spiegel IE 2013-03-12 01:08:54 +01:00
Philipp Hagemeister
8c42c506cd Add configuration to -v output 2013-03-12 00:10:05 +01:00
Filippo Valsorda
8cc83b8dbe Bubble up all the stack of exceptions and retry download tests on timeout errors 2013-03-09 10:05:43 +01:00
Johny Mo Swag
51af426d89 forgot to fix this. 2013-03-08 22:52:17 -08:00
Johny Mo Swag
08ec0af7c6 catch fatal error 2013-03-08 22:48:05 -08:00
Johny Mo Swag
3b221c5406 removed str used for other project. 2013-03-08 22:39:45 -08:00
Philipp Hagemeister
3d3423574d Fix Unicode handling GenericIE (Fixes #734) 2013-03-08 20:47:06 +01:00
Philipp Hagemeister
e5edd51de4 Clear up error messages (#734) 2013-03-08 20:12:05 +01:00
Johny Mo Swag
64c78d50cc working - worldstarhiphop IE
Support for WorldStarHipHop
2013-03-07 16:27:21 -08:00
Johny Mo Swag
b3bcca0844 clean up 2013-03-07 15:39:17 -08:00
Johny Mo Swag
61e40c88a9 fixed typo 2013-03-06 21:14:46 -08:00
Johny Mo Swag
40634747f7 Support for WorldStarHipHop.com 2013-03-06 21:09:55 -08:00
Philipp Hagemeister
c2e21f2f0d Merge pull request #728 from timdoug/fix-escapist-extension
Escapist videos are acutally .mp4, not .flv
2013-03-06 10:26:18 -08:00
Tim Douglas
47dcd621c0 Escapist videos are acutally .mp4, not .flv 2013-03-06 12:46:45 -05:00
Jaime Marquínez Ferrándiz
a0d6fe7b92 When a redirect is found return the new url using the new style 2013-03-05 22:33:32 +01:00
Jaime Marquínez Ferrándiz
c9fa1cbab6 More trouble calls changed in InfoExtractors.py
The calls with the message starting with 'WARNING' have been changed to report_warning instead of report_error
2013-03-05 21:13:17 +01:00
Jaime Marquínez Ferrándiz
8a38a194fb Add auxiliary methods to InfoExtractor to set the '_type' key and use them for some playlist IEs 2013-03-05 20:55:48 +01:00
Jaime Marquínez Ferrándiz
6ac7f082c4 extract_info now expects ie.extract to return a list in the format proposed in issue 608.
Each element should have a '_type' key specifying if it's a video, an url or a playlist.
`extract_info` will process each element to get the full info
2013-03-05 20:14:32 +01:00
Jaime Marquínez Ferrándiz
f6e6da9525 Use extract_info in BlipTV User and Youtube Channel 2013-03-05 12:26:18 +01:00
Jaime Marquínez Ferrándiz
597cc8a455 Use extract_info in YoutubePlaylist and YoutubeSearch 2013-03-05 11:58:01 +01:00
Jaime Marquínez Ferrándiz
3370abd509 Merge branch 'master' into extract_info_rewrite 2013-03-04 22:25:46 +01:00
Jaime Marquínez Ferrándiz
631f73978c Add a method for extracting info from a list of urls 2013-03-04 22:16:42 +01:00
Jaime Marquínez Ferrándiz
e5f30ade10 Use report_error in InfoExtractors.py
Some calls haven't been changed
2013-03-04 15:56:14 +01:00
Jaime Marquínez Ferrándiz
6622d22c79 Use report_error in FileDownloader.py 2013-03-04 11:47:58 +01:00
Jaime Marquínez Ferrándiz
4e1582f372 Use red color when printing error messages 2013-03-04 11:27:25 +01:00
Philipp Hagemeister
967897fd22 Fix Python 3 errors with rmtp downloads 2013-03-03 22:38:38 +01:00
Philipp Hagemeister
f918ec7ea2 Clarify rate limit documentation (Closes #723) 2013-03-03 22:35:26 +01:00
Philipp Hagemeister
a2ae43a55f Remove changed playlist test (#661) 2013-03-03 22:19:19 +01:00
Philipp Hagemeister
7ae153ee9c Remove tweetreel - it has shut down 2013-03-03 22:15:06 +01:00
Philipp Hagemeister
f7b567ff84 Use proper urlparse functions and simplify a bit 2013-03-03 22:09:44 +01:00
Philipp Hagemeister
f2e237adc8 Merge remote-tracking branch 'jcarlosgarciasegovia/master' 2013-03-03 22:04:06 +01:00
Jaime Marquínez Ferrándiz
2e5457be1d Use report_warning in InfoExtractors 2013-03-02 11:24:07 +01:00
Juan Carlos Garcia Segovia
7f9d41a55e Allow downloading http://blip.tv/play/ embeded URLs 2013-03-01 10:22:16 +00:00
Jaime Marquínez Ferrándiz
8207626bbe Use color when printing warning messages 2013-02-28 22:07:29 +01:00
Jaime Marquínez Ferrándiz
df8db1aa21 Create extract_info method 2013-02-26 23:33:58 +01:00
Philipp Hagemeister
691db5ba02 Don't be too clever (Fixes Python 3) 2013-02-26 22:03:43 +01:00
Philipp Hagemeister
acb8752f80 fix tests in Python3, and make them parallelizable 2013-02-26 22:03:33 +01:00
Philipp Hagemeister
679790eee1 Do not user upper-case for non-constants 2013-02-26 20:03:19 +01:00
Philipp Hagemeister
6bf48bd866 Merge remote-tracking branch 'origin/API_YT_playlists' 2013-02-26 19:58:04 +01:00
Philipp Hagemeister
790d4fcbe1 Merge pull request #715 from joksnet/no_video_results
[YT Search] No results if items is not in response
2013-02-26 10:43:35 -08:00
Filippo Valsorda
89de9eb125 Modified Youtube video/playlist matching; fixes #668; fixes #585 2013-02-26 19:06:41 +01:00
Filippo Valsorda
6324fd1d74 Switch YTPlaylistIE to API (relevant: #586); fixes #651; fixes #673; fixes #661 2013-02-26 19:06:28 +01:00
Juan M
9e07cf2955 [YT Search] No results if items is not in response
When a query results of 0 items, the key items is not present in the
api_response dictionary, raising a KeyError.

Intead, look for the key and call trouble if it's not present.
2013-02-26 18:06:43 +01:00
Philipp Hagemeister
f03b88b3fb Merge remote-tracking branch 'joksnet/not_keep_video_message' 2013-02-25 00:35:12 +01:00
Philipp Hagemeister
97d0365f49 release 2013.02.25 2013-02-25 00:28:19 +01:00
Philipp Hagemeister
12887875a2 Fix typo 2013-02-25 00:22:55 +01:00
Philipp Hagemeister
450e709972 Formalize URL creation (prepare for some cleanup in blip.tv:users) 2013-02-24 23:23:50 +01:00
Philipp Hagemeister
9befce2b8c Merge remote-tracking branch 'joksnet/ytsearch_decode_request' 2013-02-24 23:14:34 +01:00
Philipp Hagemeister
cb99797798 Test TED thumbnail 2013-02-24 01:01:20 +01:00
Philipp Hagemeister
f82b28146a Merge remote-tracking branch 'jaimeMF/TED' 2013-02-24 00:59:22 +01:00
Philipp Hagemeister
4dc72b830c Merge remote-tracking branch 'jaimeMF/Steam' 2013-02-24 00:59:03 +01:00
Philipp Hagemeister
ea05129ebd release 2013.02.22 2013-02-24 00:47:08 +01:00
Juan M
35d217133f Message for delete video it's not an error.
When using youtube-dl from another python script with the quiet option
on, and a post procesor for extract the audio. The message of deleting
video shows in the first script logs (as it goes to stderr).

There is no way to keep this quiet as it's treated as an error, even if,
for me, it's not.
2013-02-23 22:52:52 +01:00
Juan M
d1b7a24354 Decode the data requested to the api in utf-8. 2013-02-23 22:47:22 +01:00
Jaime Marquínez Ferrándiz
c85538dba1 TED: get thumbnails 2013-02-23 17:27:49 +01:00
Jaime Marquínez Ferrándiz
60bd48b175 Steam: get thumbnails 2013-02-23 16:48:15 +01:00
Philipp Hagemeister
4be0aa3539 release 2012.02.22 2013-02-22 16:41:36 +01:00
Philipp Hagemeister
f636c34481 Stop early in nosetests (in release script) 2013-02-22 16:40:19 +01:00
Philipp Hagemeister
3bf79c752e Print *all* release notes 2013-02-22 00:36:23 +01:00
Ismael Mejia
cdb130b09a Added new option '--only-srt' to download only the subtitles of a video
Improved option '--srt-lang'
 - it shows the argument in case of missing subtitles
 - added language suffix for non-english languages (e.g. video.it.srt)
2013-02-21 22:12:36 +01:00
Ismael Mejia
2e5d60b7db Removed conversion from youtube closed caption format to srt since youtube api supports the 'srt' format 2013-02-21 20:51:35 +01:00
Philipp Hagemeister
8271226a55 Fix --match-title and --reject-title decoding (Closes #690) 2013-02-21 17:09:39 +01:00
Philipp Hagemeister
1013186a17 Also check for JSLoader of JWSPlayer (thanks to @maximeg, Closes #685) 2013-02-21 16:56:48 +01:00
Philipp Hagemeister
7c038b3c32 Import HTTPErrorProcessor from the correct module (Closes #696) 2013-02-21 16:49:05 +01:00
Philipp Hagemeister
c8cd8e5f55 release 2013.02.19 2013-02-19 00:06:04 +01:00
Philipp Hagemeister
471cf47796 include bash completion and manpage in PyPi dist 2013-02-18 23:56:13 +01:00
Philipp Hagemeister
d8f64574a4 release 2013.02.18 2013-02-18 23:37:20 +01:00
Philipp Hagemeister
e711babbd1 Fix YP IE 2013-02-18 23:30:33 +01:00
Philipp Hagemeister
a72b0f2b6f Use proper echo commands 2013-02-18 23:22:01 +01:00
Philipp Hagemeister
434eb6f26b Include man and bash completion in PyPi release 2013-02-18 23:19:57 +01:00
Philipp Hagemeister
197080b10b Merge remote-tracking branch 'jaimeMF/TED' 2013-02-18 23:12:56 +01:00
Philipp Hagemeister
7796e8c2cb facebook: also download lq videos 2013-02-18 23:12:48 +01:00
Philipp Hagemeister
6d4363368a Fix MyVideo IE 2013-02-18 22:32:56 +01:00
Jaime Marquínez Ferrándiz
414638cd50 TED: Add support for playlists 2013-02-18 21:42:06 +01:00
Philipp Hagemeister
2a9983b78f Fix 8tracks 2013-02-18 19:11:32 +01:00
Philipp Hagemeister
b17c974a88 Mark DailyMotion as broken for now (#680) 2013-02-18 18:53:40 +01:00
Philipp Hagemeister
5717d91ab7 Correct --newline and give it a more meaningful title 2013-02-18 18:52:06 +01:00
Philipp Hagemeister
79eb0287ab Merge remote-tracking branch 'glisignoli/master' 2013-02-18 18:47:35 +01:00
Philipp Hagemeister
58994225bc Add tests to MySpass 2013-02-18 18:45:09 +01:00
Jaime Marquínez Ferrándiz
59d4c2fe1b fix some titles in TED 2013-02-17 17:25:02 +01:00
Jaime Marquínez Ferrándiz
3a468f2d8b Basic support for TED 2013-02-17 17:13:06 +01:00
bastik
1ad5d872b9 added new InfoExtractor for myspass.de 2013-02-16 13:46:13 +01:00
glisignoli
355fc8e944 Update README.md 2013-02-15 15:57:40 +13:00
glisignoli
380a29dbf7 Update youtube_dl/__init__.py 2013-02-15 15:55:11 +13:00
Gino Lisignoli
1528d6642d Forgot to remove \r 2013-02-13 16:43:08 +13:00
Gino Lisignoli
7311fef854 Modified youtube-dl to write new lines with the --newline switch. This
enables easier process monitoring when being called with external
scripts.
2013-02-13 14:02:31 +13:00
Mantas Mikulėnas
906417c7c5 Fix delayed title display in --console-title
With Python 3, the titlebar wouldn't get updated for a long time (due to
stderr buffering), and when it did, the title would be shown as b'...'
representation.
2013-02-09 22:58:12 +02:00
Philipp Hagemeister
6aabe82035 Credit Osama Khalid for Keek support 2013-02-08 11:01:09 +01:00
Philipp Hagemeister
f0877a445e Add tests for keek 2013-02-08 11:00:28 +01:00
Osama Khalid
da06e2daf8 Add KeekIE() 2013-02-08 10:25:55 +03:00
Philipp Hagemeister
d3f5f9f6b9 Fix login (Closes #658) 2013-02-06 21:22:53 +01:00
Philipp Hagemeister
bfc6ea7935 Ignore PyPi metadata 2013-02-05 13:42:52 +01:00
Philipp Hagemeister
8edc2cf8ca Support direct vimeo links (Closes #666) 2013-02-05 13:42:08 +01:00
Philipp Hagemeister
fb778e66df Fix encoding in youtube subtitle download (Closes #669) 2013-02-05 13:30:02 +01:00
Philipp Hagemeister
3a9918d37f Escapist continues to be flaky on travis 2013-02-02 14:53:34 +01:00
Philipp Hagemeister
ccb0cae134 Fix automatic release (oops) 2013-02-02 14:52:38 +01:00
Philipp Hagemeister
085c8b75a6 release 2013.02.02 2013-02-02 14:45:38 +01:00
Philipp Hagemeister
dbf2ba3d61 Better help for new options 2013-02-02 14:44:22 +01:00
Philipp Hagemeister
b47bbac393 Disable Stanford OC test for now, and enable escapist 2013-02-02 14:40:41 +01:00
Philipp Hagemeister
229cac754a Improve cookie error handling 2013-02-02 13:51:54 +01:00
Philipp Hagemeister
0e33684194 Switch to m4a by default (Closes #240) 2013-02-01 18:23:20 +01:00
Jeff Crouse
9e982f9e4e Added "min-filesize" and "max-filesize" options 2013-02-01 18:09:34 +01:00
Philipp Hagemeister
c7a725cfad Merge remote-tracking branch 'dcoppa/master' 2013-02-01 18:05:42 +01:00
Philipp Hagemeister
450a30cae8 Add PyPi upload to release script 2013-02-01 18:01:53 +01:00
Philipp Hagemeister
9cd5e4fce8 release 2013.02.01 2013-02-01 17:57:32 +01:00
Philipp Hagemeister
edba5137b8 Fix Facebook IE 2013-02-01 17:56:22 +01:00
Philipp Hagemeister
233a22960a Switch ComedyCentral test to a permanent URL (They delete full episodes older than a month) 2013-02-01 17:46:03 +01:00
Philipp Hagemeister
3b024e17af Work around buggy HTML Parser in Python < 2.7.3 (Closes #662) 2013-02-01 17:29:50 +01:00
David Coppa
a32b573ccb Try setuptools first, then fallback to distutils.core 2013-01-30 15:31:38 +01:00
Philipp Hagemeister
ec71c13ab8 release 2013.01.28 2013-01-27 18:33:58 +01:00
Philipp Hagemeister
f0bad2b026 Fix Stanford (Closes #653) 2013-01-27 15:23:26 +01:00
Philipp Hagemeister
25580f3251 8tracks: Ignore hashes 2013-01-27 04:15:12 +01:00
Philipp Hagemeister
da4de959df 8tracks: Better default titles 2013-01-27 04:05:53 +01:00
Philipp Hagemeister
d0d51a8afa 8tracks: Include performer as uploader 2013-01-27 03:27:46 +01:00
Philipp Hagemeister
c67598c3e1 Remove space before shebang 2013-01-27 03:07:07 +01:00
Philipp Hagemeister
811d253bc2 Merge remote-tracking branch 'jaimeMF/makefilePythonversion' 2013-01-27 03:06:32 +01:00
Philipp Hagemeister
c3a1642ead release 2013.01.27 2013-01-27 03:03:02 +01:00
Philipp Hagemeister
ccf65f9dee 8tracks IE (Closes #652) 2013-01-27 03:01:23 +01:00
Philipp Hagemeister
b954070d70 Fix Facebook (Closes #375) 2013-01-25 16:54:48 +01:00
Philipp Hagemeister
30e9f4496b Drop md5: spec for now (unused and breaks int values) 2013-01-25 16:54:25 +01:00
Jaime Marquínez Ferrándiz
271d3fbdaa Option in makefile to select python interpreter 2013-01-25 15:11:03 +01:00
Philipp Hagemeister
6df40dcbe0 Guard against sys.getfilesystemencoding() == None (#503) 2013-01-20 01:48:05 +01:00
Philipp Hagemeister
97f194c1fb twitch.tv: Use id as title if no title is present (Closes #638) 2013-01-16 09:55:45 +01:00
Philipp Hagemeister
4da769ccca Do not backup version.py (under version control and frankly, not that complex) 2013-01-12 23:04:46 +01:00
Philipp Hagemeister
253d96f2e2 Force build removal 2013-01-12 22:25:54 +01:00
Philipp Hagemeister
bbc3e2753a release 2013.01.13 2013-01-12 22:18:13 +01:00
Philipp Hagemeister
67353612ba Revert "Move update to front"
This reverts commit db30f02b50.
2013-01-12 22:10:36 +01:00
Philipp Hagemeister
bffbd5f038 Download progress hooks 2013-01-12 20:34:50 +01:00
Philipp Hagemeister
d8bbf2018e Aggressive test timeout to catch hanging servers 2013-01-12 20:33:03 +01:00
Philipp Hagemeister
187f491ad2 [RBMA] Do not fail if thumbnail is empty 2013-01-12 18:45:50 +01:00
Philipp Hagemeister
335959e778 Correct Blip.tv on 2.6, where HTTP headers are case-sensitive (wtf?) 2013-01-12 18:38:23 +01:00
Philipp Hagemeister
3b83bf8f6a correct pushes in release script 2013-01-12 18:37:21 +01:00
Philipp Hagemeister
51719893bf Default to py3 in sign-versions 2013-01-12 18:14:07 +01:00
Philipp Hagemeister
1841f65e64 Python 2-proof versions.py 2013-01-12 18:12:24 +01:00
Philipp Hagemeister
bb28998920 fix location of updates_key in devscripts/release 2013-01-12 18:07:31 +01:00
Philipp Hagemeister
fbc5f99db9 release 2013.01.12 2013-01-12 17:59:58 +01:00
Philipp Hagemeister
ca0a0bbeec RBMA IE (Closes #630) 2013-01-12 17:58:39 +01:00
Philipp Hagemeister
6119f78cb9 Add location field 2013-01-12 17:34:31 +01:00
Philipp Hagemeister
539679c7f9 Make uploader and upload_date fields optional 2013-01-12 17:34:09 +01:00
Philipp Hagemeister
b642cd44c1 restore youtube-dl (update) binary 2013-01-12 17:07:12 +01:00
Philipp Hagemeister
fffec3b9d9 Credit jefftimesten for YouPornIE, PornoTubeIE, YouJizzIE 2013-01-12 16:51:20 +01:00
Philipp Hagemeister
3446dfb7cb Proper support for changing User-Agents from IEs 2013-01-12 16:49:13 +01:00
Philipp Hagemeister
db16276b7c Improve YouJizz 2013-01-12 16:41:04 +01:00
Philipp Hagemeister
629fcdd135 Add agecheck and various improvements to YouPorn IE 2013-01-12 16:10:35 +01:00
Philipp Hagemeister
64ce2aada8 _request_webpage helper methods for queries that need the final URL 2013-01-12 16:10:16 +01:00
Philipp Hagemeister
565f751967 Clean up porno IEs 2013-01-12 15:17:04 +01:00
Philipp Hagemeister
6017964580 Merge remote-tracking branch 'jefftimesten/master' 2013-01-12 15:12:50 +01:00
Philipp Hagemeister
1d16b0c3fe Keep file without any PPs (oops, missed the obvious case) 2013-01-12 15:12:28 +01:00
Philipp Hagemeister
7851b37993 --recode-video option (Closes #18) 2013-01-12 15:09:09 +01:00
Philipp Hagemeister
d81edc573e Merge 'jaimeMF/videoconversion' (sans actual option for now) 2013-01-12 14:04:30 +01:00
Philipp Hagemeister
ef0c8d5f9f Make ustream IE more robust 2013-01-12 13:49:14 +01:00
Philipp Hagemeister
db30f02b50 Move update to front 2013-01-12 13:45:39 +01:00
Philipp Hagemeister
4ba7262467 Less confusing player version 2013-01-12 13:35:16 +01:00
Jaime Marquínez Ferrándiz
67d0c25eab Add a PostProcessor for converting video format 2013-01-11 20:50:49 +01:00
Philipp Hagemeister
09f9552b40 Less git acrobatics in devscripts/release.sh 2013-01-11 08:28:37 +01:00
Philipp Hagemeister
142d38f776 release 2013.01.11 2013-01-11 08:05:30 +01:00
Philipp Hagemeister
6dd3471900 Add Makefile in tarball (Closes #626) 2013-01-11 08:00:27 +01:00
Philipp Hagemeister
280d67896a Correct documentation (Closes #625) 2013-01-10 23:20:26 +01:00
Philipp Hagemeister
510e6f6dc1 Support --audio-format=opus 2013-01-10 19:15:04 +01:00
Philipp Hagemeister
712e86b999 Fix broken ffmpeg (Closes #623) 2013-01-09 14:46:19 +01:00
Philipp Hagemeister
74fdba620d release 2013.01.08 2013-01-08 10:29:53 +01:00
Philipp Hagemeister
dc1c479a6f Merge pull request #621 from atomizer/master
justin.tv tweaks
2013-01-08 00:57:46 -08:00
atomizer
119d536e07 Merge branch 'my-origin/master' 2013-01-07 17:03:58 +04:00
atomizer
fa1bf9c653 justin.tv tweaks
- download all parts of a broadcast, fixes #614
- set "uploader" variable to channel_name if available
- catch api errors even if http status is 200
2013-01-07 16:59:39 +04:00
Philipp Hagemeister
814eed0ea1 Fix tar target (--exclude-vcs is not supported everywhere, and reading . while writing to it can fail randomly) 2013-01-07 12:48:07 +01:00
Philipp Hagemeister
0aa3068e9e Do not check in test_coverage 2013-01-06 23:38:36 +01:00
Philipp Hagemeister
db2d6124b1 correct quoting 2013-01-06 23:14:56 +01:00
Philipp Hagemeister
039dc61bd2 Simplify Makefile 2013-01-06 23:02:31 +01:00
Philipp Hagemeister
4b879984ea release 2013.01.06 2013-01-06 22:52:04 +01:00
Philipp Hagemeister
55e286ba55 read -n is bash-specific 2013-01-06 22:50:20 +01:00
Jeff Crouse
9450bfa26e fixed tests (used the --test option) so that they pass. go figure 2013-01-06 16:33:37 -05:00
Jeff Crouse
18be482a6f oops - didn't remove some reminders 2013-01-06 15:52:33 -05:00
Jeff Crouse
ca6710ee41 made changes recommended in pull request 2013-01-06 15:40:50 -05:00
Philipp Hagemeister
9314810243 fix ComedyCentral IE in Python3 2013-01-06 21:36:01 +01:00
Philipp Hagemeister
7717ae19fa Add tests for ComedyCentral IE 2013-01-06 21:35:20 +01:00
Philipp Hagemeister
32635ec685 Switch comedycentral IE to http downloads 2013-01-06 21:26:31 +01:00
Jeff Crouse
caec7618a1 re-fixed XNXX regex problem 2013-01-05 16:05:23 -05:00
Jeff Crouse
7e7ab2815c Merge branch 'master' of https://github.com/jefftimesten/youtube-dl 2013-01-05 16:01:03 -05:00
Jeff Crouse
d7744f2219 Merge branch 'master' of https://github.com/jefftimesten/youtube-dl 2013-01-05 16:00:50 -05:00
Jeff Crouse
7161829de5 Merge branch 'master' of https://github.com/jefftimesten/youtube-dl 2013-01-05 15:59:28 -05:00
Jeff Crouse
991ba7fae3 Added extractors for 3 porn sites 2013-01-05 15:59:01 -05:00
Jeff Crouse
a7539296ce Added extractors for 3 porn sites 2013-01-05 15:42:35 -05:00
Jeff Crouse
258d5850c9 Merge branch 'master' of https://github.com/rg3/youtube-dl
Conflicts:
	.gitignore
	LATEST_VERSION
	Makefile
	youtube-dl
	youtube-dl.exe
	youtube_dl/InfoExtractors.py
	youtube_dl/__init__.py
2013-01-05 15:03:54 -05:00
Philipp Hagemeister
20759b340a Disable travis irc notifications
travis is much to verbose for that, with random IEs constantly failing
2013-01-04 00:34:02 +01:00
Philipp Hagemeister
8e5f761870 Merge pull request #617 from jaimeMF/steamIE
[steamIE]Allow downloading videos with other characters in their titles
2013-01-03 15:16:27 -08:00
Jaime Marquínez Ferrándiz
26714799c9 steamIE remove the HTMLparser object 2013-01-03 23:56:02 +01:00
Jaime Marquínez Ferrándiz
5e9d042d8f steamIE follow @phihag suggestions 2013-01-03 23:51:48 +01:00
Jaime Marquínez Ferrándiz
9cf98a2bcc Allow downloading videos with other characters in their titles
Especially html entities
2013-01-03 21:17:35 +01:00
Philipp Hagemeister
f5ebb61495 Support page URL in RTMP downloads 2013-01-03 20:26:38 +01:00
Philipp Hagemeister
431d88dd31 Also generate SHA2-256 2013-01-03 19:49:06 +01:00
Philipp Hagemeister
876f1a86af Also publish hashsums 2013-01-03 19:18:55 +01:00
Philipp Hagemeister
01951dda7a Make ExtractorError usable for other causes 2013-01-03 15:39:55 +01:00
Filippo Valsorda
6e3dba168b release.sh edits based on 2013.01.02 experience 2013-01-02 23:40:24 +01:00
Filippo Valsorda
d851e895d5 release 2013.01.02 2013-01-02 22:21:45 +01:00
Filippo Valsorda
b962b76f43 re-worked release workflow, it is one-step and creates GPG signatures now 2013-01-02 21:52:27 +01:00
Philipp Hagemeister
26cf040827 Support youtube videos of google+ users 2013-01-02 19:12:44 +01:00
Philipp Hagemeister
8e241d1a1a Simplify DailyMotion IE 2013-01-01 21:22:30 +01:00
Philipp Hagemeister
3a648b209c Remove .part files before and after tests 2013-01-01 21:16:03 +01:00
Philipp Hagemeister
c80f0a417a Better name for InfoQ IE 2013-01-01 21:10:45 +01:00
Philipp Hagemeister
4fcca4bb18 Fix infoQ in Python3 2013-01-01 21:07:37 +01:00
Philipp Hagemeister
511eda8eda add test for infoq 2013-01-01 21:01:49 +01:00
Philipp Hagemeister
5f9551719c Simplify some IEs 2013-01-01 20:52:59 +01:00
Philipp Hagemeister
d830b7c297 _download_webpage helper function 2013-01-01 20:43:43 +01:00
Philipp Hagemeister
1c256f7047 ExtractorError for errors during extraction 2013-01-01 20:27:53 +01:00
Philipp Hagemeister
a34dd63beb Remove superfluous IE names 2013-01-01 19:40:48 +01:00
Philipp Hagemeister
4aeae91f86 Move gen_extractors to InfoExtractors 2013-01-01 19:37:07 +01:00
Philipp Hagemeister
c073e35b1e Simplify test parameter initialization 2013-01-01 19:34:54 +01:00
Philipp Hagemeister
5c892b0ba9 Adapt test_download to support playlists, and remove race conditions 2013-01-01 19:30:29 +01:00
Philipp Hagemeister
6985325e01 Revert "In tests.json file and md5 join in a 'files' list to handle multiple-file IEs"
This made the JSON structure really unreadable and was a quick fix.

This reverts commit 6535e9511f.
2013-01-01 19:07:06 +01:00
Philipp Hagemeister
911ee27e83 typo 2013-01-01 19:07:01 +01:00
Philipp Hagemeister
2069acc6a4 credit @jaimeMF 2013-01-01 18:29:43 +01:00
Jaime Marquínez Ferrándiz
278986ea0f ustreamIE 2013-01-01 18:14:20 +01:00
Filippo Valsorda
6535e9511f In tests.json file and md5 join in a 'files' list to handle multiple-file IEs 2013-01-01 16:07:26 +01:00
Filippo Valsorda
60c7520a51 Merge pull request #612 from jaimeMF/steamIE
SteamIE
2013-01-01 06:49:30 -08:00
Jaime Marquínez Ferrándiz
deb594a9a0 Test for steam 2013-01-01 15:41:55 +01:00
Jaime Marquínez Ferrándiz
e314ba675b SteamIE 2013-01-01 14:12:14 +01:00
Filippo Valsorda
0214ce7c75 Ok, the Escapist test was passing only in my Travis repo, do not ask me why; also, a small bugfix to the latest commit 2012-12-31 19:21:28 +01:00
Filippo Valsorda
95fedbf86b three small edits
* ask for a --verbose log when reporting bugs in README.md
* re-enable Escapist test, seems stable now
* check that we are not downloading multiple videos when the template is fixed (NOT a complete fix: not detecting playlists)
2012-12-31 19:12:57 +01:00
Filippo Valsorda
b7769a05ec addedd a serious Public Domain dedication, see http://unlicense.org/ 2012-12-31 15:32:26 +01:00
Filippo Valsorda
067f6a3536 moved docs and updates generation scripts from gh-pages branch to devscripts 2012-12-30 21:02:19 +01:00
Filippo Valsorda
8cad53e84c Removed a spurious increment_downloads, this time cleanly 2012-12-30 19:53:07 +01:00
Filippo Valsorda
d5ed35b664 moved updating code to update.py 2012-12-30 19:50:33 +01:00
Filippo Valsorda
f427df17ab some fixes, pulled the codename from the code 2012-12-30 19:50:33 +01:00
Filippo Valsorda
4e38899e97 print some version and environment info on --verbose (+ py3 fixes) 2012-12-30 19:50:33 +01:00
Filippo Valsorda
cb6ff87fbb The new updates system, relies on gh-pages, secured by RSA, uses external web servers 2012-12-30 19:50:33 +01:00
Philipp Hagemeister
0deac3a2d8 Revert "Removed a spurious increment_downloads"
This reverts commit 92e3e18a1d.
2012-12-29 16:56:52 +01:00
Filippo Valsorda
92e3e18a1d Removed a spurious increment_downloads 2012-12-29 16:49:49 +01:00
Philipp Hagemeister
3bb6165927 Allow ampersand right after ? in youtube URLs (Closes #602) 2012-12-27 05:31:36 +01:00
Philipp Hagemeister
d0d4f277da TweetReel IE 2012-12-27 01:38:41 +01:00
Filippo Valsorda
99b0a1292b add --no-post-overwrites to README.md; + minor style fixes 2012-12-26 20:39:33 +01:00
Philipp Hagemeister
dc23886a77 Merge pull request #601 from paullik/no-post-overwrites
No post-processing overwrites
2012-12-24 03:18:48 -08:00
Barbu Paul - Gheorghe
b7298b6e2a not relying on ffmpeg to do the post-processed file checking, instead doing it directly in youtube-dl 2012-12-24 12:53:28 +02:00
Barbu Paul - Gheorghe
3e6c3f52a9 apparently the -n option is available only in ffmpeg 2012-12-23 20:20:19 +02:00
Barbu Paul - Gheorghe
0c0074328b modified FFmpegExtractAudioPP to accept whether it should overwrite post-processed files or not 2012-12-23 19:51:41 +02:00
Barbu Paul - Gheorghe
f0648fc18c added the --no-post-overwrites argument 2012-12-23 19:36:48 +02:00
Philipp Hagemeister
a7c0f8602e Merge branch 'master' of github.com:rg3/youtube-dl 2012-12-20 21:28:32 +01:00
Philipp Hagemeister
21a9c6aaac FunnyOrDie IE (Fixes #599) 2012-12-20 21:28:27 +01:00
Filippo Valsorda
162e3c5261 Temporary skip Escapist test as it fails only on Travis; we'll make a more specific workaround later if we can't fix it 2012-12-20 17:21:46 +01:00
Filippo Valsorda
6b3aef80ce better Vimeo tests; fixed a couple of VimeoIE fields 2012-12-20 16:30:55 +01:00
Filippo Valsorda
77c4beab8a new info_dict field: uploader_id 2012-12-20 16:28:16 +01:00
Filippo Valsorda
1a2c3c0f3e some py3 fixes, both needed and recommended; we should pass 2to3 as cleanly as possible now 2012-12-20 14:20:24 +01:00
Filippo Valsorda
0eaf520d77 add info_dict testing to test_download 2012-12-20 14:20:24 +01:00
Filippo Valsorda
056d857571 refactor YouTube subtitles code, it was ugly (my bad) 2012-12-20 14:20:24 +01:00
Philipp Hagemeister
69a3883199 Enable 3.3 in Travis (works; see https://travis-ci.org/phihag/youtube-dl/jobs/3757443 ) 2012-12-20 13:48:39 +01:00
Nick Daniels
0dcfb234ed Update Vimeo Info Extractor to get pull in the description properly 2012-12-20 13:27:44 +01:00
Nick Daniels
43e8fafd49 Refactor IDParser to search for elements by any attribute not just ID 2012-12-20 13:27:38 +01:00
Philipp Hagemeister
314d506b96 Do not use deprecated method 2012-12-20 13:26:37 +01:00
Philipp Hagemeister
af42895612 Extend json info data / description file test 2012-12-20 13:26:21 +01:00
Philipp Hagemeister
bfa6389b74 Clean up legacy code 2012-12-20 13:25:54 +01:00
Philipp Hagemeister
9b14f51a3e Remove legacy code 2012-12-20 13:14:27 +01:00
Philipp Hagemeister
f4bfd65ff2 Correct JSON writing (Closes #596) 2012-12-20 13:13:24 +01:00
Philipp Hagemeister
3cc687d486 test write_info_json 2012-12-20 13:11:52 +01:00
Nick Daniels
cdb3076445 Sublime space formatting 2012-12-19 14:19:08 +00:00
Nick Daniels
8a2f13c304 Ignore DS_Store files in Git 2012-12-19 14:17:21 +00:00
Philipp Hagemeister
77bd7968ea Switch test to metacafe.com, whose DNS seems to be fine atm 2012-12-17 20:32:05 +01:00
Philipp Hagemeister
993693aa79 Merge remote-tracking branch 'origin/master' 2012-12-17 20:21:41 +01:00
Philipp Hagemeister
ce4be3a91d Remove some antipatterns and ensure that we always write the JSON file with UTF-8 2012-12-17 19:48:10 +01:00
Filippo Valsorda
937021133f a number of new tests and fixes; all tests green on 3.3 2012-12-17 18:33:11 +01:00
Filippo Valsorda
f7b111b7d1 Google Video has been shutdown as of 11/15/2012. All videos on Google Video will be migrated to YouTube by the end of 2012. 2012-12-17 16:33:49 +01:00
Filippo Valsorda
80d3177e5c various py3 fixes; all tests green on 3.3 2012-12-17 16:25:03 +01:00
Filippo Valsorda
5e5ddcfbcf test subtitles 2012-12-17 16:23:55 +01:00
Philipp Hagemeister
5910e210f4 Fix --extract-audio on Python 3 2012-12-16 12:29:03 +01:00
Philipp Hagemeister
b375c8b946 Tests for justin.tv 2012-12-16 11:17:10 +01:00
Philipp Hagemeister
88f6c78b02 Credit vasi for justin.tv 2012-12-16 11:16:57 +01:00
Dave Vasilevsky
4096b60948 Misc justin.tv fixes 2012-12-16 04:45:46 -05:00
Dave Vasilevsky
2ab1c5ed1a Support more than 100 videos for justin.tv 2012-12-16 04:26:22 -05:00
Dave Vasilevsky
0b40544f29 Preliminary support for twitch.tv and justin.tv 2012-12-16 03:50:41 -05:00
Jeff Crouse
187da2c093 added YouJizz extractor 2012-12-16 00:26:27 -05:00
Jeff Crouse
9a2cf56d51 Fixed a problem with the XNXXIE Regex 2012-12-15 23:22:07 -05:00
Philipp Hagemeister
0be41ec241 Do not decode None 2012-12-15 23:55:13 +01:00
Philipp Hagemeister
f1171f7c2d Fix VimeoIE in Python 3 2012-12-15 18:25:00 +01:00
Philipp Hagemeister
28ca6b5afa Fix Dailymotion in Python 3 2012-12-15 18:23:17 +01:00
Philipp Hagemeister
bec102a843 Fix XNXX in Python 3 2012-12-15 18:19:25 +01:00
Philipp Hagemeister
8f6f40d991 More Youku Python 3 fixing 2012-12-15 17:59:09 +01:00
Philipp Hagemeister
e2a8ff24a9 Fix YoukuIE in Python3 (and in general) 2012-12-15 17:57:13 +01:00
Philipp Hagemeister
8588a86f9e Fix xvideo IE in Python 3 2012-12-15 17:50:45 +01:00
Philipp Hagemeister
5cb9c3129b restrict sys.argv craziness to Python 2 (Fixes #591) 2012-12-15 17:44:48 +01:00
Philipp Hagemeister
4cc3d07426 NBA IE (Closes #590) 2012-12-13 21:27:57 +01:00
Philipp Hagemeister
5d01a64719 Revert "Don't be too clever"
This reverts commit a276e06080.
2012-12-12 15:14:58 +01:00
Philipp Hagemeister
a276e06080 Don't be too clever 2012-12-12 15:00:03 +01:00
Filippo Valsorda
fd5ff02042 streamlined and simplified dynamic tests generation; readded a couple of test features 2012-12-12 14:15:21 +01:00
Filippo Valsorda
2b5b2cb84c Merge remote-tracking branch 'gcmalloc/master' into fork_master 2012-12-12 14:11:40 +01:00
nto
ca6849e65d Add support for comedycentral clips (closes #233)
Support individual clips, not just full episodes.
break up now monstrous _VALID_URL regex over multiple lines to improve readability,
pass re.VERBOSE flag when using regex to ignore the whitespace
2012-12-11 21:38:16 -06:00
gcmalloc
1535ac2ae9 test automation 2012-12-12 04:03:35 +01:00
gcmalloc
a4680a590f changing the template file extension 2012-12-11 20:49:54 +01:00
Filippo Valsorda
fedb6816cd rollback tests multiprocess, Travis and OSX don't support it 2012-12-11 20:07:35 +01:00
gcmalloc
f6152b4b64 changing the template file extension 2012-12-11 19:17:02 +01:00
Philipp Hagemeister
4b618047ce Speed up testing (<10s instead of 25s) 2012-12-11 18:52:50 +01:00
Philipp Hagemeister
2c6945be30 Fix TestYoutubeLists.test_youtube_user 2012-12-11 18:07:38 +01:00
Philipp Hagemeister
9a6f4429a0 Fix test selection in Python 2.6 2012-12-11 18:03:22 +01:00
Philipp Hagemeister
4c21c56bfe Merge branch 'master' of github.com:rg3/youtube-dl 2012-12-11 17:07:13 +01:00
Filippo Valsorda
2a298b72eb Release 2012.12.11 2012-12-11 17:00:13 +01:00
Philipp Hagemeister
55c0539872 Fix blip.tv in python3 2012-12-11 17:00:11 +01:00
Filippo Valsorda
9789a05c20 fix playlist pagination and add YT playlist tests (closes #569) 2012-12-11 16:58:36 +01:00
Philipp Hagemeister
d050de77f9 Merge pull request #580 from FiloSottile/master
The new shiny build system
2012-12-11 07:52:44 -08:00
Filippo Valsorda
95eb771dcd Merge branch 'master' into fork_master
Conflicts:
	.travis.yml
2012-12-11 12:15:16 +01:00
Filippo Valsorda
4fb1acc212 use the new --test option to speed up tests (fetch only first 10K)
now all tests working and passing
2012-12-11 12:12:02 +01:00
Filippo Valsorda
d3d3199870 gentests: allow test-specific FileDownloader params override from tests.json 2012-12-11 12:09:22 +01:00
Filippo Valsorda
1ca63e3ae3 the test didn't load our Gzip opener
this was blocking the Vimeo test

+ some more gentest fixes
2012-12-11 11:33:15 +01:00
Filippo Valsorda
59ce201915 print traceback on trouble if --verbose (why didn't I think of this before!?) 2012-12-11 11:02:21 +01:00
Filippo Valsorda
8d5d3a5d00 exposing the test mode as --test (hidden and undocumented) 2012-12-11 09:57:40 +01:00
Filippo Valsorda
37c8fd4842 added a test mode to FileDownloader that fetches only first 10K 2012-12-11 09:49:27 +01:00
Filippo Valsorda
3c6ffbaedb Merge 'rg3/master' into fork_master 2012-12-08 01:57:43 +01:00
Filippo Valsorda
c7287a3caf ATTENTION DO NOT USE THESE: new binaries in the Downloads section
placed fake binaries that update themselves where old versions updating will search for the new version
2012-12-08 01:52:39 +01:00
Filippo Valsorda
5a304a7637 new updating scheme, based on GH downloads; also, check if not updateable (pip installed) 2012-12-08 00:48:07 +01:00
Filippo Valsorda
4c1d273e88 it's curious but bash-completion is with - and not _ 2012-12-08 00:37:26 +01:00
gcmalloc
a9d2f7e894 making the script compatible with python3 2012-12-07 22:01:02 +01:00
gcmalloc
682407f2d5 little correction on the readme 2012-12-07 21:40:06 +01:00
gcmalloc
bdff345529 adding a proper bash-completion generation 2012-12-07 21:38:45 +01:00
Filippo Valsorda
23109d6a9c youtube-dl.tar.gz make target 2012-12-07 14:46:14 +01:00
Filippo Valsorda
4bb028f48e devscripts/make_readme.py in place of all that sedding, that has porting problems 2012-12-07 14:45:16 +01:00
Filippo Valsorda
fec89790b1 and now, also py2exe compiles fine :) (on Windows) 2012-12-07 12:04:52 +01:00
Filippo Valsorda
a5741a3f5e pip installs fine! 2012-12-07 11:39:08 +01:00
Philipp Hagemeister
863baa16ec SoundCloud IDs have changed, fix tests 2012-12-07 01:34:40 +01:00
Philipp Hagemeister
c7214f9a6f Use Soundcloud API (Closes #579) 2012-12-07 01:30:03 +01:00
Philipp Hagemeister
8fd3afd56c More work on soundcloud IE 2012-12-07 01:24:51 +01:00
Philipp Hagemeister
f9b2f2b955 Correct accidental rename 2012-12-07 00:57:06 +01:00
Philipp Hagemeister
633b4a5ff6 Mark SoundCloud IE as nonfunctional for now (#579) 2012-12-07 00:50:56 +01:00
Philipp Hagemeister
b4cd069d5e Better error reporting for SoundCloud IE 2012-12-07 00:40:13 +01:00
Philipp Hagemeister
0f8d03f81c Let YoutubeDLHandler (transparent gzip) handle HTTPS URLs as well (Needed for #579) 2012-12-07 00:39:44 +01:00
Philipp Hagemeister
077174f4ed Add an example to the -o documentation (#573) 2012-12-04 11:05:38 +01:00
Philipp Hagemeister
e387eb5aba Let youtube IE handle IDs starting with PL (Closes #572) 2012-12-04 10:59:38 +01:00
Philipp Hagemeister
4083bf81a0 Correct metacafe test filename (happens to start with an underscore) 2012-12-03 20:17:47 +01:00
Philipp Hagemeister
796173d08b Keep video IDs verbatim if possible (Closes #571) 2012-12-03 15:36:41 +01:00
Philipp Hagemeister
e575b6821e Improve execution tests 2012-12-01 15:52:34 +01:00
Philipp Hagemeister
d78be7e331 Add test for Youku (Mentioned in #314) 2012-11-30 08:42:11 +01:00
Philipp Hagemeister
15c8d83358 Fix Soundcloud IE (+ Python3 support) 2012-11-29 20:40:12 +01:00
Philipp Hagemeister
e91d2338d8 Fix MD5 calculation 2012-11-29 20:38:16 +01:00
Philipp Hagemeister
4b235346d6 Add irc channel notice 2012-11-29 19:45:07 +01:00
Philipp Hagemeister
ad348291bb Enable travis notifications 2012-11-29 19:41:09 +01:00
Filippo Valsorda
2f1765c4ea setup.py Python3 fix, PyPi classifiers 2012-11-29 19:21:19 +01:00
Philipp Hagemeister
3c5b63d2d6 Merge branch 'master' of github.com:rg3/youtube-dl 2012-11-29 18:14:43 +01:00
Filippo Valsorda
cc51a7d4e0 New repo skeleton, getting ready for PyPi 2012-11-29 16:51:55 +01:00
Philipp Hagemeister
8af4ed7b4f Fix 2.6 nosetests 2012-11-29 16:35:57 +01:00
Filippo Valsorda
8192ebe1f8 Merge remote-tracking branch 'origin/master' into fork_master
New tests - merged with md5 correction
2012-11-29 15:38:07 +01:00
Filippo Valsorda
20ba04267c removed __main__.py from the root of the repo 2012-11-29 15:20:20 +01:00
Philipp Hagemeister
743b28ce11 Allow youtube_dl/__main__.py to be called directly 2012-11-29 15:11:24 +01:00
gcmalloc
caaa47d372 adding the script hook 2012-11-29 14:12:06 +01:00
gcmalloc
10f100ac8a cleaning binaries 2012-11-28 19:38:37 +01:00
Philipp Hagemeister
8176041605 Check during test runtime instead of test generation for _WORKING, and add 2.6 compat 2012-11-28 19:03:11 +01:00
gcmalloc
87bec4c715 getting version from git or failing 2012-11-28 18:49:56 +01:00
gcmalloc
190e8e27d8 removing the zip option, this can be done with python setup.py bdist --format=zip 2012-11-28 18:33:58 +01:00
gcmalloc
4efe62a016 moving to setup.py 2012-11-28 18:24:16 +01:00
gcmalloc
c64de2c980 correction on the test 2012-11-28 18:21:53 +01:00
Philipp Hagemeister
6ad98fb3fd Correct exception raising 2012-11-28 18:21:06 +01:00
Philipp Hagemeister
b08e09c370 Mark broken IEs in --list-extractors 2012-11-28 17:58:55 +01:00
Philipp Hagemeister
cdab8aa389 Update download tests 2012-11-28 15:09:56 +01:00
Philipp Hagemeister
3cd69a54b2 Merge branch 'master' of github.com:rg3/youtube-dl 2012-11-28 12:59:55 +01:00
Philipp Hagemeister
627dcfff39 Restrict more characters (Closes #566) 2012-11-28 12:59:27 +01:00
Filippo Valsorda
df5cff3751 make tests skip on not _WORKING 2012-11-28 11:54:20 +01:00
Filippo Valsorda
79ae0a06d5 comment out 3.3 testing until Travis implements it 2012-11-28 11:46:56 +01:00
gcmalloc
2d2fa229ec making the metacafe test pass 2012-11-28 11:46:03 +01:00
Filippo Valsorda
5a59fd6392 new .travis.yml with notifications and 3.3 2012-11-28 11:46:03 +01:00
Filippo Valsorda
0eb0faa26f Mark CollegeHumorIE not working until phihag finishes 2012-11-28 11:43:35 +01:00
Filippo Valsorda
32761d863c fix YouTubeIE on 2.6, sorry 2012-11-28 11:28:59 +01:00
Philipp Hagemeister
799c076384 collegehumor: able to download a single f4f file (not yet playable) 2012-11-28 04:51:27 +01:00
Philipp Hagemeister
f1cb5bcad2 Make __main__ work in all scenarios with relative imports 2012-11-28 03:55:35 +01:00
Philipp Hagemeister
9e8056d5a7 Use relative imports 2012-11-28 03:34:40 +01:00
Philipp Hagemeister
c6f3620859 Drop 2.5 support 2012-11-28 03:30:35 +01:00
Philipp Hagemeister
59ae15a507 Convert all tabs to 4 spaces (PEP8) 2012-11-28 02:04:46 +01:00
Philipp Hagemeister
40b35b4aa6 hack for apparently broken parse_qs in python2 2012-11-28 02:01:09 +01:00
Philipp Hagemeister
be0f77d075 test import 2012-11-28 02:00:45 +01:00
Philipp Hagemeister
0f00efed4c Woooohooo! python3 youtube_dl BaW_jenozKc -t works! 2012-11-28 00:56:20 +01:00
Philipp Hagemeister
e6137fd61d Remove superfluous encodings 2012-11-28 00:53:09 +01:00
Philipp Hagemeister
8cd10ac4ef Fix printing title etc. 2012-11-28 00:46:21 +01:00
Philipp Hagemeister
64a57846d3 correct to_stderr 2012-11-28 00:33:38 +01:00
Philipp Hagemeister
72f976701a youtube IE: Correct bytes vs str 2012-11-28 00:31:59 +01:00
Philipp Hagemeister
5bd9cc7a6a typo 2012-11-28 00:22:55 +01:00
Philipp Hagemeister
f660c89d51 Use list comprehension instead of map 2012-11-28 00:19:24 +01:00
Philipp Hagemeister
73dce4b2e4 Import from the correct module 2012-11-28 00:17:59 +01:00
Philipp Hagemeister
9f37a95941 Py2/3 parse_qs compatibility 2012-11-28 00:17:12 +01:00
Philipp Hagemeister
a130bc6d02 One more except..as 2012-11-28 00:13:40 +01:00
Philipp Hagemeister
348d0a7a18 Py2/3 compatibility for http.client 2012-11-28 00:13:00 +01:00
Philipp Hagemeister
03f9daab34 Use io.BytesIO instead of StringIO 2012-11-28 00:09:17 +01:00
Philipp Hagemeister
a8156c1d2e Python 3 version of HTMLParser 2012-11-28 00:06:28 +01:00
Philipp Hagemeister
3e669f369f Py3 compat for unichr and htmlentitydefs 2012-11-28 00:02:55 +01:00
Philipp Hagemeister
da779b4924 Fall back to urllib instead of urllib2 for Python 3 urllib.parse 2012-11-27 23:58:47 +01:00
Philipp Hagemeister
89fb51dd2d Remove ur references for Python 3.3 support 2012-11-27 23:56:10 +01:00
Philipp Hagemeister
01ba00ca42 Prepare urllib references for 2/3 compatibility 2012-11-27 23:54:09 +01:00
Philipp Hagemeister
e08bee320e Use except .. as everywhere (#180) 2012-11-27 23:31:55 +01:00
Philipp Hagemeister
96731798db Rename util.u to util.compat_str 2012-11-27 23:29:18 +01:00
Philipp Hagemeister
c116339ddb Merge branch 'master' of github.com:rg3/youtube-dl 2012-11-27 23:23:37 +01:00
Filippo Valsorda
e643e2c6b7 Merge pull request #563 from FiloSottile/IE_cleanup
General IE docs and return dicts cleanup
2012-11-27 14:22:40 -08:00
Filippo Valsorda
c63cc10ffa Merge remote-tracking branch 'origin/master' into IE_cleanup
Conflicts:
	youtube_dl/FileDownloader.py
2012-11-27 23:20:32 +01:00
Philipp Hagemeister
dae7c920f6 Make test_utils.py run on Python 3 2012-11-27 23:20:29 +01:00
Filippo Valsorda
f462df021a Use None on missing required info_dict fields 2012-11-27 23:15:33 +01:00
Philipp Hagemeister
1a84d8675b Use u instead of str in Python 2 2012-11-27 23:11:44 +01:00
Philipp Hagemeister
18ea0cefc3 Merge pull request #560 from phihag/fix-to_screen-mode
to_screen: Only encode when output stream is binary
2012-11-27 13:10:57 -08:00
Philipp Hagemeister
c806f804d8 Only encode when output stream is binary 2012-11-27 21:07:25 +01:00
Filippo Valsorda
03c5b0fbd4 IE._WORKING attribute in order to warn the users and skip the tests on broken IEs 2012-11-27 19:30:09 +01:00
Philipp Hagemeister
95649b3936 Replace long with int (see PEP 237) 2012-11-27 19:05:03 +01:00
Philipp Hagemeister
3aeb78ea4e Better formatting (PEP 8) 2012-11-27 19:03:37 +01:00
Philipp Hagemeister
dd109dee8e Remove mentions of unicode 2012-11-27 19:02:37 +01:00
Philipp Hagemeister
b514df2034 Clean up with the help of pep8 2012-11-27 18:55:35 +01:00
Philipp Hagemeister
0969bdd305 unify spacing 2012-11-27 18:49:18 +01:00
Philipp Hagemeister
1a9c655e3b Merge remote-tracking branch 'Asido/master' 2012-11-27 18:48:43 +01:00
Philipp Hagemeister
88db5ef279 2012.11.29 2012-11-27 18:36:43 +01:00
Philipp Hagemeister
f8d8b39bba Prepare 2012.11.29 release 2012-11-27 18:30:34 +01:00
Philipp Hagemeister
dcd60025f8 Fix filename sanitation (Closes #555) 2012-11-27 18:27:46 +01:00
Filippo Valsorda
7e4674830e document info_dict['subtitles'] and info_dict['urlhandle'] 2012-11-27 18:08:07 +01:00
Filippo Valsorda
9ce5d9ee75 make all IEs return 'upload_date' and 'uploader', even if only u'NA' 2012-11-27 17:57:12 +01:00
Filippo Valsorda
b49e75ff9a info_dict['upload_date'] is documented in --output, IEs MUST specify it 2012-11-27 17:38:22 +01:00
Filippo Valsorda
abe7a3ac2a info_dict['player_url'] is used only for rtmpdump, indicate it as optional in the info_dict 2012-11-27 17:32:25 +01:00
Filippo Valsorda
717b1f72ed default info_dict['format'] to info_dict['ext'] and make the YT one more verbose 2012-11-27 17:20:25 +01:00
Philipp Hagemeister
26396311b5 Add Christian Albrecht (Arte.tv IE) to authors 2012-11-27 17:16:49 +01:00
Philipp Hagemeister
dffe658bac Remove exclamation mark in --restrict-filenames mode 2012-11-27 17:15:33 +01:00
Philipp Hagemeister
33d94a6c99 Merge remote-tracking branch 'alab1001101/master' 2012-11-27 17:14:29 +01:00
Philipp Hagemeister
4d47921c9e ignore kate swap files 2012-11-27 17:01:12 +01:00
Philipp Hagemeister
d94adc2638 Actually fix manpage (#473) 2012-11-27 16:58:50 +01:00
Philipp Hagemeister
5c5d06d31d Merge pull request #473 from grimreaper/master
fix mdoc nits
2012-11-27 07:52:58 -08:00
Philipp Hagemeister
cc872b68a8 Actually merge #379 2012-11-27 16:42:50 +01:00
Philipp Hagemeister
17cb14a336 Merge remote-tracking branch 'joelverhagen/master' 2012-11-27 16:41:16 +01:00
Philipp Hagemeister
877f4c45d3 Fix output format doc 2012-11-27 16:28:29 +01:00
Philipp Hagemeister
02531431f2 Extended documentation for output format in README (Closes #268) 2012-11-27 16:27:35 +01:00
Philipp Hagemeister
e02066e7ff Windows build for 2012.11.28 2012-11-27 16:15:15 +01:00
Philipp Hagemeister
c9128b353d Bump version number to a numeric-only one to appease py2exe 2012-11-27 16:12:08 +01:00
Philipp Hagemeister
e7c6f1a2dc Bump version number 2012-11-27 16:08:39 +01:00
Philipp Hagemeister
1a911e60a4 Add test for asian characters (#551) 2012-11-27 16:07:52 +01:00
Philipp Hagemeister
46cbda0be4 Minor filename encoding improvement in a common case 2012-11-27 15:07:10 +01:00
Philipp Hagemeister
fa59f4b6a9 Merge remote-tracking branch 'chrisjrn/master' 2012-11-27 14:55:18 +01:00
Christopher Neugebauer
4a702f3819 Fixes the InfoExtractor for the Colbert Report. 2012-11-27 23:54:43 +11:00
Philipp Hagemeister
6bac102a4d Fix spacing in comedycentral IE 2012-11-27 13:24:10 +01:00
Philipp Hagemeister
958a22b7cf Merge remote-tracking branch 'chrisjrn/master' 2012-11-27 13:19:18 +01:00
Philipp Hagemeister
97cd3afc75 warn if %(stitle)s is being used 2012-11-27 13:11:06 +01:00
Philipp Hagemeister
aa2a94ed81 Encode the entire filename 2012-11-27 13:01:32 +01:00
Philipp Hagemeister
c7032546f1 Clean up test 2012-11-27 12:46:27 +01:00
Philipp Hagemeister
56781d3d2e Switch back to underline for invalid characters, and make restricted ASCII-only 2012-11-27 12:46:09 +01:00
Christopher Neugebauer
feb22fe5fe Fixed indentation error 2012-11-27 22:32:24 +11:00
Christopher Neugebauer
d8dddb7c02 Removes extranous debugging info :) 2012-11-27 22:30:07 +11:00
Christopher Neugebauer
4408d996fb Adds format listing/selection support to the Comedy Central extractor. 2012-11-27 22:28:16 +11:00
Philipp Hagemeister
ed7516c69d Merge remote-tracking branch 'chrisjrn/master' 2012-11-27 12:25:51 +01:00
Christopher Neugebauer
89af8e9d32 Removes extraneous debug message. 2012-11-27 21:51:30 +11:00
Christopher Neugebauer
36a9c0b5ff Points the ComedyCentral extractor at a CDN which works with more RTMPDump versions. 2012-11-27 21:49:27 +11:00
Philipp Hagemeister
9fb3bfb45a Merge remote-tracking branch 'gcmalloc/master' 2012-11-27 00:42:47 +01:00
Filippo Valsorda
d479e34043 release 2012.11.27 2012-11-27 00:22:39 +01:00
Philipp Hagemeister
240089e5df remove accidental remnants 2012-11-27 00:14:12 +01:00
Philipp Hagemeister
1c469a9480 New optoin --restrict-filenames 2012-11-26 23:58:46 +01:00
Philipp Hagemeister
71f36332dd Remove redundancy in instructions 2012-11-26 23:40:51 +01:00
Philipp Hagemeister
8179d2ba74 Merge branch 'master' of github.com:rg3/youtube-dl 2012-11-26 23:25:04 +01:00
Philipp Hagemeister
df4bad3245 Document configuration 2012-11-26 23:24:55 +01:00
Filippo Valsorda
a7b5c8d6a8 fix FAQ on how to compile (also, starnge fix in the Makefile) 2012-11-26 22:35:12 +01:00
Philipp Hagemeister
92b91c1878 Use character instead of byte strings 2012-11-26 04:23:20 +01:00
Philipp Hagemeister
7ec1a206ea Remove longs (int does the right thing since Python 2.2, see PEP 237) 2012-11-26 04:13:43 +01:00
Philipp Hagemeister
51937c0869 Add some parentheses around print for #180 2012-11-26 04:05:54 +01:00
Philipp Hagemeister
6b50761222 Merge pull request #538 from zejn/patch-1
Also enable album URLs on Vimeo.
2012-11-25 18:04:11 -08:00
Philipp Hagemeister
6571408dc6 Merge pull request #545 from FiloSottile/alias
Kill (alias) --literal and %(title)
2012-11-25 15:57:57 -08:00
Filippo Valsorda
b6fab35b9f alias %(title)s to %(stitle)s 2012-11-25 20:39:42 +01:00
Filippo Valsorda
baec15387c aliased --literal to --title 2012-11-25 20:28:49 +01:00
zejn
297d7fd9c0 Also enable album URLs on Vimeo. 2012-11-21 13:24:14 +01:00
Filippo Valsorda
5002aea371 release 2012.11.17 2012-11-17 14:02:31 +01:00
Jeff Crouse
5f7ad21633 Strip HTML out of uploader name 2012-11-13 17:48:30 -05:00
Jeff Crouse
089d47f8d5 Removed the README.md build target in the makefile. It is broken... 2012-11-13 17:48:10 -05:00
Filippo Valsorda
74033a662d Reworked Vimeo file selection logic (quality, codec) - closes #530 2012-11-13 21:53:18 +01:00
Jeff Crouse
fdef722fa1 Added YouPorn infoExtractor 2012-11-13 13:10:56 -05:00
Jeff Crouse
110d4f4c91 Added Pornotube support (for Laborers of Love) 2012-11-12 16:17:55 -05:00
Filippo Valsorda
0526e4f55a Merge pull request #522 from art-zhitnik/master
--(match|reject)-title utf8 fix
2012-11-11 06:22:10 -08:00
Art Zhitnik
39973a0236 Solve the bug of parsing titles with unicode (cyrillic) 2012-11-11 14:09:12 +10:00
Filippo Valsorda
5d40a470a2 quiet the HTMLParser debug info - closes #517 2012-11-09 12:32:07 +01:00
Filippo Valsorda
4cc391461a fix DailyMotion official users videos - closes #281 - by @yvestan 2012-11-07 14:44:10 +01:00
Filippo Valsorda
bf95333e5e fixed MetacafeIE (uploader nickname regex) - closes #515 2012-11-06 23:08:10 +01:00
Philipp Hagemeister
b7a34316d2 -x for --extract-audio, one of the most popular options 2012-10-30 17:41:38 +01:00
Philipp Hagemeister
74e453bdea New --id option for the old default filename pattern 2012-10-30 17:37:53 +01:00
Philipp Hagemeister
156a59e7a9 Additional tests in file name sanitation 2012-10-29 08:19:54 +01:00
Philipp Hagemeister
aeca861f22 Merge pull request #502 from FiloSottile/new_sanitize_filename
My sanitize_filename proposal
2012-10-28 15:33:59 -07:00
Filippo Valsorda
42cb53fcfa modified filename escaping to a "smarter" one 2012-10-28 22:47:02 +01:00
Filippo Valsorda
fe4d68e196 slight change to Dailymotion uploader regex (fix) 2012-10-28 21:43:43 +01:00
Philipp Hagemeister
25b7fd9c01 Merge pull request #491 from tyll/master
Update install target
2012-10-26 01:10:25 -07:00
Till Maas
e79e8b7dc4 Update install target
- Allow to configure destination directories to fulfill the needs of
  different distributions
- Support DESTDIR variable for staging installation when packaging
- Do not set user/group to root. It requires 'make install' to run as
  root, but then this is the default behaviour anyways.
2012-10-25 21:19:13 +02:00
Filippo Valsorda
965a8b2bc4 Merge pull request #488 from Tailszefox/local
Fix audio bitrate quality for ffmpeg/avconv (closes #487)
2012-10-24 11:42:31 -07:00
gcmalloc
a8ac2f8664 adding second vimeo url 2012-10-24 15:57:19 +02:00
gcmalloc
fb0e99b884 skipping vimeo for the moment 2012-10-24 00:32:23 +02:00
gcmalloc
9c6e9a4532 adding xnxx test 2012-10-24 00:13:16 +02:00
gcmalloc
67af74992e adding collegehumor test 2012-10-24 00:05:45 +02:00
gcmalloc
103c508ffa adding stanford open class courses 2012-10-23 23:59:12 +02:00
gcmalloc
2876773381 adding test for vimeo, xvideo and soundcloud 2012-10-23 23:53:33 +02:00
Tailszefox
f06eaa873e Fix audio bitrate quality for ffmpeg/avconv 2012-10-23 16:37:12 +02:00
Philipp Hagemeister
ece34e8951 Merge pull request #486 from Tailszefox/local
Added duration for YouTube videos
2012-10-23 05:53:28 -07:00
Tailszefox
2262a32dd7 Added duration for YouTube videos 2012-10-22 18:32:42 +02:00
Philipp Hagemeister
c6c0e23a32 Support raw playlist parameters (Closes #482) 2012-10-22 13:01:36 +02:00
Philipp Hagemeister
02b324a23d Restore 2.5 compat by activating with_statement future 2012-10-22 12:51:20 +02:00
Filippo Valsorda
b8005afc20 handle YT urls with #/ redirects (closes #484) 2012-10-22 09:15:27 +02:00
Philipp Hagemeister
073522bc6c Don't use 2.7+ check_output 2012-10-19 23:28:37 +02:00
Philipp Hagemeister
9248cb0549 Merge pull request #472 from gcmalloc/master
Test proposal
2012-10-19 05:48:12 -07:00
gcmalloc
6b41b61119 correcting travis 2012-10-19 12:53:20 +02:00
gcmalloc
591bbe9c90 changing test from md5 to filesize, the file changed between download 2012-10-19 12:53:20 +02:00
gcmalloc
fc7376016c cleaning the test that doesn't work with the api for the moment 2012-10-19 12:53:20 +02:00
gcmalloc
97a37c2319 some assertion on the file downloaded 2012-10-19 12:53:20 +02:00
gcmalloc
3afed78a6a removing testing video 2012-10-19 12:53:20 +02:00
gcmalloc
4279a0ca98 correcting test to be compatible with python2.6 2012-10-19 12:53:20 +02:00
gcmalloc
edcc7d2dd3 StringIO used by nosetests do not merge with the way youtube-dl handle sys.stdout and sys.stderr 2012-10-19 12:53:19 +02:00
gcmalloc
7f60b5aa40 correction on the test 2012-10-19 12:53:19 +02:00
Eitan Adler
65adb79fb6 Fix mandoc nits 2012-10-15 21:45:56 -04:00
gcmalloc
aeeb29a356 adding travis support 2012-10-15 10:58:35 +02:00
Filippo Valsorda
902b2a0a45 New IE: YouTube channels (closes #396) 2012-10-14 13:48:18 +02:00
gcmalloc
6d9c22cd26 correcting the makefile according to the new one 2012-10-12 20:30:01 +02:00
gcmalloc
729baf58b2 removing extended globbing for the find utility 2012-10-12 20:25:22 +02:00
gcmalloc
4c9afeca34 adding xvideo 2012-10-12 20:25:22 +02:00
gcmalloc
6da7877bf5 adding facebook test 2012-10-12 20:25:22 +02:00
gcmalloc
b4e5de51ec adding photobucket test 2012-10-12 20:25:22 +02:00
gcmalloc
a4b5f22554 adding metacafe test 2012-10-12 20:25:22 +02:00
gcmalloc
ff08984246 adding dailymotion test 2012-10-12 20:25:22 +02:00
gcmalloc
137c5803c3 some changes to keep the same standard 2012-10-12 20:25:22 +02:00
gcmalloc
3eec021a1f removing unused global modifier 2012-10-12 20:25:22 +02:00
gcmalloc
5a33b73309 correcting the makefile 2012-10-12 20:25:22 +02:00
gcmalloc
0b4e98490b changing test video 2012-10-12 20:24:58 +02:00
gcmalloc
80a846e119 correction on the test for the utils.py 2012-10-12 20:24:58 +02:00
gcmalloc
434d60cd95 adding clean rule in the makefile 2012-10-12 20:24:58 +02:00
gcmalloc
efe8902f0b adding download test with md5 check 2012-10-12 20:24:58 +02:00
gcmalloc
44fb345437 adding TestCase class and corresponding test 2012-10-12 20:24:58 +02:00
gcmalloc
9993976ae4 correction on the sanitize title method, change in title resulting 2012-10-12 20:24:58 +02:00
gcmalloc
b387fb0385 adding test rule in the Makefile 2012-10-12 20:24:58 +02:00
Filippo Valsorda
10daa766a1 support EDU YouTube playlists (closes #407) 2012-10-11 08:27:19 +02:00
Filippo Valsorda
7b107eea51 release 2012.10.09 2012-10-09 15:53:20 +02:00
Filippo Valsorda
646b885cbf Added missing dependencies to Makefile 2012-10-09 15:49:24 +02:00
Filippo Valsorda
0bfd0b598a Re-engineered Dailymotion qualities selection (thanks @knagano, sort of merges #176) 2012-10-09 12:28:44 +02:00
Filippo Valsorda
fd873c69a4 Merge PR #422 from 'kevinamadeus/master'
Add InfoExtractor for Google Plus video
(with fixes)
2012-10-09 10:48:49 +02:00
Filippo Valsorda
d64db7409b Merge pull request #458 from grimreaper/patch-1
There is nothing bash specific in release.sh, switch to /bin/sh
2012-10-09 01:16:40 -07:00
Philipp Hagemeister
27fec0e3bd Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-08 22:14:28 +02:00
Philipp Hagemeister
65f934dc93 Correct detect_executables on Windows (Closes #447, #457) 2012-10-08 22:14:19 +02:00
grimreaper
d51d784f85 There is nothing bash specific here
/bin/bash is always wrong. Since there is nothing bash specific here, switch to /bin/sh
2012-10-06 10:00:40 -03:00
Filippo Valsorda
aa85963987 Merge pull request #452 from Tailszefox/local
Added uploaded date for Dailymotion
2012-10-03 11:29:51 -07:00
Tailszefox
413575f7a5 Added uploaded date for Dailymotion 2012-10-03 10:57:46 +02:00
Philipp Hagemeister
b7b4796bf2 Fix docs 2012-10-01 18:39:24 +02:00
Philipp Hagemeister
fcbc8c830e Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-01 18:38:19 +02:00
Philipp Hagemeister
f48ce130c7 Fix doc of extractor field 2012-10-01 18:38:10 +02:00
Filippo Valsorda
13e69f546c Merged, modified and compiled Dailymotion pull request #446 by @Steap 2012-09-30 21:45:43 +02:00
Cyril Roelandt
63ec7b7479 DailymotionIE: There is not necessarily an underscore in a Dailymotion URL. 2012-09-30 15:47:37 +02:00
Cyril Roelandt
7b6d7001d8 DailymotionIE: some videos do not use the "hqURL", "sdURL", "ldURL" keywords. In this case, the "video_url" keyword should be looked for. 2012-09-30 15:47:29 +02:00
Filippo Valsorda
39ce6e79e7 Updated youtube-dl.exe 2012-09-29 19:12:56 +02:00
Filippo Valsorda
5c961d89df Merge pull request #403 from FiloSottile/re_VERBOSE 2012-09-29 17:05:40 +02:00
Filippo Valsorda
3c4d6c9eba Not all Dailymotion videos have an hqURL, now downloads highest quality available 2012-09-29 16:53:06 +02:00
Filippo Valsorda
349e2e3e21 Fixed DailymotionIE, now downloads high-def mp4s, which might be too much (?) 2012-09-29 16:38:38 +02:00
Filippo Valsorda
551fa9dfbf adding new --output replacements. Thanks @danut007ro (closes #442) 2012-09-29 15:49:10 +02:00
Filippo Valsorda
ce3674430b added new FAQ on exe dependency 2012-09-29 15:35:07 +02:00
Filippo Valsorda
5cdfaeb37b New FAQ: What is this binary file? (+ small fix to other one) 2012-09-28 19:55:18 +02:00
Philipp Hagemeister
38612b4edc update default UA string (Closes #390) 2012-09-27 23:38:11 +02:00
Philipp Hagemeister
6c5b442a9b Add recent breakage to FAQ (Closes #433) 2012-09-27 23:30:17 +02:00
Philipp Hagemeister
5a5523698d Add new field "extractor" to the info dictionary 2012-09-27 20:48:16 +02:00
Philipp Hagemeister
05a2c206be Merge pull request #425 from danut007ro/master
Provider (youtube, etc) is now saved in info_dict
2012-09-27 11:45:07 -07:00
Philipp Hagemeister
8ca21983d8 Merge pull request #432 from cryzed/master
Fixed YouTube playlist parsing
2012-09-27 11:42:58 -07:00
Philipp Hagemeister
20326b8b1b Let Makefile use youtube-dl source code instead of compiled binary 2012-09-27 20:21:20 +02:00
Philipp Hagemeister
5d534e2fe6 Improve option definitions 2012-09-27 20:19:27 +02:00
Philipp Hagemeister
234e230c87 Merge remote-tracking branch 'FiloSottille/vbr'
Conflicts:
	youtube-dl
	youtube-dl.exe
2012-09-27 20:18:29 +02:00
Philipp Hagemeister
34ae0f9d20 Merge branch 'master' of github.com:rg3/youtube-dl 2012-09-27 19:56:29 +02:00
Philipp Hagemeister
df09e5f9e1 Merge pull request #405 from hdclark/master
Support for custom user agent
2012-09-27 10:56:25 -07:00
cryzed
3af2f7656c Fixed YouTube playlist parsing 2012-09-27 19:48:29 +02:00
Philipp Hagemeister
74e716bb64 original test video 2012-09-27 19:44:44 +02:00
Philipp Hagemeister
85f76ac90b Merge remote-tracking branch 'FiloSottille/automation' 2012-09-27 19:41:51 +02:00
Philipp Hagemeister
7f36e39676 Merge remote-tracking branch 'FiloSottille/supports'
Conflicts:
	youtube-dl
2012-09-27 19:24:41 +02:00
Philipp Hagemeister
ebe3f89ea4 Merge xnxx.com Support (NSFW). Test URL (SFW): http://video.xnxx.com/video1443330/youtube-dl_testvid_a_and_9829_._and_amp_and_38_ 2012-09-27 18:55:56 +02:00
Philipp Hagemeister
b5de8af234 Release 2012.09.27 2012-09-27 11:25:46 +02:00
Philipp Hagemeister
eb817499b0 Compile updated youtube-dl 2012-09-27 11:23:44 +02:00
Philipp Hagemeister
e2af9232b2 Merge pull request #428 from virtulis/master
A quick fix to #427
2012-09-27 02:22:05 -07:00
Danko Alexeyev
9ca667065e Add 'signature' to YouTube URLs, fixes #427 2012-09-27 09:44:49 +03:00
danut007ro
ae16f68f4a Provider (youtube, etc) is now saved in info_dict, so template filename can be something like %(provider)s_%(id)s.%(ext)s
This can be useful because videos should also be identified by their providers since id's can be the same on multiple providers.
2012-09-27 00:35:31 +03:00
danut007ro
3cd98c7894 Removed provider (mistake) and add provider parameter to process_info 2012-09-27 00:07:20 +03:00
danut007ro
2866e68838 Merge branch 'master' of https://github.com/rg3/youtube-dl 2012-09-26 21:09:44 +03:00
danut007ro
be8786a6a4 Every extractor also return it's name. 2012-09-26 21:00:28 +03:00
Filippo Valsorda
0e841bdc54 add PREFIX option to make install 2012-09-26 00:10:39 +02:00
Filippo Valsorda
225dceb046 moved make release to devscripts/release.sh 2012-09-25 23:56:01 +02:00
Philipp Hagemeister
b0d4f95899 Merge pull request #391 from rbrito/support-tube.majestyc.net
Support downloading Youtube videos via tube.majestyc.net
2012-09-25 14:17:13 -07:00
Kevin Kwan
d443aca863 Add InfoExtractor for Google Plus video 2012-09-25 16:21:02 +08:00
Christian Albrecht
2ebc6e6a92 Make youtube-dl 2012-08-26 09:57:49 +02:00
Christian Albrecht
f2ad10a97d Add arte.tv Info Extractor 2012-08-26 09:47:19 +02:00
hdclark
ea46fe2dd4 Added support for custom user agents.
Added a few simple lines to add support for the flag "--user-agent" to pass a custom string to std_header['User-Agent'].
2012-08-22 23:40:35 -07:00
Filippo Valsorda
202e76cfb0 Made the YouTubeIE regex verbose/commented 2012-08-20 00:58:10 +02:00
Filippo Valsorda
3a68d7b467 tweaked the --audio-quality input validation/specification 2012-08-19 23:25:16 +02:00
Filippo Valsorda
795cc5059a Re-engineered XNXXIE to actually exit on ERRORs even with -i 2012-08-19 18:46:23 +02:00
Filippo Valsorda
5dc846fad0 Merge pull request #398 from tempname/master 2012-08-19 18:39:43 +02:00
Filippo Valsorda
d5c4c4c10e bugfix and standarize the youku.com support 2012-08-19 17:44:34 +02:00
Filippo Valsorda
1ac3e3315e Merge pull request #395 from thesues/master 2012-08-19 17:08:39 +02:00
Filippo Valsorda
0e4dc2fc74 Merge 'rbrito/support-tube.majestyc.net' (PR #391) with small fix 2012-08-19 17:00:20 +02:00
Filippo Valsorda
9bb8dc8e42 Python 2.6 compatibility fix. Thanks @Jamesc359 - closes #400 2012-08-19 16:06:33 +02:00
tempname
154b55dae3 added InfoExtractor for XNXX 2012-08-15 20:57:27 -03:00
tempname
6de7ef9b8d added InfoExtractor for XNXX 2012-08-15 20:54:03 -03:00
dongmao zhang
392105265c Merge branch 'master' of github.com:thesues/youtube-dl
Conflicts:
	youtube-dl
	youtube_dl/InfoExtractors.py
2012-08-10 18:32:28 +08:00
dongmao zhang
51661d8600 add www.youku.com support 2012-08-09 13:54:19 +08:00
dongmao zhang
b5809a68bf merge 2012-08-09 12:26:26 +08:00
dongmao zhang
7733d455c8 fix 0a->0A bug 2012-08-09 03:14:02 +08:00
dongmao zhang
0a98b09bc2 youku default to download hd2 video 2012-08-09 02:53:21 +08:00
dongmao zhang
302efc19ea add youku support 2012-08-09 02:04:02 +08:00
Rogério Brito
55a1fa8a56 Support downloading Youtube videos via tube.majestyc.net
A user requested (in Debian's bug tracking system) that support for
tube.majestyc.net, a frontend for Youtube with accessibility functions
(and other support for other assistive technologies), be added.

This patch adds support for this.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2012-08-05 23:37:33 -03:00
Filippo Valsorda
dce1088450 A more "make-esque" Makefile with file targets and dependencies 2012-08-03 20:10:54 +02:00
Philipp Hagemeister
a171dbfc27 Merge pull request #386 from FiloSottile/blip
Blip.tv
2012-08-01 12:26:00 -07:00
Filippo Valsorda
11a141dec9 BlipTVUserIE fix 2012-08-01 21:11:04 +02:00
Filippo Valsorda
818282710b moved the User-Agent workaround to the BlipTV IE 2012-08-01 20:51:56 +02:00
Filippo Valsorda
7a7c093ab0 added one-step realese script 'make release version=nn' - closes #158 2012-08-01 18:40:27 +02:00
Filippo Valsorda
ce7b2a40d0 added automatically generated bash-completion; closes #191 2012-08-01 17:26:50 +02:00
Filippo Valsorda
cfcec69331 auto-generating manpage from README.md (closes #151); redesigned Makefile 2012-08-01 11:54:27 +02:00
Filippo Valsorda
91645066e2 Merge branch 'joehillen/master' - pull request #381 2012-08-01 11:35:04 +02:00
Filippo Valsorda
dee5d76923 changed YouTube closed captions URL; closes #382 2012-07-31 15:56:35 +02:00
Filippo Valsorda
363a4e1114 xvideos patch by @pocoimporta - closes #370 2012-07-31 01:40:29 +02:00
joehillen
ef0c08cdfe Added install target to Makefile. 2012-07-22 13:36:22 -07:00
Philipp Hagemeister
3210735c49 Fix EscapistMagazine IE 2012-07-18 21:17:51 +02:00
Joel Verhagen
aab4fca422 Updated --no-resize-buffer docs, removed -b option 2012-07-16 10:59:21 -04:00
Joel Verhagen
891d7f2329 Added options to set download buffer size and disable automatic buffer resizing. 2012-07-14 16:47:19 -04:00
Filippo Valsorda
b24676ce88 changed --audio-quality behaviour to support both CBR and VBR 2012-07-14 19:43:24 +02:00
Filippo Valsorda
cca4828ac9 fixed a logic bug in post-processing 2012-07-14 14:35:57 +02:00
Arvydas Sidorenko
bae611f216 Simplified preferredencoding()
Not sure what is the point to use yield to return encoding, thus
it will simplify the whole function.

Signed-off-by: Arvydas Sidorenko <asido4@gmail.com>
2012-07-01 18:21:27 +02:00
Filippo Valsorda
d4e16d3e97 YouTube playlist fix; closes #365 and #331 2012-06-30 15:04:30 +02:00
Filippo Valsorda
65dc7d0272 Merge pull request #363 from chalet16/master
Change a number of subtitle sequence to begin with one - closes #362
2012-06-26 05:35:37 -07:00
Witchakorn Kamolpornwijit
5404179338 Change a number of subtitle sequence to begin with one (instead of zero) for ffmpeg,avcodec, and Matroska compatibility 2012-06-26 19:24:30 +07:00
Filippo Valsorda
7df97fb59f display a meaningful error message on rental videos (#359) 2012-06-22 13:57:17 +02:00
Filippo Valsorda
3187e42a23 Merge pull requests #356 #357 #358 by jcarlosgarciasegovia 2012-06-06 20:51:29 +02:00
Juan Carlos Garcia Segovia
f1927d71e4 Some blip.tv URLs use Unicode characters. urllib2 breaks when passing a Unicode string. it needs a UTF-8 byte buffer 2012-06-06 16:24:29 +00:00
Juan Carlos Garcia Segovia
eeeb4daabc Information Extractor for blip.tv users 2012-06-06 16:16:16 +00:00
Juan Carlos Garcia Segovia
3c4fc580bb Use an User-Agent that will allow downloading from blip.tv fixes #325 2012-06-06 13:24:12 +00:00
Filippo Valsorda
17f3c40a31 Merge pull request #353 from FiloSottile/avconv
check for avconv and ffmpeg, use as available; closes #344
2012-06-03 03:39:16 -07:00
Filippo Valsorda
505ed3088f normalize ffmpeg/avconv names printing 2012-06-03 12:11:39 +02:00
Filippo Valsorda
0b976545c7 check for avconv and ffmpeg, use as available; closes #344 2012-06-03 12:10:15 +02:00
Philipp Hagemeister
a047951477 Merge pull request #352 from chocolateboy/decontaminate_stdout
don't corrupt stdout (-o -) in verbose mode
2012-05-31 00:04:32 -07:00
chocolateboy
6ab92c8b62 don't corrupt stdout (-o -) in verbose mode 2012-05-30 11:50:13 +01:00
Filippo Valsorda
f36cd07685 fixed a couple of Windows exe update bugs 2012-05-27 23:03:45 +02:00
Philipp Hagemeister
668d975039 quiet zip in make compile 2012-05-23 19:19:53 +02:00
Philipp Hagemeister
9ab3406ddb Fix Escapist IE 2012-05-23 19:19:31 +02:00
Philipp Hagemeister
1b91a2e2cf Merge pull request #342 from FiloSottile/master
Re-organized code and a lot of other stuff.
2012-05-22 04:35:59 -07:00
Filippo Valsorda
2c288bda42 reorganized the titles sanitizing: now title is the untouched title
and stitle is created in process_info() and is cross-filesystem sanitized by sanitize_filename();
closes #164
2012-05-09 14:47:28 +02:00
Filippo Valsorda
0b8c922da9 Introduced Trouble(Exception) for more elegant non-fatal errors handling 2012-05-09 09:43:11 +00:00
Filippo Valsorda
3fe294e4ef merge upstream 2012-05-01 18:22:08 +02:00
Filippo Valsorda
921a145592 dropped the support for Python 2.5
let's elaborate the decision: Python 2.5 is a 6 years old release
and "under the current release policy, no security issues in Python
2.5 will be fixed anymore" (!!); also, it doesn't support the new
zipfile distribution format.
2012-05-01 17:01:51 +02:00
Philipp Hagemeister
0c24eed73a merge #336 2012-04-19 09:46:01 +02:00
Philipp Hagemeister
29ce2c1201 Merge git://git.jankratochvil.net/youtube-dl 2012-04-19 09:44:25 +02:00
Jan Kratochvil
532c74ae86 Add format #46 - WebM 1920x1080. 2012-04-16 17:13:01 +02:00
Filippo Valsorda
9beb5af82e some HTMLParser bugfixes 2012-04-13 22:09:24 +02:00
Filippo Valsorda
9e6dd23876 merged unescapeHTML branch; removed lxml dependency 2012-04-11 00:22:51 +02:00
Filippo Valsorda
7a8501e307 ignore parsing errors in get_element_by_id() 2012-04-10 23:08:53 +02:00
Filippo Valsorda - Campagna
781cc523af removed the undocumented HTMLParser.unescape, replaced with _unescapeHTML; fixed a bug in the use of _unescapeHTML (missing _, from d6a9615347) 2012-04-10 18:54:40 +02:00
Filippo Valsorda - Campagna
c6f45d4314 removed dependency from lxml: added IDParser 2012-04-10 18:21:00 +02:00
Filippo Valsorda - Campagna
d11d05d07a better naming for the sub-modules 2012-04-10 16:46:36 +02:00
Filippo Valsorda - Campagna
e179aadfdf moved trivialjson to a separate file 2012-04-10 16:37:40 +02:00
Filippo Valsorda - Campagna
d6a9615347 standardized the use of unescapeHTML; added clean_html() 2012-04-10 16:31:46 +02:00
Filippo Valsorda - Campagna
c6306eb798 wine-py2exe.sh to create the exe under linux (!!) 2012-04-07 20:07:42 +02:00
Filippo Valsorda
bcfde70d73 py2exe -U fix for Windows XP 2012-03-31 01:27:47 +02:00
Filippo Valsorda
53e893615d corrected -U to support new zipfile and exe (#153) formats 2012-03-31 01:19:30 +02:00
Filippo Valsorda
303692b5ed 's/ /\t/' 2012-03-30 23:54:16 +02:00
Filippo Valsorda
58ca755f40 moved increment_downloads and process_info calls from IEs to FD.download (#296) (follows current doclines); a small step towards importability #217 2012-03-30 23:45:27 +02:00
Filippo Valsorda
770234afa2 Added py2exe script 2012-03-25 23:48:53 +02:00
Filippo Valsorda
d77c3dfd02 Split code as a package, compiled into an executable zip 2012-03-25 03:07:37 +02:00
Filippo Valsorda
c23d8a74dc Merge branch 'next-url' 2012-03-25 01:07:47 +01:00
Filippo Valsorda
74a5ff5f43 transplant ceba827e9a, d891ff9fd9, 69d3b2d824, 071940680f 2012-03-24 01:23:19 +01:00
Filippo Valsorda
071940680f Always extract original URL from next_url (#318) 2012-03-24 01:17:36 +01:00
Witold Baryluk
69d3b2d824 Extract original URL from next_url parameter of verify_age page, before actual extract 2012-03-23 06:17:29 +01:00
Witold Baryluk
d891ff9fd9 Ignore leading spaces (and trailing also) in all URL from url list or command line 2012-03-23 06:15:57 +01:00
Filippo Valsorda
6af22cf0ef added support for HTTP redirects. Closes #315 2012-03-18 22:15:58 +01:00
Philipp Hagemeister
fff24d5e35 Clean up superfluous whitespace 2012-03-15 20:52:35 +01:00
Philipp Hagemeister
ceba827e9a Credit Filippo Valsorda 2012-03-15 20:47:27 +01:00
Filippo Valsorda
a0432a1e80 added --srt-lang; updated README; extended the -g FAQ 2012-03-15 14:56:08 +01:00
Filippo Valsorda
cfcf32d038 Merge branch 'master' of git://github.com/rg3/youtube-dl into closed-captions 2012-03-15 14:05:34 +01:00
Philipp Hagemeister
a67bdc34fa transplant gist of 7151f63a5f 2012-03-15 08:36:31 +01:00
Philipp Hagemeister
b3a653c245 Merge commit '7151f63a5f3820a322ba8bf61eebe8d9f75d6ee5' 2012-03-15 08:26:44 +01:00
Philipp Hagemeister
4a34b7252e transplant 2934c2ce43 and afbaa80b8b 2012-03-15 08:05:21 +01:00
Philipp Hagemeister
7e45ec57a8 transplant 0f6e296a8e 2012-03-15 07:56:32 +01:00
Filippo Valsorda
afbaa80b8b switched ytsearch to more robust Youtube Data API (fixes #307) 2012-03-14 22:44:45 +01:00
Filippo Valsorda
115d243428 added youtube closed captions .srt support (see #90) 2012-03-13 23:49:33 +01:00
cryzed
7151f63a5f Fixed downloading of unrelated videos when downloading a YouTube playlist 2012-03-09 22:05:35 +01:00
Filippo Valsorda
597e7b1805 Vimeo: Added support for flv only videos 2012-03-07 21:02:12 +01:00
Filippo Valsorda
2934c2ce43 Switch Vimeo to scraping: fixes #285 2012-03-05 17:51:16 +01:00
Filippo Valsorda
0f6e296a8e Fixed gvsearch 2012-03-02 00:35:56 +01:00
129 changed files with 12012 additions and 9405 deletions

17
.gitignore vendored
View File

@@ -1,3 +1,20 @@
*.pyc
*.pyo
*~
*.DS_Store
wine-py2exe/
py2exe.log
*.kate-swp
build/
dist/
MANIFEST
README.txt
youtube-dl.1
youtube-dl.bash-completion
youtube-dl
youtube-dl.exe
youtube-dl.tar.gz
.coverage
cover/
updates_key.pem
*.egg-info

15
.travis.yml Normal file
View File

@@ -0,0 +1,15 @@
language: python
python:
- "2.6"
- "2.7"
- "3.3"
script: nosetests test --verbose
notifications:
email:
- filippo.valsorda@gmail.com
- phihag@phihag.de
- jaime.marquinez.ferrandiz+travis@gmail.com
# irc:
# channels:
# - "irc.freenode.org#youtube-dl"
# skip_join: true

14
CHANGELOG Normal file
View File

@@ -0,0 +1,14 @@
2013.01.02 Codename: GIULIA
* Add support for ComedyCentral clips <nto>
* Corrected Vimeo description fetching <Nick Daniels>
* Added the --no-post-overwrites argument <Barbu Paul - Gheorghe>
* --verbose offers more environment info
* New info_dict field: uploader_id
* New updates system, with signature checking
* New IEs: NBA, JustinTV, FunnyOrDie, TweetReel, Steam, Ustream
* Fixed IEs: BlipTv
* Fixed for Python 3 IEs: Xvideo, Youku, XNXX, Dailymotion, Vimeo, InfoQ
* Simplified IEs and test code
* Various (Python 3 and other) fixes
* Revamped and expanded tests

View File

@@ -1 +1 @@
2012.02.27
2012.12.99

24
LICENSE Normal file
View File

@@ -0,0 +1,24 @@
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org/>

5
MANIFEST.in Normal file
View File

@@ -0,0 +1,5 @@
include README.md
include test/*.py
include test/*.json
include youtube-dl.bash-completion
include youtube-dl.1

View File

@@ -1,23 +1,78 @@
default: update
all: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion
update: compile update-readme update-latest
clean:
rm -rf youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz
update-latest:
./youtube-dl.dev --version > LATEST_VERSION
cleanall: clean
rm -f youtube-dl youtube-dl.exe
update-readme:
@options=$$(COLUMNS=80 ./youtube-dl.dev --help | sed -e '1,/.*General Options.*/ d' -e 's/^\W\{2\}\(\w\)/### \1/') && \
header=$$(sed -e '/.*## OPTIONS/,$$ d' README.md) && \
footer=$$(sed -e '1,/.*## FAQ/ d' README.md) && \
echo "$${header}" > README.md && \
echo >> README.md && \
echo '## OPTIONS' >> README.md && \
echo "$${options}" >> README.md&& \
echo >> README.md && \
echo '## FAQ' >> README.md && \
echo "$${footer}" >> README.md
PREFIX=/usr/local
BINDIR=$(PREFIX)/bin
MANDIR=$(PREFIX)/man
PYTHON=/usr/bin/env python
compile:
cp youtube_dl/__init__.py youtube-dl
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
ifeq ($(PREFIX),/usr)
SYSCONFDIR=/etc
else
ifeq ($(PREFIX),/usr/local)
SYSCONFDIR=/etc
else
SYSCONFDIR=$(PREFIX)/etc
endif
endif
.PHONY: default compile update update-latest update-readme
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion
install -d $(DESTDIR)$(BINDIR)
install -m 755 youtube-dl $(DESTDIR)$(BINDIR)
install -d $(DESTDIR)$(MANDIR)/man1
install -m 644 youtube-dl.1 $(DESTDIR)$(MANDIR)/man1
install -d $(DESTDIR)$(SYSCONFDIR)/bash_completion.d
install -m 644 youtube-dl.bash-completion $(DESTDIR)$(SYSCONFDIR)/bash_completion.d/youtube-dl
test:
#nosetests --with-coverage --cover-package=youtube_dl --cover-html --verbose --processes 4 test
nosetests --verbose test
tar: youtube-dl.tar.gz
.PHONY: all clean install test tar bash-completion pypi-files
pypi-files: youtube-dl.bash-completion README.txt youtube-dl.1
youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
zip --quiet youtube-dl youtube_dl/*.py youtube_dl/*/*.py
zip --quiet --junk-paths youtube-dl youtube_dl/__main__.py
echo '#!$(PYTHON)' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
README.md: youtube_dl/*.py youtube_dl/*/*.py
COLUMNS=80 python -m youtube_dl --help | python devscripts/make_readme.py
README.txt: README.md
pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md
pandoc -s -f markdown -t man README.md -o youtube-dl.1
youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in
python devscripts/bash-completion.py
bash-completion: youtube-dl.bash-completion
youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion
@tar -czf youtube-dl.tar.gz --transform "s|^|youtube-dl/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \
--exclude '*.kate-swp' \
--exclude '*.pyc' \
--exclude '*.pyo' \
--exclude '*~' \
--exclude '__pycache' \
--exclude '.git' \
-- \
bin devscripts test youtube_dl \
CHANGELOG LICENSE README.md README.txt \
Makefile MANIFEST.in youtube-dl.1 youtube-dl.bash-completion setup.py \
youtube-dl

297
README.md
View File

@@ -1,104 +1,207 @@
# youtube-dl
% YOUTUBE-DL(1)
## USAGE
youtube-dl [options] url [url...]
# NAME
youtube-dl - download videos from youtube.com or other video platforms
## DESCRIPTION
# SYNOPSIS
**youtube-dl** [OPTIONS] URL [URL...]
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.x (x being at least 5), and it is not platform specific. It should work in
your Unix box, in Windows or in Mac OS X. It is released to the public domain,
2.6, 2.7, or 3.3+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like.
## OPTIONS
-h, --help print this help text and exit
--version print program version and exit
-U, --update update this program to latest version
-i, --ignore-errors continue on download errors
-r, --rate-limit LIMIT download rate limit (e.g. 50k or 44.6m)
-R, --retries RETRIES number of retries (default is 10)
--dump-user-agent display the current browser identification
--list-extractors List all supported extractors and the URLs they
would handle
# OPTIONS
-h, --help print this help text and exit
--version print program version and exit
-U, --update update this program to latest version
-i, --ignore-errors continue on download errors
--dump-user-agent display the current browser identification
--user-agent UA specify a custom user agent
--referer REF specify a custom referer, use if the video access
is restricted to one domain
--list-extractors List all supported extractors and the URLs they
would handle
--extractor-descriptions Output descriptions of all supported extractors
--proxy URL Use the specified HTTP/HTTPS proxy
--no-check-certificate Suppress HTTPS certificate validation.
### Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last)
--match-title REGEX download only matching titles (regex or caseless
sub-string)
--reject-title REGEX skip download for matching titles (regex or
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last)
--match-title REGEX download only matching titles (regex or caseless
sub-string)
--reject-title REGEX skip download for matching titles (regex or
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than SIZE
(e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE (e.g.
50k or 44.6m)
--date DATE download only videos uploaded in this date
--datebefore DATE download only videos uploaded before this date
--dateafter DATE download only videos uploaded after this date
### Filesystem Options:
-t, --title use title in file name
-l, --literal use literal title in file name
-A, --auto-number number downloaded files starting from 00000
-o, --output TEMPLATE output filename template. Use %(stitle)s to get the
title, %(uploader)s for the uploader name,
%(autonumber)s to get an automatically incremented
number, %(ext)s for the filename extension,
%(upload_date)s for the upload date (YYYYMMDD), and
%% for a literal percent. Use - to output to
stdout.
-a, --batch-file FILE file containing URLs to download ('-' for stdin)
-w, --no-overwrites do not overwrite files
-c, --continue resume partially downloaded files
--no-continue do not resume partially downloaded files (restart
from beginning)
--cookies FILE file to read cookies from and dump cookie jar in
--no-part do not use .part files
--no-mtime do not use the Last-modified header to set the file
modification time
--write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file
## Download Options:
-r, --rate-limit LIMIT maximum download rate (e.g. 50k or 44.6m)
-R, --retries RETRIES number of retries (default is 10)
--buffer-size SIZE size of download buffer (e.g. 1024 or 16k)
(default is 1024)
--no-resize-buffer do not automatically adjust the buffer size. By
default, the buffer size is automatically resized
from an initial value of SIZE.
### Verbosity / Simulation Options:
-q, --quiet activates quiet mode
-s, --simulate do not download the video and do not write anything
to disk
--skip-download do not download the video
-g, --get-url simulate, quiet but print URL
-e, --get-title simulate, quiet but print title
--get-thumbnail simulate, quiet but print thumbnail URL
--get-description simulate, quiet but print video description
--get-filename simulate, quiet but print output filename
--get-format simulate, quiet but print output format
--no-progress do not print progress bar
--console-title display progress in console titlebar
-v, --verbose print various debugging information
## Filesystem Options:
-t, --title use title in file name (default)
--id use only video ID in file name
-l, --literal [deprecated] alias of --title
-A, --auto-number number downloaded files starting from 00000
-o, --output TEMPLATE output filename template. Use %(title)s to get
the title, %(uploader)s for the uploader name,
%(uploader_id)s for the uploader nickname if
different, %(autonumber)s to get an automatically
incremented number, %(ext)s for the filename
extension, %(upload_date)s for the upload date
(YYYYMMDD), %(extractor)s for the provider
(youtube, metacafe, etc), %(id)s for the video id
, %(playlist)s for the playlist the video is in,
%(playlist_index)s for the position in the
playlist and %% for a literal percent. Use - to
output to stdout. Can also be used to download to
a different directory, for example with -o '/my/d
ownloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in %(autonumber)s
when it is present in output filename template or
--autonumber option is given
--restrict-filenames Restrict filenames to only ASCII characters, and
avoid "&" and spaces in filenames
-a, --batch-file FILE file containing URLs to download ('-' for stdin)
-w, --no-overwrites do not overwrite files
-c, --continue resume partially downloaded files
--no-continue do not resume partially downloaded files (restart
from beginning)
--cookies FILE file to read cookies from and dump cookie jar in
--no-part do not use .part files
--no-mtime do not use the Last-modified header to set the
file modification time
--write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file
--write-thumbnail write thumbnail image to disk
### Video Format Options:
-f, --format FORMAT video format code
--all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific one is
requested
--max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats (currently youtube only)
## Verbosity / Simulation Options:
-q, --quiet activates quiet mode
-s, --simulate do not download the video and do not write
anything to disk
--skip-download do not download the video
-g, --get-url simulate, quiet but print URL
-e, --get-title simulate, quiet but print title
--get-id simulate, quiet but print id
--get-thumbnail simulate, quiet but print thumbnail URL
--get-description simulate, quiet but print video description
--get-filename simulate, quiet but print output filename
--get-format simulate, quiet but print output format
--newline output progress bar as new lines
--no-progress do not print progress bar
--console-title display progress in console titlebar
-v, --verbose print various debugging information
--dump-intermediate-pages print downloaded pages to debug problems(very
verbose)
### Authentication Options:
-u, --username USERNAME account username
-p, --password PASSWORD account password
-n, --netrc use .netrc authentication data
## Video Format Options:
-f, --format FORMAT video format code, specifiy the order of
preference using slashes: "-f 22/17/18"
--all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific one
is requested
--max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats (currently youtube
only)
--write-sub write subtitle file (currently youtube only)
--write-auto-sub write automatic subtitle file (currently youtube
only)
--only-sub [deprecated] alias of --skip-download
--all-subs downloads all the available subtitles of the
video (currently youtube only)
--list-subs lists all available subtitles for the video
(currently youtube only)
--sub-format FORMAT subtitle format [srt/sbv/vtt] (default=srt)
(currently youtube only)
--sub-lang LANG language of the subtitles to download (optional)
use IETF language tags like 'en'
### Post-processing Options:
--extract-audio convert video files to audio-only files (requires
ffmpeg and ffprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", or "wav";
best by default
--audio-quality QUALITY ffmpeg audio bitrate specification, 128k by default
-k, --keep-video keeps the video file on disk after the post-
processing; the video is erased by default
## Authentication Options:
-u, --username USERNAME account username
-p, --password PASSWORD account password
-n, --netrc use .netrc authentication data
--video-password PASSWORD video password (vimeo only)
## FAQ
## Post-processing Options:
-x, --extract-audio convert video files to audio-only files (requires
ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", "opus", or
"wav"; best by default
--audio-quality QUALITY ffmpeg/avconv audio quality specification, insert
a value between 0 (better) and 9 (worse) for VBR
or a specific bitrate like 128K (default 5)
--recode-video FORMAT Encode the video to another format if necessary
(currently supported: mp4|flv|ogg|webm)
-k, --keep-video keeps the video file on disk after the post-
processing; the video is erased by default
--no-post-overwrites do not overwrite post-processed files; the post-
processed files are overwritten by default
# CONFIGURATION
You can configure youtube-dl by placing default arguments (such as `--extract-audio --no-mtime` to always extract the audio and not copy the mtime) into `/etc/youtube-dl.conf` and/or `~/.config/youtube-dl.conf`.
# OUTPUT TEMPLATE
The `-o` option allows users to indicate a template for the output file names. The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parenthesis, followed by a lowercase S. Allowed names are:
- `id`: The sequence will be replaced by the video identifier.
- `url`: The sequence will be replaced by the video URL.
- `uploader`: The sequence will be replaced by the nickname of the person who uploaded the video.
- `upload_date`: The sequence will be replaced by the upload date in YYYYMMDD format.
- `title`: The sequence will be replaced by the video title.
- `ext`: The sequence will be replaced by the appropriate extension (like flv or mp4).
- `epoch`: The sequence will be replaced by the Unix epoch when creating the file.
- `autonumber`: The sequence will be replaced by a five-digit number that will be increased with each download, starting at zero.
- `playlist`: The name or the id of the playlist that contains the video.
- `playlist_index`: The index of the video in the playlist, a five-digit number.
The current default template is `%(title)s-%(id)s.%(ext)s`.
In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
$ youtube-dl --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.mp4 # All kinds of weird characters
$ youtube-dl --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.mp4 # A simple file name
# VIDEO SELECTION
Videos can be filtered by their upload date using the options `--date`, `--datebefore` or `--dateafter`, they accept dates in two formats:
- Absolute dates: Dates in the format `YYYYMMDD`.
- Relative dates: Dates in the format `(now|today)[+-][0-9](day|week|month|year)(s)?`
Examples:
$ youtube-dl --dateafter now-6months #will only download the videos uploaded in the last 6 months
$ youtube-dl --date 19700101 #will only download the videos uploaded in January 1, 1970
$ youtube-dl --dateafter 20000101 --datebefore 20100101 #will only download the videos uploaded between 2000 and 2010
# FAQ
### Can you please put the -b option back?
Most people asking this question are not aware that youtube-dl now defaults to downloading the highest available quality as reported by YouTube, which will be 1080p or 720p in some cases, so you no longer need the -b option. For some specific videos, maybe YouTube does not report them to be available in a specific high quality format you''re interested in. In that case, simply request it with the -f option and youtube-dl will try to download it.
Most people asking this question are not aware that youtube-dl now defaults to downloading the highest available quality as reported by YouTube, which will be 1080p or 720p in some cases, so you no longer need the `-b` option. For some specific videos, maybe YouTube does not report them to be available in a specific high quality format you're interested in. In that case, simply request it with the `-f` option and youtube-dl will try to download it.
### I get HTTP error 402 when trying to download a video. What's this?
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We''re [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### I have downloaded a video but how can I play it?
@@ -106,25 +209,49 @@ Once the video is fully downloaded, use any video player, such as [vlc](http://w
### The links provided by youtube-dl -g are not working anymore
The URLs youtube-dl outputs require the downloader to have the correct cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file.
The URLs youtube-dl outputs require the downloader to have the correct cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
### ERROR: no fmt_url_map or conn information found in video info
youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
## COPYRIGHT
### ERROR: unable to download video ###
youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
### SyntaxError: Non-ASCII character ###
The error
File "youtube-dl", line 2
SyntaxError: Non-ASCII character '\x93' ...
means you're using an outdated version of Python. Please update to Python 2.6 or 2.7.
### What is this binary file? Where has the code gone?
Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws a *Runtime error from Visual C++*
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
# COPYRIGHT
youtube-dl is released into the public domain by the copyright holders.
This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
## BUGS
# BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>
Please include:
* Your exact command line, like `youtube-dl -t "http://www.youtube.com/watch?v=uHlDtZ6Oc3s&feature=channel_video_title"`. A common mistake is not to escape the `&`. Putting URLs in quotes should solve this problem.
* If possible re-run the command with `--verbose`, and include the full output, it is really helpful to us.
* The output of `youtube-dl --version`
* The output of `python --version`
* The name and version of your Operating System ("Ubuntu 11.04 x64" or "Windows 7 x64" is usually enough).
For discussions, join us in the irc channel #youtube-dl on freenode.

6
bin/youtube-dl Executable file
View File

@@ -0,0 +1,6 @@
#!/usr/bin/env python
import youtube_dl
if __name__ == '__main__':
youtube_dl.main()

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,14 @@
__youtube-dl()
{
local cur prev opts
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
opts="{{flags}}"
if [[ ${cur} == * ]] ; then
COMPREPLY=( $(compgen -W "${opts}" -- ${cur}) )
return 0
fi
}
complete -F __youtube-dl youtube-dl

26
devscripts/bash-completion.py Executable file
View File

@@ -0,0 +1,26 @@
#!/usr/bin/env python
import os
from os.path import dirname as dirn
import sys
sys.path.append(dirn(dirn((os.path.abspath(__file__)))))
import youtube_dl
BASH_COMPLETION_FILE = "youtube-dl.bash-completion"
BASH_COMPLETION_TEMPLATE = "devscripts/bash-completion.in"
def build_completion(opt_parser):
opts_flag = []
for group in opt_parser.option_groups:
for option in group.option_list:
#for every long flag
opts_flag.append(option.get_opt_string())
with open(BASH_COMPLETION_TEMPLATE) as f:
template = f.read()
with open(BASH_COMPLETION_FILE, "w") as f:
#just using the special char
filled_template = template.replace("{{flags}}", " ".join(opts_flag))
f.write(filled_template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -0,0 +1,33 @@
#!/usr/bin/env python3
import json
import sys
import hashlib
import urllib.request
if len(sys.argv) <= 1:
print('Specify the version number as parameter')
sys.exit()
version = sys.argv[1]
with open('update/LATEST_VERSION', 'w') as f:
f.write(version)
versions_info = json.load(open('update/versions.json'))
if 'signature' in versions_info:
del versions_info['signature']
new_version = {}
filenames = {'bin': 'youtube-dl', 'exe': 'youtube-dl.exe', 'tar': 'youtube-dl-%s.tar.gz' % version}
for key, filename in filenames.items():
print('Downloading and checksumming %s...' %filename)
url = 'http://youtube-dl.org/downloads/%s/%s' % (version, filename)
data = urllib.request.urlopen(url).read()
sha256sum = hashlib.sha256(data).hexdigest()
new_version[key] = (url, sha256sum)
versions_info['versions'][version] = new_version
versions_info['latest'] = version
json.dump(versions_info, open('update/versions.json', 'w'), indent=4, sort_keys=True)

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env python3
import hashlib
import shutil
import subprocess
import tempfile
import urllib.request
import json
versions_info = json.load(open('update/versions.json'))
version = versions_info['latest']
URL = versions_info['versions'][version]['bin'][0]
data = urllib.request.urlopen(URL).read()
# Read template page
with open('download.html.in', 'r', encoding='utf-8') as tmplf:
template = tmplf.read()
md5sum = hashlib.md5(data).hexdigest()
sha1sum = hashlib.sha1(data).hexdigest()
sha256sum = hashlib.sha256(data).hexdigest()
template = template.replace('@PROGRAM_VERSION@', version)
template = template.replace('@PROGRAM_URL@', URL)
template = template.replace('@PROGRAM_MD5SUM@', md5sum)
template = template.replace('@PROGRAM_SHA1SUM@', sha1sum)
template = template.replace('@PROGRAM_SHA256SUM@', sha256sum)
template = template.replace('@EXE_URL@', versions_info['versions'][version]['exe'][0])
template = template.replace('@EXE_SHA256SUM@', versions_info['versions'][version]['exe'][1])
template = template.replace('@TAR_URL@', versions_info['versions'][version]['tar'][0])
template = template.replace('@TAR_SHA256SUM@', versions_info['versions'][version]['tar'][1])
with open('download.html', 'w', encoding='utf-8') as dlf:
dlf.write(template)

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env python3
import rsa
import json
from binascii import hexlify
try:
input = raw_input
except NameError:
pass
versions_info = json.load(open('update/versions.json'))
if 'signature' in versions_info:
del versions_info['signature']
print('Enter the PKCS1 private key, followed by a blank line:')
privkey = b''
while True:
try:
line = input()
except EOFError:
break
if line == '':
break
privkey += line.encode('ascii') + b'\n'
privkey = rsa.PrivateKey.load_pkcs1(privkey)
signature = hexlify(rsa.pkcs1.sign(json.dumps(versions_info, sort_keys=True).encode('utf-8'), privkey, 'SHA-256')).decode()
print('signature: ' + signature)
versions_info['signature'] = signature
json.dump(versions_info, open('update/versions.json', 'w'), indent=4, sort_keys=True)

View File

@@ -0,0 +1,21 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import with_statement
import datetime
import glob
import io # For Python 2 compatibilty
import os
import re
year = str(datetime.datetime.now().year)
for fn in glob.glob('*.html*'):
with io.open(fn, encoding='utf-8') as f:
content = f.read()
newc = re.sub(u'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', u'Copyright © 2006-' + year, content)
if content != newc:
tmpFn = fn + '.part'
with io.open(tmpFn, 'wt', encoding='utf-8') as outf:
outf.write(newc)
os.rename(tmpFn, fn)

View File

@@ -0,0 +1,57 @@
#!/usr/bin/env python3
import datetime
import textwrap
import json
atom_template=textwrap.dedent("""\
<?xml version='1.0' encoding='utf-8'?>
<atom:feed xmlns:atom="http://www.w3.org/2005/Atom">
<atom:title>youtube-dl releases</atom:title>
<atom:id>youtube-dl-updates-feed</atom:id>
<atom:updated>@TIMESTAMP@</atom:updated>
@ENTRIES@
</atom:feed>""")
entry_template=textwrap.dedent("""
<atom:entry>
<atom:id>youtube-dl-@VERSION@</atom:id>
<atom:title>New version @VERSION@</atom:title>
<atom:link href="http://rg3.github.io/youtube-dl" />
<atom:content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
Downloads available at <a href="http://youtube-dl.org/downloads/@VERSION@/">http://youtube-dl.org/downloads/@VERSION@/</a>
</div>
</atom:content>
<atom:author>
<atom:name>The youtube-dl maintainers</atom:name>
</atom:author>
<atom:updated>@TIMESTAMP@</atom:updated>
</atom:entry>
""")
now = datetime.datetime.now()
now_iso = now.isoformat()
atom_template = atom_template.replace('@TIMESTAMP@',now_iso)
entries=[]
versions_info = json.load(open('update/versions.json'))
versions = list(versions_info['versions'].keys())
versions.sort()
for v in versions:
entry = entry_template.replace('@TIMESTAMP@',v.replace('.','-'))
entry = entry.replace('@VERSION@',v)
entries.append(entry)
entries_str = textwrap.indent(''.join(entries), '\t')
atom_template = atom_template.replace('@ENTRIES@', entries_str)
with open('update/releases.atom','w',encoding='utf-8') as atom_file:
atom_file.write(atom_template)

20
devscripts/make_readme.py Executable file
View File

@@ -0,0 +1,20 @@
import sys
import re
README_FILE = 'README.md'
helptext = sys.stdin.read()
with open(README_FILE) as f:
oldreadme = f.read()
header = oldreadme[:oldreadme.index('# OPTIONS')]
footer = oldreadme[oldreadme.index('# CONFIGURATION'):]
options = helptext[helptext.index(' General Options:')+19:]
options = re.sub(r'^ (\w.+)$', r'## \1', options, flags=re.M)
options = '# OPTIONS\n' + options + '\n'
with open(README_FILE, 'w') as f:
f.write(header)
f.write(options)
f.write(footer)

0
devscripts/posix-locale.sh Normal file → Executable file
View File

104
devscripts/release.sh Executable file
View File

@@ -0,0 +1,104 @@
#!/bin/bash
# IMPORTANT: the following assumptions are made
# * the GH repo is on the origin remote
# * the gh-pages branch is named so locally
# * the git config user.signingkey is properly set
# You will need
# pip install coverage nose rsa
# TODO
# release notes
# make hash on local files
set -e
skip_tests=false
if [ "$1" = '--skip-test' ]; then
skip_tests=true
shift
fi
if [ -z "$1" ]; then echo "ERROR: specify version number like this: $0 1994.09.06"; exit 1; fi
version="$1"
if [ ! -z "`git tag | grep "$version"`" ]; then echo 'ERROR: version already present'; exit 1; fi
if [ ! -z "`git status --porcelain | grep -v CHANGELOG`" ]; then echo 'ERROR: the working directory is not clean; commit or stash changes'; exit 1; fi
if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
/bin/echo -e "\n### First of all, testing..."
make cleanall
if $skip_tests ; then
echo 'SKIPPING TESTS'
else
nosetests --verbose --with-coverage --cover-package=youtube_dl --cover-html test --stop || exit 1
fi
/bin/echo -e "\n### Changing version in version.py..."
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
/bin/echo -e "\n### Committing CHANGELOG README.md and youtube_dl/version.py..."
make README.md
git add CHANGELOG README.md youtube_dl/version.py
git commit -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."
git tag -s -m "Release $version" "$version"
git show "$version"
read -p "Is it good, can I push? (y/n) " -n 1
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
echo
MASTER=$(git rev-parse --abbrev-ref HEAD)
git push origin $MASTER:master
git push origin "$version"
/bin/echo -e "\n### OK, now it is time to build the binaries..."
REV=$(git rev-parse HEAD)
make youtube-dl youtube-dl.tar.gz
wget "http://jeromelaheurte.net:8142/download/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe || \
wget "http://jeromelaheurte.net:8142/build/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
mkdir -p "build/$version"
mv youtube-dl youtube-dl.exe "build/$version"
mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz"
RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
(cd build/$version/ && md5sum $RELEASE_FILES > MD5SUMS)
(cd build/$version/ && sha1sum $RELEASE_FILES > SHA1SUMS)
(cd build/$version/ && sha256sum $RELEASE_FILES > SHA2-256SUMS)
(cd build/$version/ && sha512sum $RELEASE_FILES > SHA2-512SUMS)
git checkout HEAD -- youtube-dl youtube-dl.exe
/bin/echo -e "\n### Signing and uploading the new binaries to youtube-dl.org..."
for f in $RELEASE_FILES; do gpg --detach-sig "build/$version/$f"; done
scp -r "build/$version" ytdl@yt-dl.org:html/tmp/
ssh ytdl@yt-dl.org "mv html/tmp/$version html/downloads/"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"
/bin/echo -e "\n### Now switching to gh-pages..."
git clone --branch gh-pages --single-branch . build/gh-pages
ROOT=$(pwd)
(
set -e
ORIGIN_URL=$(git config --get remote.origin.url)
cd build/gh-pages
"$ROOT/devscripts/gh-pages/add-version.py" $version
"$ROOT/devscripts/gh-pages/update-feed.py"
"$ROOT/devscripts/gh-pages/sign-versions.py" < "$ROOT/updates_key.pem"
"$ROOT/devscripts/gh-pages/generate-download.py"
"$ROOT/devscripts/gh-pages/update-copyright.py"
git add *.html *.html.in update
git commit -m "release $version"
git show HEAD
read -p "Is it good, can I push? (y/n) " -n 1
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
echo
git push "$ROOT" gh-pages
git push "$ORIGIN_URL" gh-pages
)
rm -rf build
make pypi-files
echo "Uploading to PyPi ..."
python setup.py sdist upload
make clean
/bin/echo -e "\n### DONE!"

View File

@@ -0,0 +1,40 @@
#!/usr/bin/env python
import sys, os
try:
import urllib.request as compat_urllib_request
except ImportError: # Python 2
import urllib2 as compat_urllib_request
sys.stderr.write(u'Hi! We changed distribution method and now youtube-dl needs to update itself one more time.\n')
sys.stderr.write(u'This will only happen once. Simply press enter to go on. Sorry for the trouble!\n')
sys.stderr.write(u'The new location of the binaries is https://github.com/rg3/youtube-dl/downloads, not the git repository.\n\n')
try:
raw_input()
except NameError: # Python 3
input()
filename = sys.argv[0]
API_URL = "https://api.github.com/repos/rg3/youtube-dl/downloads"
BIN_URL = "https://github.com/downloads/rg3/youtube-dl/youtube-dl"
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
try:
urlh = compat_urllib_request.urlopen(BIN_URL)
newcontent = urlh.read()
urlh.close()
except (IOError, OSError) as err:
sys.exit('ERROR: unable to download latest version')
try:
with open(filename, 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError) as err:
sys.exit('ERROR: unable to overwrite current version')
sys.stderr.write(u'Done! Now you can run youtube-dl.\n')

View File

@@ -0,0 +1,12 @@
from distutils.core import setup
import py2exe
py2exe_options = {
"bundle_files": 1,
"compressed": 1,
"optimize": 2,
"dist_dir": '.',
"dll_excludes": ['w9xpopen.exe']
}
setup(console=['youtube-dl.py'], options={ "py2exe": py2exe_options }, zipfile=None)

View File

@@ -0,0 +1,102 @@
#!/usr/bin/env python
import sys, os
import urllib2
import json, hashlib
def rsa_verify(message, signature, key):
from struct import pack
from hashlib import sha256
from sys import version_info
def b(x):
if version_info[0] == 2: return x
else: return x.encode('latin1')
assert(type(message) == type(b('')))
block_size = 0
n = key[0]
while n:
block_size += 1
n >>= 8
signature = pow(int(signature, 16), key[1], key[0])
raw_bytes = []
while signature:
raw_bytes.insert(0, pack("B", signature & 0xFF))
signature >>= 8
signature = (block_size - len(raw_bytes)) * b('\x00') + b('').join(raw_bytes)
if signature[0:2] != b('\x00\x01'): return False
signature = signature[2:]
if not b('\x00') in signature: return False
signature = signature[signature.index(b('\x00'))+1:]
if not signature.startswith(b('\x30\x31\x30\x0D\x06\x09\x60\x86\x48\x01\x65\x03\x04\x02\x01\x05\x00\x04\x20')): return False
signature = signature[19:]
if signature != sha256(message).digest(): return False
return True
sys.stderr.write(u'Hi! We changed distribution method and now youtube-dl needs to update itself one more time.\n')
sys.stderr.write(u'This will only happen once. Simply press enter to go on. Sorry for the trouble!\n')
sys.stderr.write(u'From now on, get the binaries from http://rg3.github.com/youtube-dl/download.html, not from the git repository.\n\n')
raw_input()
filename = sys.argv[0]
UPDATE_URL = "http://rg3.github.io/youtube-dl/update/"
VERSION_URL = UPDATE_URL + 'LATEST_VERSION'
JSON_URL = UPDATE_URL + 'versions.json'
UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
exe = os.path.abspath(filename)
directory = os.path.dirname(exe)
if not os.access(directory, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % directory)
try:
versions_info = urllib2.urlopen(JSON_URL).read().decode('utf-8')
versions_info = json.loads(versions_info)
except:
sys.exit(u'ERROR: can\'t obtain versions info. Please try again later.')
if not 'signature' in versions_info:
sys.exit(u'ERROR: the versions file is not signed or corrupted. Aborting.')
signature = versions_info['signature']
del versions_info['signature']
if not rsa_verify(json.dumps(versions_info, sort_keys=True), signature, UPDATES_RSA_KEY):
sys.exit(u'ERROR: the versions file signature is invalid. Aborting.')
version = versions_info['versions'][versions_info['latest']]
try:
urlh = urllib2.urlopen(version['exe'][0])
newcontent = urlh.read()
urlh.close()
except (IOError, OSError) as err:
sys.exit('ERROR: unable to download latest version')
newcontent_hash = hashlib.sha256(newcontent).hexdigest()
if newcontent_hash != version['exe'][1]:
sys.exit(u'ERROR: the downloaded file hash does not match. Aborting.')
try:
with open(exe + '.new', 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError) as err:
sys.exit(u'ERROR: unable to write the new version')
try:
bat = os.path.join(directory, 'youtube-dl-updater.bat')
b = open(bat, 'w')
b.write("""
echo Updating youtube-dl...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s"
del "%s"
\n""" %(exe, exe, bat))
b.close()
os.startfile(bat)
except (IOError, OSError) as err:
sys.exit('ERROR: unable to overwrite current version')
sys.stderr.write(u'Done! Now you can run youtube-dl.\n')

56
devscripts/wine-py2exe.sh Executable file
View File

@@ -0,0 +1,56 @@
#!/bin/bash
# Run with as parameter a setup.py that works in the current directory
# e.g. no os.chdir()
# It will run twice, the first time will crash
set -e
SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
if [ ! -d wine-py2exe ]; then
sudo apt-get install wine1.3 axel bsdiff
mkdir wine-py2exe
cd wine-py2exe
export WINEPREFIX=`pwd`
axel -a "http://www.python.org/ftp/python/2.7/python-2.7.msi"
axel -a "http://downloads.sourceforge.net/project/py2exe/py2exe/0.6.9/py2exe-0.6.9.win32-py2.7.exe"
#axel -a "http://winetricks.org/winetricks"
# http://appdb.winehq.org/objectManager.php?sClass=version&iId=21957
echo "Follow python setup on screen"
wine msiexec /i python-2.7.msi
echo "Follow py2exe setup on screen"
wine py2exe-0.6.9.win32-py2.7.exe
#echo "Follow Microsoft Visual C++ 2008 Redistributable Package setup on screen"
#bash winetricks vcrun2008
rm py2exe-0.6.9.win32-py2.7.exe
rm python-2.7.msi
#rm winetricks
# http://bugs.winehq.org/show_bug.cgi?id=3591
mv drive_c/Python27/Lib/site-packages/py2exe/run.exe drive_c/Python27/Lib/site-packages/py2exe/run.exe.backup
bspatch drive_c/Python27/Lib/site-packages/py2exe/run.exe.backup drive_c/Python27/Lib/site-packages/py2exe/run.exe "$SCRIPT_DIR/SizeOfImage.patch"
mv drive_c/Python27/Lib/site-packages/py2exe/run_w.exe drive_c/Python27/Lib/site-packages/py2exe/run_w.exe.backup
bspatch drive_c/Python27/Lib/site-packages/py2exe/run_w.exe.backup drive_c/Python27/Lib/site-packages/py2exe/run_w.exe "$SCRIPT_DIR/SizeOfImage_w.patch"
cd -
else
export WINEPREFIX="$( cd wine-py2exe && pwd )"
fi
wine "C:\\Python27\\python.exe" "$1" py2exe > "py2exe.log" 2>&1 || true
echo '# Copying python27.dll' >> "py2exe.log"
cp "$WINEPREFIX/drive_c/windows/system32/python27.dll" build/bdist.win32/winexe/bundle-2.7/
wine "C:\\Python27\\python.exe" "$1" py2exe >> "py2exe.log" 2>&1

View File

@@ -0,0 +1,83 @@
#!/usr/bin/env python
# Generate youtube signature algorithm from test cases
import sys
tests = [
# 88
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[]}|:;?/>.<",
"J:|}][{=+-_)(*&;%$#@>MNBVCXZASDFGH^KLPOIUYTREWQ0987654321mnbvcxzasdfghrklpoiuytej"),
# 87
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$^&*()_-+={[]}|:;?/>.<",
"!?;:|}][{=+-_)(*&^$#@/MNBVCXZASqFGHJKLPOIUYTREWQ0987654321mnbvcxzasdfghjklpoiuytr"),
# 86 - vfl_ymO4Z 2013/06/27
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[|};?/>.<",
"ertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!/#$%^&*()_-+={[|};?@"),
# 85
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[};?/>.<",
"{>/?;}[.=+-_)(*&^%$#@!MqBVCXZASDFwHJKLPOIUYTREWQ0987654321mnbvcxzasdfghjklpoiuytr"),
# 84
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[};?>.<",
"<.>?;}[{=+-_)(*&^%$#@!MNBVCXZASDFGHJKLPOIUYTREWe098765432rmnbvcxzasdfghjklpoiuyt1"),
# 83 - vflcaqGO8 2013/07/11
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!#$%^&*()_+={[};?/>.<",
"urty8ioplkjhgfdsazxcvbqm1234567S90QWERTYUIOPLKJHGFDnAZXCVBNM!#$%^&*()_+={[};?/>.<"),
# 82
("qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKHGFDSAZXCVBNM!@#$%^&*(-+={[};?/>.<",
"Q>/?;}[{=+-(*<^%$#@!MNBVCXZASDFGHKLPOIUY8REWT0q&7654321mnbvcxzasdfghjklpoiuytrew9"),
]
def find_matching(wrong, right):
idxs = [wrong.index(c) for c in right]
return compress(idxs)
return ('s[%d]' % i for i in idxs)
def compress(idxs):
def _genslice(start, end, step):
starts = '' if start == 0 else str(start)
ends = ':%d' % (end+step)
steps = '' if step == 1 else (':%d' % step)
return 's[%s%s%s]' % (starts, ends, steps)
step = None
for i, prev in zip(idxs[1:], idxs[:-1]):
if step is not None:
if i - prev == step:
continue
yield _genslice(start, prev, step)
step = None
continue
if i - prev in [-1, 1]:
step = i - prev
start = prev
continue
else:
yield 's[%d]' % prev
if step is None:
yield 's[%d]' % i
else:
yield _genslice(start, i, step)
def _assert_compress(inp, exp):
res = list(compress(inp))
if res != exp:
print('Got %r, expected %r' % (res, exp))
assert res == exp
_assert_compress([0,2,4,6], ['s[0]', 's[2]', 's[4]', 's[6]'])
_assert_compress([0,1,2,4,6,7], ['s[:3]', 's[4]', 's[6:8]'])
_assert_compress([8,0,1,2,4,7,6,9], ['s[8]', 's[:3]', 's[4]', 's[7:5:-1]', 's[9]'])
def gen(wrong, right, indent):
code = ' + '.join(find_matching(wrong, right))
return 'if len(s) == %d:\n%s return %s\n' % (len(wrong), indent, code)
def genall(tests):
indent = ' ' * 8
return indent + (indent + 'el').join(gen(wrong, right, indent) for wrong,right in tests)
def main():
print(genall(tests))
if __name__ == '__main__':
main()

86
setup.py Normal file
View File

@@ -0,0 +1,86 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import pkg_resources
import sys
try:
from setuptools import setup
except ImportError:
from distutils.core import setup
try:
# This will create an exe that needs Microsoft Visual C++ 2008
# Redistributable Package
import py2exe
except ImportError:
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("Cannot import py2exe", file=sys.stderr)
exit(1)
py2exe_options = {
"bundle_files": 1,
"compressed": 1,
"optimize": 2,
"dist_dir": '.',
"dll_excludes": ['w9xpopen.exe'],
}
py2exe_console = [{
"script": "./youtube_dl/__main__.py",
"dest_base": "youtube-dl",
}]
py2exe_params = {
'console': py2exe_console,
'options': {"py2exe": py2exe_options},
'zipfile': None
}
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
params = py2exe_params
else:
params = {
'scripts': ['bin/youtube-dl'],
'data_files': [ # Installing system-wide would require sudo...
('etc/bash_completion.d', ['youtube-dl.bash-completion']),
('share/doc/youtube_dl', ['README.txt']),
('share/man/man1/', ['youtube-dl.1'])
]
}
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
setup(
name='youtube_dl',
version=__version__,
description='YouTube video downloader',
long_description='Small command-line program to download videos from'
' YouTube.com and other video sites.',
url='https://github.com/rg3/youtube-dl',
author='Ricardo Garcia',
maintainer='Philipp Hagemeister',
maintainer_email='phihag@phihag.de',
packages=['youtube_dl', 'youtube_dl.extractor'],
# Provokes warning on most systems (why?!)
# test_suite = 'nose.collector',
# test_requires = ['nosetest'],
classifiers=[
"Topic :: Multimedia :: Video",
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: Public Domain",
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.3"
],
**params
)

44
test/helper.py Normal file
View File

@@ -0,0 +1,44 @@
import io
import json
import os.path
import youtube_dl.extractor
from youtube_dl import YoutubeDL, YoutubeDLHandler
from youtube_dl.utils import (
compat_cookiejar,
compat_urllib_request,
)
# General configuration (from __init__, not very elegant...)
jar = compat_cookiejar.CookieJar()
cookie_processor = compat_urllib_request.HTTPCookieProcessor(jar)
proxy_handler = compat_urllib_request.ProxyHandler()
opener = compat_urllib_request.build_opener(proxy_handler, cookie_processor, YoutubeDLHandler())
compat_urllib_request.install_opener(opener)
PARAMETERS_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), "parameters.json")
with io.open(PARAMETERS_FILE, encoding='utf-8') as pf:
parameters = json.load(pf)
class FakeYDL(YoutubeDL):
def __init__(self):
self.result = []
# Different instances of the downloader can't share the same dictionary
# some test set the "sublang" parameter, which would break the md5 checks.
self.params = dict(parameters)
def to_screen(self, s):
print(s)
def trouble(self, s, tb=None):
raise Exception(s)
def download(self, x):
self.result.append(x)
def get_testcases():
for ie in youtube_dl.extractor.gen_extractors():
t = getattr(ie, '_TEST', None)
if t:
t['name'] = type(ie).__name__[:-len('IE')]
yield t
for t in getattr(ie, '_TESTS', []):
t['name'] = type(ie).__name__[:-len('IE')]
yield t

44
test/parameters.json Normal file
View File

@@ -0,0 +1,44 @@
{
"consoletitle": false,
"continuedl": true,
"forcedescription": false,
"forcefilename": false,
"forceformat": false,
"forcethumbnail": false,
"forcetitle": false,
"forceurl": false,
"format": null,
"format_limit": null,
"ignoreerrors": false,
"listformats": null,
"logtostderr": false,
"matchtitle": null,
"max_downloads": null,
"nooverwrites": false,
"nopart": false,
"noprogress": false,
"outtmpl": "%(id)s.%(ext)s",
"password": null,
"playlistend": -1,
"playliststart": 1,
"prefer_free_formats": false,
"quiet": false,
"ratelimit": null,
"rejecttitle": null,
"retries": 10,
"simulate": false,
"skip_download": false,
"subtitleslang": null,
"subtitlesformat": "srt",
"test": true,
"updatetime": true,
"usenetrc": false,
"username": null,
"verbose": true,
"writedescription": false,
"writeinfojson": true,
"writesubtitles": false,
"onlysubtitles": false,
"allsubtitles": false,
"listssubtitles": false
}

77
test/test_all_urls.py Normal file
View File

@@ -0,0 +1,77 @@
#!/usr/bin/env python
import sys
import unittest
# Allow direct execution
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.extractor import YoutubeIE, YoutubePlaylistIE, YoutubeChannelIE, JustinTVIE, gen_extractors
from helper import get_testcases
class TestAllURLsMatching(unittest.TestCase):
def test_youtube_playlist_matching(self):
self.assertTrue(YoutubePlaylistIE.suitable(u'ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8'))
self.assertTrue(YoutubePlaylistIE.suitable(u'UUBABnxM4Ar9ten8Mdjj1j0Q')) #585
self.assertTrue(YoutubePlaylistIE.suitable(u'PL63F0C78739B09958'))
self.assertTrue(YoutubePlaylistIE.suitable(u'https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q'))
self.assertTrue(YoutubePlaylistIE.suitable(u'https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8'))
self.assertTrue(YoutubePlaylistIE.suitable(u'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC'))
self.assertTrue(YoutubePlaylistIE.suitable(u'https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012')) #668
self.assertFalse(YoutubePlaylistIE.suitable(u'PLtS2H6bU1M'))
def test_youtube_matching(self):
self.assertTrue(YoutubeIE.suitable(u'PLtS2H6bU1M'))
self.assertFalse(YoutubeIE.suitable(u'https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012')) #668
def test_youtube_channel_matching(self):
self.assertTrue(YoutubeChannelIE.suitable('https://www.youtube.com/channel/HCtnHdj3df7iM'))
self.assertTrue(YoutubeChannelIE.suitable('https://www.youtube.com/channel/HCtnHdj3df7iM?feature=gb_ch_rec'))
self.assertTrue(YoutubeChannelIE.suitable('https://www.youtube.com/channel/HCtnHdj3df7iM/videos'))
def test_justin_tv_channelid_matching(self):
self.assertTrue(JustinTVIE.suitable(u"justin.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"twitch.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"www.justin.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"www.twitch.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"http://www.justin.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"http://www.twitch.tv/vanillatv"))
self.assertTrue(JustinTVIE.suitable(u"http://www.justin.tv/vanillatv/"))
self.assertTrue(JustinTVIE.suitable(u"http://www.twitch.tv/vanillatv/"))
def test_justintv_videoid_matching(self):
self.assertTrue(JustinTVIE.suitable(u"http://www.twitch.tv/vanillatv/b/328087483"))
def test_justin_tv_chapterid_matching(self):
self.assertTrue(JustinTVIE.suitable(u"http://www.twitch.tv/tsm_theoddone/c/2349361"))
def test_youtube_extract(self):
self.assertEqual(YoutubeIE()._extract_id('http://www.youtube.com/watch?&v=BaW_jenozKc'), 'BaW_jenozKc')
self.assertEqual(YoutubeIE()._extract_id('https://www.youtube.com/watch?&v=BaW_jenozKc'), 'BaW_jenozKc')
self.assertEqual(YoutubeIE()._extract_id('https://www.youtube.com/watch?feature=player_embedded&v=BaW_jenozKc'), 'BaW_jenozKc')
def test_no_duplicates(self):
ies = gen_extractors()
for tc in get_testcases():
url = tc['url']
for ie in ies:
if type(ie).__name__ in ['GenericIE', tc['name'] + 'IE']:
self.assertTrue(ie.suitable(url), '%s should match URL %r' % (type(ie).__name__, url))
else:
self.assertFalse(ie.suitable(url), '%s should not match URL %r' % (type(ie).__name__, url))
def test_keywords(self):
ies = gen_extractors()
matching_ies = lambda url: [ie.IE_NAME for ie in ies
if ie.suitable(url) and ie.IE_NAME != 'generic']
self.assertEqual(matching_ies(':ytsubs'), ['youtube:subscriptions'])
self.assertEqual(matching_ies(':ytsubscriptions'), ['youtube:subscriptions'])
self.assertEqual(matching_ies(':thedailyshow'), ['ComedyCentral'])
self.assertEqual(matching_ies(':tds'), ['ComedyCentral'])
self.assertEqual(matching_ies(':colbertreport'), ['ComedyCentral'])
self.assertEqual(matching_ies(':cr'), ['ComedyCentral'])
if __name__ == '__main__':
unittest.main()

View File

@@ -1,29 +0,0 @@
# -*- coding: utf-8 -*-
# Various small unit tests
import os,sys
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
import youtube_dl
def test_simplify_title():
assert youtube_dl._simplify_title(u'abc') == u'abc'
assert youtube_dl._simplify_title(u'abc_d-e') == u'abc_d-e'
assert youtube_dl._simplify_title(u'123') == u'123'
assert u'/' not in youtube_dl._simplify_title(u'abc/de')
assert u'abc' in youtube_dl._simplify_title(u'abc/de')
assert u'de' in youtube_dl._simplify_title(u'abc/de')
assert u'/' not in youtube_dl._simplify_title(u'abc/de///')
assert u'\\' not in youtube_dl._simplify_title(u'abc\\de')
assert u'abc' in youtube_dl._simplify_title(u'abc\\de')
assert u'de' in youtube_dl._simplify_title(u'abc\\de')
assert youtube_dl._simplify_title(u'ä') == u'ä'
assert youtube_dl._simplify_title(u'кириллица') == u'кириллица'
# Strip underlines
assert youtube_dl._simplify_title(u'\'a_') == u'a'

169
test/test_download.py Normal file
View File

@@ -0,0 +1,169 @@
#!/usr/bin/env python
import errno
import hashlib
import io
import os
import json
import unittest
import sys
import socket
import binascii
# Allow direct execution
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import youtube_dl.YoutubeDL
from youtube_dl.utils import *
PARAMETERS_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), "parameters.json")
RETRIES = 3
# General configuration (from __init__, not very elegant...)
jar = compat_cookiejar.CookieJar()
cookie_processor = compat_urllib_request.HTTPCookieProcessor(jar)
proxy_handler = compat_urllib_request.ProxyHandler()
opener = compat_urllib_request.build_opener(proxy_handler, cookie_processor, YoutubeDLHandler())
compat_urllib_request.install_opener(opener)
socket.setdefaulttimeout(10)
def _try_rm(filename):
""" Remove a file if it exists """
try:
os.remove(filename)
except OSError as ose:
if ose.errno != errno.ENOENT:
raise
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
class YoutubeDL(youtube_dl.YoutubeDL):
def __init__(self, *args, **kwargs):
self.to_stderr = self.to_screen
self.processed_info_dicts = []
super(YoutubeDL, self).__init__(*args, **kwargs)
def report_warning(self, message):
# Don't accept warnings during tests
raise ExtractorError(message)
def process_info(self, info_dict):
self.processed_info_dicts.append(info_dict)
return super(YoutubeDL, self).process_info(info_dict)
def _file_md5(fn):
with open(fn, 'rb') as f:
return hashlib.md5(f.read()).hexdigest()
from helper import get_testcases
defs = get_testcases()
with io.open(PARAMETERS_FILE, encoding='utf-8') as pf:
parameters = json.load(pf)
class TestDownload(unittest.TestCase):
maxDiff = None
def setUp(self):
self.parameters = parameters
self.defs = defs
### Dynamically generate tests
def generator(test_case):
def test_template(self):
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
def print_skipping(reason):
print('Skipping %s: %s' % (test_case['name'], reason))
if not ie._WORKING:
print_skipping('IE marked as not _WORKING')
return
if 'playlist' not in test_case and not test_case['file']:
print_skipping('No output file specified')
return
if 'skip' in test_case:
print_skipping(test_case['skip'])
return
params = self.parameters.copy()
params.update(test_case.get('params', {}))
ydl = YoutubeDL(params)
ydl.add_default_info_extractors()
finished_hook_called = set()
def _hook(status):
if status['status'] == 'finished':
finished_hook_called.add(status['filename'])
ydl.fd.add_progress_hook(_hook)
test_cases = test_case.get('playlist', [test_case])
for tc in test_cases:
_try_rm(tc['file'])
_try_rm(tc['file'] + '.part')
_try_rm(tc['file'] + '.info.json')
try:
for retry in range(1, RETRIES + 1):
try:
ydl.download([test_case['url']])
except (DownloadError, ExtractorError) as err:
if retry == RETRIES: raise
# Check if the exception is not a network related one
if not err.exc_info[0] in (compat_urllib_error.URLError, socket.timeout, UnavailableVideoError):
raise
print('Retrying: {0} failed tries\n\n##########\n\n'.format(retry))
else:
break
for tc in test_cases:
if not test_case.get('params', {}).get('skip_download', False):
self.assertTrue(os.path.exists(tc['file']), msg='Missing file ' + tc['file'])
self.assertTrue(tc['file'] in finished_hook_called)
self.assertTrue(os.path.exists(tc['file'] + '.info.json'))
if 'md5' in tc:
md5_for_file = _file_md5(tc['file'])
self.assertEqual(md5_for_file, tc['md5'])
with io.open(tc['file'] + '.info.json', encoding='utf-8') as infof:
info_dict = json.load(infof)
for (info_field, expected) in tc.get('info_dict', {}).items():
if isinstance(expected, compat_str) and expected.startswith('md5:'):
self.assertEqual(expected, 'md5:' + md5(info_dict.get(info_field)))
else:
got = info_dict.get(info_field)
self.assertEqual(
expected, got,
u'invalid value for field %s, expected %r, got %r' % (info_field, expected, got))
# If checkable fields are missing from the test case, print the info_dict
test_info_dict = dict((key, value if not isinstance(value, compat_str) or len(value) < 250 else 'md5:' + md5(value))
for key, value in info_dict.items()
if value and key in ('title', 'description', 'uploader', 'upload_date', 'uploader_id', 'location'))
if not all(key in tc.get('info_dict', {}).keys() for key in test_info_dict.keys()):
sys.stderr.write(u'\n"info_dict": ' + json.dumps(test_info_dict, ensure_ascii=False, indent=2) + u'\n')
# Check for the presence of mandatory fields
for key in ('id', 'url', 'title', 'ext'):
self.assertTrue(key in info_dict.keys() and info_dict[key])
finally:
for tc in test_cases:
_try_rm(tc['file'])
_try_rm(tc['file'] + '.part')
_try_rm(tc['file'] + '.info.json')
return test_template
### And add them to TestDownload
for n, test_case in enumerate(defs):
test_method = generator(test_case)
tname = 'test_' + str(test_case['name'])
i = 1
while hasattr(TestDownload, tname):
tname = 'test_' + str(test_case['name']) + '_' + str(i)
i += 1
test_method.__name__ = tname
setattr(TestDownload, test_method.__name__, test_method)
del test_method
if __name__ == '__main__':
unittest.main()

26
test/test_execution.py Normal file
View File

@@ -0,0 +1,26 @@
import unittest
import sys
import os
import subprocess
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
_DEV_NULL = subprocess.DEVNULL
except AttributeError:
_DEV_NULL = open(os.devnull, 'wb')
class TestExecution(unittest.TestCase):
def test_import(self):
subprocess.check_call([sys.executable, '-c', 'import youtube_dl'], cwd=rootDir)
def test_module_exec(self):
if sys.version_info >= (2,7): # Python 2.6 doesn't support package execution
subprocess.check_call([sys.executable, '-m', 'youtube_dl', '--version'], cwd=rootDir, stdout=_DEV_NULL)
def test_main_exec(self):
subprocess.check_call([sys.executable, 'youtube_dl/__main__.py', '--version'], cwd=rootDir, stdout=_DEV_NULL)
if __name__ == '__main__':
unittest.main()

131
test/test_utils.py Normal file
View File

@@ -0,0 +1,131 @@
#!/usr/bin/env python
# Various small unit tests
import sys
import unittest
import xml.etree.ElementTree
# Allow direct execution
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
#from youtube_dl.utils import htmlentity_transform
from youtube_dl.utils import timeconvert
from youtube_dl.utils import sanitize_filename
from youtube_dl.utils import unescapeHTML
from youtube_dl.utils import orderedSet
from youtube_dl.utils import DateRange
from youtube_dl.utils import unified_strdate
from youtube_dl.utils import find_xpath_attr
if sys.version_info < (3, 0):
_compat_str = lambda b: b.decode('unicode-escape')
else:
_compat_str = lambda s: s
class TestUtil(unittest.TestCase):
def test_timeconvert(self):
self.assertTrue(timeconvert('') is None)
self.assertTrue(timeconvert('bougrg') is None)
def test_sanitize_filename(self):
self.assertEqual(sanitize_filename('abc'), 'abc')
self.assertEqual(sanitize_filename('abc_d-e'), 'abc_d-e')
self.assertEqual(sanitize_filename('123'), '123')
self.assertEqual('abc_de', sanitize_filename('abc/de'))
self.assertFalse('/' in sanitize_filename('abc/de///'))
self.assertEqual('abc_de', sanitize_filename('abc/<>\\*|de'))
self.assertEqual('xxx', sanitize_filename('xxx/<>\\*|'))
self.assertEqual('yes no', sanitize_filename('yes? no'))
self.assertEqual('this - that', sanitize_filename('this: that'))
self.assertEqual(sanitize_filename('AT&T'), 'AT&T')
aumlaut = _compat_str('\xe4')
self.assertEqual(sanitize_filename(aumlaut), aumlaut)
tests = _compat_str('\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430')
self.assertEqual(sanitize_filename(tests), tests)
forbidden = '"\0\\/'
for fc in forbidden:
for fbc in forbidden:
self.assertTrue(fbc not in sanitize_filename(fc))
def test_sanitize_filename_restricted(self):
self.assertEqual(sanitize_filename('abc', restricted=True), 'abc')
self.assertEqual(sanitize_filename('abc_d-e', restricted=True), 'abc_d-e')
self.assertEqual(sanitize_filename('123', restricted=True), '123')
self.assertEqual('abc_de', sanitize_filename('abc/de', restricted=True))
self.assertFalse('/' in sanitize_filename('abc/de///', restricted=True))
self.assertEqual('abc_de', sanitize_filename('abc/<>\\*|de', restricted=True))
self.assertEqual('xxx', sanitize_filename('xxx/<>\\*|', restricted=True))
self.assertEqual('yes_no', sanitize_filename('yes? no', restricted=True))
self.assertEqual('this_-_that', sanitize_filename('this: that', restricted=True))
tests = _compat_str('a\xe4b\u4e2d\u56fd\u7684c')
self.assertEqual(sanitize_filename(tests, restricted=True), 'a_b_c')
self.assertTrue(sanitize_filename(_compat_str('\xf6'), restricted=True) != '') # No empty filename
forbidden = '"\0\\/&!: \'\t\n()[]{}$;`^,#'
for fc in forbidden:
for fbc in forbidden:
self.assertTrue(fbc not in sanitize_filename(fc, restricted=True))
# Handle a common case more neatly
self.assertEqual(sanitize_filename(_compat_str('\u5927\u58f0\u5e26 - Song'), restricted=True), 'Song')
self.assertEqual(sanitize_filename(_compat_str('\u603b\u7edf: Speech'), restricted=True), 'Speech')
# .. but make sure the file name is never empty
self.assertTrue(sanitize_filename('-', restricted=True) != '')
self.assertTrue(sanitize_filename(':', restricted=True) != '')
def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw')
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI')
def test_ordered_set(self):
self.assertEqual(orderedSet([1, 1, 2, 3, 4, 4, 5, 6, 7, 3, 5]), [1, 2, 3, 4, 5, 6, 7])
self.assertEqual(orderedSet([]), [])
self.assertEqual(orderedSet([1]), [1])
#keep the list ordered
self.assertEqual(orderedSet([135, 1, 1, 1]), [135, 1])
def test_unescape_html(self):
self.assertEqual(unescapeHTML(_compat_str('%20;')), _compat_str('%20;'))
def test_daterange(self):
_20century = DateRange("19000101","20000101")
self.assertFalse("17890714" in _20century)
_ac = DateRange("00010101")
self.assertTrue("19690721" in _ac)
_firstmilenium = DateRange(end="10000101")
self.assertTrue("07110427" in _firstmilenium)
def test_unified_dates(self):
self.assertEqual(unified_strdate('December 21, 2010'), '20101221')
self.assertEqual(unified_strdate('8/7/2009'), '20090708')
self.assertEqual(unified_strdate('Dec 14, 2012'), '20121214')
self.assertEqual(unified_strdate('2012/10/11 01:56:38 +0000'), '20121011')
def test_find_xpath_attr(self):
testxml = u'''<root>
<node/>
<node x="a"/>
<node x="a" y="c" />
<node x="b" y="d" />
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
self.assertEqual(find_xpath_attr(doc, './/fourohfour', 'n', 'v'), None)
self.assertEqual(find_xpath_attr(doc, './/node', 'x', 'a'), doc[1])
self.assertEqual(find_xpath_attr(doc, './/node', 'y', 'c'), doc[2])
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,77 @@
#!/usr/bin/env python
# coding: utf-8
import json
import os
import sys
import unittest
# Allow direct execution
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import youtube_dl.YoutubeDL
import youtube_dl.extractor
from youtube_dl.utils import *
PARAMETERS_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), "parameters.json")
# General configuration (from __init__, not very elegant...)
jar = compat_cookiejar.CookieJar()
cookie_processor = compat_urllib_request.HTTPCookieProcessor(jar)
proxy_handler = compat_urllib_request.ProxyHandler()
opener = compat_urllib_request.build_opener(proxy_handler, cookie_processor, YoutubeDLHandler())
compat_urllib_request.install_opener(opener)
class YoutubeDL(youtube_dl.YoutubeDL):
def __init__(self, *args, **kwargs):
super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen
with io.open(PARAMETERS_FILE, encoding='utf-8') as pf:
params = json.load(pf)
params['writeinfojson'] = True
params['skip_download'] = True
params['writedescription'] = True
TEST_ID = 'BaW_jenozKc'
INFO_JSON_FILE = TEST_ID + '.mp4.info.json'
DESCRIPTION_FILE = TEST_ID + '.mp4.description'
EXPECTED_DESCRIPTION = u'''test chars: "'/\ä↭𝕐
This is a test video for youtube-dl.
For more information, contact phihag@phihag.de .'''
class TestInfoJSON(unittest.TestCase):
def setUp(self):
# Clear old files
self.tearDown()
def test_info_json(self):
ie = youtube_dl.extractor.YoutubeIE()
ydl = YoutubeDL(params)
ydl.add_info_extractor(ie)
ydl.download([TEST_ID])
self.assertTrue(os.path.exists(INFO_JSON_FILE))
with io.open(INFO_JSON_FILE, 'r', encoding='utf-8') as jsonf:
jd = json.load(jsonf)
self.assertEqual(jd['upload_date'], u'20121002')
self.assertEqual(jd['description'], EXPECTED_DESCRIPTION)
self.assertEqual(jd['id'], TEST_ID)
self.assertEqual(jd['extractor'], 'youtube')
self.assertEqual(jd['title'], u'''youtube-dl test video "'/\ä↭𝕐''')
self.assertEqual(jd['uploader'], 'Philipp Hagemeister')
self.assertTrue(os.path.exists(DESCRIPTION_FILE))
with io.open(DESCRIPTION_FILE, 'r', encoding='utf-8') as descf:
descr = descf.read()
self.assertEqual(descr, EXPECTED_DESCRIPTION)
def tearDown(self):
if os.path.exists(INFO_JSON_FILE):
os.remove(INFO_JSON_FILE)
if os.path.exists(DESCRIPTION_FILE):
os.remove(DESCRIPTION_FILE)
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,98 @@
#!/usr/bin/env python
import sys
import unittest
import json
# Allow direct execution
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.extractor import YoutubeUserIE, YoutubePlaylistIE, YoutubeIE, YoutubeChannelIE, YoutubeShowIE
from youtube_dl.utils import *
from helper import FakeYDL
class TestYoutubeLists(unittest.TestCase):
def assertIsPlaylist(self,info):
"""Make sure the info has '_type' set to 'playlist'"""
self.assertEqual(info['_type'], 'playlist')
def test_youtube_playlist(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re')[0]
self.assertIsPlaylist(result)
self.assertEqual(result['title'], 'ytdl test PL')
ytie_results = [YoutubeIE()._extract_id(url['url']) for url in result['entries']]
self.assertEqual(ytie_results, [ 'bV9L5Ht9LgY', 'FXxLjLQi3Fg', 'tU3Bgo5qJZE'])
def test_issue_673(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('PLBB231211A4F62143')[0]
self.assertTrue(len(result['entries']) > 25)
def test_youtube_playlist_long(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')[0]
self.assertIsPlaylist(result)
self.assertTrue(len(result['entries']) >= 799)
def test_youtube_playlist_with_deleted(self):
#651
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')[0]
ytie_results = [YoutubeIE()._extract_id(url['url']) for url in result['entries']]
self.assertFalse('pElCt5oNDuI' in ytie_results)
self.assertFalse('KdPEApIVdWM' in ytie_results)
def test_youtube_playlist_empty(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx')[0]
self.assertIsPlaylist(result)
self.assertEqual(len(result['entries']), 0)
def test_youtube_course(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
# TODO find a > 100 (paginating?) videos course
result = ie.extract('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')[0]
entries = result['entries']
self.assertEqual(YoutubeIE()._extract_id(entries[0]['url']), 'j9WZyLZCBzs')
self.assertEqual(len(entries), 25)
self.assertEqual(YoutubeIE()._extract_id(entries[-1]['url']), 'rYefUsYuEp0')
def test_youtube_channel(self):
dl = FakeYDL()
ie = YoutubeChannelIE(dl)
#test paginated channel
result = ie.extract('https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w')[0]
self.assertTrue(len(result['entries']) > 90)
#test autogenerated channel
result = ie.extract('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')[0]
self.assertTrue(len(result['entries']) >= 18)
def test_youtube_user(self):
dl = FakeYDL()
ie = YoutubeUserIE(dl)
result = ie.extract('https://www.youtube.com/user/TheLinuxFoundation')[0]
self.assertTrue(len(result['entries']) >= 320)
def test_youtube_safe_search(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)
result = ie.extract('PLtPgu7CB4gbY9oDN3drwC3cMbJggS7dKl')[0]
self.assertEqual(len(result['entries']), 2)
def test_youtube_show(self):
dl = FakeYDL()
ie = YoutubeShowIE(dl)
result = ie.extract('http://www.youtube.com/show/airdisasters')
self.assertTrue(len(result) >= 4)
if __name__ == '__main__':
unittest.main()

57
test/test_youtube_sig.py Executable file
View File

@@ -0,0 +1,57 @@
#!/usr/bin/env python
import unittest
import sys
# Allow direct execution
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.extractor.youtube import YoutubeIE
from helper import FakeYDL
sig = YoutubeIE(FakeYDL())._decrypt_signature
class TestYoutubeSig(unittest.TestCase):
def test_43_43(self):
wrong = '5AEEAE0EC39677BC65FD9021CCD115F1F2DBD5A59E4.C0B243A3E2DED6769199AF3461781E75122AE135135'
right = '931EA22157E1871643FA9519676DED253A342B0C.4E95A5DBD2F1F511DCC1209DF56CB77693CE0EAE'
self.assertEqual(sig(wrong), right)
def test_88(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[]}|:;?/>.<"
right = "J:|}][{=+-_)(*&;%$#@>MNBVCXZASDFGH^KLPOIUYTREWQ0987654321mnbvcxzasdfghrklpoiuytej"
self.assertEqual(sig(wrong), right)
def test_87(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$^&*()_-+={[]}|:;?/>.<"
right = "!?;:|}][{=+-_)(*&^$#@/MNBVCXZASqFGHJKLPOIUYTREWQ0987654321mnbvcxzasdfghjklpoiuytr"
self.assertEqual(sig(wrong), right)
def test_86(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[|};?/>.<"
right = "ertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!/#$%^&*()_-+={[|};?@"
self.assertEqual(sig(wrong), right)
def test_85(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[};?/>.<"
right = "{>/?;}[.=+-_)(*&^%$#@!MqBVCXZASDFwHJKLPOIUYTREWQ0987654321mnbvcxzasdfghjklpoiuytr"
self.assertEqual(sig(wrong), right)
def test_84(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!@#$%^&*()_-+={[};?>.<"
right = "<.>?;}[{=+-_)(*&^%$#@!MNBVCXZASDFGHJKLPOIUYTREWe098765432rmnbvcxzasdfghjklpoiuyt1"
self.assertEqual(sig(wrong), right)
def test_83(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKJHGFDSAZXCVBNM!#$%^&*()_+={[};?/>.<"
right = "urty8ioplkjhgfdsazxcvbqm1234567S90QWERTYUIOPLKJHGFDnAZXCVBNM!#$%^&*()_+={[};?/>.<"
self.assertEqual(sig(wrong), right)
def test_82(self):
wrong = "qwertyuioplkjhgfdsazxcvbnm1234567890QWERTYUIOPLKHGFDSAZXCVBNM!@#$%^&*(-+={[};?/>.<"
right = "Q>/?;}[{=+-(*<^%$#@!MNBVCXZASDFGHKLPOIUY8REWT0q&7654321mnbvcxzasdfghjklpoiuytrew9"
self.assertEqual(sig(wrong), right)
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,95 @@
#!/usr/bin/env python
import sys
import unittest
import json
import io
import hashlib
# Allow direct execution
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.extractor import YoutubeIE
from youtube_dl.utils import *
from helper import FakeYDL
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
class TestYoutubeSubtitles(unittest.TestCase):
def setUp(self):
DL = FakeYDL()
DL.params['allsubtitles'] = False
DL.params['writesubtitles'] = False
DL.params['subtitlesformat'] = 'srt'
DL.params['listsubtitles'] = False
def test_youtube_no_subtitles(self):
DL = FakeYDL()
DL.params['writesubtitles'] = False
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
subtitles = info_dict[0]['subtitles']
self.assertEqual(subtitles, None)
def test_youtube_subtitles(self):
DL = FakeYDL()
DL.params['writesubtitles'] = True
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
sub = info_dict[0]['subtitles'][0]
self.assertEqual(md5(sub[2]), '4cd9278a35ba2305f47354ee13472260')
def test_youtube_subtitles_it(self):
DL = FakeYDL()
DL.params['writesubtitles'] = True
DL.params['subtitleslang'] = 'it'
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
sub = info_dict[0]['subtitles'][0]
self.assertEqual(md5(sub[2]), '164a51f16f260476a05b50fe4c2f161d')
def test_youtube_onlysubtitles(self):
DL = FakeYDL()
DL.params['writesubtitles'] = True
DL.params['onlysubtitles'] = True
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
sub = info_dict[0]['subtitles'][0]
self.assertEqual(md5(sub[2]), '4cd9278a35ba2305f47354ee13472260')
def test_youtube_allsubtitles(self):
DL = FakeYDL()
DL.params['allsubtitles'] = True
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
subtitles = info_dict[0]['subtitles']
self.assertEqual(len(subtitles), 13)
def test_youtube_subtitles_sbv_format(self):
DL = FakeYDL()
DL.params['writesubtitles'] = True
DL.params['subtitlesformat'] = 'sbv'
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
sub = info_dict[0]['subtitles'][0]
self.assertEqual(md5(sub[2]), '13aeaa0c245a8bed9a451cb643e3ad8b')
def test_youtube_subtitles_vtt_format(self):
DL = FakeYDL()
DL.params['writesubtitles'] = True
DL.params['subtitlesformat'] = 'vtt'
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
sub = info_dict[0]['subtitles'][0]
self.assertEqual(md5(sub[2]), '356cdc577fde0c6783b9b822e7206ff7')
def test_youtube_list_subtitles(self):
DL = FakeYDL()
DL.params['listsubtitles'] = True
IE = YoutubeIE(DL)
info_dict = IE.extract('QRS8MkLhQmM')
self.assertEqual(info_dict, None)
def test_youtube_automatic_captions(self):
DL = FakeYDL()
DL.params['writeautomaticsub'] = True
DL.params['subtitleslang'] = 'it'
IE = YoutubeIE(DL)
info_dict = IE.extract('8YoUxe5ncPo')
sub = info_dict[0]['subtitles'][0]
self.assertTrue(sub[2] is not None)
if __name__ == '__main__':
unittest.main()

4712
youtube-dl

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +0,0 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import youtube_dl
youtube_dl.main()

BIN
youtube-dl.exe Normal file

Binary file not shown.

View File

@@ -0,0 +1,544 @@
import math
import os
import re
import subprocess
import sys
import time
import traceback
if os.name == 'nt':
import ctypes
from .utils import *
class FileDownloader(object):
"""File Downloader class.
File downloader objects are the ones responsible of downloading the
actual video file and writing it to disk.
File downloaders accept a lot of parameters. In order not to saturate
the object constructor with arguments, it receives a dictionary of
options instead.
Available options:
verbose: Print additional info to stdout.
quiet: Do not print messages to stdout.
ratelimit: Download speed limit, in bytes/sec.
retries: Number of times to retry for HTTP error 5xx
buffersize: Size of download buffer in bytes.
noresizebuffer: Do not automatically resize the download buffer.
continuedl: Try to continue downloads if possible.
noprogress: Do not print the progress bar.
logtostderr: Log messages to stderr instead of stdout.
consoletitle: Display progress in console window's titlebar.
nopart: Do not use temporary .part files.
updatetime: Use the Last-modified header to set output file timestamps.
test: Download only first bytes to test the downloader.
min_filesize: Skip files smaller than this size
max_filesize: Skip files larger than this size
"""
params = None
def __init__(self, ydl, params):
"""Create a FileDownloader object with the given options."""
self.ydl = ydl
self._progress_hooks = []
self.params = params
@staticmethod
def format_bytes(bytes):
if bytes is None:
return 'N/A'
if type(bytes) is str:
bytes = float(bytes)
if bytes == 0.0:
exponent = 0
else:
exponent = int(math.log(bytes, 1024.0))
suffix = ['B','KiB','MiB','GiB','TiB','PiB','EiB','ZiB','YiB'][exponent]
converted = float(bytes) / float(1024 ** exponent)
return '%.2f%s' % (converted, suffix)
@staticmethod
def calc_percent(byte_counter, data_len):
if data_len is None:
return '---.-%'
return '%6s' % ('%3.1f%%' % (float(byte_counter) / float(data_len) * 100.0))
@staticmethod
def calc_eta(start, now, total, current):
if total is None:
return '--:--'
dif = now - start
if current == 0 or dif < 0.001: # One millisecond
return '--:--'
rate = float(current) / dif
eta = int((float(total) - float(current)) / rate)
(eta_mins, eta_secs) = divmod(eta, 60)
if eta_mins > 99:
return '--:--'
return '%02d:%02d' % (eta_mins, eta_secs)
@staticmethod
def calc_speed(start, now, bytes):
dif = now - start
if bytes == 0 or dif < 0.001: # One millisecond
return '%10s' % '---b/s'
return '%10s' % ('%s/s' % FileDownloader.format_bytes(float(bytes) / dif))
@staticmethod
def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0)
new_max = min(max(bytes * 2.0, 1.0), 4194304) # Do not surpass 4 MB
if elapsed_time < 0.001:
return int(new_max)
rate = bytes / elapsed_time
if rate > new_max:
return int(new_max)
if rate < new_min:
return int(new_min)
return int(rate)
@staticmethod
def parse_bytes(bytestr):
"""Parse a string indicating a byte quantity into an integer."""
matchobj = re.match(r'(?i)^(\d+(?:\.\d+)?)([kMGTPEZY]?)$', bytestr)
if matchobj is None:
return None
number = float(matchobj.group(1))
multiplier = 1024.0 ** 'bkmgtpezy'.index(matchobj.group(2).lower())
return int(round(number * multiplier))
def to_screen(self, *args, **kargs):
self.ydl.to_screen(*args, **kargs)
def to_stderr(self, message):
self.ydl.to_screen(message)
def to_cons_title(self, message):
"""Set console/terminal window title to message."""
if not self.params.get('consoletitle', False):
return
if os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
# c_wchar_p() might not be necessary if `message` is
# already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
elif 'TERM' in os.environ:
self.to_screen('\033]0;%s\007' % message, skip_eol=True)
def trouble(self, *args, **kargs):
self.ydl.trouble(*args, **kargs)
def report_warning(self, *args, **kargs):
self.ydl.report_warning(*args, **kargs)
def report_error(self, *args, **kargs):
self.ydl.report_error(*args, **kargs)
def slow_down(self, start_time, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
if rate_limit is None or byte_counter == 0:
return
now = time.time()
elapsed = now - start_time
if elapsed <= 0.0:
return
speed = float(byte_counter) / elapsed
if speed > rate_limit:
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
def temp_name(self, filename):
"""Returns a temporary filename for the given filename."""
if self.params.get('nopart', False) or filename == u'-' or \
(os.path.exists(encodeFilename(filename)) and not os.path.isfile(encodeFilename(filename))):
return filename
return filename + u'.part'
def undo_temp_name(self, filename):
if filename.endswith(u'.part'):
return filename[:-len(u'.part')]
return filename
def try_rename(self, old_filename, new_filename):
try:
if old_filename == new_filename:
return
os.rename(encodeFilename(old_filename), encodeFilename(new_filename))
except (IOError, OSError) as err:
self.report_error(u'unable to rename file')
def try_utime(self, filename, last_modified_hdr):
"""Try to set the last-modified time of the given file."""
if last_modified_hdr is None:
return
if not os.path.isfile(encodeFilename(filename)):
return
timestr = last_modified_hdr
if timestr is None:
return
filetime = timeconvert(timestr)
if filetime is None:
return filetime
# Ignore obviously invalid dates
if filetime == 0:
return
try:
os.utime(filename, (time.time(), filetime))
except:
pass
return filetime
def report_destination(self, filename):
"""Report destination filename."""
self.to_screen(u'[download] Destination: ' + filename)
def report_progress(self, percent_str, data_len_str, speed_str, eta_str):
"""Report download progress."""
if self.params.get('noprogress', False):
return
clear_line = (u'\x1b[K' if sys.stderr.isatty() and os.name != 'nt' else u'')
if self.params.get('progress_with_newline', False):
self.to_screen(u'[download] %s of %s at %s ETA %s' %
(percent_str, data_len_str, speed_str, eta_str))
else:
self.to_screen(u'\r%s[download] %s of %s at %s ETA %s' %
(clear_line, percent_str, data_len_str, speed_str, eta_str), skip_eol=True)
self.to_cons_title(u'youtube-dl - %s of %s at %s ETA %s' %
(percent_str.strip(), data_len_str.strip(), speed_str.strip(), eta_str.strip()))
def report_resuming_byte(self, resume_len):
"""Report attempt to resume at given byte."""
self.to_screen(u'[download] Resuming download at byte %s' % resume_len)
def report_retry(self, count, retries):
"""Report retry in case of HTTP error 5xx"""
self.to_screen(u'[download] Got server HTTP error. Retrying (attempt %d of %d)...' % (count, retries))
def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded."""
try:
self.to_screen(u'[download] %s has already been downloaded' % file_name)
except (UnicodeEncodeError) as err:
self.to_screen(u'[download] The file has already been downloaded')
def report_unable_to_resume(self):
"""Report it was impossible to resume download."""
self.to_screen(u'[download] Unable to resume')
def report_finish(self):
"""Report download finished."""
if self.params.get('noprogress', False):
self.to_screen(u'[download] Download completed')
else:
self.to_screen(u'')
def _download_with_rtmpdump(self, filename, url, player_url, page_url, play_path, tc_url):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
# Check for rtmpdump first
try:
subprocess.call(['rtmpdump', '-h'], stdout=(open(os.path.devnull, 'w')), stderr=subprocess.STDOUT)
except (OSError, IOError):
self.report_error(u'RTMP download detected but "rtmpdump" could not be run')
return False
verbosity_option = '--verbose' if self.params.get('verbose', False) else '--quiet'
# Download using rtmpdump. rtmpdump returns exit code 2 when
# the connection was interrumpted and resuming appears to be
# possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = ['rtmpdump', verbosity_option, '-r', url, '-o', tmpfilename]
if player_url is not None:
basic_args += ['--swfVfy', player_url]
if page_url is not None:
basic_args += ['--pageUrl', page_url]
if play_path is not None:
basic_args += ['--playpath', play_path]
if tc_url is not None:
basic_args += ['--tcUrl', url]
args = basic_args + [[], ['--resume', '--skip', '1']][self.params.get('continuedl', False)]
if self.params.get('verbose', False):
try:
import pipes
shell_quote = lambda args: ' '.join(map(pipes.quote, args))
except ImportError:
shell_quote = repr
self.to_screen(u'[debug] rtmpdump command line: ' + shell_quote(args))
retval = subprocess.call(args)
while retval == 2 or retval == 1:
prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[rtmpdump] %s bytes' % prevsize, skip_eol=True)
time.sleep(5.0) # This seems to be needed
retval = subprocess.call(basic_args + ['-e'] + [[], ['-k', '1']][retval == 1])
cursize = os.path.getsize(encodeFilename(tmpfilename))
if prevsize == cursize and retval == 1:
break
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
if prevsize == cursize and retval == 2 and cursize > 1024:
self.to_screen(u'\r[rtmpdump] Could not download the whole video. This can happen for some advertisements.')
retval = 0
break
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[rtmpdump] %s bytes' % fsize)
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'filename': filename,
'status': 'finished',
})
return True
else:
self.to_stderr(u"\n")
self.report_error(u'rtmpdump exited with code %d' % retval)
return False
def _download_with_mplayer(self, filename, url):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
args = ['mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy', '-dumpstream', '-dumpfile', tmpfilename, url]
# Check for mplayer first
try:
subprocess.call(['mplayer', '-h'], stdout=(open(os.path.devnull, 'w')), stderr=subprocess.STDOUT)
except (OSError, IOError):
self.report_error(u'MMS or RTSP download detected but "%s" could not be run' % args[0] )
return False
# Download using mplayer.
retval = subprocess.call(args)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[%s] %s bytes' % (args[0], fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'filename': filename,
'status': 'finished',
})
return True
else:
self.to_stderr(u"\n")
self.report_error(u'mplayer exited with code %d' % retval)
return False
def _do_download(self, filename, info_dict):
url = info_dict['url']
# Check file already present
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)) and not self.params.get('nopart', False):
self.report_file_already_downloaded(filename)
self._hook_progress({
'filename': filename,
'status': 'finished',
})
return True
# Attempt to download using rtmpdump
if url.startswith('rtmp'):
return self._download_with_rtmpdump(filename, url,
info_dict.get('player_url', None),
info_dict.get('page_url', None),
info_dict.get('play_path', None),
info_dict.get('tc_url', None))
# Attempt to download using mplayer
if url.startswith('mms') or url.startswith('rtsp'):
return self._download_with_mplayer(filename, url)
tmpfilename = self.temp_name(filename)
stream = None
# Do not include the Accept-Encoding header
headers = {'Youtubedl-no-compression': 'True'}
if 'user_agent' in info_dict:
headers['Youtubedl-user-agent'] = info_dict['user_agent']
basic_request = compat_urllib_request.Request(url, None, headers)
request = compat_urllib_request.Request(url, None, headers)
if self.params.get('test', False):
request.add_header('Range','bytes=0-10240')
# Establish possible resume length
if os.path.isfile(encodeFilename(tmpfilename)):
resume_len = os.path.getsize(encodeFilename(tmpfilename))
else:
resume_len = 0
open_mode = 'wb'
if resume_len != 0:
if self.params.get('continuedl', False):
self.report_resuming_byte(resume_len)
request.add_header('Range','bytes=%d-' % resume_len)
open_mode = 'ab'
else:
resume_len = 0
count = 0
retries = self.params.get('retries', 0)
while count <= retries:
# Establish connection
try:
if count == 0 and 'urlhandle' in info_dict:
data = info_dict['urlhandle']
data = compat_urllib_request.urlopen(request)
break
except (compat_urllib_error.HTTPError, ) as err:
if (err.code < 500 or err.code >= 600) and err.code != 416:
# Unexpected HTTP error
raise
elif err.code == 416:
# Unable to resume (requested range not satisfiable)
try:
# Open the connection again without the range header
data = compat_urllib_request.urlopen(basic_request)
content_length = data.info()['Content-Length']
except (compat_urllib_error.HTTPError, ) as err:
if err.code < 500 or err.code >= 600:
raise
else:
# Examine the reported length
if (content_length is not None and
(resume_len - 100 < int(content_length) < resume_len + 100)):
# The file had already been fully downloaded.
# Explanation to the above condition: in issue #175 it was revealed that
# YouTube sometimes adds or removes a few bytes from the end of the file,
# changing the file size slightly and causing problems for some users. So
# I decided to implement a suggested change and consider the file
# completely downloaded if the file size differs less than 100 bytes from
# the one in the hard drive.
self.report_file_already_downloaded(filename)
self.try_rename(tmpfilename, filename)
self._hook_progress({
'filename': filename,
'status': 'finished',
})
return True
else:
# The length does not match, we start the download over
self.report_unable_to_resume()
open_mode = 'wb'
break
# Retry
count += 1
if count <= retries:
self.report_retry(count, retries)
if count > retries:
self.report_error(u'giving up after %s retries' % retries)
return False
data_len = data.info().get('Content-length', None)
if data_len is not None:
data_len = int(data_len) + resume_len
min_data_len = self.params.get("min_filesize", None)
max_data_len = self.params.get("max_filesize", None)
if min_data_len is not None and data_len < min_data_len:
self.to_screen(u'\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
return False
if max_data_len is not None and data_len > max_data_len:
self.to_screen(u'\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
return False
data_len_str = self.format_bytes(data_len)
byte_counter = 0 + resume_len
block_size = self.params.get('buffersize', 1024)
start = time.time()
while True:
# Download and write
before = time.time()
data_block = data.read(block_size)
after = time.time()
if len(data_block) == 0:
break
byte_counter += len(data_block)
# Open file just in time
if stream is None:
try:
(stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
assert stream is not None
filename = self.undo_temp_name(tmpfilename)
self.report_destination(filename)
except (OSError, IOError) as err:
self.report_error(u'unable to open for writing: %s' % str(err))
return False
try:
stream.write(data_block)
except (IOError, OSError) as err:
self.to_stderr(u"\n")
self.report_error(u'unable to write data: %s' % str(err))
return False
if not self.params.get('noresizebuffer', False):
block_size = self.best_block_size(after - before, len(data_block))
# Progress message
speed_str = self.calc_speed(start, time.time(), byte_counter - resume_len)
if data_len is None:
self.report_progress('Unknown %', data_len_str, speed_str, 'Unknown ETA')
else:
percent_str = self.calc_percent(byte_counter, data_len)
eta_str = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
self.report_progress(percent_str, data_len_str, speed_str, eta_str)
self._hook_progress({
'downloaded_bytes': byte_counter,
'total_bytes': data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'status': 'downloading',
})
# Apply rate limit
self.slow_down(start, byte_counter - resume_len)
if stream is None:
self.to_stderr(u"\n")
self.report_error(u'Did not get any data blocks')
return False
stream.close()
self.report_finish()
if data_len is not None and byte_counter != data_len:
raise ContentTooShortError(byte_counter, int(data_len))
self.try_rename(tmpfilename, filename)
# Update file modification time
if self.params.get('updatetime', True):
info_dict['filetime'] = self.try_utime(filename, data.info().get('last-modified', None))
self._hook_progress({
'downloaded_bytes': byte_counter,
'total_bytes': byte_counter,
'filename': filename,
'status': 'finished',
})
return True
def _hook_progress(self, status):
for ph in self._progress_hooks:
ph(status)
def add_progress_hook(self, ph):
""" ph gets called on download progress, with a dictionary with the entries
* filename: The final filename
* status: One of "downloading" and "finished"
It can also have some of the following entries:
* downloaded_bytes: Bytes on disks
* total_bytes: Total bytes, None if unknown
* tmpfilename: The filename we're currently writing to
Hooks are guaranteed to be called at least once (with status "finished")
if the download is successful.
"""
self._progress_hooks.append(ph)

4
youtube_dl/InfoExtractors.py Executable file
View File

@@ -0,0 +1,4 @@
# Legacy file for backwards compatibility, use youtube_dl.extractor instead!
from .extractor.common import InfoExtractor, SearchInfoExtractor
from .extractor import gen_extractors, get_info_extractor

233
youtube_dl/PostProcessor.py Normal file
View File

@@ -0,0 +1,233 @@
import os
import subprocess
import sys
import time
from .utils import *
class PostProcessor(object):
"""Post Processor class.
PostProcessor objects can be added to downloaders with their
add_post_processor() method. When the downloader has finished a
successful download, it will take its internal chain of PostProcessors
and start calling the run() method on each one of them, first with
an initial argument and then with the returned value of the previous
PostProcessor.
The chain will be stopped if one of them ever returns None or the end
of the chain is reached.
PostProcessor objects follow a "mutual registration" process similar
to InfoExtractor objects.
"""
_downloader = None
def __init__(self, downloader=None):
self._downloader = downloader
def set_downloader(self, downloader):
"""Sets the downloader for this PP."""
self._downloader = downloader
def run(self, information):
"""Run the PostProcessor.
The "information" argument is a dictionary like the ones
composed by InfoExtractors. The only difference is that this
one has an extra field called "filepath" that points to the
downloaded file.
This method returns a tuple, the first element of which describes
whether the original file should be kept (i.e. not deleted - None for
no preference), and the second of which is the updated information.
In addition, this method may raise a PostProcessingError
exception if post processing fails.
"""
return None, information # by default, keep file and do nothing
class FFmpegPostProcessorError(PostProcessingError):
pass
class AudioConversionError(PostProcessingError):
pass
class FFmpegPostProcessor(PostProcessor):
def __init__(self,downloader=None):
PostProcessor.__init__(self, downloader)
self._exes = self.detect_executables()
@staticmethod
def detect_executables():
def executable(exe):
try:
subprocess.Popen([exe, '-version'], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
except OSError:
return False
return exe
programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
return dict((program, executable(program)) for program in programs)
def run_ffmpeg(self, path, out_path, opts):
if not self._exes['ffmpeg'] and not self._exes['avconv']:
raise FFmpegPostProcessorError(u'ffmpeg or avconv not found. Please install one.')
cmd = ([self._exes['avconv'] or self._exes['ffmpeg'], '-y', '-i', encodeFilename(path)]
+ opts +
[encodeFilename(self._ffmpeg_filename_argument(out_path))])
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout,stderr = p.communicate()
if p.returncode != 0:
stderr = stderr.decode('utf-8', 'replace')
msg = stderr.strip().split('\n')[-1]
raise FFmpegPostProcessorError(msg)
def _ffmpeg_filename_argument(self, fn):
# ffmpeg broke --, see https://ffmpeg.org/trac/ffmpeg/ticket/2127 for details
if fn.startswith(u'-'):
return u'./' + fn
return fn
class FFmpegExtractAudioPP(FFmpegPostProcessor):
def __init__(self, downloader=None, preferredcodec=None, preferredquality=None, nopostoverwrites=False):
FFmpegPostProcessor.__init__(self, downloader)
if preferredcodec is None:
preferredcodec = 'best'
self._preferredcodec = preferredcodec
self._preferredquality = preferredquality
self._nopostoverwrites = nopostoverwrites
def get_audio_codec(self, path):
if not self._exes['ffprobe'] and not self._exes['avprobe']: return None
try:
cmd = [self._exes['avprobe'] or self._exes['ffprobe'], '-show_streams', encodeFilename(self._ffmpeg_filename_argument(path))]
handle = subprocess.Popen(cmd, stderr=compat_subprocess_get_DEVNULL(), stdout=subprocess.PIPE)
output = handle.communicate()[0]
if handle.wait() != 0:
return None
except (IOError, OSError):
return None
audio_codec = None
for line in output.decode('ascii', 'ignore').split('\n'):
if line.startswith('codec_name='):
audio_codec = line.split('=')[1].strip()
elif line.strip() == 'codec_type=audio' and audio_codec is not None:
return audio_codec
return None
def run_ffmpeg(self, path, out_path, codec, more_opts):
if not self._exes['ffmpeg'] and not self._exes['avconv']:
raise AudioConversionError('ffmpeg or avconv not found. Please install one.')
if codec is None:
acodec_opts = []
else:
acodec_opts = ['-acodec', codec]
opts = ['-vn'] + acodec_opts + more_opts
try:
FFmpegPostProcessor.run_ffmpeg(self, path, out_path, opts)
except FFmpegPostProcessorError as err:
raise AudioConversionError(err.message)
def run(self, information):
path = information['filepath']
filecodec = self.get_audio_codec(path)
if filecodec is None:
raise PostProcessingError(u'WARNING: unable to obtain file audio codec with ffprobe')
more_opts = []
if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']:
# Lossless, but in another container
acodec = 'copy'
extension = 'm4a'
more_opts = [self._exes['avconv'] and '-bsf:a' or '-absf', 'aac_adtstoasc']
elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
# Lossless if possible
acodec = 'copy'
extension = filecodec
if filecodec == 'aac':
more_opts = ['-f', 'adts']
if filecodec == 'vorbis':
extension = 'ogg'
else:
# MP3 otherwise.
acodec = 'libmp3lame'
extension = 'mp3'
more_opts = []
if self._preferredquality is not None:
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality + 'k']
else:
# We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
extension = self._preferredcodec
more_opts = []
if self._preferredquality is not None:
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality + 'k']
if self._preferredcodec == 'aac':
more_opts += ['-f', 'adts']
if self._preferredcodec == 'm4a':
more_opts += [self._exes['avconv'] and '-bsf:a' or '-absf', 'aac_adtstoasc']
if self._preferredcodec == 'vorbis':
extension = 'ogg'
if self._preferredcodec == 'wav':
extension = 'wav'
more_opts += ['-f', 'wav']
prefix, sep, ext = path.rpartition(u'.') # not os.path.splitext, since the latter does not work on unicode in all setups
new_path = prefix + sep + extension
# If we download foo.mp3 and convert it to... foo.mp3, then don't delete foo.mp3, silly.
if new_path == path:
self._nopostoverwrites = True
try:
if self._nopostoverwrites and os.path.exists(encodeFilename(new_path)):
self._downloader.to_screen(u'[youtube] Post-process file %s exists, skipping' % new_path)
else:
self._downloader.to_screen(u'[' + (self._exes['avconv'] and 'avconv' or 'ffmpeg') + '] Destination: ' + new_path)
self.run_ffmpeg(path, new_path, acodec, more_opts)
except:
etype,e,tb = sys.exc_info()
if isinstance(e, AudioConversionError):
msg = u'audio conversion failed: ' + e.message
else:
msg = u'error running ' + (self._exes['avconv'] and 'avconv' or 'ffmpeg')
raise PostProcessingError(msg)
# Try to update the date time for extracted audio file.
if information.get('filetime') is not None:
try:
os.utime(encodeFilename(new_path), (time.time(), information['filetime']))
except:
self._downloader.to_stderr(u'WARNING: Cannot update utime of audio file')
information['filepath'] = new_path
return self._nopostoverwrites,information
class FFmpegVideoConvertor(FFmpegPostProcessor):
def __init__(self, downloader=None,preferedformat=None):
super(FFmpegVideoConvertor, self).__init__(downloader)
self._preferedformat=preferedformat
def run(self, information):
path = information['filepath']
prefix, sep, ext = path.rpartition(u'.')
outpath = prefix + sep + self._preferedformat
if information['ext'] == self._preferedformat:
self._downloader.to_screen(u'[ffmpeg] Not converting video file %s - already is in target format %s' % (path, self._preferedformat))
return True,information
self._downloader.to_screen(u'['+'ffmpeg'+'] Converting video from %s to %s, Destination: ' % (information['ext'], self._preferedformat) +outpath)
self.run_ffmpeg(path, outpath, [])
information['filepath'] = outpath
information['format'] = self._preferedformat
information['ext'] = self._preferedformat
return False,information

605
youtube_dl/YoutubeDL.py Normal file
View File

@@ -0,0 +1,605 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import absolute_import
import io
import os
import re
import shutil
import socket
import sys
import time
import traceback
from .utils import *
from .extractor import get_info_extractor, gen_extractors
from .FileDownloader import FileDownloader
class YoutubeDL(object):
"""YoutubeDL class.
YoutubeDL objects are the ones responsible of downloading the
actual video file and writing it to disk if the user has requested
it, among some other tasks. In most cases there should be one per
program. As, given a video URL, the downloader doesn't know how to
extract all the needed information, task that InfoExtractors do, it
has to pass the URL to one of them.
For this, YoutubeDL objects have a method that allows
InfoExtractors to be registered in a given order. When it is passed
a URL, the YoutubeDL object handles it to the first InfoExtractor it
finds that reports being able to handle it. The InfoExtractor extracts
all the information about the video or videos the URL refers to, and
YoutubeDL process the extracted information, possibly using a File
Downloader to download the video.
YoutubeDL objects accept a lot of parameters. In order not to saturate
the object constructor with arguments, it receives a dictionary of
options instead. These options are available through the params
attribute for the InfoExtractors to use. The YoutubeDL also
registers itself as the downloader in charge for the InfoExtractors
that are added to it, so this is a "mutual registration".
Available options:
username: Username for authentication purposes.
password: Password for authentication purposes.
videopassword: Password for acces a video.
usenetrc: Use netrc for authentication instead.
verbose: Print additional info to stdout.
quiet: Do not print messages to stdout.
forceurl: Force printing final URL.
forcetitle: Force printing title.
forceid: Force printing ID.
forcethumbnail: Force printing thumbnail URL.
forcedescription: Force printing description.
forcefilename: Force printing final filename.
simulate: Do not download the video files.
format: Video format code.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
restrictfilenames: Do not allow "&" and spaces in file names
ignoreerrors: Do not stop on download errors.
nooverwrites: Prevent overwriting files.
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logtostderr: Log messages to stderr instead of stdout.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writethumbnail: Write the thumbnail image to a file
writesubtitles: Write the video subtitles to a file
writeautomaticsub: Write the automatic subtitles to a file
allsubtitles: Downloads all the subtitles of the video
listsubtitles: Lists all available subtitles for the video
subtitlesformat: Subtitle format [srt/sbv/vtt] (default=srt)
subtitleslang: Language of the subtitles to download
keepvideo: Keep the video file after post-processing
daterange: A DateRange object, download only if the upload_date is in the range.
skip_download: Skip the actual download of the video file
The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader:
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle
"""
params = None
_ies = []
_pps = []
_download_retcode = None
_num_downloads = None
_screen_file = None
def __init__(self, params):
"""Create a FileDownloader object with the given options."""
self._ies = []
self._pps = []
self._progress_hooks = []
self._download_retcode = 0
self._num_downloads = 0
self._screen_file = [sys.stdout, sys.stderr][params.get('logtostderr', False)]
self.params = params
self.fd = FileDownloader(self, self.params)
if '%(stitle)s' in self.params['outtmpl']:
self.report_warning(u'%(stitle)s is deprecated. Use the %(title)s and the --restrict-filenames flag(which also secures %(uploader)s et al) instead.')
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
self._ies.append(ie)
ie.set_downloader(self)
def add_default_info_extractors(self):
"""
Add the InfoExtractors returned by gen_extractors to the end of the list
"""
for ie in gen_extractors():
self.add_info_extractor(ie)
def add_post_processor(self, pp):
"""Add a PostProcessor object to the end of the chain."""
self._pps.append(pp)
pp.set_downloader(self)
def to_screen(self, message, skip_eol=False):
"""Print message to stdout if not in quiet mode."""
assert type(message) == type(u'')
if not self.params.get('quiet', False):
terminator = [u'\n', u''][skip_eol]
output = message + terminator
if 'b' in getattr(self._screen_file, 'mode', '') or sys.version_info[0] < 3: # Python 2 lies about the mode of sys.stdout/sys.stderr
output = output.encode(preferredencoding(), 'ignore')
self._screen_file.write(output)
self._screen_file.flush()
def to_stderr(self, message):
"""Print message to stderr."""
assert type(message) == type(u'')
output = message + u'\n'
if 'b' in getattr(self._screen_file, 'mode', '') or sys.version_info[0] < 3: # Python 2 lies about the mode of sys.stdout/sys.stderr
output = output.encode(preferredencoding())
sys.stderr.write(output)
def fixed_template(self):
"""Checks if the output template is fixed."""
return (re.search(u'(?u)%\\(.+?\\)s', self.params['outtmpl']) is None)
def trouble(self, message=None, tb=None):
"""Determine action to take when a download problem appears.
Depending on if the downloader has been configured to ignore
download errors or not, this method may throw an exception or
not when errors are found, after printing the message.
tb, if given, is additional traceback information.
"""
if message is not None:
self.to_stderr(message)
if self.params.get('verbose'):
if tb is None:
if sys.exc_info()[0]: # if .trouble has been called from an except block
tb = u''
if hasattr(sys.exc_info()[1], 'exc_info') and sys.exc_info()[1].exc_info[0]:
tb += u''.join(traceback.format_exception(*sys.exc_info()[1].exc_info))
tb += compat_str(traceback.format_exc())
else:
tb_data = traceback.format_list(traceback.extract_stack())
tb = u''.join(tb_data)
self.to_stderr(tb)
if not self.params.get('ignoreerrors', False):
if sys.exc_info()[0] and hasattr(sys.exc_info()[1], 'exc_info') and sys.exc_info()[1].exc_info[0]:
exc_info = sys.exc_info()[1].exc_info
else:
exc_info = sys.exc_info()
raise DownloadError(message, exc_info)
self._download_retcode = 1
def report_warning(self, message):
'''
Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored
'''
if sys.stderr.isatty() and os.name != 'nt':
_msg_header=u'\033[0;33mWARNING:\033[0m'
else:
_msg_header=u'WARNING:'
warning_message=u'%s %s' % (_msg_header,message)
self.to_stderr(warning_message)
def report_error(self, message, tb=None):
'''
Do the same as trouble, but prefixes the message with 'ERROR:', colored
in red if stderr is a tty file.
'''
if sys.stderr.isatty() and os.name != 'nt':
_msg_header = u'\033[0;31mERROR:\033[0m'
else:
_msg_header = u'ERROR:'
error_message = u'%s %s' % (_msg_header, message)
self.trouble(error_message, tb)
def slow_down(self, start_time, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
if rate_limit is None or byte_counter == 0:
return
now = time.time()
elapsed = now - start_time
if elapsed <= 0.0:
return
speed = float(byte_counter) / elapsed
if speed > rate_limit:
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
def report_writedescription(self, descfn):
""" Report that the description file is being written """
self.to_screen(u'[info] Writing video description to: ' + descfn)
def report_writesubtitles(self, sub_filename):
""" Report that the subtitles file is being written """
self.to_screen(u'[info] Writing video subtitles to: ' + sub_filename)
def report_writeinfojson(self, infofn):
""" Report that the metadata file has been written """
self.to_screen(u'[info] Video description metadata as JSON to: ' + infofn)
def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded."""
try:
self.to_screen(u'[download] %s has already been downloaded' % file_name)
except (UnicodeEncodeError) as err:
self.to_screen(u'[download] The file has already been downloaded')
def increment_downloads(self):
"""Increment the ordinal that assigns a number to each file."""
self._num_downloads += 1
def prepare_filename(self, info_dict):
"""Generate the output filename."""
try:
template_dict = dict(info_dict)
template_dict['epoch'] = int(time.time())
autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None:
autonumber_size = 5
autonumber_templ = u'%0' + str(autonumber_size) + u'd'
template_dict['autonumber'] = autonumber_templ % self._num_downloads
if template_dict['playlist_index'] is not None:
template_dict['playlist_index'] = u'%05d' % template_dict['playlist_index']
sanitize = lambda k,v: sanitize_filename(
u'NA' if v is None else compat_str(v),
restricted=self.params.get('restrictfilenames'),
is_id=(k==u'id'))
template_dict = dict((k, sanitize(k, v)) for k,v in template_dict.items())
filename = self.params['outtmpl'] % template_dict
return filename
except KeyError as err:
self.report_error(u'Erroneous output template')
return None
except ValueError as err:
self.report_error(u'Insufficient system charset ' + repr(preferredencoding()))
return None
def _match_entry(self, info_dict):
""" Returns None iff the file should be downloaded """
title = info_dict['title']
matchtitle = self.params.get('matchtitle', False)
if matchtitle:
if not re.search(matchtitle, title, re.IGNORECASE):
return u'[download] "' + title + '" title did not match pattern "' + matchtitle + '"'
rejecttitle = self.params.get('rejecttitle', False)
if rejecttitle:
if re.search(rejecttitle, title, re.IGNORECASE):
return u'"' + title + '" title matched reject pattern "' + rejecttitle + '"'
date = info_dict.get('upload_date', None)
if date is not None:
dateRange = self.params.get('daterange', DateRange())
if date not in dateRange:
return u'[download] %s upload date is not in range %s' % (date_from_str(date).isoformat(), dateRange)
return None
def extract_info(self, url, download=True, ie_key=None, extra_info={}):
'''
Returns a list with a dictionary for each video we find.
If 'download', also downloads the videos.
extra_info is a dict containing the extra values to add to each result
'''
if ie_key:
ie = get_info_extractor(ie_key)()
ie.set_downloader(self)
ies = [ie]
else:
ies = self._ies
for ie in ies:
if not ie.suitable(url):
continue
if not ie.working():
self.report_warning(u'The program functionality for this site has been marked as broken, '
u'and will probably not work.')
try:
ie_result = ie.extract(url)
if ie_result is None: # Finished already (backwards compatibility; listformats and friends should be moved here)
break
if isinstance(ie_result, list):
# Backwards compatibility: old IE result format
for result in ie_result:
result.update(extra_info)
ie_result = {
'_type': 'compat_list',
'entries': ie_result,
}
else:
ie_result.update(extra_info)
if 'extractor' not in ie_result:
ie_result['extractor'] = ie.IE_NAME
return self.process_ie_result(ie_result, download=download)
except ExtractorError as de: # An error we somewhat expected
self.report_error(compat_str(de), de.format_traceback())
break
except Exception as e:
if self.params.get('ignoreerrors', False):
self.report_error(compat_str(e), tb=compat_str(traceback.format_exc()))
break
else:
raise
else:
self.report_error(u'no suitable InfoExtractor: %s' % url)
def process_ie_result(self, ie_result, download=True, extra_info={}):
"""
Take the result of the ie(may be modified) and resolve all unresolved
references (URLs, playlist items).
It will also download the videos if 'download'.
Returns the resolved ie_result.
"""
result_type = ie_result.get('_type', 'video') # If not given we suppose it's a video, support the default old system
if result_type == 'video':
ie_result.update(extra_info)
if 'playlist' not in ie_result:
# It isn't part of a playlist
ie_result['playlist'] = None
ie_result['playlist_index'] = None
if download:
self.process_info(ie_result)
return ie_result
elif result_type == 'url':
# We have to add extra_info to the results because it may be
# contained in a playlist
return self.extract_info(ie_result['url'],
download,
ie_key=ie_result.get('ie_key'),
extra_info=extra_info)
elif result_type == 'playlist':
# We process each entry in the playlist
playlist = ie_result.get('title', None) or ie_result.get('id', None)
self.to_screen(u'[download] Downloading playlist: %s' % playlist)
playlist_results = []
n_all_entries = len(ie_result['entries'])
playliststart = self.params.get('playliststart', 1) - 1
playlistend = self.params.get('playlistend', -1)
if playlistend == -1:
entries = ie_result['entries'][playliststart:]
else:
entries = ie_result['entries'][playliststart:playlistend]
n_entries = len(entries)
self.to_screen(u"[%s] playlist '%s': Collected %d video ids (downloading %d of them)" %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
for i,entry in enumerate(entries,1):
self.to_screen(u'[download] Downloading video #%s of %s' %(i, n_entries))
extra = {
'playlist': playlist,
'playlist_index': i + playliststart,
}
if not 'extractor' in entry:
# We set the extractor, if it's an url it will be set then to
# the new extractor, but if it's already a video we must make
# sure it's present: see issue #877
entry['extractor'] = ie_result['extractor']
entry_result = self.process_ie_result(entry,
download=download,
extra_info=extra)
playlist_results.append(entry_result)
ie_result['entries'] = playlist_results
return ie_result
elif result_type == 'compat_list':
def _fixup(r):
r.setdefault('extractor', ie_result['extractor'])
return r
ie_result['entries'] = [
self.process_ie_result(_fixup(r), download=download)
for r in ie_result['entries']
]
return ie_result
else:
raise Exception('Invalid result type: %s' % result_type)
def process_info(self, info_dict):
"""Process a single resolved IE result."""
assert info_dict.get('_type', 'video') == 'video'
#We increment the download the download count here to match the previous behaviour.
self.increment_downloads()
info_dict['fulltitle'] = info_dict['title']
if len(info_dict['title']) > 200:
info_dict['title'] = info_dict['title'][:197] + u'...'
# Keep for backwards compatibility
info_dict['stitle'] = info_dict['title']
if not 'format' in info_dict:
info_dict['format'] = info_dict['ext']
reason = self._match_entry(info_dict)
if reason is not None:
self.to_screen(u'[download] ' + reason)
return
max_downloads = self.params.get('max_downloads')
if max_downloads is not None:
if self._num_downloads > int(max_downloads):
raise MaxDownloadsReached()
filename = self.prepare_filename(info_dict)
# Forced printings
if self.params.get('forcetitle', False):
compat_print(info_dict['title'])
if self.params.get('forceid', False):
compat_print(info_dict['id'])
if self.params.get('forceurl', False):
compat_print(info_dict['url'])
if self.params.get('forcethumbnail', False) and 'thumbnail' in info_dict:
compat_print(info_dict['thumbnail'])
if self.params.get('forcedescription', False) and 'description' in info_dict:
compat_print(info_dict['description'])
if self.params.get('forcefilename', False) and filename is not None:
compat_print(filename)
if self.params.get('forceformat', False):
compat_print(info_dict['format'])
# Do nothing else if in simulate mode
if self.params.get('simulate', False):
return
if filename is None:
return
try:
dn = os.path.dirname(encodeFilename(filename))
if dn != '' and not os.path.exists(dn):
os.makedirs(dn)
except (OSError, IOError) as err:
self.report_error(u'unable to create directory ' + compat_str(err))
return
if self.params.get('writedescription', False):
try:
descfn = filename + u'.description'
self.report_writedescription(descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(info_dict['description'])
except (OSError, IOError):
self.report_error(u'Cannot write description file ' + descfn)
return
if (self.params.get('writesubtitles', False) or self.params.get('writeautomaticsub')) and 'subtitles' in info_dict and info_dict['subtitles']:
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
subtitle = info_dict['subtitles'][0]
(sub_error, sub_lang, sub) = subtitle
sub_format = self.params.get('subtitlesformat')
if sub_error:
self.report_warning("Some error while getting the subtitles")
else:
try:
sub_filename = filename.rsplit('.', 1)[0] + u'.' + sub_lang + u'.' + sub_format
self.report_writesubtitles(sub_filename)
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8') as subfile:
subfile.write(sub)
except (OSError, IOError):
self.report_error(u'Cannot write subtitles file ' + descfn)
return
if self.params.get('allsubtitles', False) and 'subtitles' in info_dict and info_dict['subtitles']:
subtitles = info_dict['subtitles']
sub_format = self.params.get('subtitlesformat')
for subtitle in subtitles:
(sub_error, sub_lang, sub) = subtitle
if sub_error:
self.report_warning("Some error while getting the subtitles")
else:
try:
sub_filename = filename.rsplit('.', 1)[0] + u'.' + sub_lang + u'.' + sub_format
self.report_writesubtitles(sub_filename)
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8') as subfile:
subfile.write(sub)
except (OSError, IOError):
self.report_error(u'Cannot write subtitles file ' + descfn)
return
if self.params.get('writeinfojson', False):
infofn = filename + u'.info.json'
self.report_writeinfojson(infofn)
try:
json_info_dict = dict((k, v) for k,v in info_dict.items() if not k in ['urlhandle'])
write_json_file(json_info_dict, encodeFilename(infofn))
except (OSError, IOError):
self.report_error(u'Cannot write metadata to JSON file ' + infofn)
return
if self.params.get('writethumbnail', False):
if 'thumbnail' in info_dict:
thumb_format = info_dict['thumbnail'].rpartition(u'/')[2].rpartition(u'.')[2]
if not thumb_format:
thumb_format = 'jpg'
thumb_filename = filename.rpartition('.')[0] + u'.' + thumb_format
self.to_screen(u'[%s] %s: Downloading thumbnail ...' %
(info_dict['extractor'], info_dict['id']))
uf = compat_urllib_request.urlopen(info_dict['thumbnail'])
with open(thumb_filename, 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf)
self.to_screen(u'[%s] %s: Writing thumbnail to: %s' %
(info_dict['extractor'], info_dict['id'], thumb_filename))
if not self.params.get('skip_download', False):
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(filename)):
success = True
else:
try:
success = self.fd._do_download(filename, info_dict)
except (OSError, IOError) as err:
raise UnavailableVideoError()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_error(u'unable to download video data: %s' % str(err))
return
except (ContentTooShortError, ) as err:
self.report_error(u'content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success:
try:
self.post_process(filename, info_dict)
except (PostProcessingError) as err:
self.report_error(u'postprocessing: %s' % str(err))
return
def download(self, url_list):
"""Download a given list of URLs."""
if len(url_list) > 1 and self.fixed_template():
raise SameFileError(self.params['outtmpl'])
for url in url_list:
try:
#It also downloads the videos
videos = self.extract_info(url)
except UnavailableVideoError:
self.report_error(u'unable to download video')
except MaxDownloadsReached:
self.to_screen(u'[info] Maximum number of downloaded files reached.')
raise
return self._download_retcode
def post_process(self, filename, ie_info):
"""Run all the postprocessors on the given file."""
info = dict(ie_info)
info['filepath'] = filename
keep_video = None
for pp in self._pps:
try:
keep_video_wish,new_info = pp.run(info)
if keep_video_wish is not None:
if keep_video_wish:
keep_video = keep_video_wish
elif keep_video is None:
# No clear decision yet, let IE decide
keep_video = keep_video_wish
except PostProcessingError as e:
self.to_stderr(u'ERROR: ' + e.msg)
if keep_video is False and not self.params.get('keepvideo', False):
try:
self.to_screen(u'Deleting original file %s (pass -k to keep)' % filename)
os.remove(encodeFilename(filename))
except (IOError, OSError):
self.report_warning(u'Unable to remove downloaded video file')

5255
youtube_dl/__init__.py Executable file → Normal file

File diff suppressed because it is too large Load Diff

18
youtube_dl/__main__.py Executable file
View File

@@ -0,0 +1,18 @@
#!/usr/bin/env python
# Execute with
# $ python youtube_dl/__main__.py (2.6+)
# $ python -m youtube_dl (2.7+)
import sys
if __package__ is None and not hasattr(sys, "frozen"):
# direct call of __main__.py
import os.path
path = os.path.realpath(os.path.abspath(__file__))
sys.path.append(os.path.dirname(os.path.dirname(path)))
import youtube_dl
if __name__ == '__main__':
youtube_dl.main()

View File

@@ -0,0 +1,103 @@
from .archiveorg import ArchiveOrgIE
from .ard import ARDIE
from .arte import ArteTvIE
from .auengine import AUEngineIE
from .bandcamp import BandcampIE
from .bliptv import BlipTVIE, BlipTVUserIE
from .breakcom import BreakIE
from .brightcove import BrightcoveIE
from .collegehumor import CollegeHumorIE
from .comedycentral import ComedyCentralIE
from .cspan import CSpanIE
from .dailymotion import DailymotionIE
from .depositfiles import DepositFilesIE
from .dotsub import DotsubIE
from .dreisat import DreiSatIE
from .ehow import EHowIE
from .eighttracks import EightTracksIE
from .escapist import EscapistIE
from .facebook import FacebookIE
from .flickr import FlickrIE
from .funnyordie import FunnyOrDieIE
from .gamespot import GameSpotIE
from .gametrailers import GametrailersIE
from .generic import GenericIE
from .googleplus import GooglePlusIE
from .googlesearch import GoogleSearchIE
from .hotnewhiphop import HotNewHipHopIE
from .howcast import HowcastIE
from .hypem import HypemIE
from .ina import InaIE
from .infoq import InfoQIE
from .instagram import InstagramIE
from .jukebox import JukeboxIE
from .justintv import JustinTVIE
from .keek import KeekIE
from .liveleak import LiveLeakIE
from .metacafe import MetacafeIE
from .mixcloud import MixcloudIE
from .mtv import MTVIE
from .myspass import MySpassIE
from .myvideo import MyVideoIE
from .nba import NBAIE
from .photobucket import PhotobucketIE
from .pornotube import PornotubeIE
from .rbmaradio import RBMARadioIE
from .redtube import RedTubeIE
from .ringtv import RingTVIE
from .soundcloud import SoundcloudIE, SoundcloudSetIE
from .spiegel import SpiegelIE
from .stanfordoc import StanfordOpenClassroomIE
from .statigram import StatigramIE
from .steam import SteamIE
from .teamcoco import TeamcocoIE
from .ted import TEDIE
from .tf1 import TF1IE
from .traileraddict import TrailerAddictIE
from .tudou import TudouIE
from .tumblr import TumblrIE
from .tutv import TutvIE
from .ustream import UstreamIE
from .vbox7 import Vbox7IE
from .veoh import VeohIE
from .vevo import VevoIE
from .vimeo import VimeoIE
from .vine import VineIE
from .wat import WatIE
from .wimp import WimpIE
from .worldstarhiphop import WorldStarHipHopIE
from .xhamster import XHamsterIE
from .xnxx import XNXXIE
from .xvideos import XVideosIE
from .yahoo import YahooIE, YahooSearchIE
from .youjizz import YouJizzIE
from .youku import YoukuIE
from .youporn import YouPornIE
from .youtube import (
YoutubeIE,
YoutubePlaylistIE,
YoutubeSearchIE,
YoutubeUserIE,
YoutubeChannelIE,
YoutubeShowIE,
YoutubeSubscriptionsIE,
)
from .zdf import ZDFIE
_ALL_CLASSES = [
klass
for name, klass in globals().items()
if name.endswith('IE') and name != 'GenericIE'
]
_ALL_CLASSES.append(GenericIE)
def gen_extractors():
""" Return a list of an instance of every supported extractor.
The order does matter; the first extractor matched is the one handling the URL.
"""
return [klass() for klass in _ALL_CLASSES]
def get_info_extractor(ie_name):
"""Returns the info extractor class with the given ie_name"""
return globals()[ie_name+'IE']

View File

@@ -0,0 +1,67 @@
import json
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
unified_strdate,
)
class ArchiveOrgIE(InfoExtractor):
IE_NAME = 'archive.org'
IE_DESC = 'archive.org videos'
_VALID_URL = r'(?:https?://)?(?:www\.)?archive.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
_TEST = {
u"url": u"http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect",
u'file': u'XD300-23_68HighlightsAResearchCntAugHumanIntellect.ogv',
u'md5': u'8af1d4cf447933ed3c7f4871162602db',
u'info_dict': {
u"title": u"1968 Demo - FJCC Conference Presentation Reel #1",
u"description": u"Reel 1 of 3: Also known as the \"Mother of All Demos\", Doug Engelbart's presentation at the Fall Joint Computer Conference in San Francisco, December 9, 1968 titled \"A Research Center for Augmenting Human Intellect.\" For this presentation, Doug and his team astonished the audience by not only relating their research, but demonstrating it live. This was the debut of the mouse, interactive computing, hypermedia, computer supported software engineering, video teleconferencing, etc. See also <a href=\"http://dougengelbart.org/firsts/dougs-1968-demo.html\" rel=\"nofollow\">Doug's 1968 Demo page</a> for more background, highlights, links, and the detailed paper published in this conference proceedings. Filmed on 3 reels: Reel 1 | <a href=\"http://www.archive.org/details/XD300-24_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 2</a> | <a href=\"http://www.archive.org/details/XD300-25_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 3</a>",
u"upload_date": u"19681210",
u"uploader": u"SRI International"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
json_url = url + (u'?' if u'?' in url else '&') + u'output=json'
json_data = self._download_webpage(json_url, video_id)
data = json.loads(json_data)
title = data['metadata']['title'][0]
description = data['metadata']['description'][0]
uploader = data['metadata']['creator'][0]
upload_date = unified_strdate(data['metadata']['date'][0])
formats = [{
'format': fdata['format'],
'url': 'http://' + data['server'] + data['dir'] + fn,
'file_size': int(fdata['size']),
}
for fn,fdata in data['files'].items()
if 'Video' in fdata['format']]
formats.sort(key=lambda fdata: fdata['file_size'])
info = {
'_type': 'video',
'id': video_id,
'title': title,
'formats': formats,
'description': description,
'uploader': uploader,
'upload_date': upload_date,
}
thumbnail = data.get('misc', {}).get('image')
if thumbnail:
info['thumbnail'] = thumbnail
# TODO: Remove when #980 has been merged
info['url'] = formats[-1]['url']
info['ext'] = determine_ext(formats[-1]['url'])
return info

View File

@@ -0,0 +1,54 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class ARDIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:(?:www\.)?ardmediathek\.de|mediathek\.daserste\.de)/(?:.*/)(?P<video_id>[^/\?]+)(?:\?.*)?'
_TITLE = r'<h1(?: class="boxTopHeadline")?>(?P<title>.*)</h1>'
_MEDIA_STREAM = r'mediaCollection\.addMediaStream\((?P<media_type>\d+), (?P<quality>\d+), "(?P<rtmp_url>[^"]*)", "(?P<video_url>[^"]*)", "[^"]*"\)'
_TEST = {
u'url': u'http://www.ardmediathek.de/das-erste/tagesschau-in-100-sek?documentId=14077640',
u'file': u'14077640.mp4',
u'md5': u'6ca8824255460c787376353f9e20bbd8',
u'info_dict': {
u"title": u"11.04.2013 09:23 Uhr - Tagesschau in 100 Sekunden"
},
u'skip': u'Requires rtmpdump'
}
def _real_extract(self, url):
# determine video id from url
m = re.match(self._VALID_URL, url)
numid = re.search(r'documentId=([0-9]+)', url)
if numid:
video_id = numid.group(1)
else:
video_id = m.group('video_id')
# determine title and media streams from webpage
html = self._download_webpage(url, video_id)
title = re.search(self._TITLE, html).group('title')
streams = [mo.groupdict() for mo in re.finditer(self._MEDIA_STREAM, html)]
if not streams:
assert '"fsk"' in html
raise ExtractorError(u'This video is only available after 8:00 pm')
# choose default media type and highest quality for now
stream = max([s for s in streams if int(s["media_type"]) == 0],
key=lambda s: int(s["quality"]))
# there's two possibilities: RTMP stream or HTTP download
info = {'id': video_id, 'title': title, 'ext': 'mp4'}
if stream['rtmp_url']:
self.to_screen(u'RTMP download detected')
assert stream['video_url'].startswith('mp4:')
info["url"] = stream["rtmp_url"]
info["play_path"] = stream['video_url']
else:
assert stream["video_url"].endswith('.mp4')
info["url"] = stream["video_url"]
return [info]

View File

@@ -0,0 +1,146 @@
import re
import json
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
ExtractorError,
find_xpath_attr,
unified_strdate,
)
class ArteTvIE(InfoExtractor):
"""
There are two sources of video in arte.tv: videos.arte.tv and
www.arte.tv/guide, the extraction process is different for each one.
The videos expire in 7 days, so we can't add tests.
"""
_EMISSION_URL = r'(?:http://)?www\.arte.tv/guide/(?P<lang>fr|de)/(?:(?:sendungen|emissions)/)?(?P<id>.*?)/(?P<name>.*?)(\?.*)?'
_VIDEOS_URL = r'(?:http://)?videos.arte.tv/(?P<lang>fr|de)/.*-(?P<id>.*?).html'
_LIVE_URL = r'index-[0-9]+\.html$'
IE_NAME = u'arte.tv'
@classmethod
def suitable(cls, url):
return any(re.match(regex, url) for regex in (cls._EMISSION_URL, cls._VIDEOS_URL))
# TODO implement Live Stream
# from ..utils import compat_urllib_parse
# def extractLiveStream(self, url):
# video_lang = url.split('/')[-4]
# info = self.grep_webpage(
# url,
# r'src="(.*?/videothek_js.*?\.js)',
# 0,
# [
# (1, 'url', u'Invalid URL: %s' % url)
# ]
# )
# http_host = url.split('/')[2]
# next_url = 'http://%s%s' % (http_host, compat_urllib_parse.unquote(info.get('url')))
# info = self.grep_webpage(
# next_url,
# r'(s_artestras_scst_geoFRDE_' + video_lang + '.*?)\'.*?' +
# '(http://.*?\.swf).*?' +
# '(rtmp://.*?)\'',
# re.DOTALL,
# [
# (1, 'path', u'could not extract video path: %s' % url),
# (2, 'player', u'could not extract video player: %s' % url),
# (3, 'url', u'could not extract video url: %s' % url)
# ]
# )
# video_url = u'%s/%s' % (info.get('url'), info.get('path'))
def _real_extract(self, url):
mobj = re.match(self._EMISSION_URL, url)
if mobj is not None:
lang = mobj.group('lang')
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
return self._extract_emission(url, video_id, lang)
mobj = re.match(self._VIDEOS_URL, url)
if mobj is not None:
id = mobj.group('id')
lang = mobj.group('lang')
return self._extract_video(url, id, lang)
if re.search(self._LIVE_URL, video_id) is not None:
raise ExtractorError(u'Arte live streams are not yet supported, sorry')
# self.extractLiveStream(url)
# return
def _extract_emission(self, url, video_id, lang):
"""Extract from www.arte.tv/guide"""
webpage = self._download_webpage(url, video_id)
json_url = self._html_search_regex(r'arte_vp_url="(.*?)"', webpage, 'json url')
json_info = self._download_webpage(json_url, video_id, 'Downloading info json')
self.report_extraction(video_id)
info = json.loads(json_info)
player_info = info['videoJsonPlayer']
info_dict = {'id': player_info['VID'],
'title': player_info['VTI'],
'description': player_info['VDE'],
'upload_date': unified_strdate(player_info['VDA'].split(' ')[0]),
'thumbnail': player_info['programImage'],
'ext': 'flv',
}
formats = player_info['VSR'].values()
def _match_lang(f):
# Return true if that format is in the language of the url
if lang == 'fr':
l = 'F'
elif lang == 'de':
l = 'A'
regexes = [r'VO?%s' % l, r'V%s-ST.' % l]
return any(re.match(r, f['versionCode']) for r in regexes)
# Some formats may not be in the same language as the url
formats = filter(_match_lang, formats)
# We order the formats by quality
formats = sorted(formats, key=lambda f: int(f['height']))
# Pick the best quality
format_info = formats[-1]
if format_info['mediaType'] == u'rtmp':
info_dict['url'] = format_info['streamer']
info_dict['play_path'] = 'mp4:' + format_info['url']
else:
info_dict['url'] = format_info['url']
return info_dict
def _extract_video(self, url, video_id, lang):
"""Extract from videos.arte.tv"""
ref_xml_url = url.replace('/videos/', '/do_delegate/videos/')
ref_xml_url = ref_xml_url.replace('.html', ',view,asPlayerXml.xml')
ref_xml = self._download_webpage(ref_xml_url, video_id, note=u'Downloading metadata')
ref_xml_doc = xml.etree.ElementTree.fromstring(ref_xml)
config_node = find_xpath_attr(ref_xml_doc, './/video', 'lang', lang)
config_xml_url = config_node.attrib['ref']
config_xml = self._download_webpage(config_xml_url, video_id, note=u'Downloading configuration')
video_urls = list(re.finditer(r'<url quality="(?P<quality>.*?)">(?P<url>.*?)</url>', config_xml))
def _key(m):
quality = m.group('quality')
if quality == 'hd':
return 2
else:
return 1
# We pick the best quality
video_urls = sorted(video_urls, key=_key)
video_url = list(video_urls)[-1].group('url')
title = self._html_search_regex(r'<name>(.*?)</name>', config_xml, 'title')
thumbnail = self._html_search_regex(r'<firstThumbnailUrl>(.*?)</firstThumbnailUrl>',
config_xml, 'thumbnail')
return {'id': video_id,
'title': title,
'thumbnail': thumbnail,
'url': video_url,
'ext': 'flv',
}

View File

@@ -0,0 +1,46 @@
import os.path
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
compat_urllib_parse_urlparse,
)
class AUEngineIE(InfoExtractor):
_TEST = {
u'url': u'http://auengine.com/embed.php?file=lfvlytY6&w=650&h=370',
u'file': u'lfvlytY6.mp4',
u'md5': u'48972bdbcf1a3a2f5533e62425b41d4f',
u'info_dict': {
u"title": u"[Commie]The Legend of the Legendary Heroes - 03 - Replication Eye (Alpha Stigma)[F9410F5A]"
}
}
_VALID_URL = r'(?:http://)?(?:www\.)?auengine\.com/embed.php\?.*?file=([^&]+).*?'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(?P<title>.+?)</title>',
webpage, u'title')
title = title.strip()
links = re.findall(r'[^A-Za-z0-9]?(?:file|url):\s*["\'](http[^\'"&]*)', webpage)
links = [compat_urllib_parse.unquote(l) for l in links]
for link in links:
root, pathext = os.path.splitext(compat_urllib_parse_urlparse(link).path)
if pathext == '.png':
thumbnail = link
elif pathext == '.mp4':
url = link
ext = pathext
if ext == title[-len(ext):]:
title = title[:-len(ext)]
ext = ext[1:]
return [{
'id': video_id,
'url': url,
'ext': ext,
'title': title,
'thumbnail': thumbnail,
}]

View File

@@ -0,0 +1,63 @@
import json
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class BandcampIE(InfoExtractor):
_VALID_URL = r'http://.*?\.bandcamp\.com/track/(?P<title>.*)'
_TEST = {
u'url': u'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
u'file': u'1812978515.mp3',
u'md5': u'cdeb30cdae1921719a3cbcab696ef53c',
u'info_dict': {
u"title": u"youtube-dl test song \"'/\\\u00e4\u21ad"
},
u'skip': u'There is a limit of 200 free downloads / month for the test song'
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
title = mobj.group('title')
webpage = self._download_webpage(url, title)
# We get the link to the free download page
m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage)
if m_download is None:
raise ExtractorError(u'No free songs found')
download_link = m_download.group(1)
id = re.search(r'var TralbumData = {(.*?)id: (?P<id>\d*?)$',
webpage, re.MULTILINE|re.DOTALL).group('id')
download_webpage = self._download_webpage(download_link, id,
'Downloading free downloads page')
# We get the dictionary of the track from some javascrip code
info = re.search(r'items: (.*?),$',
download_webpage, re.MULTILINE).group(1)
info = json.loads(info)[0]
# We pick mp3-320 for now, until format selection can be easily implemented.
mp3_info = info[u'downloads'][u'mp3-320']
# If we try to use this url it says the link has expired
initial_url = mp3_info[u'url']
re_url = r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$'
m_url = re.match(re_url, initial_url)
#We build the url we will use to get the final track url
# This url is build in Bandcamp in the script download_bunde_*.js
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), id, m_url.group('ts'))
final_url_webpage = self._download_webpage(request_url, id, 'Requesting download url')
# If we could correctly generate the .rand field the url would be
#in the "download_url" key
final_url = re.search(r'"retry_url":"(.*?)"', final_url_webpage).group(1)
track_info = {'id':id,
'title' : info[u'title'],
'ext' : 'mp3',
'url' : final_url,
'thumbnail' : info[u'thumb_url'],
'uploader' : info[u'artist']
}
return [track_info]

View File

@@ -0,0 +1,193 @@
import datetime
import json
import os
import re
import socket
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_parse_qs,
compat_str,
compat_urllib_error,
compat_urllib_parse_urlparse,
compat_urllib_request,
ExtractorError,
unescapeHTML,
)
class BlipTVIE(InfoExtractor):
"""Information extractor for blip.tv"""
_VALID_URL = r'^(?:https?://)?(?:\w+\.)?blip\.tv/((.+/)|(play/)|(api\.swf#))(.+)$'
_URL_EXT = r'^.*\.([a-z0-9]+)$'
IE_NAME = u'blip.tv'
_TEST = {
u'url': u'http://blip.tv/cbr/cbr-exclusive-gotham-city-imposters-bats-vs-jokerz-short-3-5796352',
u'file': u'5779306.m4v',
u'md5': u'80baf1ec5c3d2019037c1c707d676b9f',
u'info_dict': {
u"upload_date": u"20111205",
u"description": u"md5:9bc31f227219cde65e47eeec8d2dc596",
u"uploader": u"Comic Book Resources - CBR TV",
u"title": u"CBR EXCLUSIVE: \"Gotham City Imposters\" Bats VS Jokerz Short 3"
}
}
def report_direct_download(self, title):
"""Report information extraction."""
self.to_screen(u'%s: Direct download detected' % title)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
# See https://github.com/rg3/youtube-dl/issues/857
api_mobj = re.match(r'http://a\.blip\.tv/api\.swf#(?P<video_id>[\d\w]+)', url)
if api_mobj is not None:
url = 'http://blip.tv/play/g_%s' % api_mobj.group('video_id')
urlp = compat_urllib_parse_urlparse(url)
if urlp.path.startswith('/play/'):
request = compat_urllib_request.Request(url)
response = compat_urllib_request.urlopen(request)
redirecturl = response.geturl()
rurlp = compat_urllib_parse_urlparse(redirecturl)
file_id = compat_parse_qs(rurlp.fragment)['file'][0].rpartition('/')[2]
url = 'http://blip.tv/a/a-' + file_id
return self._real_extract(url)
if '?' in url:
cchar = '&'
else:
cchar = '?'
json_url = url + cchar + 'skin=json&version=2&no_wrap=1'
request = compat_urllib_request.Request(json_url)
request.add_header('User-Agent', 'iTunes/10.6.1')
self.report_extraction(mobj.group(1))
info = None
try:
urlh = compat_urllib_request.urlopen(request)
if urlh.headers.get('Content-Type', '').startswith('video/'): # Direct download
basename = url.split('/')[-1]
title,ext = os.path.splitext(basename)
title = title.decode('UTF-8')
ext = ext.replace('.', '')
self.report_direct_download(title)
info = {
'id': title,
'url': url,
'uploader': None,
'upload_date': None,
'title': title,
'ext': ext,
'urlhandle': urlh
}
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'ERROR: unable to download video info webpage: %s' % compat_str(err))
if info is None: # Regular URL
try:
json_code_bytes = urlh.read()
json_code = json_code_bytes.decode('utf-8')
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to read video info webpage: %s' % compat_str(err))
try:
json_data = json.loads(json_code)
if 'Post' in json_data:
data = json_data['Post']
else:
data = json_data
upload_date = datetime.datetime.strptime(data['datestamp'], '%m-%d-%y %H:%M%p').strftime('%Y%m%d')
if 'additionalMedia' in data:
formats = sorted(data['additionalMedia'], key=lambda f: int(f['media_height']))
best_format = formats[-1]
video_url = best_format['url']
else:
video_url = data['media']['url']
umobj = re.match(self._URL_EXT, video_url)
if umobj is None:
raise ValueError('Can not determine filename extension')
ext = umobj.group(1)
info = {
'id': data['item_id'],
'url': video_url,
'uploader': data['display_name'],
'upload_date': upload_date,
'title': data['title'],
'ext': ext,
'format': data['media']['mimeType'],
'thumbnail': data['thumbnailUrl'],
'description': data['description'],
'player_url': data['embedUrl'],
'user_agent': 'iTunes/10.6.1',
}
except (ValueError,KeyError) as err:
raise ExtractorError(u'Unable to parse video information: %s' % repr(err))
return [info]
class BlipTVUserIE(InfoExtractor):
"""Information Extractor for blip.tv users."""
_VALID_URL = r'(?:(?:(?:https?://)?(?:\w+\.)?blip\.tv/)|bliptvuser:)([^/]+)/*$'
_PAGE_SIZE = 12
IE_NAME = u'blip.tv:user'
def _real_extract(self, url):
# Extract username
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
username = mobj.group(1)
page_base = 'http://m.blip.tv/pr/show_get_full_episode_list?users_id=%s&lite=0&esi=1'
page = self._download_webpage(url, username, u'Downloading user page')
mobj = re.search(r'data-users-id="([^"]+)"', page)
page_base = page_base % mobj.group(1)
# Download video ids using BlipTV Ajax calls. Result size per
# query is limited (currently to 12 videos) so we need to query
# page by page until there are no video ids - it means we got
# all of them.
video_ids = []
pagenum = 1
while True:
url = page_base + "&page=" + str(pagenum)
page = self._download_webpage(url, username,
u'Downloading video ids from page %d' % pagenum)
# Extract video identifiers
ids_in_page = []
for mobj in re.finditer(r'href="/([^"]+)"', page):
if mobj.group(1) not in ids_in_page:
ids_in_page.append(unescapeHTML(mobj.group(1)))
video_ids.extend(ids_in_page)
# A little optimization - if current page is not
# "full", ie. does not contain PAGE_SIZE video ids then
# we can assume that this page is the last one - there
# are no more ids on further pages - no need to query
# again.
if len(ids_in_page) < self._PAGE_SIZE:
break
pagenum += 1
urls = [u'http://blip.tv/%s' % video_id for video_id in video_ids]
url_entries = [self.url_result(vurl, 'BlipTV') for vurl in urls]
return [self.playlist_result(url_entries, playlist_title = username)]

View File

@@ -0,0 +1,33 @@
import re
from .common import InfoExtractor
class BreakIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?break\.com/video/([^/]+)'
_TEST = {
u'url': u'http://www.break.com/video/when-girls-act-like-guys-2468056',
u'file': u'2468056.mp4',
u'md5': u'a3513fb1547fba4fb6cfac1bffc6c46b',
u'info_dict': {
u"title": u"When Girls Act Like D-Bags"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1).split("-")[-1]
webpage = self._download_webpage(url, video_id)
video_url = re.search(r"videoPath: '(.+?)',",webpage).group(1)
key = re.search(r"icon: '(.+?)',",webpage).group(1)
final_url = str(video_url)+"?"+str(key)
thumbnail_url = re.search(r"thumbnailURL: '(.+?)'",webpage).group(1)
title = re.search(r"sVidTitle: '(.+)',",webpage).group(1)
ext = video_url.split('.')[-1]
return [{
'id': video_id,
'url': final_url,
'ext': ext,
'title': title,
'thumbnail': thumbnail_url,
}]

View File

@@ -0,0 +1,85 @@
import re
import json
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
find_xpath_attr,
)
class BrightcoveIE(InfoExtractor):
_VALID_URL = r'https?://.*brightcove\.com/(services|viewer).*\?(?P<query>.*)'
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s'
_PLAYLIST_URL_TEMPLATE = 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=%s'
# There is a test for Brigtcove in GenericIE, that way we test both the download
# and the detection of videos, and we don't have to find an URL that is always valid
@classmethod
def _build_brighcove_url(cls, object_str):
"""
Build a Brightcove url from a xml string containing
<object class="BrightcoveExperience">{params}</object>
"""
object_doc = xml.etree.ElementTree.fromstring(object_str)
assert u'BrightcoveExperience' in object_doc.attrib['class']
params = {'flashID': object_doc.attrib['id'],
'playerID': find_xpath_attr(object_doc, './param', 'name', 'playerID').attrib['value'],
}
playerKey = find_xpath_attr(object_doc, './param', 'name', 'playerKey')
# Not all pages define this value
if playerKey is not None:
params['playerKey'] = playerKey.attrib['value']
videoPlayer = find_xpath_attr(object_doc, './param', 'name', '@videoPlayer')
if videoPlayer is not None:
params['@videoPlayer'] = videoPlayer.attrib['value']
data = compat_urllib_parse.urlencode(params)
return cls._FEDERATED_URL_TEMPLATE % data
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
query = mobj.group('query')
m_video_id = re.search(r'videoPlayer=(\d+)', query)
if m_video_id is not None:
video_id = m_video_id.group(1)
return self._get_video_info(video_id, query)
else:
player_key = self._search_regex(r'playerKey=(.+?)(&|$)', query, 'playlist_id')
return self._get_playlist_info(player_key)
def _get_video_info(self, video_id, query):
request_url = self._FEDERATED_URL_TEMPLATE % query
webpage = self._download_webpage(request_url, video_id)
self.report_extraction(video_id)
info = self._search_regex(r'var experienceJSON = ({.*?});', webpage, 'json')
info = json.loads(info)['data']
video_info = info['programmedContent']['videoPlayer']['mediaDTO']
return self._extract_video_info(video_info)
def _get_playlist_info(self, player_key):
playlist_info = self._download_webpage(self._PLAYLIST_URL_TEMPLATE % player_key,
player_key, u'Downloading playlist information')
playlist_info = json.loads(playlist_info)['videoList']
videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
return self.playlist_result(videos, playlist_id=playlist_info['id'],
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
def _extract_video_info(self, video_info):
renditions = video_info['renditions']
renditions = sorted(renditions, key=lambda r: r['size'])
best_format = renditions[-1]
return {'id': video_info['id'],
'title': video_info['displayName'],
'url': best_format['defaultURL'],
'ext': 'mp4',
'description': video_info.get('shortDescription'),
'thumbnail': video_info.get('videoStillURL') or video_info.get('thumbnailURL'),
'uploader': video_info.get('publisherName'),
}

View File

@@ -0,0 +1,74 @@
import re
import socket
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_parse_urlparse,
compat_urllib_request,
ExtractorError,
)
class CollegeHumorIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'^(?:https?://)?(?:www\.)?collegehumor\.com/video/(?P<videoid>[0-9]+)/(?P<shorttitle>.*)$'
def report_manifest(self, video_id):
"""Report information extraction."""
self.to_screen(u'%s: Downloading XML manifest' % video_id)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group('videoid')
info = {
'id': video_id,
'uploader': None,
'upload_date': None,
}
self.report_extraction(video_id)
xmlUrl = 'http://www.collegehumor.com/moogaloop/video/' + video_id
try:
metaXml = compat_urllib_request.urlopen(xmlUrl).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to download video info XML: %s' % compat_str(err))
mdoc = xml.etree.ElementTree.fromstring(metaXml)
try:
videoNode = mdoc.findall('./video')[0]
info['description'] = videoNode.findall('./description')[0].text
info['title'] = videoNode.findall('./caption')[0].text
info['thumbnail'] = videoNode.findall('./thumbnail')[0].text
manifest_url = videoNode.findall('./file')[0].text
except IndexError:
raise ExtractorError(u'Invalid metadata XML file')
manifest_url += '?hdcore=2.10.3'
self.report_manifest(video_id)
try:
manifestXml = compat_urllib_request.urlopen(manifest_url).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to download video info XML: %s' % compat_str(err))
adoc = xml.etree.ElementTree.fromstring(manifestXml)
try:
media_node = adoc.findall('./{http://ns.adobe.com/f4m/1.0}media')[0]
node_id = media_node.attrib['url']
video_id = adoc.findall('./{http://ns.adobe.com/f4m/1.0}id')[0].text
except IndexError as err:
raise ExtractorError(u'Invalid manifest file')
url_pr = compat_urllib_parse_urlparse(manifest_url)
url = url_pr.scheme + '://' + url_pr.netloc + '/z' + video_id[:-2] + '/' + node_id + 'Seg1-Frag1'
info['url'] = url
info['ext'] = 'f4f'
return [info]

View File

@@ -0,0 +1,189 @@
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_str,
compat_urllib_parse,
ExtractorError,
unified_strdate,
)
class ComedyCentralIE(InfoExtractor):
IE_DESC = u'The Daily Show / Colbert Report'
# urls can be abbreviations like :thedailyshow or :colbert
# urls for episodes like:
# or urls for clips like: http://www.thedailyshow.com/watch/mon-december-10-2012/any-given-gun-day
# or: http://www.colbertnation.com/the-colbert-report-videos/421667/november-29-2012/moon-shattering-news
# or: http://www.colbertnation.com/the-colbert-report-collections/422008/festival-of-lights/79524
_VALID_URL = r"""^(:(?P<shortname>tds|thedailyshow|cr|colbert|colbertnation|colbertreport)
|(https?://)?(www\.)?
(?P<showname>thedailyshow|colbertnation)\.com/
(full-episodes/(?P<episode>.*)|
(?P<clip>
(the-colbert-report-(videos|collections)/(?P<clipID>[0-9]+)/[^/]*/(?P<cntitle>.*?))
|(watch/(?P<date>[^/]*)/(?P<tdstitle>.*)))))
$"""
_TEST = {
u'url': u'http://www.thedailyshow.com/watch/thu-december-13-2012/kristen-stewart',
u'file': u'422212.mp4',
u'md5': u'4e2f5cb088a83cd8cdb7756132f9739d',
u'info_dict': {
u"upload_date": u"20121214",
u"description": u"Kristen Stewart",
u"uploader": u"thedailyshow",
u"title": u"thedailyshow-kristen-stewart part 1"
}
}
_available_formats = ['3500', '2200', '1700', '1200', '750', '400']
_video_extensions = {
'3500': 'mp4',
'2200': 'mp4',
'1700': 'mp4',
'1200': 'mp4',
'750': 'mp4',
'400': 'mp4',
}
_video_dimensions = {
'3500': '1280x720',
'2200': '960x540',
'1700': '768x432',
'1200': '640x360',
'750': '512x288',
'400': '384x216',
}
@classmethod
def suitable(cls, url):
"""Receives a URL and returns True if suitable for this IE."""
return re.match(cls._VALID_URL, url, re.VERBOSE) is not None
def _print_formats(self, formats):
print('Available formats:')
for x in formats:
print('%s\t:\t%s\t[%s]' %(x, self._video_extensions.get(x, 'mp4'), self._video_dimensions.get(x, '???')))
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url, re.VERBOSE)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
if mobj.group('shortname'):
if mobj.group('shortname') in ('tds', 'thedailyshow'):
url = u'http://www.thedailyshow.com/full-episodes/'
else:
url = u'http://www.colbertnation.com/full-episodes/'
mobj = re.match(self._VALID_URL, url, re.VERBOSE)
assert mobj is not None
if mobj.group('clip'):
if mobj.group('showname') == 'thedailyshow':
epTitle = mobj.group('tdstitle')
else:
epTitle = mobj.group('cntitle')
dlNewest = False
else:
dlNewest = not mobj.group('episode')
if dlNewest:
epTitle = mobj.group('showname')
else:
epTitle = mobj.group('episode')
self.report_extraction(epTitle)
webpage,htmlHandle = self._download_webpage_handle(url, epTitle)
if dlNewest:
url = htmlHandle.geturl()
mobj = re.match(self._VALID_URL, url, re.VERBOSE)
if mobj is None:
raise ExtractorError(u'Invalid redirected URL: ' + url)
if mobj.group('episode') == '':
raise ExtractorError(u'Redirected URL is still not specific: ' + url)
epTitle = mobj.group('episode')
mMovieParams = re.findall('(?:<param name="movie" value="|var url = ")(http://media.mtvnservices.com/([^"]*(?:episode|video).*?:.*?))"', webpage)
if len(mMovieParams) == 0:
# The Colbert Report embeds the information in a without
# a URL prefix; so extract the alternate reference
# and then add the URL prefix manually.
altMovieParams = re.findall('data-mgid="([^"]*(?:episode|video).*?:.*?)"', webpage)
if len(altMovieParams) == 0:
raise ExtractorError(u'unable to find Flash URL in webpage ' + url)
else:
mMovieParams = [("http://media.mtvnservices.com/" + altMovieParams[0], altMovieParams[0])]
uri = mMovieParams[0][1]
indexUrl = 'http://shadow.comedycentral.com/feeds/video_player/mrss/?' + compat_urllib_parse.urlencode({'uri': uri})
indexXml = self._download_webpage(indexUrl, epTitle,
u'Downloading show index',
u'unable to download episode index')
results = []
idoc = xml.etree.ElementTree.fromstring(indexXml)
itemEls = idoc.findall('.//item')
for partNum,itemEl in enumerate(itemEls):
mediaId = itemEl.findall('./guid')[0].text
shortMediaId = mediaId.split(':')[-1]
showId = mediaId.split(':')[-2].replace('.com', '')
officialTitle = itemEl.findall('./title')[0].text
officialDate = unified_strdate(itemEl.findall('./pubDate')[0].text)
configUrl = ('http://www.comedycentral.com/global/feeds/entertainment/media/mediaGenEntertainment.jhtml?' +
compat_urllib_parse.urlencode({'uri': mediaId}))
configXml = self._download_webpage(configUrl, epTitle,
u'Downloading configuration for %s' % shortMediaId)
cdoc = xml.etree.ElementTree.fromstring(configXml)
turls = []
for rendition in cdoc.findall('.//rendition'):
finfo = (rendition.attrib['bitrate'], rendition.findall('./src')[0].text)
turls.append(finfo)
if len(turls) == 0:
self._downloader.report_error(u'unable to download ' + mediaId + ': No videos found')
continue
if self._downloader.params.get('listformats', None):
self._print_formats([i[0] for i in turls])
return
# For now, just pick the highest bitrate
format,rtmp_video_url = turls[-1]
# Get the format arg from the arg stream
req_format = self._downloader.params.get('format', None)
# Select format if we can find one
for f,v in turls:
if f == req_format:
format, rtmp_video_url = f, v
break
m = re.match(r'^rtmpe?://.*?/(?P<finalid>gsp.comedystor/.*)$', rtmp_video_url)
if not m:
raise ExtractorError(u'Cannot transform RTMP url')
base = 'http://mtvnmobile.vo.llnwd.net/kip0/_pxn=1+_pxI0=Ripod-h264+_pxL0=undefined+_pxM0=+_pxK=18639+_pxE=mp4/44620/mtvnorigin/'
video_url = base + m.group('finalid')
effTitle = showId + u'-' + epTitle + u' part ' + compat_str(partNum+1)
info = {
'id': shortMediaId,
'url': video_url,
'uploader': showId,
'upload_date': officialDate,
'title': effTitle,
'ext': 'mp4',
'format': format,
'thumbnail': None,
'description': compat_str(officialTitle),
}
results.append(info)
return results

View File

@@ -0,0 +1,301 @@
import base64
import os
import re
import socket
import sys
import netrc
from ..utils import (
compat_http_client,
compat_urllib_error,
compat_urllib_request,
compat_str,
clean_html,
compiled_regex_type,
ExtractorError,
)
class InfoExtractor(object):
"""Information Extractor class.
Information extractors are the classes that, given a URL, extract
information about the video (or videos) the URL refers to. This
information includes the real video URL, the video title, author and
others. The information is stored in a dictionary which is then
passed to the FileDownloader. The FileDownloader processes this
information possibly downloading the video to the file system, among
other possible outcomes.
The dictionaries must include the following fields:
id: Video identifier.
url: Final video URL.
title: Video title, unescaped.
ext: Video filename extension.
The following fields are optional:
format: The video format, defaults to ext (used for --get-format)
thumbnails: A list of dictionaries (with the entries "resolution" and
"url") for the varying thumbnails
thumbnail: Full URL to a video thumbnail image.
description: One-line video description.
uploader: Full name of the video uploader.
upload_date: Video upload date (YYYYMMDD).
uploader_id: Nickname or id of the video uploader.
location: Physical location of the video.
player_url: SWF Player URL (used for rtmpdump).
subtitles: The subtitle file contents.
view_count: How many users have watched the video on the platform.
urlhandle: [internal] The urlHandle to be used to download the file,
like returned by urllib.request.urlopen
The fields should all be Unicode strings.
Subclasses of this one should re-define the _real_initialize() and
_real_extract() methods and define a _VALID_URL regexp.
Probably, they should also be added to the list of extractors.
_real_extract() must return a *list* of information dictionaries as
described above.
Finally, the _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests.
"""
_ready = False
_downloader = None
_WORKING = True
def __init__(self, downloader=None):
"""Constructor. Receives an optional downloader."""
self._ready = False
self.set_downloader(downloader)
@classmethod
def suitable(cls, url):
"""Receives a URL and returns True if suitable for this IE."""
return re.match(cls._VALID_URL, url) is not None
@classmethod
def working(cls):
"""Getter method for _WORKING."""
return cls._WORKING
def initialize(self):
"""Initializes an instance (authentication, etc)."""
if not self._ready:
self._real_initialize()
self._ready = True
def extract(self, url):
"""Extracts URL information and returns it in list of dicts."""
self.initialize()
return self._real_extract(url)
def set_downloader(self, downloader):
"""Sets the downloader for this IE."""
self._downloader = downloader
def _real_initialize(self):
"""Real initialization process. Redefine in subclasses."""
pass
def _real_extract(self, url):
"""Real extraction process. Redefine in subclasses."""
pass
@property
def IE_NAME(self):
return type(self).__name__[:-2]
def _request_webpage(self, url_or_request, video_id, note=None, errnote=None):
""" Returns the response handle """
if note is None:
self.report_download_webpage(video_id)
elif note is not False:
self.to_screen(u'%s: %s' % (video_id, note))
try:
return compat_urllib_request.urlopen(url_or_request)
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
if errnote is None:
errnote = u'Unable to download webpage'
raise ExtractorError(u'%s: %s' % (errnote, compat_str(err)), sys.exc_info()[2])
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None):
""" Returns a tuple (page content as string, URL handle) """
urlh = self._request_webpage(url_or_request, video_id, note, errnote)
content_type = urlh.headers.get('Content-Type', '')
m = re.match(r'[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+\s*;\s*charset=(.+)', content_type)
if m:
encoding = m.group(1)
else:
encoding = 'utf-8'
webpage_bytes = urlh.read()
if self._downloader.params.get('dump_intermediate_pages', False):
try:
url = url_or_request.get_full_url()
except AttributeError:
url = url_or_request
self.to_screen(u'Dumping request to ' + url)
dump = base64.b64encode(webpage_bytes).decode('ascii')
self._downloader.to_screen(dump)
content = webpage_bytes.decode(encoding, 'replace')
return (content, urlh)
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None):
""" Returns the data of the page as a string """
return self._download_webpage_handle(url_or_request, video_id, note, errnote)[0]
def to_screen(self, msg):
"""Print msg to screen, prefixing it with '[ie_name]'"""
self._downloader.to_screen(u'[%s] %s' % (self.IE_NAME, msg))
def report_extraction(self, id_or_name):
"""Report information extraction."""
self.to_screen(u'%s: Extracting information' % id_or_name)
def report_download_webpage(self, video_id):
"""Report webpage download."""
self.to_screen(u'%s: Downloading webpage' % video_id)
def report_age_confirmation(self):
"""Report attempt to confirm age."""
self.to_screen(u'Confirming age')
def report_login(self):
"""Report attempt to log in."""
self.to_screen(u'Logging in')
#Methods for following #608
def url_result(self, url, ie=None):
"""Returns a url that points to a page that should be processed"""
#TODO: ie should be the class used for getting the info
video_info = {'_type': 'url',
'url': url,
'ie_key': ie}
return video_info
def playlist_result(self, entries, playlist_id=None, playlist_title=None):
"""Returns a playlist"""
video_info = {'_type': 'playlist',
'entries': entries}
if playlist_id:
video_info['id'] = playlist_id
if playlist_title:
video_info['title'] = playlist_title
return video_info
def _search_regex(self, pattern, string, name, default=None, fatal=True, flags=0):
"""
Perform a regex search on the given string, using a single or a list of
patterns returning the first matching group.
In case of failure return a default value or raise a WARNING or a
ExtractorError, depending on fatal, specifying the field name.
"""
if isinstance(pattern, (str, compat_str, compiled_regex_type)):
mobj = re.search(pattern, string, flags)
else:
for p in pattern:
mobj = re.search(p, string, flags)
if mobj: break
if sys.stderr.isatty() and os.name != 'nt':
_name = u'\033[0;34m%s\033[0m' % name
else:
_name = name
if mobj:
# return the first matching group
return next(g for g in mobj.groups() if g is not None)
elif default is not None:
return default
elif fatal:
raise ExtractorError(u'Unable to extract %s' % _name)
else:
self._downloader.report_warning(u'unable to extract %s; '
u'please report this issue on http://yt-dl.org/bug' % _name)
return None
def _html_search_regex(self, pattern, string, name, default=None, fatal=True, flags=0):
"""
Like _search_regex, but strips HTML tags and unescapes entities.
"""
res = self._search_regex(pattern, string, name, default, fatal, flags)
if res:
return clean_html(res).strip()
else:
return res
def _get_login_info(self):
"""
Get the the login info as (username, password)
It will look in the netrc file using the _NETRC_MACHINE value
If there's no info available, return (None, None)
"""
if self._downloader is None:
return (None, None)
username = None
password = None
downloader_params = self._downloader.params
# Attempt to use provided username and password or .netrc data
if downloader_params.get('username', None) is not None:
username = downloader_params['username']
password = downloader_params['password']
elif downloader_params.get('usenetrc', False):
try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None:
username = info[0]
password = info[2]
else:
raise netrc.NetrcParseError('No authenticators for %s' % self._NETRC_MACHINE)
except (IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning(u'parsing .netrc: %s' % compat_str(err))
return (username, password)
class SearchInfoExtractor(InfoExtractor):
"""
Base class for paged search queries extractors.
They accept urls in the format _SEARCH_KEY(|all|[0-9]):{query}
Instances should define _SEARCH_KEY and _MAX_RESULTS.
"""
@classmethod
def _make_valid_url(cls):
return r'%s(?P<prefix>|[1-9][0-9]*|all):(?P<query>[\s\S]+)' % cls._SEARCH_KEY
@classmethod
def suitable(cls, url):
return re.match(cls._make_valid_url(), url) is not None
def _real_extract(self, query):
mobj = re.match(self._make_valid_url(), query)
if mobj is None:
raise ExtractorError(u'Invalid search query "%s"' % query)
prefix = mobj.group('prefix')
query = mobj.group('query')
if prefix == '':
return self._get_n_results(query, 1)
elif prefix == 'all':
return self._get_n_results(query, self._MAX_RESULTS)
else:
n = int(prefix)
if n <= 0:
raise ExtractorError(u'invalid download number %s for query "%s"' % (n, query))
elif n > self._MAX_RESULTS:
self._downloader.report_warning(u'%s returns max %i results (you requested %i)' % (self._SEARCH_KEY, self._MAX_RESULTS, n))
n = self._MAX_RESULTS
return self._get_n_results(query, n)
def _get_n_results(self, query, n):
"""Get a specified number of results for a query"""
raise NotImplementedError("This method must be implemented by sublclasses")
@property
def SEARCH_KEY(self):
return self._SEARCH_KEY

View File

@@ -0,0 +1,53 @@
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
)
class CSpanIE(InfoExtractor):
_VALID_URL = r'http://www.c-spanvideo.org/program/(.*)'
_TEST = {
u'url': u'http://www.c-spanvideo.org/program/HolderonV',
u'file': u'315139.flv',
u'md5': u'74a623266956f69e4df0068ab6c80fe4',
u'info_dict': {
u"title": u"Attorney General Eric Holder on Voting Rights Act Decision"
},
u'skip': u'Requires rtmpdump'
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
prog_name = mobj.group(1)
webpage = self._download_webpage(url, prog_name)
video_id = self._search_regex(r'programid=(.*?)&', webpage, 'video id')
data = compat_urllib_parse.urlencode({'programid': video_id,
'dynamic':'1'})
info_url = 'http://www.c-spanvideo.org/common/services/flashXml.php?' + data
video_info = self._download_webpage(info_url, video_id, u'Downloading video info')
self.report_extraction(video_id)
title = self._html_search_regex(r'<string name="title">(.*?)</string>',
video_info, 'title')
description = self._html_search_regex(r'<meta (?:property="og:|name=")description" content="(.*?)"',
webpage, 'description',
flags=re.MULTILINE|re.DOTALL)
thumbnail = self._html_search_regex(r'<meta property="og:image" content="(.*?)"',
webpage, 'thumbnail')
url = self._search_regex(r'<string name="URL">(.*?)</string>',
video_info, 'video url')
url = url.replace('$(protocol)', 'rtmp').replace('$(port)', '443')
path = self._search_regex(r'<string name="path">(.*?)</string>',
video_info, 'rtmp play path')
return {'id': video_id,
'title': title,
'ext': 'flv',
'url': url,
'play_path': path,
'description': description,
'thumbnail': thumbnail,
}

View File

@@ -0,0 +1,82 @@
import re
import json
from .common import InfoExtractor
from ..utils import (
compat_urllib_request,
ExtractorError,
)
class DailymotionIE(InfoExtractor):
"""Information Extractor for Dailymotion"""
_VALID_URL = r'(?i)(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/video/([^/]+)'
IE_NAME = u'dailymotion'
_TEST = {
u'url': u'http://www.dailymotion.com/video/x33vw9_tutoriel-de-youtubeur-dl-des-video_tech',
u'file': u'x33vw9.mp4',
u'md5': u'392c4b85a60a90dc4792da41ce3144eb',
u'info_dict': {
u"uploader": u"Alex and Van .",
u"title": u"Tutoriel de Youtubeur\"DL DES VIDEO DE YOUTUBE\""
}
}
def _real_extract(self, url):
# Extract id and simplified title from URL
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1).split('_')[0].split('?')[0]
video_extension = 'mp4'
# Retrieve video webpage to extract further information
request = compat_urllib_request.Request(url)
request.add_header('Cookie', 'family_filter=off')
webpage = self._download_webpage(request, video_id)
# Extract URL, uploader and title from webpage
self.report_extraction(video_id)
video_title = self._html_search_regex(r'<meta property="og:title" content="(.*?)" />',
webpage, 'title')
video_uploader = self._search_regex([r'(?im)<span class="owner[^\"]+?">[^<]+?<a [^>]+?>([^<]+?)</a>',
# Looking for official user
r'<(?:span|a) .*?rel="author".*?>([^<]+?)</'],
webpage, 'video uploader')
video_upload_date = None
mobj = re.search(r'<div class="[^"]*uploaded_cont[^"]*" title="[^"]*">([0-9]{2})-([0-9]{2})-([0-9]{4})</div>', webpage)
if mobj is not None:
video_upload_date = mobj.group(3) + mobj.group(2) + mobj.group(1)
embed_url = 'http://www.dailymotion.com/embed/video/%s' % video_id
embed_page = self._download_webpage(embed_url, video_id,
u'Downloading embed page')
info = self._search_regex(r'var info = ({.*?}),', embed_page, 'video info')
info = json.loads(info)
# TODO: support choosing qualities
for key in ['stream_h264_hd1080_url','stream_h264_hd_url',
'stream_h264_hq_url','stream_h264_url',
'stream_h264_ld_url']:
if info.get(key):#key in info and info[key]:
max_quality = key
self.to_screen(u'Using %s' % key)
break
else:
raise ExtractorError(u'Unable to extract video URL')
video_url = info[max_quality]
return [{
'id': video_id,
'url': video_url,
'uploader': video_uploader,
'upload_date': video_upload_date,
'title': video_title,
'ext': video_extension,
'thumbnail': info['thumbnail_url']
}]

View File

@@ -0,0 +1,60 @@
import re
import os
import socket
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_parse,
compat_urllib_request,
ExtractorError,
)
class DepositFilesIE(InfoExtractor):
"""Information extractor for depositfiles.com"""
_VALID_URL = r'(?:http://)?(?:\w+\.)?depositfiles\.com/(?:../(?#locale))?files/(.+)'
def _real_extract(self, url):
file_id = url.split('/')[-1]
# Rebuild url in english locale
url = 'http://depositfiles.com/en/files/' + file_id
# Retrieve file webpage with 'Free download' button pressed
free_download_indication = { 'gateway_result' : '1' }
request = compat_urllib_request.Request(url, compat_urllib_parse.urlencode(free_download_indication))
try:
self.report_download_webpage(file_id)
webpage = compat_urllib_request.urlopen(request).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to retrieve file webpage: %s' % compat_str(err))
# Search for the real file URL
mobj = re.search(r'<form action="(http://fileshare.+?)"', webpage)
if (mobj is None) or (mobj.group(1) is None):
# Try to figure out reason of the error.
mobj = re.search(r'<strong>(Attention.*?)</strong>', webpage, re.DOTALL)
if (mobj is not None) and (mobj.group(1) is not None):
restriction_message = re.sub('\s+', ' ', mobj.group(1)).strip()
raise ExtractorError(u'%s' % restriction_message)
else:
raise ExtractorError(u'Unable to extract download URL from: %s' % url)
file_url = mobj.group(1)
file_extension = os.path.splitext(file_url)[1][1:]
# Search for file title
file_title = self._search_regex(r'<b title="(.*?)">', webpage, u'title')
return [{
'id': file_id.decode('utf-8'),
'url': file_url.decode('utf-8'),
'uploader': None,
'upload_date': None,
'title': file_title,
'ext': file_extension.decode('utf-8'),
}]

View File

@@ -0,0 +1,41 @@
import re
import json
import time
from .common import InfoExtractor
class DotsubIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?dotsub\.com/view/([^/]+)'
_TEST = {
u'url': u'http://dotsub.com/view/aed3b8b2-1889-4df5-ae63-ad85f5572f27',
u'file': u'aed3b8b2-1889-4df5-ae63-ad85f5572f27.flv',
u'md5': u'0914d4d69605090f623b7ac329fea66e',
u'info_dict': {
u"title": u"Pyramids of Waste (2010), AKA The Lightbulb Conspiracy - Planned obsolescence documentary",
u"uploader": u"4v4l0n42",
u'description': u'Pyramids of Waste (2010) also known as "The lightbulb conspiracy" is a documentary about how our economic system based on consumerism and planned obsolescence is breaking our planet down.\r\n\r\nSolutions to this can be found at:\r\nhttp://robotswillstealyourjob.com\r\nhttp://www.federicopistono.org\r\n\r\nhttp://opensourceecology.org\r\nhttp://thezeitgeistmovement.com',
u'thumbnail': u'http://dotsub.com/media/aed3b8b2-1889-4df5-ae63-ad85f5572f27/p',
u'upload_date': u'20101213',
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
info_url = "https://dotsub.com/api/media/%s/metadata" %(video_id)
webpage = self._download_webpage(info_url, video_id)
info = json.loads(webpage)
date = time.gmtime(info['dateCreated']/1000) # The timestamp is in miliseconds
return [{
'id': video_id,
'url': info['mediaURI'],
'ext': 'flv',
'title': info['title'],
'thumbnail': info['screenshotURI'],
'description': info['description'],
'uploader': info['user'],
'view_count': info['numberOfViews'],
'upload_date': u'%04i%02i%02i' % (date.tm_year, date.tm_mon, date.tm_mday),
}]

View File

@@ -0,0 +1,85 @@
# coding: utf-8
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
determine_ext,
unified_strdate,
)
class DreiSatIE(InfoExtractor):
IE_NAME = '3sat'
_VALID_URL = r'(?:http://)?(?:www\.)?3sat.de/mediathek/index.php\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TEST = {
u"url": u"http://www.3sat.de/mediathek/index.php?obj=36983",
u'file': u'36983.webm',
u'md5': u'57c97d0469d71cf874f6815aa2b7c944',
u'info_dict': {
u"title": u"Kaffeeland Schweiz",
u"description": u"Über 80 Kaffeeröstereien liefern in der Schweiz das Getränk, in das das Land so vernarrt ist: Mehr als 1000 Tassen trinkt ein Schweizer pro Jahr. SCHWEIZWEIT nimmt die Kaffeekultur unter die...",
u"uploader": u"3sat",
u"upload_date": u"20130622"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?ak=web&id=%s' % video_id
details_xml = self._download_webpage(details_url, video_id, note=u'Downloading video details')
details_doc = xml.etree.ElementTree.fromstring(details_xml.encode('utf-8'))
thumbnail_els = details_doc.findall('.//teaserimage')
thumbnails = [{
'width': te.attrib['key'].partition('x')[0],
'height': te.attrib['key'].partition('x')[2],
'url': te.text,
} for te in thumbnail_els]
information_el = details_doc.find('.//information')
video_title = information_el.find('./title').text
video_description = information_el.find('./detail').text
details_el = details_doc.find('.//details')
video_uploader = details_el.find('./channel').text
upload_date = unified_strdate(details_el.find('./airtime').text)
format_els = details_doc.findall('.//formitaet')
formats = [{
'format_id': fe.attrib['basetype'],
'width': int(fe.find('./width').text),
'height': int(fe.find('./height').text),
'url': fe.find('./url').text,
'filesize': int(fe.find('./filesize').text),
'video_bitrate': int(fe.find('./videoBitrate').text),
'3sat_qualityname': fe.find('./quality').text,
} for fe in format_els
if not fe.find('./url').text.startswith('http://www.metafilegenerator.de/')]
def _sortkey(format):
qidx = ['low', 'med', 'high', 'veryhigh'].index(format['3sat_qualityname'])
prefer_http = 1 if 'rtmp' in format['url'] else 0
return (qidx, prefer_http, format['video_bitrate'])
formats.sort(key=_sortkey)
info = {
'_type': 'video',
'id': video_id,
'title': video_title,
'formats': formats,
'description': video_description,
'thumbnails': thumbnails,
'thumbnail': thumbnails[-1]['url'],
'uploader': video_uploader,
'upload_date': upload_date,
}
# TODO: Remove when #980 has been merged
info['url'] = formats[-1]['url']
info['ext'] = determine_ext(formats[-1]['url'])
return info

View File

@@ -0,0 +1,51 @@
import re
from ..utils import (
compat_urllib_parse,
determine_ext
)
from .common import InfoExtractor
class EHowIE(InfoExtractor):
IE_NAME = u'eHow'
_VALID_URL = r'(?:https?://)?(?:www\.)?ehow\.com/[^/_?]*_(?P<id>[0-9]+)'
_TEST = {
u'url': u'http://www.ehow.com/video_12245069_hardwood-flooring-basics.html',
u'file': u'12245069.flv',
u'md5': u'9809b4e3f115ae2088440bcb4efbf371',
u'info_dict': {
u"title": u"Hardwood Flooring Basics",
u"description": u"Hardwood flooring may be time consuming, but its ultimately a pretty straightforward concept. Learn about hardwood flooring basics with help from a hardware flooring business owner in this free video...",
u"uploader": u"Erick Nathan"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(r'(?:file|source)=(http[^\'"&]*)',
webpage, u'video URL')
final_url = compat_urllib_parse.unquote(video_url)
thumbnail_url = self._search_regex(r'<meta property="og:image" content="(.+?)" />',
webpage, u'thumbnail URL')
uploader = self._search_regex(r'<meta name="uploader" content="(.+?)" />',
webpage, u'uploader')
title = self._search_regex(r'<meta property="og:title" content="(.+?)" />',
webpage, u'Video title').replace(' | eHow', '')
description = self._search_regex(r'<meta property="og:description" content="(.+?)" />',
webpage, u'video description')
ext = determine_ext(final_url)
return {
'_type': 'video',
'id': video_id,
'url': final_url,
'ext': ext,
'title': title,
'thumbnail': thumbnail_url,
'description': description,
'uploader': uploader,
}

View File

@@ -0,0 +1,122 @@
import itertools
import json
import random
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class EightTracksIE(InfoExtractor):
IE_NAME = '8tracks'
_VALID_URL = r'https?://8tracks.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$'
_TEST = {
u"name": u"EightTracks",
u"url": u"http://8tracks.com/ytdl/youtube-dl-test-tracks-a",
u"playlist": [
{
u"file": u"11885610.m4a",
u"md5": u"96ce57f24389fc8734ce47f4c1abcc55",
u"info_dict": {
u"title": u"youtue-dl project<>\"' - youtube-dl test track 1 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885608.m4a",
u"md5": u"4ab26f05c1f7291ea460a3920be8021f",
u"info_dict": {
u"title": u"youtube-dl project - youtube-dl test track 2 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885679.m4a",
u"md5": u"d30b5b5f74217410f4689605c35d1fd7",
u"info_dict": {
u"title": u"youtube-dl project as well - youtube-dl test track 3 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885680.m4a",
u"md5": u"4eb0a669317cd725f6bbd336a29f923a",
u"info_dict": {
u"title": u"youtube-dl project as well - youtube-dl test track 4 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885682.m4a",
u"md5": u"1893e872e263a2705558d1d319ad19e8",
u"info_dict": {
u"title": u"PH - youtube-dl test track 5 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885683.m4a",
u"md5": u"b673c46f47a216ab1741ae8836af5899",
u"info_dict": {
u"title": u"PH - youtube-dl test track 6 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885684.m4a",
u"md5": u"1d74534e95df54986da7f5abf7d842b7",
u"info_dict": {
u"title": u"phihag - youtube-dl test track 7 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
},
{
u"file": u"11885685.m4a",
u"md5": u"f081f47af8f6ae782ed131d38b9cd1c0",
u"info_dict": {
u"title": u"phihag - youtube-dl test track 8 \"'/\\\u00e4\u21ad",
u"uploader_id": u"ytdl"
}
}
]
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
playlist_id = mobj.group('id')
webpage = self._download_webpage(url, playlist_id)
json_like = self._search_regex(r"PAGE.mix = (.*?);\n", webpage, u'trax information', flags=re.DOTALL)
data = json.loads(json_like)
session = str(random.randint(0, 1000000000))
mix_id = data['id']
track_count = data['tracks_count']
first_url = 'http://8tracks.com/sets/%s/play?player=sm&mix_id=%s&format=jsonh' % (session, mix_id)
next_url = first_url
res = []
for i in itertools.count():
api_json = self._download_webpage(next_url, playlist_id,
note=u'Downloading song information %s/%s' % (str(i+1), track_count),
errnote=u'Failed to download song information')
api_data = json.loads(api_json)
track_data = api_data[u'set']['track']
info = {
'id': track_data['id'],
'url': track_data['track_file_stream_url'],
'title': track_data['performer'] + u' - ' + track_data['name'],
'raw_title': track_data['name'],
'uploader_id': data['user']['login'],
'ext': 'm4a',
}
res.append(info)
if api_data['set']['at_last_track']:
break
next_url = 'http://8tracks.com/sets/%s/next?player=sm&mix_id=%s&format=jsonh&track_id=%s' % (session, mix_id, track_data['id'])
return res

View File

@@ -0,0 +1,78 @@
import json
import re
from .common import InfoExtractor
from ..utils import (
compat_str,
compat_urllib_parse,
ExtractorError,
)
class EscapistIE(InfoExtractor):
_VALID_URL = r'^(https?://)?(www\.)?escapistmagazine\.com/videos/view/(?P<showname>[^/]+)/(?P<episode>[^/?]+)[/?]?.*$'
_TEST = {
u'url': u'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
u'file': u'6618-Breaking-Down-Baldurs-Gate.mp4',
u'md5': u'c6793dbda81388f4264c1ba18684a74d',
u'info_dict': {
u"description": u"Baldur's Gate: Original, Modded or Enhanced Edition? I'll break down what you can expect from the new Baldur's Gate: Enhanced Edition.",
u"uploader": u"the-escapist-presents",
u"title": u"Breaking Down Baldur's Gate"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
showName = mobj.group('showname')
videoId = mobj.group('episode')
self.report_extraction(videoId)
webpage = self._download_webpage(url, videoId)
videoDesc = self._html_search_regex('<meta name="description" content="([^"]*)"',
webpage, u'description', fatal=False)
imgUrl = self._html_search_regex('<meta property="og:image" content="([^"]*)"',
webpage, u'thumbnail', fatal=False)
playerUrl = self._html_search_regex('<meta property="og:video" content="([^"]*)"',
webpage, u'player url')
title = self._html_search_regex('<meta name="title" content="([^"]*)"',
webpage, u'player url').split(' : ')[-1]
configUrl = self._search_regex('config=(.*)$', playerUrl, u'config url')
configUrl = compat_urllib_parse.unquote(configUrl)
configJSON = self._download_webpage(configUrl, videoId,
u'Downloading configuration',
u'unable to download configuration')
# Technically, it's JavaScript, not JSON
configJSON = configJSON.replace("'", '"')
try:
config = json.loads(configJSON)
except (ValueError,) as err:
raise ExtractorError(u'Invalid JSON in configuration file: ' + compat_str(err))
playlist = config['playlist']
videoUrl = playlist[1]['url']
info = {
'id': videoId,
'url': videoUrl,
'uploader': showName,
'upload_date': None,
'title': title,
'ext': 'mp4',
'thumbnail': imgUrl,
'description': videoDesc,
'player_url': playerUrl,
}
return [info]

View File

@@ -0,0 +1,120 @@
import json
import netrc
import re
import socket
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_parse,
compat_urllib_request,
ExtractorError,
)
class FacebookIE(InfoExtractor):
"""Information Extractor for Facebook"""
_VALID_URL = r'^(?:https?://)?(?:\w+\.)?facebook\.com/(?:video/video|photo)\.php\?(?:.*?)v=(?P<ID>\d+)(?:.*)'
_LOGIN_URL = 'https://login.facebook.com/login.php?m&next=http%3A%2F%2Fm.facebook.com%2Fhome.php&'
_NETRC_MACHINE = 'facebook'
IE_NAME = u'facebook'
_TEST = {
u'url': u'https://www.facebook.com/photo.php?v=120708114770723',
u'file': u'120708114770723.mp4',
u'md5': u'48975a41ccc4b7a581abd68651c1a5a8',
u'info_dict': {
u"duration": 279,
u"title": u"PEOPLE ARE AWESOME 2013"
}
}
def report_login(self):
"""Report attempt to log in."""
self.to_screen(u'Logging in')
def _real_initialize(self):
if self._downloader is None:
return
useremail = None
password = None
downloader_params = self._downloader.params
# Attempt to use provided username and password or .netrc data
if downloader_params.get('username', None) is not None:
useremail = downloader_params['username']
password = downloader_params['password']
elif downloader_params.get('usenetrc', False):
try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None:
useremail = info[0]
password = info[2]
else:
raise netrc.NetrcParseError('No authenticators for %s' % self._NETRC_MACHINE)
except (IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning(u'parsing .netrc: %s' % compat_str(err))
return
if useremail is None:
return
# Log in
login_form = {
'email': useremail,
'pass': password,
'login': 'Log+In'
}
request = compat_urllib_request.Request(self._LOGIN_URL, compat_urllib_parse.urlencode(login_form))
try:
self.report_login()
login_results = compat_urllib_request.urlopen(request).read()
if re.search(r'<form(.*)name="login"(.*)</form>', login_results) is not None:
self._downloader.report_warning(u'unable to log in: bad username/password, or exceded login rate limit (~3/min). Check credentials or wait.')
return
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self._downloader.report_warning(u'unable to log in: %s' % compat_str(err))
return
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group('ID')
url = 'https://www.facebook.com/video/video.php?v=%s' % video_id
webpage = self._download_webpage(url, video_id)
BEFORE = '{swf.addParam(param[0], param[1]);});\n'
AFTER = '.forEach(function(variable) {swf.addVariable(variable[0], variable[1]);});'
m = re.search(re.escape(BEFORE) + '(.*?)' + re.escape(AFTER), webpage)
if not m:
raise ExtractorError(u'Cannot parse data')
data = dict(json.loads(m.group(1)))
params_raw = compat_urllib_parse.unquote(data['params'])
params = json.loads(params_raw)
video_data = params['video_data'][0]
video_url = video_data.get('hd_src')
if not video_url:
video_url = video_data['sd_src']
if not video_url:
raise ExtractorError(u'Cannot find video URL')
video_duration = int(video_data['video_duration'])
thumbnail = video_data['thumbnail_src']
video_title = self._html_search_regex('<h2 class="uiHeaderTitle">([^<]+)</h2>',
webpage, u'title')
info = {
'id': video_id,
'title': video_title,
'url': video_url,
'ext': 'mp4',
'duration': video_duration,
'thumbnail': thumbnail,
}
return [info]

View File

@@ -0,0 +1,67 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
unescapeHTML,
)
class FlickrIE(InfoExtractor):
"""Information Extractor for Flickr videos"""
_VALID_URL = r'(?:https?://)?(?:www\.)?flickr\.com/photos/(?P<uploader_id>[\w\-_@]+)/(?P<id>\d+).*'
_TEST = {
u'url': u'http://www.flickr.com/photos/forestwander-nature-pictures/5645318632/in/photostream/',
u'file': u'5645318632.mp4',
u'md5': u'6fdc01adbc89d72fc9c4f15b4a4ba87b',
u'info_dict': {
u"description": u"Waterfalls in the Springtime at Dark Hollow Waterfalls. These are located just off of Skyline Drive in Virginia. They are only about 6/10 of a mile hike but it is a pretty steep hill and a good climb back up.",
u"uploader_id": u"forestwander-nature-pictures",
u"title": u"Dark Hollow Waterfalls"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_uploader_id = mobj.group('uploader_id')
webpage_url = 'http://www.flickr.com/photos/' + video_uploader_id + '/' + video_id
webpage = self._download_webpage(webpage_url, video_id)
secret = self._search_regex(r"photo_secret: '(\w+)'", webpage, u'secret')
first_url = 'https://secure.flickr.com/apps/video/video_mtl_xml.gne?v=x&photo_id=' + video_id + '&secret=' + secret + '&bitrate=700&target=_self'
first_xml = self._download_webpage(first_url, video_id, 'Downloading first data webpage')
node_id = self._html_search_regex(r'<Item id="id">(\d+-\d+)</Item>',
first_xml, u'node_id')
second_url = 'https://secure.flickr.com/video_playlist.gne?node_id=' + node_id + '&tech=flash&mode=playlist&bitrate=700&secret=' + secret + '&rd=video.yahoo.com&noad=1'
second_xml = self._download_webpage(second_url, video_id, 'Downloading second data webpage')
self.report_extraction(video_id)
mobj = re.search(r'<STREAM APP="(.+?)" FULLPATH="(.+?)"', second_xml)
if mobj is None:
raise ExtractorError(u'Unable to extract video url')
video_url = mobj.group(1) + unescapeHTML(mobj.group(2))
video_title = self._html_search_regex(r'<meta property="og:title" content=(?:"([^"]+)"|\'([^\']+)\')',
webpage, u'video title')
video_description = self._html_search_regex(r'<meta property="og:description" content=(?:"([^"]+)"|\'([^\']+)\')',
webpage, u'description', fatal=False)
thumbnail = self._html_search_regex(r'<meta property="og:image" content=(?:"([^"]+)"|\'([^\']+)\')',
webpage, u'thumbnail', fatal=False)
return [{
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': video_title,
'description': video_description,
'thumbnail': thumbnail,
'uploader_id': video_uploader_id,
}]

View File

@@ -0,0 +1,40 @@
import re
from .common import InfoExtractor
class FunnyOrDieIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:www\.)?funnyordie\.com/videos/(?P<id>[0-9a-f]+)/.*$'
_TEST = {
u'url': u'http://www.funnyordie.com/videos/0732f586d7/heart-shaped-box-literal-video-version',
u'file': u'0732f586d7.mp4',
u'md5': u'f647e9e90064b53b6e046e75d0241fbd',
u'info_dict': {
u"description": u"Lyrics changed to match the video. Spoken cameo by Obscurus Lupa (from ThatGuyWithTheGlasses.com). Based on a concept by Dustin McLean (DustFilms.com). Performed, edited, and written by David A. Scott.",
u"title": u"Heart-Shaped Box: Literal Video Version"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(r'<video[^>]*>\s*<source[^>]*>\s*<source src="(?P<url>[^"]+)"',
webpage, u'video URL', flags=re.DOTALL)
title = self._html_search_regex((r"<h1 class='player_page_h1'.*?>(?P<title>.*?)</h1>",
r'<title>(?P<title>[^<]+?)</title>'), webpage, 'title', flags=re.DOTALL)
video_description = self._html_search_regex(r'<meta property="og:description" content="(?P<desc>.*?)"',
webpage, u'description', fatal=False, flags=re.DOTALL)
info = {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': title,
'description': video_description,
}
return [info]

View File

@@ -0,0 +1,55 @@
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
unified_strdate,
compat_urllib_parse,
)
class GameSpotIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?gamespot\.com/.*-(?P<page_id>\d+)/?'
_TEST = {
u"url": u"http://www.gamespot.com/arma-iii/videos/arma-iii-community-guide-sitrep-i-6410818/",
u"file": u"6410818.mp4",
u"md5": u"b2a30deaa8654fcccd43713a6b6a4825",
u"info_dict": {
u"title": u"Arma III - Community Guide: SITREP I",
u"upload_date": u"20130627",
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
page_id = mobj.group('page_id')
webpage = self._download_webpage(url, page_id)
video_id = self._html_search_regex([r'"og:video" content=".*?\?id=(\d+)"',
r'http://www\.gamespot\.com/videoembed/(\d+)'],
webpage, 'video id')
data = compat_urllib_parse.urlencode({'id': video_id, 'newplayer': '1'})
info_url = 'http://www.gamespot.com/pages/video_player/xml.php?' + data
info_xml = self._download_webpage(info_url, video_id)
doc = xml.etree.ElementTree.fromstring(info_xml)
clip_el = doc.find('./playList/clip')
http_urls = [{'url': node.find('filePath').text,
'rate': int(node.find('rate').text)}
for node in clip_el.find('./httpURI')]
best_quality = sorted(http_urls, key=lambda f: f['rate'])[-1]
video_url = best_quality['url']
title = clip_el.find('./title').text
ext = video_url.rpartition('.')[2]
thumbnail_url = clip_el.find('./screenGrabURI').text
view_count = int(clip_el.find('./views').text)
upload_date = unified_strdate(clip_el.find('./postDate').text)
return [{
'id' : video_id,
'url' : video_url,
'ext' : ext,
'title' : title,
'thumbnail' : thumbnail_url,
'upload_date' : upload_date,
'view_count' : view_count,
}]

View File

@@ -0,0 +1,63 @@
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
ExtractorError,
)
class GametrailersIE(InfoExtractor):
_VALID_URL = r'http://www.gametrailers.com/(?P<type>videos|reviews|full-episodes)/(?P<id>.*?)/(?P<title>.*)'
_TEST = {
u'url': u'http://www.gametrailers.com/videos/zbvr8i/mirror-s-edge-2-e3-2013--debut-trailer',
u'file': u'70e9a5d7-cf25-4a10-9104-6f3e7342ae0d.flv',
u'md5': u'c3edbc995ab4081976e16779bd96a878',
u'info_dict': {
u"title": u"E3 2013: Debut Trailer"
},
u'skip': u'Requires rtmpdump'
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
mgid = self._search_regex([r'data-video="(?P<mgid>mgid:.*?)"',
r'data-contentId=\'(?P<mgid>mgid:.*?)\''],
webpage, u'mgid')
data = compat_urllib_parse.urlencode({'uri': mgid, 'acceptMethods': 'fms'})
info_page = self._download_webpage('http://www.gametrailers.com/feeds/mrss?' + data,
video_id, u'Downloading video info')
doc = xml.etree.ElementTree.fromstring(info_page.encode('utf-8'))
default_thumb = doc.find('./channel/image/url').text
media_namespace = {'media': 'http://search.yahoo.com/mrss/'}
parts = [{
'title': video_doc.find('title').text,
'ext': 'flv',
'id': video_doc.find('guid').text.rpartition(':')[2],
# Videos are actually flv not mp4
'url': self._get_video_url(video_doc.find('media:group/media:content', media_namespace).attrib['url'], video_id),
# The thumbnail may not be defined, it would be ''
'thumbnail': video_doc.find('media:group/media:thumbnail', media_namespace).attrib['url'] or default_thumb,
'description': video_doc.find('description').text,
} for video_doc in doc.findall('./channel/item')]
return parts
def _get_video_url(self, mediagen_url, video_id):
if 'acceptMethods' not in mediagen_url:
mediagen_url += '&acceptMethods=fms'
links_webpage = self._download_webpage(mediagen_url,
video_id, u'Downloading video urls info')
doc = xml.etree.ElementTree.fromstring(links_webpage)
urls = list(doc.iter('src'))
if len(urls) == 0:
raise ExtractorError(u'Unable to extract video url')
# They are sorted from worst to best quality
return urls[-1].text

View File

@@ -0,0 +1,182 @@
# encoding: utf-8
import os
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_error,
compat_urllib_parse,
compat_urllib_request,
ExtractorError,
)
from .brightcove import BrightcoveIE
class GenericIE(InfoExtractor):
IE_DESC = u'Generic downloader that works on some sites'
_VALID_URL = r'.*'
IE_NAME = u'generic'
_TESTS = [
{
u'url': u'http://www.hodiho.fr/2013/02/regis-plante-sa-jeep.html',
u'file': u'13601338388002.mp4',
u'md5': u'85b90ccc9d73b4acd9138d3af4c27f89',
u'info_dict': {
u"uploader": u"www.hodiho.fr",
u"title": u"R\u00e9gis plante sa Jeep"
}
},
{
u'url': u'http://www.8tv.cat/8aldia/videos/xavier-sala-i-martin-aquesta-tarda-a-8-al-dia/',
u'file': u'2371591881001.mp4',
u'md5': u'9e80619e0a94663f0bdc849b4566af19',
u'note': u'Test Brightcove downloads and detection in GenericIE',
u'info_dict': {
u'title': u'Xavier Sala i Martín: “Un banc que no presta és un banc zombi que no serveix per a res”',
u'uploader': u'8TV',
u'description': u'md5:a950cc4285c43e44d763d036710cd9cd',
}
},
]
def report_download_webpage(self, video_id):
"""Report webpage download."""
if not self._downloader.params.get('test', False):
self._downloader.report_warning(u'Falling back on generic information extractor.')
super(GenericIE, self).report_download_webpage(video_id)
def report_following_redirect(self, new_url):
"""Report information extraction."""
self._downloader.to_screen(u'[redirect] Following redirect to %s' % new_url)
def _test_redirect(self, url):
"""Check if it is a redirect, like url shorteners, in case return the new url."""
class HeadRequest(compat_urllib_request.Request):
def get_method(self):
return "HEAD"
class HEADRedirectHandler(compat_urllib_request.HTTPRedirectHandler):
"""
Subclass the HTTPRedirectHandler to make it use our
HeadRequest also on the redirected URL
"""
def redirect_request(self, req, fp, code, msg, headers, newurl):
if code in (301, 302, 303, 307):
newurl = newurl.replace(' ', '%20')
newheaders = dict((k,v) for k,v in req.headers.items()
if k.lower() not in ("content-length", "content-type"))
return HeadRequest(newurl,
headers=newheaders,
origin_req_host=req.get_origin_req_host(),
unverifiable=True)
else:
raise compat_urllib_error.HTTPError(req.get_full_url(), code, msg, headers, fp)
class HTTPMethodFallback(compat_urllib_request.BaseHandler):
"""
Fallback to GET if HEAD is not allowed (405 HTTP error)
"""
def http_error_405(self, req, fp, code, msg, headers):
fp.read()
fp.close()
newheaders = dict((k,v) for k,v in req.headers.items()
if k.lower() not in ("content-length", "content-type"))
return self.parent.open(compat_urllib_request.Request(req.get_full_url(),
headers=newheaders,
origin_req_host=req.get_origin_req_host(),
unverifiable=True))
# Build our opener
opener = compat_urllib_request.OpenerDirector()
for handler in [compat_urllib_request.HTTPHandler, compat_urllib_request.HTTPDefaultErrorHandler,
HTTPMethodFallback, HEADRedirectHandler,
compat_urllib_request.HTTPErrorProcessor, compat_urllib_request.HTTPSHandler]:
opener.add_handler(handler())
response = opener.open(HeadRequest(url))
if response is None:
raise ExtractorError(u'Invalid URL protocol')
new_url = response.geturl()
if url == new_url:
return False
self.report_following_redirect(new_url)
return new_url
def _real_extract(self, url):
new_url = self._test_redirect(url)
if new_url: return [self.url_result(new_url)]
video_id = url.split('/')[-1]
try:
webpage = self._download_webpage(url, video_id)
except ValueError:
# since this is the last-resort InfoExtractor, if
# this error is thrown, it'll be thrown here
raise ExtractorError(u'Invalid URL: %s' % url)
self.report_extraction(video_id)
# Look for BrigthCove:
m_brightcove = re.search(r'<object.+?class=".*?BrightcoveExperience.*?".+?</object>', webpage, re.DOTALL)
if m_brightcove is not None:
self.to_screen(u'Brightcove video detected.')
bc_url = BrightcoveIE._build_brighcove_url(m_brightcove.group())
return self.url_result(bc_url, 'Brightcove')
# Start with something easy: JW Player in SWFObject
mobj = re.search(r'flashvars: [\'"](?:.*&)?file=(http[^\'"&]*)', webpage)
if mobj is None:
# Broaden the search a little bit
mobj = re.search(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage)
if mobj is None:
# Broaden the search a little bit: JWPlayer JS loader
mobj = re.search(r'[^A-Za-z0-9]?file["\']?:\s*["\'](http[^\'"&]*)', webpage)
if mobj is None:
# Try to find twitter cards info
mobj = re.search(r'<meta (?:property|name)="twitter:player:stream" (?:content|value)="(.+?)"', webpage)
if mobj is None:
# We look for Open Graph info:
# We have to match any number spaces between elements, some sites try to align them (eg.: statigr.am)
m_video_type = re.search(r'<meta.*?property="og:video:type".*?content="video/(.*?)"', webpage)
# We only look in og:video if the MIME type is a video, don't try if it's a Flash player:
if m_video_type is not None:
mobj = re.search(r'<meta.*?property="og:video".*?content="(.*?)"', webpage)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
# It's possible that one of the regexes
# matched, but returned an empty group:
if mobj.group(1) is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_url = compat_urllib_parse.unquote(mobj.group(1))
video_id = os.path.basename(video_url)
# here's a fun little line of code for you:
video_extension = os.path.splitext(video_id)[1][1:]
video_id = os.path.splitext(video_id)[0]
# it's tempting to parse this further, but you would
# have to take into account all the variations like
# Video Title - Site Name
# Site Name | Video Title
# Video Title - Tagline | Site Name
# and so on and so forth; it's just not practical
video_title = self._html_search_regex(r'<title>(.*)</title>',
webpage, u'video title', default=u'video', flags=re.DOTALL)
# video uploader is domain name
video_uploader = self._search_regex(r'(?:https?://)?([^/]*)/.*',
url, u'video uploader')
return [{
'id': video_id,
'url': video_url,
'uploader': video_uploader,
'upload_date': None,
'title': video_title,
'ext': video_extension,
}]

View File

@@ -0,0 +1,96 @@
# coding: utf-8
import datetime
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class GooglePlusIE(InfoExtractor):
IE_DESC = u'Google Plus'
_VALID_URL = r'(?:https://)?plus\.google\.com/(?:[^/]+/)*?posts/(\w+)'
IE_NAME = u'plus.google'
_TEST = {
u"url": u"https://plus.google.com/u/0/108897254135232129896/posts/ZButuJc6CtH",
u"file": u"ZButuJc6CtH.flv",
u"info_dict": {
u"upload_date": u"20120613",
u"uploader": u"井上ヨシマサ",
u"title": u"嘆きの天使 降臨"
}
}
def _real_extract(self, url):
# Extract id from URL
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
post_url = mobj.group(0)
video_id = mobj.group(1)
video_extension = 'flv'
# Step 1, Retrieve post webpage to extract further information
webpage = self._download_webpage(post_url, video_id, u'Downloading entry webpage')
self.report_extraction(video_id)
# Extract update date
upload_date = self._html_search_regex('title="Timestamp">(.*?)</a>',
webpage, u'upload date', fatal=False)
if upload_date:
# Convert timestring to a format suitable for filename
upload_date = datetime.datetime.strptime(upload_date, "%Y-%m-%d")
upload_date = upload_date.strftime('%Y%m%d')
# Extract uploader
uploader = self._html_search_regex(r'rel\="author".*?>(.*?)</a>',
webpage, u'uploader', fatal=False)
# Extract title
# Get the first line for title
video_title = self._html_search_regex(r'<meta name\=\"Description\" content\=\"(.*?)[\n<"]',
webpage, 'title', default=u'NA')
# Step 2, Simulate clicking the image box to launch video
DOMAIN = 'https://plus.google.com'
video_page = self._search_regex(r'<a href="((?:%s)?/photos/.*?)"' % re.escape(DOMAIN),
webpage, u'video page URL')
if not video_page.startswith(DOMAIN):
video_page = DOMAIN + video_page
webpage = self._download_webpage(video_page, video_id, u'Downloading video page')
# Extract video links on video page
"""Extract video links of all sizes"""
pattern = r'\d+,\d+,(\d+),"(http\://redirector\.googlevideo\.com.*?)"'
mobj = re.findall(pattern, webpage)
if len(mobj) == 0:
raise ExtractorError(u'Unable to extract video links')
# Sort in resolution
links = sorted(mobj)
# Choose the lowest of the sort, i.e. highest resolution
video_url = links[-1]
# Only get the url. The resolution part in the tuple has no use anymore
video_url = video_url[-1]
# Treat escaped \u0026 style hex
try:
video_url = video_url.decode("unicode_escape")
except AttributeError: # Python 3
video_url = bytes(video_url, 'ascii').decode('unicode-escape')
return [{
'id': video_id,
'url': video_url,
'uploader': uploader,
'upload_date': upload_date,
'title': video_title,
'ext': video_extension,
}]

View File

@@ -0,0 +1,39 @@
import itertools
import re
from .common import SearchInfoExtractor
from ..utils import (
compat_urllib_parse,
)
class GoogleSearchIE(SearchInfoExtractor):
IE_DESC = u'Google Video search'
_MORE_PAGES_INDICATOR = r'id="pnnext" class="pn"'
_MAX_RESULTS = 1000
IE_NAME = u'video.google:search'
_SEARCH_KEY = 'gvsearch'
def _get_n_results(self, query, n):
"""Get a specified number of results for a query"""
res = {
'_type': 'playlist',
'id': query,
'entries': []
}
for pagenum in itertools.count(1):
result_url = u'http://www.google.com/search?tbm=vid&q=%s&start=%s&hl=en' % (compat_urllib_parse.quote_plus(query), pagenum*10)
webpage = self._download_webpage(result_url, u'gvsearch:' + query,
note='Downloading result page ' + str(pagenum))
for mobj in re.finditer(r'<h3 class="r"><a href="([^"]+)"', webpage):
e = {
'_type': 'url',
'url': mobj.group(1)
}
res['entries'].append(e)
if (pagenum * 10 > n) or not re.search(self._MORE_PAGES_INDICATOR, webpage):
return res

View File

@@ -0,0 +1,48 @@
import re
import base64
from .common import InfoExtractor
class HotNewHipHopIE(InfoExtractor):
_VALID_URL = r'http://www\.hotnewhiphop.com/.*\.(?P<id>.*)\.html'
_TEST = {
u'url': u"http://www.hotnewhiphop.com/freddie-gibbs-lay-it-down-song.1435540.html'",
u'file': u'1435540.mp3',
u'md5': u'2c2cd2f76ef11a9b3b581e8b232f3d96',
u'info_dict': {
u"title": u"Freddie Gibbs Songs - Lay It Down"
}
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('id')
webpage_src = self._download_webpage(url, video_id)
video_url_base64 = self._search_regex(r'data-path="(.*?)"',
webpage_src, u'video URL', fatal=False)
if video_url_base64 == None:
video_url = self._search_regex(r'"contentUrl" content="(.*?)"', webpage_src,
u'video URL')
return self.url_result(video_url, ie='Youtube')
video_url = base64.b64decode(video_url_base64).decode('utf-8')
video_title = self._html_search_regex(r"<title>(.*)</title>",
webpage_src, u'title')
# Getting thumbnail and if not thumbnail sets correct title for WSHH candy video.
thumbnail = self._html_search_regex(r'"og:image" content="(.*)"',
webpage_src, u'thumbnail', fatal=False)
results = [{
'id': video_id,
'url' : video_url,
'title' : video_title,
'thumbnail' : thumbnail,
'ext' : 'mp3',
}]
return results

View File

@@ -0,0 +1,46 @@
import re
from .common import InfoExtractor
class HowcastIE(InfoExtractor):
_VALID_URL = r'(?:https?://)?(?:www\.)?howcast\.com/videos/(?P<id>\d+)'
_TEST = {
u'url': u'http://www.howcast.com/videos/390161-How-to-Tie-a-Square-Knot-Properly',
u'file': u'390161.mp4',
u'md5': u'1d7ba54e2c9d7dc6935ef39e00529138',
u'info_dict': {
u"description": u"The square knot, also known as the reef knot, is one of the oldest, most basic knots to tie, and can be used in many different ways. Here's the proper way to tie a square knot.",
u"title": u"How to Tie a Square Knot Properly"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage_url = 'http://www.howcast.com/videos/' + video_id
webpage = self._download_webpage(webpage_url, video_id)
self.report_extraction(video_id)
video_url = self._search_regex(r'\'?file\'?: "(http://mobile-media\.howcast\.com/[0-9]+\.mp4)',
webpage, u'video URL')
video_title = self._html_search_regex(r'<meta content=(?:"([^"]+)"|\'([^\']+)\') property=\'og:title\'',
webpage, u'title')
video_description = self._html_search_regex(r'<meta content=(?:"([^"]+)"|\'([^\']+)\') name=\'description\'',
webpage, u'description', fatal=False)
thumbnail = self._html_search_regex(r'<meta content=\'(.+?)\' property=\'og:image\'',
webpage, u'thumbnail', fatal=False)
return [{
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': video_title,
'description': video_description,
'thumbnail': thumbnail,
}]

View File

@@ -0,0 +1,71 @@
import json
import re
import time
from .common import InfoExtractor
from ..utils import (
compat_str,
compat_urllib_parse,
compat_urllib_request,
ExtractorError,
)
class HypemIE(InfoExtractor):
"""Information Extractor for hypem"""
_VALID_URL = r'(?:http://)?(?:www\.)?hypem\.com/track/([^/]+)/([^/]+)'
_TEST = {
u'url': u'http://hypem.com/track/1v6ga/BODYWORK+-+TAME',
u'file': u'1v6ga.mp3',
u'md5': u'b9cc91b5af8995e9f0c1cee04c575828',
u'info_dict': {
u"title": u"Tame"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
track_id = mobj.group(1)
data = { 'ax': 1, 'ts': time.time() }
data_encoded = compat_urllib_parse.urlencode(data)
complete_url = url + "?" + data_encoded
request = compat_urllib_request.Request(complete_url)
response, urlh = self._download_webpage_handle(request, track_id, u'Downloading webpage with the url')
cookie = urlh.headers.get('Set-Cookie', '')
self.report_extraction(track_id)
html_tracks = self._html_search_regex(r'<script type="application/json" id="displayList-data">(.*?)</script>',
response, u'tracks', flags=re.MULTILINE|re.DOTALL).strip()
try:
track_list = json.loads(html_tracks)
track = track_list[u'tracks'][0]
except ValueError:
raise ExtractorError(u'Hypemachine contained invalid JSON.')
key = track[u"key"]
track_id = track[u"id"]
artist = track[u"artist"]
title = track[u"song"]
serve_url = "http://hypem.com/serve/source/%s/%s" % (compat_str(track_id), compat_str(key))
request = compat_urllib_request.Request(serve_url, "" , {'Content-Type': 'application/json'})
request.add_header('cookie', cookie)
song_data_json = self._download_webpage(request, track_id, u'Downloading metadata')
try:
song_data = json.loads(song_data_json)
except ValueError:
raise ExtractorError(u'Hypemachine contained invalid JSON.')
final_url = song_data[u"url"]
return [{
'id': track_id,
'url': final_url,
'ext': "mp3",
'title': title,
'artist': artist,
}]

View File

@@ -0,0 +1,39 @@
import re
from .common import InfoExtractor
class InaIE(InfoExtractor):
"""Information Extractor for Ina.fr"""
_VALID_URL = r'(?:http://)?(?:www\.)?ina\.fr/video/(?P<id>I[0-9]+)/.*'
_TEST = {
u'url': u'www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
u'file': u'I12055569.mp4',
u'md5': u'a667021bf2b41f8dc6049479d9bb38a3',
u'info_dict': {
u"title": u"Fran\u00e7ois Hollande \"Je crois que c'est clair\""
}
}
def _real_extract(self,url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
mrss_url='http://player.ina.fr/notices/%s.mrss' % video_id
video_extension = 'mp4'
webpage = self._download_webpage(mrss_url, video_id)
self.report_extraction(video_id)
video_url = self._html_search_regex(r'<media:player url="(?P<mp4url>http://mp4.ina.fr/[^"]+\.mp4)',
webpage, u'video URL')
video_title = self._search_regex(r'<title><!\[CDATA\[(?P<titre>.*?)]]></title>',
webpage, u'title')
return [{
'id': video_id,
'url': video_url,
'ext': video_extension,
'title': video_title,
}]

View File

@@ -0,0 +1,62 @@
import base64
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
ExtractorError,
)
class InfoQIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:www\.)?infoq\.com/[^/]+/[^/]+$'
_TEST = {
u"name": u"InfoQ",
u"url": u"http://www.infoq.com/presentations/A-Few-of-My-Favorite-Python-Things",
u"file": u"12-jan-pythonthings.mp4",
u"info_dict": {
u"description": u"Mike Pirnat presents some tips and tricks, standard libraries and third party packages that make programming in Python a richer experience.",
u"title": u"A Few of My Favorite [Python] Things"
},
u"params": {
u"skip_download": True
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
webpage = self._download_webpage(url, video_id=url)
self.report_extraction(url)
# Extract video URL
mobj = re.search(r"jsclassref ?= ?'([^']*)'", webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract video url')
real_id = compat_urllib_parse.unquote(base64.b64decode(mobj.group(1).encode('ascii')).decode('utf-8'))
video_url = 'rtmpe://video.infoq.com/cfx/st/' + real_id
# Extract title
video_title = self._search_regex(r'contentTitle = "(.*?)";',
webpage, u'title')
# Extract description
video_description = self._html_search_regex(r'<meta name="description" content="(.*)"(?:\s*/)?>',
webpage, u'description', fatal=False)
video_filename = video_url.split('/')[-1]
video_id, extension = video_filename.split('.')
info = {
'id': video_id,
'url': video_url,
'uploader': None,
'upload_date': None,
'title': video_title,
'ext': extension, # Extension is always(?) mp4, but seems to be flv
'thumbnail': None,
'description': video_description,
}
return [info]

View File

@@ -0,0 +1,42 @@
import re
from .common import InfoExtractor
class InstagramIE(InfoExtractor):
_VALID_URL = r'(?:http://)?instagram.com/p/(.*?)/'
_TEST = {
u'url': u'http://instagram.com/p/aye83DjauH/#',
u'file': u'aye83DjauH.mp4',
u'md5': u'0d2da106a9d2631273e192b372806516',
u'info_dict': {
u"uploader_id": u"naomipq",
u"title": u"Video by naomipq"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(
r'<meta property="og:video" content="(.+?)"',
webpage, u'video URL')
thumbnail_url = self._html_search_regex(
r'<meta property="og:image" content="(.+?)" />',
webpage, u'thumbnail URL', fatal=False)
html_title = self._html_search_regex(
r'<title>(.+?)</title>',
webpage, u'title', flags=re.DOTALL)
title = re.sub(u'(?: *\(Videos?\))? \u2022 Instagram$', '', html_title).strip()
uploader_id = self._html_search_regex(r'content="(.*?)\'s video on Instagram',
webpage, u'uploader name', fatal=False)
ext = 'mp4'
return [{
'id': video_id,
'url': video_url,
'ext': ext,
'title': title,
'thumbnail': thumbnail_url,
'uploader_id' : uploader_id
}]

View File

@@ -0,0 +1,56 @@
# coding: utf-8
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
unescapeHTML,
)
class JukeboxIE(InfoExtractor):
_VALID_URL = r'^http://www\.jukebox?\..+?\/.+[,](?P<video_id>[a-z0-9\-]+).html'
_IFRAME = r'<iframe .*src="(?P<iframe>[^"]*)".*>'
_VIDEO_URL = r'"config":{"file":"(?P<video_url>http:[^"]+[.](?P<video_ext>[^.?]+)[?]mdtk=[0-9]+)"'
_TITLE = r'<h1 class="inline">(?P<title>[^<]+)</h1>.*<span id="infos_article_artist">(?P<artist>[^<]+)</span>'
_IS_YOUTUBE = r'config":{"file":"(?P<youtube_url>http:[\\][/][\\][/]www[.]youtube[.]com[\\][/]watch[?]v=[^"]+)"'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('video_id')
html = self._download_webpage(url, video_id)
mobj = re.search(self._IFRAME, html)
if mobj is None:
raise ExtractorError(u'Cannot extract iframe url')
iframe_url = unescapeHTML(mobj.group('iframe'))
iframe_html = self._download_webpage(iframe_url, video_id, 'Downloading iframe')
mobj = re.search(r'class="jkb_waiting"', iframe_html)
if mobj is not None:
raise ExtractorError(u'Video is not available(in your country?)!')
self.report_extraction(video_id)
mobj = re.search(self._VIDEO_URL, iframe_html)
if mobj is None:
mobj = re.search(self._IS_YOUTUBE, iframe_html)
if mobj is None:
raise ExtractorError(u'Cannot extract video url')
youtube_url = unescapeHTML(mobj.group('youtube_url')).replace('\/','/')
self.to_screen(u'Youtube video detected')
return self.url_result(youtube_url,ie='Youtube')
video_url = unescapeHTML(mobj.group('video_url')).replace('\/','/')
video_ext = unescapeHTML(mobj.group('video_ext'))
mobj = re.search(self._TITLE, html)
if mobj is None:
raise ExtractorError(u'Cannot extract title')
title = unescapeHTML(mobj.group('title'))
artist = unescapeHTML(mobj.group('artist'))
return [{'id': video_id,
'url': video_url,
'title': artist + '-' + title,
'ext': video_ext
}]

View File

@@ -0,0 +1,155 @@
import json
import os
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
ExtractorError,
formatSeconds,
)
class JustinTVIE(InfoExtractor):
"""Information extractor for justin.tv and twitch.tv"""
# TODO: One broadcast may be split into multiple videos. The key
# 'broadcast_id' is the same for all parts, and 'broadcast_part'
# starts at 1 and increases. Can we treat all parts as one video?
_VALID_URL = r"""(?x)^(?:http://)?(?:www\.)?(?:twitch|justin)\.tv/
(?:
(?P<channelid>[^/]+)|
(?:(?:[^/]+)/b/(?P<videoid>[^/]+))|
(?:(?:[^/]+)/c/(?P<chapterid>[^/]+))
)
/?(?:\#.*)?$
"""
_JUSTIN_PAGE_LIMIT = 100
IE_NAME = u'justin.tv'
_TEST = {
u'url': u'http://www.twitch.tv/thegamedevhub/b/296128360',
u'file': u'296128360.flv',
u'md5': u'ecaa8a790c22a40770901460af191c9a',
u'info_dict': {
u"upload_date": u"20110927",
u"uploader_id": 25114803,
u"uploader": u"thegamedevhub",
u"title": u"Beginner Series - Scripting With Python Pt.1"
}
}
def report_download_page(self, channel, offset):
"""Report attempt to download a single page of videos."""
self.to_screen(u'%s: Downloading video information from %d to %d' %
(channel, offset, offset + self._JUSTIN_PAGE_LIMIT))
# Return count of items, list of *valid* items
def _parse_page(self, url, video_id):
info_json = self._download_webpage(url, video_id,
u'Downloading video info JSON',
u'unable to download video info JSON')
response = json.loads(info_json)
if type(response) != list:
error_text = response.get('error', 'unknown error')
raise ExtractorError(u'Justin.tv API: %s' % error_text)
info = []
for clip in response:
video_url = clip['video_file_url']
if video_url:
video_extension = os.path.splitext(video_url)[1][1:]
video_date = re.sub('-', '', clip['start_time'][:10])
video_uploader_id = clip.get('user_id', clip.get('channel_id'))
video_id = clip['id']
video_title = clip.get('title', video_id)
info.append({
'id': video_id,
'url': video_url,
'title': video_title,
'uploader': clip.get('channel_name', video_uploader_id),
'uploader_id': video_uploader_id,
'upload_date': video_date,
'ext': video_extension,
})
return (len(response), info)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'invalid URL: %s' % url)
api_base = 'http://api.justin.tv'
paged = False
if mobj.group('channelid'):
paged = True
video_id = mobj.group('channelid')
api = api_base + '/channel/archives/%s.json' % video_id
elif mobj.group('chapterid'):
chapter_id = mobj.group('chapterid')
webpage = self._download_webpage(url, chapter_id)
m = re.search(r'PP\.archive_id = "([0-9]+)";', webpage)
if not m:
raise ExtractorError(u'Cannot find archive of a chapter')
archive_id = m.group(1)
api = api_base + '/broadcast/by_chapter/%s.xml' % chapter_id
chapter_info_xml = self._download_webpage(api, chapter_id,
note=u'Downloading chapter information',
errnote=u'Chapter information download failed')
doc = xml.etree.ElementTree.fromstring(chapter_info_xml)
for a in doc.findall('.//archive'):
if archive_id == a.find('./id').text:
break
else:
raise ExtractorError(u'Could not find chapter in chapter information')
video_url = a.find('./video_file_url').text
video_ext = video_url.rpartition('.')[2] or u'flv'
chapter_api_url = u'https://api.twitch.tv/kraken/videos/c' + chapter_id
chapter_info_json = self._download_webpage(chapter_api_url, u'c' + chapter_id,
note='Downloading chapter metadata',
errnote='Download of chapter metadata failed')
chapter_info = json.loads(chapter_info_json)
bracket_start = int(doc.find('.//bracket_start').text)
bracket_end = int(doc.find('.//bracket_end').text)
# TODO determine start (and probably fix up file)
# youtube-dl -v http://www.twitch.tv/firmbelief/c/1757457
#video_url += u'?start=' + TODO:start_timestamp
# bracket_start is 13290, but we want 51670615
self._downloader.report_warning(u'Chapter detected, but we can just download the whole file. '
u'Chapter starts at %s and ends at %s' % (formatSeconds(bracket_start), formatSeconds(bracket_end)))
info = {
'id': u'c' + chapter_id,
'url': video_url,
'ext': video_ext,
'title': chapter_info['title'],
'thumbnail': chapter_info['preview'],
'description': chapter_info['description'],
'uploader': chapter_info['channel']['display_name'],
'uploader_id': chapter_info['channel']['name'],
}
return [info]
else:
video_id = mobj.group('videoid')
api = api_base + '/broadcast/by_archive/%s.json' % video_id
self.report_extraction(video_id)
info = []
offset = 0
limit = self._JUSTIN_PAGE_LIMIT
while True:
if paged:
self.report_download_page(video_id, offset)
page_url = api + ('?offset=%d&limit=%d' % (offset, limit))
page_count, page_info = self._parse_page(page_url, video_id)
info.extend(page_info)
if not paged or page_count != limit:
break
offset += limit
return info

View File

@@ -0,0 +1,41 @@
import re
from .common import InfoExtractor
class KeekIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?keek\.com/(?:!|\w+/keeks/)(?P<videoID>\w+)'
IE_NAME = u'keek'
_TEST = {
u'url': u'http://www.keek.com/ytdl/keeks/NODfbab',
u'file': u'NODfbab.mp4',
u'md5': u'9b0636f8c0f7614afa4ea5e4c6e57e83',
u'info_dict': {
u"uploader": u"ytdl",
u"title": u"test chars: \"'/\\\u00e4<>This is a test video for youtube-dl.For more information, contact phihag@phihag.de ."
}
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('videoID')
video_url = u'http://cdn.keek.com/keek/video/%s' % video_id
thumbnail = u'http://cdn.keek.com/keek/thumbnail/%s/w100/h75' % video_id
webpage = self._download_webpage(url, video_id)
video_title = self._html_search_regex(r'<meta property="og:title" content="(?P<title>.*?)"',
webpage, u'title')
uploader = self._html_search_regex(r'<div class="user-name-and-bio">[\S\s]+?<h2>(?P<uploader>.+?)</h2>',
webpage, u'uploader', fatal=False)
info = {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': video_title,
'thumbnail': thumbnail,
'uploader': uploader
}
return [info]

View File

@@ -0,0 +1,54 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class LiveLeakIE(InfoExtractor):
_VALID_URL = r'^(?:http?://)?(?:\w+\.)?liveleak\.com/view\?(?:.*?)i=(?P<video_id>[\w_]+)(?:.*)'
IE_NAME = u'liveleak'
_TEST = {
u'url': u'http://www.liveleak.com/view?i=757_1364311680',
u'file': u'757_1364311680.mp4',
u'md5': u'0813c2430bea7a46bf13acf3406992f4',
u'info_dict': {
u"description": u"extremely bad day for this guy..!",
u"uploader": u"ljfriel2",
u"title": u"Most unlucky car accident"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group('video_id')
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(r'file: "(.*?)",',
webpage, u'video URL')
video_title = self._html_search_regex(r'<meta property="og:title" content="(?P<title>.*?)"',
webpage, u'title').replace('LiveLeak.com -', '').strip()
video_description = self._html_search_regex(r'<meta property="og:description" content="(?P<desc>.*?)"',
webpage, u'description', fatal=False)
video_uploader = self._html_search_regex(r'By:.*?(\w+)</a>',
webpage, u'uploader', fatal=False)
info = {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': video_title,
'description': video_description,
'uploader': video_uploader
}
return [info]

View File

@@ -0,0 +1,123 @@
import re
import socket
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_parse_qs,
compat_urllib_error,
compat_urllib_parse,
compat_urllib_request,
compat_str,
ExtractorError,
)
class MetacafeIE(InfoExtractor):
"""Information Extractor for metacafe.com."""
_VALID_URL = r'(?:http://)?(?:www\.)?metacafe\.com/watch/([^/]+)/([^/]+)/.*'
_DISCLAIMER = 'http://www.metacafe.com/family_filter/'
_FILTER_POST = 'http://www.metacafe.com/f/index.php?inputType=filter&controllerGroup=user'
IE_NAME = u'metacafe'
_TEST = {
u"add_ie": ["Youtube"],
u"url": u"http://metacafe.com/watch/yt-_aUehQsCQtM/the_electric_company_short_i_pbs_kids_go/",
u"file": u"_aUehQsCQtM.flv",
u"info_dict": {
u"upload_date": u"20090102",
u"title": u"The Electric Company | \"Short I\" | PBS KIDS GO!",
u"description": u"md5:2439a8ef6d5a70e380c22f5ad323e5a8",
u"uploader": u"PBS",
u"uploader_id": u"PBS"
}
}
def report_disclaimer(self):
"""Report disclaimer retrieval."""
self.to_screen(u'Retrieving disclaimer')
def _real_initialize(self):
# Retrieve disclaimer
request = compat_urllib_request.Request(self._DISCLAIMER)
try:
self.report_disclaimer()
compat_urllib_request.urlopen(request).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to retrieve disclaimer: %s' % compat_str(err))
# Confirm age
disclaimer_form = {
'filters': '0',
'submit': "Continue - I'm over 18",
}
request = compat_urllib_request.Request(self._FILTER_POST, compat_urllib_parse.urlencode(disclaimer_form))
try:
self.report_age_confirmation()
compat_urllib_request.urlopen(request).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to confirm age: %s' % compat_str(err))
def _real_extract(self, url):
# Extract id and simplified title from URL
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group(1)
# Check if video comes from YouTube
mobj2 = re.match(r'^yt-(.*)$', video_id)
if mobj2 is not None:
return [self.url_result('http://www.youtube.com/watch?v=%s' % mobj2.group(1), 'Youtube')]
# Retrieve video webpage to extract further information
webpage = self._download_webpage('http://www.metacafe.com/watch/%s/' % video_id, video_id)
# Extract URL, uploader and title from webpage
self.report_extraction(video_id)
mobj = re.search(r'(?m)&mediaURL=([^&]+)', webpage)
if mobj is not None:
mediaURL = compat_urllib_parse.unquote(mobj.group(1))
video_extension = mediaURL[-3:]
# Extract gdaKey if available
mobj = re.search(r'(?m)&gdaKey=(.*?)&', webpage)
if mobj is None:
video_url = mediaURL
else:
gdaKey = mobj.group(1)
video_url = '%s?__gda__=%s' % (mediaURL, gdaKey)
else:
mobj = re.search(r' name="flashvars" value="(.*?)"', webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract media URL')
vardict = compat_parse_qs(mobj.group(1))
if 'mediaData' not in vardict:
raise ExtractorError(u'Unable to extract media URL')
mobj = re.search(r'"mediaURL":"(?P<mediaURL>http.*?)",(.*?)"key":"(?P<key>.*?)"', vardict['mediaData'][0])
if mobj is None:
raise ExtractorError(u'Unable to extract media URL')
mediaURL = mobj.group('mediaURL').replace('\\/', '/')
video_extension = mediaURL[-3:]
video_url = '%s?__gda__=%s' % (mediaURL, mobj.group('key'))
mobj = re.search(r'(?im)<title>(.*) - Video</title>', webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract title')
video_title = mobj.group(1).decode('utf-8')
mobj = re.search(r'submitter=(.*?);', webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract uploader nickname')
video_uploader = mobj.group(1)
return [{
'id': video_id.decode('utf-8'),
'url': video_url.decode('utf-8'),
'uploader': video_uploader.decode('utf-8'),
'upload_date': None,
'title': video_title,
'ext': video_extension.decode('utf-8'),
}]

View File

@@ -0,0 +1,115 @@
import json
import re
import socket
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_request,
ExtractorError,
)
class MixcloudIE(InfoExtractor):
_WORKING = False # New API, but it seems good http://www.mixcloud.com/developers/documentation/
_VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([\w\d-]+)/([\w\d-]+)'
IE_NAME = u'mixcloud'
def report_download_json(self, file_id):
"""Report JSON download."""
self.to_screen(u'Downloading json')
def get_urls(self, jsonData, fmt, bitrate='best'):
"""Get urls from 'audio_formats' section in json"""
try:
bitrate_list = jsonData[fmt]
if bitrate is None or bitrate == 'best' or bitrate not in bitrate_list:
bitrate = max(bitrate_list) # select highest
url_list = jsonData[fmt][bitrate]
except TypeError: # we have no bitrate info.
url_list = jsonData[fmt]
return url_list
def check_urls(self, url_list):
"""Returns 1st active url from list"""
for url in url_list:
try:
compat_urllib_request.urlopen(url)
return url
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error):
url = None
return None
def _print_formats(self, formats):
print('Available formats:')
for fmt in formats.keys():
for b in formats[fmt]:
try:
ext = formats[fmt][b][0]
print('%s\t%s\t[%s]' % (fmt, b, ext.split('.')[-1]))
except TypeError: # we have no bitrate info
ext = formats[fmt][0]
print('%s\t%s\t[%s]' % (fmt, '??', ext.split('.')[-1]))
break
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
# extract uploader & filename from url
uploader = mobj.group(1).decode('utf-8')
file_id = uploader + "-" + mobj.group(2).decode('utf-8')
# construct API request
file_url = 'http://www.mixcloud.com/api/1/cloudcast/' + '/'.join(url.split('/')[-3:-1]) + '.json'
# retrieve .json file with links to files
request = compat_urllib_request.Request(file_url)
try:
self.report_download_json(file_url)
jsonData = compat_urllib_request.urlopen(request).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to retrieve file: %s' % compat_str(err))
# parse JSON
json_data = json.loads(jsonData)
player_url = json_data['player_swf_url']
formats = dict(json_data['audio_formats'])
req_format = self._downloader.params.get('format', None)
if self._downloader.params.get('listformats', None):
self._print_formats(formats)
return
if req_format is None or req_format == 'best':
for format_param in formats.keys():
url_list = self.get_urls(formats, format_param)
# check urls
file_url = self.check_urls(url_list)
if file_url is not None:
break # got it!
else:
if req_format not in formats:
raise ExtractorError(u'Format is not available')
url_list = self.get_urls(formats, req_format)
file_url = self.check_urls(url_list)
format_param = req_format
return [{
'id': file_id.decode('utf-8'),
'url': file_url.decode('utf-8'),
'uploader': uploader.decode('utf-8'),
'upload_date': None,
'title': json_data['name'],
'ext': file_url.split('.')[-1].decode('utf-8'),
'format': (format_param is None and u'NA' or format_param.decode('utf-8')),
'thumbnail': json_data['thumbnail_url'],
'description': json_data['description'],
'player_url': player_url.decode('utf-8'),
}]

View File

@@ -0,0 +1,80 @@
import re
import socket
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_request,
ExtractorError,
)
class MTVIE(InfoExtractor):
_VALID_URL = r'^(?P<proto>https?://)?(?:www\.)?mtv\.com/videos/[^/]+/(?P<videoid>[0-9]+)/[^/]+$'
_WORKING = False
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
if not mobj.group('proto'):
url = 'http://' + url
video_id = mobj.group('videoid')
webpage = self._download_webpage(url, video_id)
# Some videos come from Vevo.com
m_vevo = re.search(r'isVevoVideo = true;.*?vevoVideoId = "(.*?)";',
webpage, re.DOTALL)
if m_vevo:
vevo_id = m_vevo.group(1);
self.to_screen(u'Vevo video detected: %s' % vevo_id)
return self.url_result('vevo:%s' % vevo_id, ie='Vevo')
#song_name = self._html_search_regex(r'<meta name="mtv_vt" content="([^"]+)"/>',
# webpage, u'song name', fatal=False)
video_title = self._html_search_regex(r'<meta name="mtv_an" content="([^"]+)"/>',
webpage, u'title')
mtvn_uri = self._html_search_regex(r'<meta name="mtvn_uri" content="([^"]+)"/>',
webpage, u'mtvn_uri', fatal=False)
content_id = self._search_regex(r'MTVN.Player.defaultPlaylistId = ([0-9]+);',
webpage, u'content id', fatal=False)
videogen_url = 'http://www.mtv.com/player/includes/mediaGen.jhtml?uri=' + mtvn_uri + '&id=' + content_id + '&vid=' + video_id + '&ref=www.mtvn.com&viewUri=' + mtvn_uri
self.report_extraction(video_id)
request = compat_urllib_request.Request(videogen_url)
try:
metadataXml = compat_urllib_request.urlopen(request).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to download video metadata: %s' % compat_str(err))
mdoc = xml.etree.ElementTree.fromstring(metadataXml)
renditions = mdoc.findall('.//rendition')
# For now, always pick the highest quality.
rendition = renditions[-1]
try:
_,_,ext = rendition.attrib['type'].partition('/')
format = ext + '-' + rendition.attrib['width'] + 'x' + rendition.attrib['height'] + '_' + rendition.attrib['bitrate']
video_url = rendition.find('./src').text
except KeyError:
raise ExtractorError('Invalid rendition field.')
info = {
'id': video_id,
'url': video_url,
'upload_date': None,
'title': video_title,
'ext': ext,
'format': format,
}
return [info]

View File

@@ -0,0 +1,73 @@
import os.path
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse_urlparse,
ExtractorError,
)
class MySpassIE(InfoExtractor):
_VALID_URL = r'http://www.myspass.de/.*'
_TEST = {
u'url': u'http://www.myspass.de/myspass/shows/tvshows/absolute-mehrheit/Absolute-Mehrheit-vom-17022013-Die-Highlights-Teil-2--/11741/',
u'file': u'11741.mp4',
u'md5': u'0b49f4844a068f8b33f4b7c88405862b',
u'info_dict': {
u"description": u"Wer kann in die Fu\u00dfstapfen von Wolfgang Kubicki treten und die Mehrheit der Zuschauer hinter sich versammeln? Wird vielleicht sogar die Absolute Mehrheit geknackt und der Jackpot von 200.000 Euro mit nach Hause genommen?",
u"title": u"Absolute Mehrheit vom 17.02.2013 - Die Highlights, Teil 2"
}
}
def _real_extract(self, url):
META_DATA_URL_TEMPLATE = 'http://www.myspass.de/myspass/includes/apps/video/getvideometadataxml.php?id=%s'
# video id is the last path element of the URL
# usually there is a trailing slash, so also try the second but last
url_path = compat_urllib_parse_urlparse(url).path
url_parent_path, video_id = os.path.split(url_path)
if not video_id:
_, video_id = os.path.split(url_parent_path)
# get metadata
metadata_url = META_DATA_URL_TEMPLATE % video_id
metadata_text = self._download_webpage(metadata_url, video_id)
metadata = xml.etree.ElementTree.fromstring(metadata_text.encode('utf-8'))
# extract values from metadata
url_flv_el = metadata.find('url_flv')
if url_flv_el is None:
raise ExtractorError(u'Unable to extract download url')
video_url = url_flv_el.text
extension = os.path.splitext(video_url)[1][1:]
title_el = metadata.find('title')
if title_el is None:
raise ExtractorError(u'Unable to extract title')
title = title_el.text
format_id_el = metadata.find('format_id')
if format_id_el is None:
format = 'mp4'
else:
format = format_id_el.text
description_el = metadata.find('description')
if description_el is not None:
description = description_el.text
else:
description = None
imagePreview_el = metadata.find('imagePreview')
if imagePreview_el is not None:
thumbnail = imagePreview_el.text
else:
thumbnail = None
info = {
'id': video_id,
'url': video_url,
'title': title,
'ext': extension,
'format': format,
'thumbnail': thumbnail,
'description': description
}
return [info]

View File

@@ -0,0 +1,172 @@
import binascii
import base64
import hashlib
import re
from .common import InfoExtractor
from ..utils import (
compat_ord,
compat_urllib_parse,
ExtractorError,
)
class MyVideoIE(InfoExtractor):
"""Information Extractor for myvideo.de."""
_VALID_URL = r'(?:http://)?(?:www\.)?myvideo\.de/watch/([0-9]+)/([^?/]+).*'
IE_NAME = u'myvideo'
_TEST = {
u'url': u'http://www.myvideo.de/watch/8229274/bowling_fail_or_win',
u'file': u'8229274.flv',
u'md5': u'2d2753e8130479ba2cb7e0a37002053e',
u'info_dict': {
u"title": u"bowling-fail-or-win"
}
}
# Original Code from: https://github.com/dersphere/plugin.video.myvideo_de.git
# Released into the Public Domain by Tristan Fischer on 2013-05-19
# https://github.com/rg3/youtube-dl/pull/842
def __rc4crypt(self,data, key):
x = 0
box = list(range(256))
for i in list(range(256)):
x = (x + box[i] + compat_ord(key[i % len(key)])) % 256
box[i], box[x] = box[x], box[i]
x = 0
y = 0
out = ''
for char in data:
x = (x + 1) % 256
y = (y + box[x]) % 256
box[x], box[y] = box[y], box[x]
out += chr(compat_ord(char) ^ box[(box[x] + box[y]) % 256])
return out
def __md5(self,s):
return hashlib.md5(s).hexdigest().encode()
def _real_extract(self,url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'invalid URL: %s' % url)
video_id = mobj.group(1)
GK = (
b'WXpnME1EZGhNRGhpTTJNM01XVmhOREU0WldNNVpHTTJOakpt'
b'TW1FMU5tVTBNR05pWkRaa05XRXhNVFJoWVRVd1ptSXhaVEV3'
b'TnpsbA0KTVRkbU1tSTRNdz09'
)
# Get video webpage
webpage_url = 'http://www.myvideo.de/watch/%s' % video_id
webpage = self._download_webpage(webpage_url, video_id)
mobj = re.search('source src=\'(.+?)[.]([^.]+)\'', webpage)
if mobj is not None:
self.report_extraction(video_id)
video_url = mobj.group(1) + '.flv'
video_title = self._html_search_regex('<title>([^<]+)</title>',
webpage, u'title')
video_ext = self._search_regex('[.](.+?)$', video_url, u'extension')
return [{
'id': video_id,
'url': video_url,
'uploader': None,
'upload_date': None,
'title': video_title,
'ext': video_ext,
}]
# try encxml
mobj = re.search('var flashvars={(.+?)}', webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract video')
params = {}
encxml = ''
sec = mobj.group(1)
for (a, b) in re.findall('(.+?):\'(.+?)\',?', sec):
if not a == '_encxml':
params[a] = b
else:
encxml = compat_urllib_parse.unquote(b)
if not params.get('domain'):
params['domain'] = 'www.myvideo.de'
xmldata_url = '%s?%s' % (encxml, compat_urllib_parse.urlencode(params))
if 'flash_playertype=MTV' in xmldata_url:
self._downloader.report_warning(u'avoiding MTV player')
xmldata_url = (
'http://www.myvideo.de/dynamic/get_player_video_xml.php'
'?flash_playertype=D&ID=%s&_countlimit=4&autorun=yes'
) % video_id
# get enc data
enc_data = self._download_webpage(xmldata_url, video_id).split('=')[1]
enc_data_b = binascii.unhexlify(enc_data)
sk = self.__md5(
base64.b64decode(base64.b64decode(GK)) +
self.__md5(
str(video_id).encode('utf-8')
)
)
dec_data = self.__rc4crypt(enc_data_b, sk)
# extracting infos
self.report_extraction(video_id)
video_url = None
mobj = re.search('connectionurl=\'(.*?)\'', dec_data)
if mobj:
video_url = compat_urllib_parse.unquote(mobj.group(1))
if 'myvideo2flash' in video_url:
self._downloader.report_warning(u'forcing RTMPT ...')
video_url = video_url.replace('rtmpe://', 'rtmpt://')
if not video_url:
# extract non rtmp videos
mobj = re.search('path=\'(http.*?)\' source=\'(.*?)\'', dec_data)
if mobj is None:
raise ExtractorError(u'unable to extract url')
video_url = compat_urllib_parse.unquote(mobj.group(1)) + compat_urllib_parse.unquote(mobj.group(2))
video_file = self._search_regex('source=\'(.*?)\'', dec_data, u'video file')
video_file = compat_urllib_parse.unquote(video_file)
if not video_file.endswith('f4m'):
ppath, prefix = video_file.split('.')
video_playpath = '%s:%s' % (prefix, ppath)
video_hls_playlist = ''
else:
video_playpath = ''
video_hls_playlist = (
video_file
).replace('.f4m', '.m3u8')
video_swfobj = self._search_regex('swfobject.embedSWF\(\'(.+?)\'', webpage, u'swfobj')
video_swfobj = compat_urllib_parse.unquote(video_swfobj)
video_title = self._html_search_regex("<h1(?: class='globalHd')?>(.*?)</h1>",
webpage, u'title')
return [{
'id': video_id,
'url': video_url,
'tc_url': video_url,
'uploader': None,
'upload_date': None,
'title': video_title,
'ext': u'flv',
'play_path': video_playpath,
'video_file': video_file,
'video_hls_playlist': video_hls_playlist,
'player_url': video_swfobj,
}]

View File

@@ -0,0 +1,49 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class NBAIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:watch\.|www\.)?nba\.com/(?:nba/)?video(/[^?]*?)(?:/index\.html)?(?:\?.*)?$'
_TEST = {
u'url': u'http://www.nba.com/video/games/nets/2012/12/04/0021200253-okc-bkn-recap.nba/index.html',
u'file': u'0021200253-okc-bkn-recap.nba.mp4',
u'md5': u'c0edcfc37607344e2ff8f13c378c88a4',
u'info_dict': {
u"description": u"Kevin Durant scores 32 points and dishes out six assists as the Thunder beat the Nets in Brooklyn.",
u"title": u"Thunder vs. Nets"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group(1)
webpage = self._download_webpage(url, video_id)
video_url = u'http://ht-mobile.cdn.turner.com/nba/big' + video_id + '_nba_1280x720.mp4'
shortened_video_id = video_id.rpartition('/')[2]
title = self._html_search_regex(r'<meta property="og:title" content="(.*?)"',
webpage, 'title', default=shortened_video_id).replace('NBA.com: ', '')
# It isn't there in the HTML it returns to us
# uploader_date = self._html_search_regex(r'<b>Date:</b> (.*?)</div>', webpage, 'upload_date', fatal=False)
description = self._html_search_regex(r'<meta name="description" (?:content|value)="(.*?)" />', webpage, 'description', fatal=False)
info = {
'id': shortened_video_id,
'url': video_url,
'ext': 'mp4',
'title': title,
# 'uploader_date': uploader_date,
'description': description,
}
return [info]

View File

@@ -0,0 +1,76 @@
import datetime
import json
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
)
class PhotobucketIE(InfoExtractor):
"""Information extractor for photobucket.com."""
# TODO: the original _VALID_URL was:
# r'(?:http://)?(?:[a-z0-9]+\.)?photobucket\.com/.*[\?\&]current=(.*\.flv)'
# Check if it's necessary to keep the old extracion process
_VALID_URL = r'(?:http://)?(?:[a-z0-9]+\.)?photobucket\.com/.*(([\?\&]current=)|_)(?P<id>.*)\.(?P<ext>(flv)|(mp4))'
IE_NAME = u'photobucket'
_TEST = {
u'url': u'http://media.photobucket.com/user/rachaneronas/media/TiredofLinkBuildingTryBacklinkMyDomaincom_zpsc0c3b9fa.mp4.html?filters[term]=search&filters[primary]=videos&filters[secondary]=images&sort=1&o=0',
u'file': u'zpsc0c3b9fa.mp4',
u'md5': u'7dabfb92b0a31f6c16cebc0f8e60ff99',
u'info_dict': {
u"upload_date": u"20130504",
u"uploader": u"rachaneronas",
u"title": u"Tired of Link Building? Try BacklinkMyDomain.com!"
}
}
def _real_extract(self, url):
# Extract id from URL
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
video_id = mobj.group('id')
video_extension = mobj.group('ext')
# Retrieve video webpage to extract further information
webpage = self._download_webpage(url, video_id)
# Extract URL, uploader, and title from webpage
self.report_extraction(video_id)
# We try first by looking the javascript code:
mobj = re.search(r'Pb\.Data\.Shared\.put\(Pb\.Data\.Shared\.MEDIA, (?P<json>.*?)\);', webpage)
if mobj is not None:
info = json.loads(mobj.group('json'))
return [{
'id': video_id,
'url': info[u'downloadUrl'],
'uploader': info[u'username'],
'upload_date': datetime.date.fromtimestamp(info[u'creationDate']).strftime('%Y%m%d'),
'title': info[u'title'],
'ext': video_extension,
'thumbnail': info[u'thumbUrl'],
}]
# We try looking in other parts of the webpage
video_url = self._search_regex(r'<link rel="video_src" href=".*\?file=([^"]+)" />',
webpage, u'video URL')
mobj = re.search(r'<title>(.*) video by (.*) - Photobucket</title>', webpage)
if mobj is None:
raise ExtractorError(u'Unable to extract title')
video_title = mobj.group(1).decode('utf-8')
video_uploader = mobj.group(2).decode('utf-8')
return [{
'id': video_id.decode('utf-8'),
'url': video_url.decode('utf-8'),
'uploader': video_uploader,
'upload_date': None,
'title': video_title,
'ext': video_extension.decode('utf-8'),
}]

View File

@@ -0,0 +1,50 @@
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
unified_strdate,
)
class PornotubeIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:\w+\.)?pornotube\.com(/c/(?P<channel>[0-9]+))?(/m/(?P<videoid>[0-9]+))(/(?P<title>.+))$'
_TEST = {
u'url': u'http://pornotube.com/c/173/m/1689755/Marilyn-Monroe-Bathing',
u'file': u'1689755.flv',
u'md5': u'374dd6dcedd24234453b295209aa69b6',
u'info_dict': {
u"upload_date": u"20090708",
u"title": u"Marilyn-Monroe-Bathing"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('videoid')
video_title = mobj.group('title')
# Get webpage content
webpage = self._download_webpage(url, video_id)
# Get the video URL
VIDEO_URL_RE = r'url: "(?P<url>http://video[0-9].pornotube.com/.+\.flv)",'
video_url = self._search_regex(VIDEO_URL_RE, webpage, u'video url')
video_url = compat_urllib_parse.unquote(video_url)
#Get the uploaded date
VIDEO_UPLOADED_RE = r'<div class="video_added_by">Added (?P<date>[0-9\/]+) by'
upload_date = self._html_search_regex(VIDEO_UPLOADED_RE, webpage, u'upload date', fatal=False)
if upload_date: upload_date = unified_strdate(upload_date)
info = {'id': video_id,
'url': video_url,
'uploader': None,
'upload_date': upload_date,
'title': video_title,
'ext': 'flv',
'format': 'flv'}
return [info]

View File

@@ -0,0 +1,56 @@
import json
import re
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse_urlparse,
ExtractorError,
)
class RBMARadioIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rbmaradio\.com/shows/(?P<videoID>[^/]+)$'
_TEST = {
u'url': u'http://www.rbmaradio.com/shows/ford-lopatin-live-at-primavera-sound-2011',
u'file': u'ford-lopatin-live-at-primavera-sound-2011.mp3',
u'md5': u'6bc6f9bcb18994b4c983bc3bf4384d95',
u'info_dict': {
u"uploader_id": u"ford-lopatin",
u"location": u"Spain",
u"description": u"Joel Ford and Daniel \u2019Oneohtrix Point Never\u2019 Lopatin fly their midified pop extravaganza to Spain. Live at Primavera Sound 2011.",
u"uploader": u"Ford & Lopatin",
u"title": u"Live at Primavera Sound 2011"
}
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('videoID')
webpage = self._download_webpage(url, video_id)
json_data = self._search_regex(r'window\.gon.*?gon\.show=(.+?);$',
webpage, u'json data', flags=re.MULTILINE)
try:
data = json.loads(json_data)
except ValueError as e:
raise ExtractorError(u'Invalid JSON: ' + str(e))
video_url = data['akamai_url'] + '&cbr=256'
url_parts = compat_urllib_parse_urlparse(video_url)
video_ext = url_parts.path.rpartition('.')[2]
info = {
'id': video_id,
'url': video_url,
'ext': video_ext,
'title': data['title'],
'description': data.get('teaser_text'),
'location': data.get('country_of_origin'),
'uploader': data.get('host', {}).get('name'),
'uploader_id': data.get('host', {}).get('slug'),
'thumbnail': data.get('image', {}).get('large_url_2x'),
'duration': data.get('duration'),
}
return [info]

View File

@@ -0,0 +1,37 @@
import re
from .common import InfoExtractor
class RedTubeIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?redtube\.com/(?P<id>[0-9]+)'
_TEST = {
u'url': u'http://www.redtube.com/66418',
u'file': u'66418.mp4',
u'md5': u'7b8c22b5e7098a3e1c09709df1126d2d',
u'info_dict': {
u"title": u"Sucked on a toilet"
}
}
def _real_extract(self,url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_extension = 'mp4'
webpage = self._download_webpage(url, video_id)
self.report_extraction(video_id)
video_url = self._html_search_regex(r'<source src="(.+?)" type="video/mp4">',
webpage, u'video URL')
video_title = self._html_search_regex('<h1 class="videoTitle slidePanelMovable">(.+?)</h1>',
webpage, u'title')
return [{
'id': video_id,
'url': video_url,
'ext': video_extension,
'title': video_title,
}]

View File

@@ -0,0 +1,37 @@
import re
from .common import InfoExtractor
class RingTVIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?ringtv\.craveonline\.com/videos/video/([^/]+)'
_TEST = {
u"url": u"http://ringtv.craveonline.com/videos/video/746619-canelo-alvarez-talks-about-mayweather-showdown",
u"file": u"746619.mp4",
u"md5": u"7c46b4057d22de32e0a539f017e64ad3",
u"info_dict": {
u"title": u"Canelo Alvarez talks about Mayweather showdown",
u"description": u"Saul \\\"Canelo\\\" Alvarez spoke to the media about his Sept. 14 showdown with Floyd Mayweather after their kick-off presser in NYC. Canelo is motivated and confident that he will have the speed and gameplan to beat the pound-for-pound king."
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1).split('-')[0]
webpage = self._download_webpage(url, video_id)
title = self._search_regex(r'<title>(.+?)</title>',
webpage, 'video title').replace(' | RingTV','')
description = self._search_regex(r'<div class="blurb">(.+?)</div>',
webpage, 'Description')
final_url = "http://ringtv.craveonline.springboardplatform.com/storage/ringtv.craveonline.com/conversion/%s.mp4" %(str(video_id))
thumbnail_url = "http://ringtv.craveonline.springboardplatform.com/storage/ringtv.craveonline.com/snapshots/%s.jpg" %(str(video_id))
ext = final_url.split('.')[-1]
return [{
'id' : video_id,
'url' : final_url,
'ext' : ext,
'title' : title,
'thumbnail' : thumbnail_url,
'description' : description,
}]

View File

@@ -0,0 +1,204 @@
import json
import re
from .common import InfoExtractor
from ..utils import (
compat_str,
ExtractorError,
unified_strdate,
)
class SoundcloudIE(InfoExtractor):
"""Information extractor for soundcloud.com
To access the media, the uid of the song and a stream token
must be extracted from the page source and the script must make
a request to media.soundcloud.com/crossdomain.xml. Then
the media can be grabbed by requesting from an url composed
of the stream token and uid
"""
_VALID_URL = r'^(?:https?://)?(?:www\.)?soundcloud\.com/([\w\d-]+)/([\w\d-]+)(?:[?].*)?$'
IE_NAME = u'soundcloud'
_TEST = {
u'url': u'http://soundcloud.com/ethmusic/lostin-powers-she-so-heavy',
u'file': u'62986583.mp3',
u'md5': u'ebef0a451b909710ed1d7787dddbf0d7',
u'info_dict': {
u"upload_date": u"20121011",
u"description": u"No Downloads untill we record the finished version this weekend, i was too pumped n i had to post it , earl is prolly gonna b hella p.o'd",
u"uploader": u"E.T. ExTerrestrial Music",
u"title": u"Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1"
}
}
def report_resolve(self, video_id):
"""Report information extraction."""
self.to_screen(u'%s: Resolving id' % video_id)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
# extract uploader (which is in the url)
uploader = mobj.group(1)
# extract simple title (uploader + slug of song title)
slug_title = mobj.group(2)
full_title = '%s/%s' % (uploader, slug_title)
self.report_resolve(full_title)
url = 'http://soundcloud.com/%s/%s' % (uploader, slug_title)
resolv_url = 'http://api.soundcloud.com/resolve.json?url=' + url + '&client_id=b45b1aa10f1ac2941910a7f0d10f8e28'
info_json = self._download_webpage(resolv_url, full_title, u'Downloading info JSON')
info = json.loads(info_json)
video_id = info['id']
self.report_extraction(full_title)
streams_url = 'https://api.sndcdn.com/i1/tracks/' + str(video_id) + '/streams?client_id=b45b1aa10f1ac2941910a7f0d10f8e28'
stream_json = self._download_webpage(streams_url, full_title,
u'Downloading stream definitions',
u'unable to download stream definitions')
streams = json.loads(stream_json)
mediaURL = streams['http_mp3_128_url']
upload_date = unified_strdate(info['created_at'])
return [{
'id': info['id'],
'url': mediaURL,
'uploader': info['user']['username'],
'upload_date': upload_date,
'title': info['title'],
'ext': u'mp3',
'description': info['description'],
}]
class SoundcloudSetIE(InfoExtractor):
"""Information extractor for soundcloud.com sets
To access the media, the uid of the song and a stream token
must be extracted from the page source and the script must make
a request to media.soundcloud.com/crossdomain.xml. Then
the media can be grabbed by requesting from an url composed
of the stream token and uid
"""
_VALID_URL = r'^(?:https?://)?(?:www\.)?soundcloud\.com/([\w\d-]+)/sets/([\w\d-]+)(?:[?].*)?$'
IE_NAME = u'soundcloud:set'
_TEST = {
u"url":"https://soundcloud.com/the-concept-band/sets/the-royal-concept-ep",
u"playlist": [
{
u"file":"30510138.mp3",
u"md5":"f9136bf103901728f29e419d2c70f55d",
u"info_dict": {
u"upload_date": u"20111213",
u"description": u"The Royal Concept from Stockholm\r\nFilip / Povel / David / Magnus\r\nwww.royalconceptband.com",
u"uploader": u"The Royal Concept",
u"title": u"D-D-Dance"
}
},
{
u"file":"47127625.mp3",
u"md5":"09b6758a018470570f8fd423c9453dd8",
u"info_dict": {
u"upload_date": u"20120521",
u"description": u"The Royal Concept from Stockholm\r\nFilip / Povel / David / Magnus\r\nwww.royalconceptband.com",
u"uploader": u"The Royal Concept",
u"title": u"The Royal Concept - Gimme Twice"
}
},
{
u"file":"47127627.mp3",
u"md5":"154abd4e418cea19c3b901f1e1306d9c",
u"info_dict": {
u"upload_date": u"20120521",
u"uploader": u"The Royal Concept",
u"title": u"Goldrushed"
}
},
{
u"file":"47127629.mp3",
u"md5":"2f5471edc79ad3f33a683153e96a79c1",
u"info_dict": {
u"upload_date": u"20120521",
u"description": u"The Royal Concept from Stockholm\r\nFilip / Povel / David / Magnus\r\nwww.royalconceptband.com",
u"uploader": u"The Royal Concept",
u"title": u"In the End"
}
},
{
u"file":"47127631.mp3",
u"md5":"f9ba87aa940af7213f98949254f1c6e2",
u"info_dict": {
u"upload_date": u"20120521",
u"description": u"The Royal Concept from Stockholm\r\nFilip / David / Povel / Magnus\r\nwww.theroyalconceptband.com",
u"uploader": u"The Royal Concept",
u"title": u"Knocked Up"
}
},
{
u"file":"75206121.mp3",
u"md5":"f9d1fe9406717e302980c30de4af9353",
u"info_dict": {
u"upload_date": u"20130116",
u"description": u"The unreleased track World on Fire premiered on the CW's hit show Arrow (8pm/7pm central). \r\nAs a gift to our fans we would like to offer you a free download of the track! ",
u"uploader": u"The Royal Concept",
u"title": u"World On Fire"
}
}
]
}
def report_resolve(self, video_id):
"""Report information extraction."""
self.to_screen(u'%s: Resolving id' % video_id)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
# extract uploader (which is in the url)
uploader = mobj.group(1)
# extract simple title (uploader + slug of song title)
slug_title = mobj.group(2)
full_title = '%s/sets/%s' % (uploader, slug_title)
self.report_resolve(full_title)
url = 'http://soundcloud.com/%s/sets/%s' % (uploader, slug_title)
resolv_url = 'http://api.soundcloud.com/resolve.json?url=' + url + '&client_id=b45b1aa10f1ac2941910a7f0d10f8e28'
info_json = self._download_webpage(resolv_url, full_title)
videos = []
info = json.loads(info_json)
if 'errors' in info:
for err in info['errors']:
self._downloader.report_error(u'unable to download video webpage: %s' % compat_str(err['error_message']))
return
self.report_extraction(full_title)
for track in info['tracks']:
video_id = track['id']
streams_url = 'https://api.sndcdn.com/i1/tracks/' + str(video_id) + '/streams?client_id=b45b1aa10f1ac2941910a7f0d10f8e28'
stream_json = self._download_webpage(streams_url, video_id, u'Downloading track info JSON')
self.report_extraction(video_id)
streams = json.loads(stream_json)
mediaURL = streams['http_mp3_128_url']
videos.append({
'id': video_id,
'url': mediaURL,
'uploader': track['user']['username'],
'upload_date': unified_strdate(track['created_at']),
'title': track['title'],
'ext': u'mp3',
'description': track['description'],
})
return videos

View File

@@ -0,0 +1,45 @@
import re
import xml.etree.ElementTree
from .common import InfoExtractor
class SpiegelIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?spiegel\.de/video/[^/]*-(?P<videoID>[0-9]+)(?:\.html)?(?:#.*)?$'
_TEST = {
u'url': u'http://www.spiegel.de/video/vulkan-tungurahua-in-ecuador-ist-wieder-aktiv-video-1259285.html',
u'file': u'1259285.mp4',
u'md5': u'2c2754212136f35fb4b19767d242f66e',
u'info_dict': {
u"title": u"Vulkanausbruch in Ecuador: Der \"Feuerschlund\" ist wieder aktiv"
}
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('videoID')
webpage = self._download_webpage(url, video_id)
video_title = self._html_search_regex(r'<div class="module-title">(.*?)</div>',
webpage, u'title')
xml_url = u'http://video2.spiegel.de/flash/' + video_id + u'.xml'
xml_code = self._download_webpage(xml_url, video_id,
note=u'Downloading XML', errnote=u'Failed to download XML')
idoc = xml.etree.ElementTree.fromstring(xml_code)
last_type = idoc[-1]
filename = last_type.findall('./filename')[0].text
duration = float(last_type.findall('./duration')[0].text)
video_url = 'http://video2.spiegel.de/flash/' + filename
video_ext = filename.rpartition('.')[2]
info = {
'id': video_id,
'url': video_url,
'ext': video_ext,
'title': video_title,
'duration': duration,
}
return [info]

View File

@@ -0,0 +1,119 @@
import re
import socket
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_request,
ExtractorError,
orderedSet,
unescapeHTML,
)
class StanfordOpenClassroomIE(InfoExtractor):
IE_NAME = u'stanfordoc'
IE_DESC = u'Stanford Open ClassRoom'
_VALID_URL = r'^(?:https?://)?openclassroom.stanford.edu(?P<path>/?|(/MainFolder/(?:HomePage|CoursePage|VideoPage)\.php([?]course=(?P<course>[^&]+)(&video=(?P<video>[^&]+))?(&.*)?)?))$'
_TEST = {
u'url': u'http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=PracticalUnix&video=intro-environment&speed=100',
u'file': u'PracticalUnix_intro-environment.mp4',
u'md5': u'544a9468546059d4e80d76265b0443b8',
u'info_dict': {
u"title": u"Intro Environment"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
raise ExtractorError(u'Invalid URL: %s' % url)
if mobj.group('course') and mobj.group('video'): # A specific video
course = mobj.group('course')
video = mobj.group('video')
info = {
'id': course + '_' + video,
'uploader': None,
'upload_date': None,
}
self.report_extraction(info['id'])
baseUrl = 'http://openclassroom.stanford.edu/MainFolder/courses/' + course + '/videos/'
xmlUrl = baseUrl + video + '.xml'
try:
metaXml = compat_urllib_request.urlopen(xmlUrl).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to download video info XML: %s' % compat_str(err))
mdoc = xml.etree.ElementTree.fromstring(metaXml)
try:
info['title'] = mdoc.findall('./title')[0].text
info['url'] = baseUrl + mdoc.findall('./videoFile')[0].text
except IndexError:
raise ExtractorError(u'Invalid metadata XML file')
info['ext'] = info['url'].rpartition('.')[2]
return [info]
elif mobj.group('course'): # A course page
course = mobj.group('course')
info = {
'id': course,
'type': 'playlist',
'uploader': None,
'upload_date': None,
}
coursepage = self._download_webpage(url, info['id'],
note='Downloading course info page',
errnote='Unable to download course info page')
info['title'] = self._html_search_regex('<h1>([^<]+)</h1>', coursepage, 'title', default=info['id'])
info['description'] = self._html_search_regex('<description>([^<]+)</description>',
coursepage, u'description', fatal=False)
links = orderedSet(re.findall('<a href="(VideoPage.php\?[^"]+)">', coursepage))
info['list'] = [
{
'type': 'reference',
'url': 'http://openclassroom.stanford.edu/MainFolder/' + unescapeHTML(vpage),
}
for vpage in links]
results = []
for entry in info['list']:
assert entry['type'] == 'reference'
results += self.extract(entry['url'])
return results
else: # Root page
info = {
'id': 'Stanford OpenClassroom',
'type': 'playlist',
'uploader': None,
'upload_date': None,
}
self.report_download_webpage(info['id'])
rootURL = 'http://openclassroom.stanford.edu/MainFolder/HomePage.php'
try:
rootpage = compat_urllib_request.urlopen(rootURL).read()
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
raise ExtractorError(u'Unable to download course info page: ' + compat_str(err))
info['title'] = info['id']
links = orderedSet(re.findall('<a href="(CoursePage.php\?[^"]+)">', rootpage))
info['list'] = [
{
'type': 'reference',
'url': 'http://openclassroom.stanford.edu/MainFolder/' + unescapeHTML(cpage),
}
for cpage in links]
results = []
for entry in info['list']:
assert entry['type'] == 'reference'
results += self.extract(entry['url'])
return results

View File

@@ -0,0 +1,42 @@
import re
from .common import InfoExtractor
class StatigramIE(InfoExtractor):
_VALID_URL = r'(?:http://)?(?:www\.)?statigr\.am/p/([^/]+)'
_TEST = {
u'url': u'http://statigr.am/p/484091715184808010_284179915',
u'file': u'484091715184808010_284179915.mp4',
u'md5': u'deda4ff333abe2e118740321e992605b',
u'info_dict': {
u"uploader_id": u"videoseconds",
u"title": u"Instagram photo by @videoseconds"
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(
r'<meta property="og:video:secure_url" content="(.+?)">',
webpage, u'video URL')
thumbnail_url = self._html_search_regex(
r'<meta property="og:image" content="(.+?)" />',
webpage, u'thumbnail URL', fatal=False)
html_title = self._html_search_regex(
r'<title>(.+?)</title>',
webpage, u'title')
title = re.sub(r'(?: *\(Videos?\))? \| Statigram$', '', html_title)
uploader_id = self._html_search_regex(
r'@([^ ]+)', title, u'uploader name', fatal=False)
ext = 'mp4'
return [{
'id': video_id,
'url': video_url,
'ext': ext,
'title': title,
'thumbnail': thumbnail_url,
'uploader_id' : uploader_id
}]

Some files were not shown because too many files have changed in this diff Show More