Compare commits

...

352 Commits

Author SHA1 Message Date
Sergey M․
e7ea724cb9 release 2017.01.08 2017-01-08 20:58:43 +07:00
Sergey M․
e60166020b [ChangeLog] Actualize 2017-01-08 20:56:38 +07:00
Sergey M․
364131584b [hitrecord] Improve (closes #11626) 2017-01-08 20:17:18 +07:00
J
553c68bbd9 [hitrecord] Add extractor 2017-01-08 20:17:18 +07:00
Remita Amine
827961b122 [videott] remove extractor 2017-01-07 14:47:36 +01:00
Remita Amine
a5eefc492b [swrmediathek] skip tests correctly 2017-01-06 15:09:10 +01:00
Remita Amine
a9cd1691b2 [swrmediathek] improve extraction 2017-01-06 15:06:08 +01:00
Remita Amine
2365f94412 [sharesix] remove extractor 2017-01-06 13:56:58 +01:00
Remita Amine
32b7c2a57e [aol] remove AolFeaturesIE 2017-01-06 12:10:47 +01:00
Remita Amine
221ce32529 [break] merge BreakIE and ScreenJunkiesIE 2017-01-06 11:25:48 +01:00
Remita Amine
e5dfdc8164 [sendtonews] improve info extraction 2017-01-06 11:23:43 +01:00
Remita Amine
a814da3f62 [skynews] update test 2017-01-06 11:22:35 +01:00
Sergey M․
b2727d0bee [3sat,phoenix] Fix extraction (closes #11619) 2017-01-06 17:13:53 +07:00
Philipp Hagemeister
dbaf601646 [comedycentral/mtv] Add support for HLS videos (fixes #11600)
Currently, the HTTP files of the RTMP urls are not present for the The Daily Show.
Use HLS instead for now.
2017-01-05 22:36:07 +01:00
Sergey M․
a9ee260217 [README.md] Mention --config-location in configuration section (#11615) 2017-01-06 02:45:45 +07:00
Yen Chi Hsuan
1219201143 [ChangeLog] Update after #11581
[ci skip]
2017-01-06 01:07:30 +08:00
Yen Chi Hsuan
ec85ded83c Fix "invalid escape sequences" error on Python 3.6 2017-01-06 00:58:56 +08:00
Yen Chi Hsuan
24d8a75982 [discoverygo] Fix JSON data parsing
HTMLParser, which is used by extract_attributes, already unescapes
attribute values with HTMLParser.unescape. They shouldn't be unescaped
again, to there may be parsing errors.

Ref: #11219, #11522
2017-01-05 18:50:34 +08:00
Sergey M․
7232bb299b release 2017.01.05 2017-01-05 04:10:15 +07:00
Sergey M․
2b12e34076 [ChangeLog] Actualize 2017-01-05 04:07:32 +07:00
Sergey M․
fb47cb5b23 [zdf] Improve (closes #11055, closes #11063) 2017-01-05 04:05:27 +07:00
Paul Hartmann
b6de53ea8a [zdf] Fix extraction 2017-01-05 04:04:53 +07:00
Sergey M․
96d315c2be [pornhub:playlist] Improve extraction (closes #11594) 2017-01-04 05:32:18 +07:00
Sergey M․
1911d77d28 [cctv] Add support for ncpa-classic.com (closes #11591) 2017-01-04 01:30:40 +07:00
Sergey M․
027e231295 [tunein] Add support for embeds (closes #11579) 2017-01-03 01:45:59 +07:00
Sergey M․
7a9e066972 [cctv] Relax some video id regexes 2017-01-03 01:13:02 +07:00
Sergey M․
2021b650dd release 2017.01.02 2017-01-02 23:55:04 +07:00
Sergey M․
b890caaf21 [ChangeLog] Actualize 2017-01-02 23:54:29 +07:00
Sergey M․
3783a5ccba [cctv] Relax _VALID_URL 2017-01-02 23:18:44 +07:00
Sergey M․
327caf661a [cctv] Do not fallback on video id extracted from URL 2017-01-02 23:00:37 +07:00
Sergey M․
ce7ccb1caa [cctv] Improve and merge with cntv (closes #879, closes #6753, closes #8541) 2017-01-02 22:55:24 +07:00
RPing
295eac6165 [cntv] Add extractor 2017-01-02 22:55:19 +07:00
Yen Chi Hsuan
d546d4c8e0 Merge pull request #11578 from rpunkfu/patch-2
Update Readme: Set HOME properly in install command
2017-01-02 13:50:26 +08:00
Oskar Cieslik
eec45445a8 Update Readme: Set home in sudo pip install
Hi, it's not always a default behaviour, but when you use `sudo pip` you most likely may want to use `-H` flag to set HOME value to directory of the target user :)
2017-01-02 00:22:10 +01:00
Sergey M․
7fc06b6a15 [README.md] Update link to available YoutubeDL options 2017-01-01 23:36:52 +07:00
Sergey M․
966815e139 [nrktv:episodes] Add support for episodes (#11571) 2017-01-01 21:26:32 +07:00
Sergey M․
e5e19379be [ISSUE_TEMPLATE_tmpl.md] Clarify example URLs constraints for site support request 2017-01-01 15:11:49 +07:00
Sergey M․
1f766b6e7b [arkena] Add support for video.arkena.com (closes #11568) 2017-01-01 02:46:47 +07:00
Sergey M․
dc48a35404 release 2016.12.31 2016-12-31 23:58:41 +07:00
Sergey M․
1ea0b727c4 [ChangeLog] Actualize 2016-12-31 23:53:30 +07:00
Sergey M․
b6ee45e9fa Improve custom config support (closes #10648) 2016-12-31 23:41:37 +07:00
Fabian Stahl
e66dca5e4a Add option --config-location
A configfile can now be passed to youtube_dl.

undo changes

Raise parser error if file not found, change to user_conf

change metavar hand helptext for --configfile

Fix help for --configfile

Update help for --configfile

Numbering placeholder in configfile error msg

minor fix

Change option --configfile top --config-file

Fix -config-file error
2016-12-31 23:04:16 +07:00
Sergey M․
3f1ce16876 [twitch:vod] Improve _VALID_URL (closes #11537) 2016-12-31 22:40:42 +07:00
Robert Smith
9a0f999585 [twitch] Added support for player.twitch.tv URLs (closes #11535) 2016-12-31 22:32:49 +07:00
David Haberthür
3540fe262f [README.md] Fix spelling and harmonize line length 2016-12-31 22:29:35 +07:00
Sergey M․
e186a9ec03 [videa] Add support for videa embeds 2016-12-31 22:05:32 +07:00
Sergey M․
69677f3ee2 [videa] Improve and simplify (closes #8181, closes #11133) 2016-12-31 22:05:32 +07:00
Bagira
e746021577 [videa] Add extractor 2016-12-31 22:05:32 +07:00
Chris Gavin
490da94edf [devscripts/buildserver] Remove unreachable except block 2016-12-31 19:17:52 +07:00
Sergey M․
424ed37ec4 [vk] Fix postlive videos extraction 2016-12-30 04:31:19 +07:00
Sergey M․
9cdb0a338d [vk] Extract from playerParams (closes #11555) 2016-12-30 04:21:49 +07:00
Sergey M․
6cf261d882 [freevideo] Remove extractor (closes #11515)
Handled by generic extractor
2016-12-30 00:32:23 +07:00
Sergey M․
df086e74e2 [showroomlive] Improve (closes #11458) 2016-12-30 00:12:35 +07:00
Arjan Verwer
963bd5ecfc [showroomlive] Add extractor 2016-12-29 23:17:00 +07:00
Sergey M․
51378d359e [xhamster] Fix duration extraction (closes #11549) 2016-12-28 23:04:46 +07:00
Sergey M․
b63005f5af [rtve:live] Fix extraction (closes #11529) 2016-12-25 04:02:29 +07:00
Yen Chi Hsuan
4606c34e19 [extractor/common] Allow non-lang in subtitles' keys
See 264e77c406
2016-12-25 01:50:50 +08:00
Sergey M․
53a664edf4 [brightcove:legacy] Improve embeds detection (closes #11523) 2016-12-24 22:46:27 +07:00
Sergey M․
264e77c406 [twitch] Add support for rechat messages (closes #11524) 2016-12-24 22:10:54 +07:00
Remita Amine
d1cd7e0ed9 Credit @wader for #11521 2016-12-24 15:00:23 +01:00
Mattias Wadman
846fd69bac [acast] Add test with multiple blings 2016-12-24 14:28:30 +01:00
Mattias Wadman
12da830993 [acast] Fix broken audio URL and timestamp extraction
Before first bling was used now we look for the first bling with
type BlingAudio.

Before publishingDate was a ms unix timestamp now it is iso8601.
2016-12-24 14:28:30 +01:00
Sergey M․
e7ac722d62 [README.md] Add missing protocols to format selection section 2016-12-23 22:01:22 +07:00
hub2git
19f37ce4b1 [README.md] Fix typo 2016-12-23 09:25:39 +07:00
Sergey M․
5e77c0b58e release 2016.12.22 2016-12-22 22:52:54 +07:00
Remita Amine
ab3091feda [ChangeLog] Actualize 2016-12-22 22:51:51 +07:00
Remita Amine
a07588369f [common] improve detection for video only formats and m3u8 manifest(fixes #11507) 2016-12-22 10:02:56 +01:00
Remita Amine
f5a723a78a [theplatform] pass geo verification headers to smil request(closes #10146) 2016-12-21 20:59:03 +01:00
Remita Amine
f120646f04 [viu] pass geo verification headers to auth request 2016-12-21 20:50:10 +01:00
Remita Amine
9c5b5f2115 [rtl2] extract more formats and metadata 2016-12-21 18:46:25 +01:00
Sergey M․
ae806db628 [vbox7] Skip malformed JSON-LD (closes #11501) 2016-12-21 22:39:05 +07:00
Remita Amine
bfa1073e11 [uplynk] force downloading using hls native downloader(closes #11496) 2016-12-20 19:49:45 +01:00
Remita Amine
e029c43bd4 [laola1] add support for another extraction scenario(closes #11460) 2016-12-20 18:22:57 +01:00
Sergey M․
90352a8041 release 2016.12.20 2016-12-20 22:39:39 +07:00
Sergey M․
1f6a79b0af [ChangeLog] Actualize 2016-12-20 22:37:06 +07:00
Sergey M․
3d6761ba92 [vbox7] Fix extraction (closes #11494) 2016-12-20 21:53:51 +07:00
Remita Amine
f59d1146c0 [uktvplay] Add new extractor(closes #11027) 2016-12-20 12:52:46 +01:00
Remita Amine
b1c357975d [piksel] Add new extractor(closes #11246) 2016-12-20 12:35:03 +01:00
Remita Amine
d8c507c9e2 [vimeo] fix extraction for hls formats and add support for dash formats(closes #11490) 2016-12-20 12:35:03 +01:00
Remita Amine
7fe1592073 [common] fix dash codec information for mixed videos and fragment url construction(#11490) 2016-12-20 12:35:03 +01:00
Yen Chi Hsuan
8ab7e6c4cc [kaltura] Improve widget ID extraction (closes #11480) 2016-12-20 18:45:52 +08:00
Sergey M․
c80db5d398 [nrktv:direkte] Add support for live streams (#11488) 2016-12-19 23:47:45 +07:00
Remita Amine
5aaf012a4e [pbs] fix extraction for geo restricted videos(#7095) 2016-12-19 16:27:12 +01:00
Remita Amine
954529c10f [brightcove:new] skip widevine classic videos 2016-12-18 21:39:59 +01:00
Remita Amine
ed7b333fbf [viu] extract supported hls manifest 2016-12-18 18:24:01 +01:00
Remita Amine
723103151e [viu] improve extraction(closes #10607)(closes #11329) 2016-12-18 17:20:53 +01:00
ping
e7b6caef24 [viu] New extractor for viu.com 2016-12-18 17:20:53 +01:00
Sergey M․
ec79b1de1c Revert "Credit @pyx for meipai (#10718)"
This reverts commit d5e623aaa1.
2016-12-18 20:56:21 +07:00
Sergey M․
f73d7d5074 release 2016.12.18 2016-12-18 19:50:33 +07:00
Sergey M․
52a1d48d9f [ChangeLog] Actualize 2016-12-18 19:48:59 +07:00
Sergey M․
d5e623aaa1 Credit @pyx for meipai (#10718) 2016-12-18 19:46:57 +07:00
Remita Amine
199a47abba [ccma] Add new extractor(closes #11359) 2016-12-18 10:49:10 +01:00
Remita Amine
b42a0bf360 [laola1tv] add support embed urls and improve extraction(#11460) 2016-12-17 21:48:45 +01:00
Remita Amine
6e416b210c [nbc] fix extraction for msnbc videos(fixes #11466) 2016-12-17 18:11:13 +01:00
Sergey M․
04bf59ff64 [extractors] Add missing twitch imports 2016-12-17 23:03:50 +07:00
Sergey M․
87a449c1ed [extractor/common] Recognize DASH formats in html5 media entries 2016-12-17 23:03:13 +07:00
Sergey M․
93753aad20 [twitch] Adapt to new videos pages schema (closes #11469) 2016-12-17 20:20:23 +07:00
Sergey M․
2786818c33 [meipai] Fix regular videos extraction and improve (closes #10718) 2016-12-17 19:42:34 +07:00
Philip Xu
9b785768ac [meipai] Add extractor 2016-12-17 19:41:35 +07:00
Sergey M․
47c914f995 [ondemandkorea] Fix extraction (closes #10772) 2016-12-17 18:50:12 +07:00
Sergey M․
732d116aa7 [jwplatform] Improve duration extraction 2016-12-17 18:50:07 +07:00
Sergey M․
a495840d3b [jwplatform] Improve subtitles extraction 2016-12-17 18:50:00 +07:00
Sergey M․
b0c65c677f [utils] Improve urljoin 2016-12-17 18:49:55 +07:00
ping
594601f545 [ondemandkorea] Add extractor 2016-12-17 18:49:45 +07:00
Sergey M․
0ae9560eea [vporn] Use urljoin for thumbnail 2016-12-16 23:57:51 +07:00
Remita Amine
dc1f3a9f20 [vvvvid] do not cache the conn_id 2016-12-16 11:05:46 +01:00
Remita Amine
7b1e80792b [vvvvid] Add new extractor(closes #5915) 2016-12-16 09:05:34 +01:00
Sergey M․
38be3bc568 release 2016.12.15 2016-12-15 21:16:55 +07:00
Sergey M․
d7ef47bffd [ChangeLog] Actualize 2016-12-15 21:15:45 +07:00
Yen Chi Hsuan
5c32a5be95 [openload] Recognize oload.tv URLs (#10408) 2016-12-15 17:51:26 +08:00
Yen Chi Hsuan
30918999f5 [facebook] Recognize .onion URLs (closes #11443) 2016-12-15 01:04:49 +08:00
Sergey M․
069f918302 [vlive] Use live titles for live streams 2016-12-14 21:30:33 +07:00
Sergey M․
89c63cc5f8 [vlive] Add video params extraction fallback and improve (closes #11375) 2016-12-14 21:05:50 +07:00
Corey Nicholson
577748075b [vlive] Update extraction 2016-12-14 21:05:32 +07:00
Remita Amine
67dcbc0add [canvas] extract dash formats 2016-12-13 17:59:22 +01:00
Sergey M․
3a40f859b5 [melonvod] Improve (closes #11419) 2016-12-13 02:27:26 +07:00
Sergey M․
e34c33614d [utils] Add convenience urljoin 2016-12-13 02:23:49 +07:00
ping
abf3494ac7 [melonvod] Add extractor for vod.melon.com 2016-12-13 02:13:40 +07:00
Sergey M․
3c1e9dc4ec release 2016.12.12 2016-12-12 01:44:50 +07:00
Sergey M․
62faf9b55e [ChangeLog] Actualize 2016-12-12 01:41:08 +07:00
Sergey M․
3530e0d3d9 [dplay] Use Safari user-agent for hls (closes #11418) 2016-12-12 00:58:08 +07:00
Sergey M․
fb37eb25d9 [utils] Add common user agents map 2016-12-12 00:49:07 +07:00
Sergey M․
d2d2495e16 [facebook] Detect login required error message 2016-12-11 01:40:30 +07:00
Sergey M․
19b4900b7b [facebook] Improve video selection (closes #11390) 2016-12-11 01:22:01 +07:00
Sergey M․
6ca478d44a [canalplus] Add another video id regex (closes #11399) 2016-12-11 00:45:27 +07:00
Sergey M․
655cb545ab [mixcloud] Relax _VALID_URL (closes #11406) 2016-12-10 23:48:18 +07:00
Remita Amine
f0b69fa91a [ctvnews] relax _VALID_URL regex(closes #11394) 2016-12-10 17:36:32 +01:00
Remita Amine
8821a718cf [common] recognize hls manifests that contain video only formats(#11394) 2016-12-10 17:22:15 +01:00
Remita Amine
0d7d9f9404 [rte] improve extraction(closes #10498)(closes #7746) 2016-12-10 16:34:01 +01:00
Remita Amine
f41db40596 [prosiebensat1] extract dash formats 2016-12-10 13:29:51 +01:00
Remita Amine
68601ef3ac [rts,srgssr] improve extraction for geo restricted videos(fixes #11089)(closes #4989) 2016-12-10 10:47:56 +01:00
Sergey M․
18ece70c4d release 2016.12.09 2016-12-09 02:46:18 +07:00
Sergey M․
9ed3495eae [ChangeLog] Actualize 2016-12-09 02:41:49 +07:00
Yen Chi Hsuan
6c20a0bb99 [openload] Fix extraction (closes #10408) 2016-12-09 02:15:16 +08:00
Sergey M․
f43795e56b [pandoratv] PEP 8 and simplify 2016-12-07 23:50:10 +07:00
Serkora
7441915b1e [pandoratv] Fix extraction (closes #11023) 2016-12-07 23:46:42 +07:00
Remita Amine
283d1c6a8b [telebruxelles] extract all formats and add support for emission urls 2016-12-06 19:01:17 +01:00
Sergey M․
875ddd7409 [bloomberg] Add another video id regex (closes #11371) 2016-12-06 00:41:03 +07:00
Sergey M․
4afa4ff223 [1tv] Fix video id extraction 2016-12-05 23:28:57 +07:00
vordep
3ed81714d8 [fusion] Update ooyala id regex 2016-12-05 22:43:36 +07:00
Yen Chi Hsuan
4bd7d9d4ae [socks] Refine exception model for better error handling
1. ProxyError now inherits from socket.error instead of IOError

The only functions socks.py overrides are connect and connect_ex. In
Python 2.x and Python <= 3.2, socket functions raises socket.error. In
newer Python versions, those functions raises OSError instead. The name
socket.error is preserved as an alias of OSError for backward
compability. To keep socks.py compatible with Python's standard library,
it should raise the same exception as raw sockets.

See PEP 3151 (https://www.python.org/dev/peps/pep-3151/) for more
information about the change in Python 3.3.

2. Raise EOFError instead of IOError when the socket receives less data
than it expects

There's no common convention, but both ftplib and telnetlib raises
EOFError for similar situations. socks.py follows them.

Closes #11355

In #11355, only Python 2 is affected. In Python 3, both socket.error and
IOError are alias of OSError, so AbstractHTTPHandler.do_open correctly
catches the error and thus InfoExtractor._is_valid_url works fine.
2016-12-05 00:43:37 +08:00
Sergey M․
9b5288c92a [1tv] Improve extraction and add support for playlists (closes #11335) 2016-12-04 23:35:21 +07:00
Yen Chi Hsuan
8344296619 [socks] Fix error reporting (#11355) 2016-12-03 21:53:41 +08:00
Remita Amine
a94e7f4a0c [aenetworks] extract more formats(closes #11321) 2016-12-01 12:15:35 +01:00
Yen Chi Hsuan
d17bfe4095 [thisoldhouse] Recognize /tv-episode/ URLs and update _TESTS
Closes #11271
2016-12-01 14:56:52 +08:00
Laneone
98b08f94b1 [README.md] Fix typo
Just a minor spelling mistake in the readme
2016-12-01 01:31:21 +07:00
Sergey M․
73ec479c7d release 2016.12.01 2016-12-01 00:15:12 +07:00
Sergey M․
f150530f4d [ChangeLog] Actualize 2016-12-01 00:13:06 +07:00
Sergey M․
4c4765dba2 [soundcloud] Update client id (closes #11327) 2016-11-30 23:17:30 +07:00
Philipp Hagemeister
f882554815 [comedcycentral] Give /shows/.+/full-episodes URLs to the COmedyCentralFullEpisodesIE 2016-11-30 11:52:19 +01:00
Sergey M․
db75f14d8a [ruutu] Detect DRM videos 2016-11-30 04:19:38 +07:00
Sergey M․
8b0d3ee64e [liveleak] Simplify and PEP 8 2016-11-29 23:42:19 +07:00
Varun
3779d524df [liveleak] Add support for youtube embeds 2016-11-29 23:37:30 +07:00
Mark Lee
6303fc8204 [spike] Fix full episodes extraction 2016-11-29 23:06:01 +07:00
Philipp Hagemeister
cc61fc3934 [comedycentral] Add new extractor for full-episodes
CC seems to have added yet another indirection for full episodes - the mgid is now only in a linked feed.
This may be a little brittle, but it's better than failing outright.
Plus, the current The Daily Show episode now works :)
2016-11-29 10:12:18 +01:00
Sergey M․
c2530d3319 [teamfourstar] Simplify _VALID_URL and relax regexes 2016-11-28 23:22:29 +07:00
felix
8953319916 [screenwavemedia] Remove extractor
Rewrite TeamFourStar and Normalboots extractors in terms of JWPlatform
2016-11-28 23:17:56 +07:00
Yen Chi Hsuan
51b1378eed Ignore and clean .swf files
Some videos on NicoNico are swf
2016-11-27 22:01:07 +08:00
Sergey M․
2b380fc299 release 2016.11.27 2016-11-27 20:05:32 +07:00
Sergey M․
294d4926d7 [ChangeLog] Actualize 2016-11-27 20:04:03 +07:00
Sergey M․
83f1481baa [extractor/generic] Add support for webcaster.pro embeds 2016-11-27 19:56:32 +07:00
Sergey M․
f25e1c8d8c [webcaster] Add support for webcaster.pro 2016-11-27 19:54:59 +07:00
Sergey M․
6901673868 [azubu] Add support for azubu.uol.com.br (closes #11305) 2016-11-27 15:40:28 +07:00
Sergey M․
560c8c6ec0 [viki] Prefer hls 2016-11-26 00:14:09 +07:00
Sergey M․
9338a0eae3 [viki] Fix rtmp formats extraction (closes #11255) 2016-11-26 00:13:46 +07:00
Sergey M․
74394b5e10 [puls4] Relax _VALID_URL (closes #11267) 2016-11-25 23:37:32 +07:00
Sergey M․
1db058466d [vevo] Allow video info to fail in tests 2016-11-24 23:10:58 +07:00
Sergey M․
e94eeb1dd3 [vevo] Simplify artists extraction 2016-11-24 23:09:35 +07:00
Andrew J. Erickson
8b27d83e4e vevo: fixing naming when there are featured artists 2016-11-24 23:07:28 +07:00
Sergey M․
8eb7b5c3f1 [mitele] Modernize and extract more metadata 2016-11-24 22:43:02 +07:00
zurfyx
b68599ed47 [mitele] Relax _VALID_URL 2016-11-24 21:57:53 +07:00
Yen Chi Hsuan
44444f0d3b [cbslocal] Support newyork.cbslocal.com
Closes #11285
2016-11-24 20:32:17 +08:00
Sergey M․
c867adc68c [youtube:playlist] Pass disable_polymer in query (closes #11193, closes #11270) 2016-11-23 23:28:32 +07:00
Sergey M․
3b5daf0736 release 2016.11.22 2016-11-22 22:32:16 +07:00
Sergey M․
c8f56741dd [ChangeLog] Actualize 2016-11-22 22:29:37 +07:00
Andy Savicki
868630fbe5 [hellporno] Add support for hellporno.net and improve ext extraction 2016-11-22 22:16:10 +07:00
Yen Chi Hsuan
1d6ae5628f [amcnetworks] Recognize more BBC America URLs
Closes #11263
2016-11-22 20:40:57 +08:00
Sergey M․
6334794f2a [funnyordie] Copy formats' metadata from hls and sort formats 2016-11-21 23:46:55 +07:00
Andy Savicki
4eece8ba57 [funnyordie] Improve extraction 2016-11-21 22:16:26 +07:00
Yen Chi Hsuan
2574721a81 Clean and ignore more file types
ape is another audio codec seen in kuwo. See
https://en.wikipedia.org/wiki/Monkey's_Audio
2016-11-21 12:50:13 +08:00
Yen Chi Hsuan
dbcc4a6b32 [CONTRIBUTING.md] Fix broken links (#11239) 2016-11-21 12:25:19 +08:00
Yen Chi Hsuan
0bb58a208b Merge pull request #11239 from josephfrazier/patch-1
[CONTRIBUTING.md] Fix broken link
2016-11-21 12:24:11 +08:00
Joseph Frazier
dc6a9e4195 [README.md] Update link from generated CONTRIBUTING.md 2016-11-20 11:32:00 -05:00
Sergey M․
8f8f182d0b [extractor/generic] Improve limelight embeds support 2016-11-20 02:13:21 +07:00
Yen Chi Hsuan
2176e466e0 Merge branch 'DarkstaIkers-master' 2016-11-20 00:07:35 +08:00
Yen Chi Hsuan
303b38fa84 [ChangeLog] Update for #9028 2016-11-20 00:06:44 +08:00
Yen Chi Hsuan
fb27d0ce5e Merge branch 'master' of https://github.com/DarkstaIkers/youtube-dl into DarkstaIkers-master 2016-11-20 00:05:11 +08:00
Sergey M․
0aacd2deb1 [bandcamp] Fix free downloads extraction and extract all formats (closes #11067) 2016-11-19 04:18:21 +07:00
Sergey M․
08ec95a6db [ChangeLog] Actualize 2016-11-19 03:10:20 +07:00
Sergey M․
df46b19cb8 [toutv] Fix login form regex (closes #11223) 2016-11-19 01:56:31 +07:00
Sergey M․
748a462fbe [twitter:card] Relax _VALID_URL (closes #11225) 2016-11-19 01:49:13 +07:00
Sergey M․
c131fc3372 [tvanouvelles] Add extractor (closes #10616) 2016-11-18 01:16:33 +07:00
Sergey M․
b25459b88a release 2016.11.18 2016-11-18 00:25:24 +07:00
Sergey M․
5f75c4a4ad [ChangeLog] Actualize 2016-11-18 00:19:55 +07:00
Sergey M․
689f31fde5 [devscripts/create-github-release] Fill release body from ChangeLog (closes #11094) 2016-11-18 00:17:46 +07:00
Yen Chi Hsuan
582be35847 Update coding style after pycodestyle 2.1.0
In pycodestyle 2.1.0, E305 was introduced, which requires two blank
lines after top level declarations, too.

See https://github.com/PyCQA/pycodestyle/issues/400

See also #10689; thanks @stepshal for first mentioning this issue and
initial patches
2016-11-17 19:45:42 +08:00
Sergey M․
073d5bf583 [youtube:live] Relax _VALID_URL (closes #11164) 2016-11-16 23:15:19 +07:00
Yen Chi Hsuan
315cb86a95 Merge pull request #11210 from FooBarQuaxx/patch-2
Strip only args urls
2016-11-16 23:29:37 +08:00
FooBarQuaxx
b2fc1c4fb9 Add explanatory comment 2016-11-16 18:18:54 +03:00
Yen Chi Hsuan
d76767c90e [ChangeLog] Update after #11122 landed 2016-11-16 20:47:15 +08:00
Yen Chi Hsuan
eceba9f805 Merge pull request #11122 from kasper93/openload
[openload] Fix extraction.
2016-11-16 20:43:19 +08:00
MAA
d755396804 Strip only args urls 2016-11-16 09:00:30 +03:00
Sergey M․
58355a3bf1 [vlive] Add test for #11203 2016-11-15 22:11:47 +07:00
ping
49b69ad91c [vlive] Prefer locale over language for subtitles id 2016-11-15 22:07:17 +07:00
Sergey M․
6b4dfa2819 release 2016.11.14.1 2016-11-14 02:48:15 +07:00
Sergey M․
9f60134a9d [ChangeLog] Actualize 2016-11-14 02:46:12 +07:00
Sergey M․
b3d4bd05f9 release 2016.11.14 2016-11-14 02:39:50 +07:00
Sergey M․
dbffd00ba9 [ChangeLog] Actualize 2016-11-14 02:37:21 +07:00
Sergey M․
50913b8241 [nrk] Improve geo restriction detection 2016-11-13 22:29:36 +07:00
Sergey M․
7e08e2cab0 [nrk] Add X-Forwarded-For HTTP header in info dict 2016-11-13 22:28:29 +07:00
Sergey M․
690355551c [downoader/fragment,f4m,hls] Add internal support for custom HTTP headers 2016-11-13 22:22:10 +07:00
Sergey M․
754e6c8322 [nrk] Workaround geo restriction and improve error messages 2016-11-13 20:54:34 +07:00
Sergey M․
e58609b22c [afreecatv] Add support for vod.afreecatv.com (closes #11174) 2016-11-13 06:02:26 +07:00
Sergey M․
4ea4c0bb22 [extractor/common] Fix Bandwidth substitution in media template (closes #11175) 2016-11-13 05:43:34 +07:00
Kacper Michajłow
577281b0c6 [cda] Fix and improve extraction
Fixes #10929
2016-11-13 01:01:29 +07:00
Sergey M․
3d2729514f [plays] Improve extraction and add support for embed URLs 2016-11-12 23:08:05 +07:00
Sergey M․
f076d7972c [extractor/common] Improve thumbnail extraction from JSON-LD 2016-11-12 23:01:05 +07:00
cpm
8b1aeadc33 [plays] Fix extraction 2016-11-12 22:59:39 +07:00
Kacper Michajłow
95ad9ce573 [openload] Fix extraction.
aadecode code was restored from commit c1decda58c
with some optimizations (2x faster).

Fixes #10408
2016-11-11 15:36:57 +01:00
Kacper Michajłow
189935f159 [jsinterp] Fix function calls without arguments. 2016-11-11 15:36:57 +01:00
Sergey M․
bc40b3a5ba [eagleplatform] Fix extraction (closes #11160) 2016-11-11 03:26:29 +07:00
Yen Chi Hsuan
3eaaa8abac [audioboom] Recognize /posts/ URLs (closes #11149) 2016-11-10 14:52:34 +08:00
Sergey M․
db3367f43e release 2016.11.08.1 2016-11-08 22:30:53 +07:00
Sergey M․
6590925c27 [ChangeLog] Actualize 2016-11-08 22:29:16 +07:00
Sergey M․
4719af097c [extractors] Add forgotten import for espn:article 2016-11-08 22:27:02 +07:00
Sergey M․
9946aa5ccf [franceculture] Fix extraction (closes #11140) 2016-11-08 22:26:33 +07:00
Sergey M․
c58e07a7aa release 2016.11.08 2016-11-08 22:11:21 +07:00
Sergey M․
f700afa24c [ChangeLog] Actualize 2016-11-08 22:09:03 +07:00
Yen Chi Hsuan
5d47b38cf5 [tmz:article] Fix extraction (closes #11052) 2016-11-08 21:53:41 +08:00
Sergey M․
ebc7ab1e23 [espn] Fix extraction (closes #11041) 2016-11-08 00:29:12 +07:00
Sergey M․
97726317ac [README.md] Mention HTTP headers and alternative way to obtain cookies and headers in -g FAQ 2016-11-07 23:53:22 +07:00
DarkZeros
cb882540e8 [mitele] Fix extraction after website redesign (fixes #10824) 2016-11-07 11:13:59 +01:00
Sergey M․
98708e6cbd [ard] Remove age restriction check (closes #11129) 2016-11-06 23:20:15 +07:00
Sergey M․
b52c9ef165 [extractor/generic] Improve support for pornhub embeds (closes #11100) 2016-11-06 21:52:00 +07:00
Sergey M․
e28ed498e6 [extractor/generic] Add support for redtube embds (closes #11099) 2016-11-06 21:42:41 +07:00
Sergey M․
5021ca6c13 [redtube] Add support for embed URLs 2016-11-06 21:39:29 +07:00
Sergey M․
37e7a71c6c [extractor/generic] Add support for drtuber embds (closes #11098) 2016-11-06 21:33:51 +07:00
Sergey M․
f5c4b06f17 [drtuber] Fix title extraction 2016-11-06 21:29:15 +07:00
Sergey M․
519d897049 [drtuber] Add support for embed URLs 2016-11-06 21:28:51 +07:00
Sergey M․
b61cd51869 [yahoo] Add test and improve some content id regex 2016-11-06 21:16:33 +07:00
Sergey M․
f420902a3b [yahoo] Add another content id regex (closes #11088) 2016-11-06 21:14:15 +07:00
Sergey M․
de328af362 [toutv] Relax _VALID_URL (closes #11121) 2016-11-05 03:24:42 +07:00
Sergey M․
b30e4c2754 release 2016.11.04 2016-11-04 22:07:54 +07:00
Sergey M․
09ffe34b00 [ChangeLog] Actualize 2016-11-04 21:59:42 +07:00
Sergey M․
640aff1d0c [anvato] Improve formats extraction 2016-11-04 21:45:24 +07:00
Sergey M․
c897af8aac [cbslocal] Update test 2016-11-04 21:33:08 +07:00
Sergey M․
f3c705f8ec [fox9] Add extractor (closes #11110) 2016-11-04 21:32:30 +07:00
Sergey M․
f93ac1d175 [anvato] Extract more metadata 2016-11-04 21:17:56 +07:00
Sergey M․
c4c9b8440c [extractor/common] Tolerate malformed RESOLUTION attribute in m3u8 manifests (closes #11113) 2016-11-04 05:02:31 +07:00
Sergey M․
32f2627aed [vodlocker] Add another removed file pattern (closes #11106) 2016-11-03 22:22:40 +07:00
Sergey M․
9d64e1dcdc [downloader/ism] Fix typo 2016-11-03 22:15:09 +07:00
Remita Amine
10380e55de [downloader/ism] fix AVC Decoder Configuration Record creation in python 3 2016-11-03 16:08:57 +01:00
Remita Amine
22979993e7 [vice] add coding cookie 2016-11-03 16:07:22 +01:00
Remita Amine
b47ecd0b74 [vzaar] Add new extractor(closes #11093) 2016-11-03 12:50:41 +01:00
Yen Chi Hsuan
3a86b2c51e Ignore and clean .wav files 2016-11-03 18:55:55 +08:00
Remita Amine
b811b4c93b [vice] add support for uplynk preplay videos(#11101) 2016-11-03 10:37:07 +01:00
Remita Amine
f4dfa9a5ed [tubitv] fix extraction(closes #11061) 2016-11-03 09:04:20 +01:00
Remita Amine
3b4b66b50c [shahid] add support for authentication(closes #11091) 2016-11-03 00:44:12 +01:00
Sergey M․
4119a96ce5 [extractor/generic] Skip URLs we came from when delegating ISM extraction 2016-11-02 23:43:41 +07:00
Sergey M․
26aae56690 [extractor/generic] Improve ISM extraction 2016-11-02 23:34:37 +07:00
Remita Amine
4f9cd4d36f [radiocanada] extract subtitle(closes #11096) 2016-11-02 13:55:40 +01:00
Sergey M․
cc99a77ac1 [extractor/generic] Add support for ISM manifests 2016-11-02 03:01:13 +07:00
Sergey M․
8956d6608a release 2016.11.02 2016-11-02 02:39:36 +07:00
Sergey M․
3365ea8929 [extractor/common] Remove unused code 2016-11-02 02:34:23 +07:00
Sergey M․
a18aeee803 [ChangeLog] Actualize 2016-11-02 02:33:17 +07:00
Sergey M․
1616f9b452 [extractor/common] Fix typo 2016-11-02 02:30:25 +07:00
Sergey M․
02dc0a36b7 [utils] Introduce base_url 2016-11-02 02:30:18 +07:00
Remita Amine
639e3b5c99 extract ISM formats in some of the extractors 2016-11-02 01:54:45 +07:00
Remita Amine
b2758123c5 add Basic support for Smooth Streaming protocol(#8118) 2016-11-02 01:54:45 +07:00
Sergey M․
f449c061d0 [nicknight] Improve extraction (closes #10769) 2016-11-02 01:35:53 +07:00
Sergey M․
9c82bba05d [nickde] Improve extraction 2016-11-02 01:29:05 +07:00
NeroBurner
e3577722b0 [nicknight] Add extractor 2016-11-02 01:25:59 +07:00
Sergey M․
b82c33dd67 [extractor/common] Improve mpd base URL extraction (closes #10909, closes #11079) 2016-11-01 01:15:46 +07:00
Sergey M․
e5a088dc4b [utils] Fix --match-filter for int-like strings (closes #11082) 2016-10-31 23:32:08 +07:00
Sergey M․
2c6da7df4a release 2016.10.31 2016-10-31 01:36:53 +07:00
Mel Shafer
7e7a028aa4 [README.md] Fix a typo 2016-10-31 01:12:36 +07:00
Sergey M․
e70a5e6566 release 2016.10.30 2016-10-30 18:24:49 +07:00
Sergey M․
3bf55be466 [ChangeLog] Actualize 2016-10-30 18:19:29 +07:00
Sergey M․
a901fc5fc2 [vessel] Add tests for #11068 2016-10-30 18:17:15 +07:00
dundua
cae6bc0118 [vessel] Improve video id extraction 2016-10-30 18:14:51 +07:00
Yen Chi Hsuan
d9ee2e5cf6 [facebook] Remove SWF params so that 1080P are detected
Closes #11073

In the provided link, SWF params give up to 720P, and VideoConfig
gives 1080P for both best and bestvideo. I guess all Facebook videos
supports HTML5 now, so I remove the old detection for SWF params
2016-10-30 18:20:55 +08:00
Yen Chi Hsuan
e1a0b3b81c [imgur] Recognize /r/ URLs (closes #11071) 2016-10-30 17:02:03 +08:00
Sergey M․
2a048f9878 [beeg] Fix extraction (closes #11069) 2016-10-30 05:27:50 +07:00
Sergey M․
ea331f40e6 Credit @Thor77 for jamendo (#10934) 2016-10-30 05:10:31 +07:00
Yen Chi Hsuan
f02700a1fa [openload] Fix extraction (#10408)
Thanks @TwelveCharzz again for studying openload codes
2016-10-29 18:00:30 +08:00
Sergey M․
f3517569f6 [gvsearch] Modernize and fix page result request (closes #11051) 2016-10-28 23:19:59 +07:00
neutric
c725333d41 [ard] Fix typo 2016-10-27 03:59:00 +07:00
Yen Chi Hsuan
a5a8877f9c [adultswim] Fix extraction (closes #10979) 2016-10-27 02:16:48 +08:00
Remita Amine
43c53a1700 [nobelprize] Add new extractor(closes #9999) 2016-10-26 18:15:23 +01:00
Yen Chi Hsuan
ec8705117a [hornbunny] Fix extraction (#10981) 2016-10-27 00:10:51 +08:00
Remita Amine
3d8d44c7b1 [tvp] improve video id extraction(closes #10585) 2016-10-26 16:47:22 +01:00
Sergey M․
88839f4380 release 2016.10.26 2016-10-26 19:55:09 +07:00
Sergey M․
83e9374464 [ChangeLog] Actualize 2016-10-26 19:53:44 +07:00
Sergey M․
773017c648 [rentv] Move rentv test from generic extractor and add only matching tests 2016-10-26 19:52:43 +07:00
Remita Amine
777d90dc28 [rentv] Add new extractor(closes #10620) 2016-10-26 10:07:35 +01:00
Sergey M․
3791d84acc [ard] Detect unavailable videos (closes #11018) 2016-10-25 21:21:47 +07:00
Sergey M․
9305a0dc60 [vk] Fix extraction (closes #11022) 2016-10-25 21:05:29 +07:00
Sergey M․
94e08950e3 release 2016.10.25 2016-10-25 03:19:36 +07:00
Sergey M․
ee824a8d06 [ChangeLog] Actualize
[ci skip]
2016-10-25 02:51:07 +07:00
Sergey M․
d3b6b3b95b [jamendo] Improve 2016-10-25 02:46:48 +07:00
Thor77
b17422753f [jamendo] Add extractor 2016-10-25 02:43:03 +07:00
Sergey M․
b0b28b8241 [ChangeLog] Actualize 2016-10-25 01:53:41 +07:00
Sergey M․
81cb7a5978 Credit @azuwis for pandatv (#10736) 2016-10-25 01:51:46 +07:00
Sergey M․
d2e96a8ed4 [pandatv] Extract m3u8, document reverse source and PEP 8 2016-10-25 01:51:37 +07:00
Zhong Jianxin
2e7c8cab55 [pandatv] Add new extractor 2016-10-25 01:46:02 +07:00
Sergey M․
d7d4481c6a [movieclips] Fix _VALID_URL 2016-10-24 23:54:42 +07:00
Yen Chi Hsuan
5ace137bf4 [dotsub] Support vimeo embed (closes #10964) 2016-10-24 15:13:33 +08:00
Yen Chi Hsuan
9dde0e04e6 [litv] Fix extraction (#11006) 2016-10-23 23:23:40 +08:00
Sergey M․
f16f8505b1 [vimeo] Delegate ondemand redirects to ondemand extractor (closes #10994) 2016-10-23 18:48:50 +07:00
Sergey M․
9dc13a6780 [vivo] Fix extraction (closes #11003) 2016-10-23 18:07:56 +07:00
Sergey M․
9aa929d337 [twitch:stream] Add support for rebroadcasts (closes #10995) 2016-10-23 17:20:45 +07:00
Sergey M․
425f3fdfcb [pluralsight] Fix subtitles conversion (closes #10990) 2016-10-22 21:15:39 +07:00
Yen Chi Hsuan
e034cbc581 Merge branch 'johnhawkinson-stdin2' 2016-10-22 13:10:27 +08:00
Yen Chi Hsuan
5378f8ce0d [ChangeLog] Update for #10996 2016-10-22 13:08:56 +08:00
Yen Chi Hsuan
b64d04c119 [utils] Clarify for redirecting STDIN in get_exe_version() 2016-10-22 13:04:05 +08:00
John Hawkinson
00ca755231 [get_exe_version] Do version probes with <&-
When doing version probes for ffmpeg, do the
equivalent of calling it as:

    ffmpeg -version <&-

Where <&- is shell syntax for closing stdin before calling the
program. This is roughly equivalent to </dev/null without actually
opening /dev/null.

This prevents ffmpeg -version from hanging when run in the background.
Fixes #955.

The reason is that ffmpeg tries to manipulate stdin to set up terminal
characteristic, and that causes the kernel to suspend the parent
process (youtube-dl).

Note that closing stdin is achieved by calling subprocess.Popen() with
stdin set to subprocess.PIPE and without passing any input to
Popen.communicate(). This is somewhat subtle.
2016-10-22 00:34:08 -04:00
Sergey M․
69c2d42bd7 release 2016.10.21.1 2016-10-21 04:57:28 +07:00
Sergey M․
062e2769a3 [ChangeLog] Actualize 2016-10-21 04:53:26 +07:00
Sergey M․
859447a28d [adobepass] PEP 8 2016-10-21 04:38:14 +07:00
Sergey M․
f8ae2c7f30 [pluralsight] Process all clip URLs (closes #10984) 2016-10-21 04:35:32 +07:00
Sergey M․
9ce0077485 release 2016.10.21 2016-10-21 03:08:42 +07:00
Sergey M․
0ebb86bd18 [ChangeLog] Actualize 2016-10-21 03:07:03 +07:00
Sergey M․
9df6b03caf [pluralsight] Adapt to new API (closes #10972) 2016-10-21 03:00:03 +07:00
Yen Chi Hsuan
8e2915d70b Revert "[postprocessor/embedthumbnail] Allow mkv to embed thumbnails"
This reverts commit 7360db05b4.

This commit was added as an attempt to fix #6046. Unfortunately, the fix
is completely wrong. As reported on #10359, embedded thumbnails are not
displayed in VLC, and Se7en on IRC reports that the embedded thumbnail
misleads mpv as well.

The correct way is using -attachment of ffmpeg, while the current
run_ffmpeg_multiple_files API can't handle it cleanly.
2016-10-20 15:07:19 +08:00
Yen Chi Hsuan
19e447150d [ChangeLog] Update for #10971 2016-10-20 04:19:33 +08:00
Yen Chi Hsuan
ad9fd84004 Merge pull request #10971 from kasper93/openload
[openload] Fix extraction.
2016-10-20 04:18:27 +08:00
Kacper Michajłow
60633ae9a0 [openload] Fix extraction.
Fixes #10408
2016-10-19 22:00:29 +02:00
Remita Amine
a81dc82151 Credit @raleeper for the Comcast support (#10819) 2016-10-19 20:39:52 +01:00
remitamine
9218a6b4f5 Merge pull request #10819 from raleeper/adobepass
[adobepass] Add Comcast
2016-10-19 20:16:24 +01:00
Remita Amine
02af6ec707 [natgeo] extract m3u8 formats(closes #10959) 2016-10-19 19:39:41 +01:00
Sergey M․
05b7996cab [ChangeLog] Fix typos and add unreleased version header 2016-10-20 00:00:53 +07:00
raleeper
46f6052950 [adobepass] Add Comcast with fixed _download_webpage calls 2016-10-19 09:56:26 -07:00
Sergey M․
c8802041dd release 2016.10.19 2016-10-19 23:55:16 +07:00
Sergey M․
c7911009a0 [ChangeLog] Actualize 2016-10-19 23:53:37 +07:00
Sergey M․
2b96b06bf0 [vidzi] Fix extraction (closes #10908, closes #10952) 2016-10-19 23:31:58 +07:00
Sergey M․
06b3fe2926 [utils] Expose PACKED_CODES_RE 2016-10-19 23:28:49 +07:00
Jack Danger Canty
2c6743bf0f [README.md] Change 'guys' to 'people'
Folks at Ubuntu aren't all male so calling them 'guys' is a little odd.
2016-10-19 22:32:17 +07:00
Remita Amine
efb6242916 [urplay] add supprt for urskola.se and fix subtitle extraction(closes #10915) 2016-10-19 15:05:39 +01:00
Remita Amine
0384932e3d [extractor/common] try to extract non smil wowza mpd manifests 2016-10-19 14:57:12 +01:00
Remita Amine
edd6074cea [extractor/common] detect f4m audio only formats 2016-10-19 14:42:48 +01:00
Remita Amine
791d29dbf8 [orf] add subtitles support(closes #10939) 2016-10-19 11:34:46 +01:00
Sergey M․
481cc7335c [youtube] Fix --no-playlist behavior for youtu.be/id URLs (closes #10896) 2016-10-19 03:27:18 +07:00
Sergey M․
853a71b628 [nrk] Improve _VALID_URL 2016-10-19 03:02:14 +07:00
Sergey M․
e2628fb6a0 [nrk] Relax _VALID_URL (closes #10928) 2016-10-19 02:59:44 +07:00
Sergey M․
df4939b1cd [nytimes] Fix typo 2016-10-17 22:16:23 +07:00
Sergey M․
0b94dbb115 [postprocessor/ffmpeg] PEP 8 2016-10-16 19:16:47 +07:00
Sergey M․
8d76bdf12b [extractor/common] Mention podcast in series fields section 2016-10-16 18:37:17 +07:00
Sergey M․
8204bacf1d Credit @johnhawkinson for nytimes podcasts (#10926) 2016-10-16 18:21:42 +07:00
Sergey M․
47da782337 [nytimes] Improve (closes #10926) 2016-10-16 18:21:02 +07:00
John Hawkinson
74324a7ac2 [nytimes] Add support for podcasts 2016-10-16 17:31:55 +07:00
Sergey M․
b0dfcab60a [pluralsight] Relax _VALID_URL (closes #10941) 2016-10-16 17:20:32 +07:00
DarkstaIkers
6cbb20bb09 Update crunchyroll.py 2016-03-29 14:26:24 -03:00
426 changed files with 6903 additions and 2774 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.10.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.10.16**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.08*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.08**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.10.16
[debug] youtube-dl version 2017.01.08
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information

View File

@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information

4
.gitignore vendored
View File

@@ -30,6 +30,10 @@ updates_key.pem
*.m4v
*.mp3
*.3gp
*.wav
*.ape
*.mkv
*.swf
*.part
*.swp
test/testdata

View File

@@ -186,3 +186,8 @@ Sebastian Blunt
Matěj Cepl
Xie Yanbo
Philip Xu
John Hawkinson
Rich Leeper
Zhong Jianxin
Thor77
Mattias Wadman

View File

@@ -58,7 +58,7 @@ We are then presented with a very complicated request when the original problem
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?
@@ -92,9 +92,9 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with:
@@ -124,7 +124,7 @@ After you have ensured this site is distributing it's content legally, you can f
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
@@ -199,7 +199,7 @@ Assume at this point `meta`'s layout is:
}
```
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
@@ -245,7 +245,7 @@ Say `meta` from the previous example has a `title` and you are about to extract
title = meta['title']
```
If `title` disappeares from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:

354
ChangeLog
View File

@@ -1,3 +1,357 @@
version 2017.01.08
Core
* Fix "invalid escape sequence" errors under Python 3.6 (#11581)
Extractors
+ [hitrecord] Add support for hitrecord.org (#10867, #11626)
- [videott] Remove extractor
* [swrmediathek] Improve extraction
- [sharesix] Remove extractor
- [aol:features] Remove extractor
* [sendtonews] Improve info extraction
* [3sat,phoenix] Fix extraction (#11619)
* [comedycentral/mtv] Add support for HLS videos (#11600)
* [discoverygo] Fix JSON data parsing (#11219, #11522)
version 2017.01.05
Extractors
+ [zdf] Fix extraction (#11055, #11063)
* [pornhub:playlist] Improve extraction (#11594)
+ [cctv] Add support for ncpa-classic.com (#11591)
+ [tunein] Add support for embeds (#11579)
version 2017.01.02
Extractors
* [cctv] Improve extraction (#879, #6753, #8541)
+ [nrktv:episodes] Add support for episodes (#11571)
+ [arkena] Add support for video.arkena.com (#11568)
version 2016.12.31
Core
+ Introduce --config-location option for custom configuration files (#6745,
#10648)
Extractors
+ [twitch] Add support for player.twitch.tv (#11535, #11537)
+ [videa] Add support for videa.hu (#8181, #11133)
* [vk] Fix postlive videos extraction
* [vk] Extract from playerParams (#11555)
- [freevideo] Remove extractor (#11515)
+ [showroomlive] Add support for showroom-live.com (#11458)
* [xhamster] Fix duration extraction (#11549)
* [rtve:live] Fix extraction (#11529)
* [brightcove:legacy] Improve embeds detection (#11523)
+ [twitch] Add support for rechat messages (#11524)
* [acast] Fix audio and timestamp extraction (#11521)
version 2016.12.22
Core
* [extractor/common] Improve detection of video-only formats in m3u8
manifests (#11507)
Extractors
+ [theplatform] Pass geo verification headers to SMIL request (#10146)
+ [viu] Pass geo verification headers to auth request
* [rtl2] Extract more formats and metadata
* [vbox7] Skip malformed JSON-LD (#11501)
* [uplynk] Force downloading using native HLS downloader (#11496)
+ [laola1] Add support for another extraction scenario (#11460)
version 2016.12.20
Core
* [extractor/common] Improve fragment URL construction for DASH media
* [extractor/common] Fix codec information extraction for mixed audio/video
DASH media (#11490)
Extractors
* [vbox7] Fix extraction (#11494)
+ [uktvplay] Add support for uktvplay.uktv.co.uk (#11027)
+ [piksel] Add support for player.piksel.com (#11246)
+ [vimeo] Add support for DASH formats
* [vimeo] Fix extraction for HLS formats (#11490)
* [kaltura] Fix wrong widget ID in some cases (#11480)
+ [nrktv:direkte] Add support for live streams (#11488)
* [pbs] Fix extraction for geo restricted videos (#7095)
* [brightcove:new] Skip widevine classic videos
+ [viu] Add support for viu.com (#10607, #11329)
version 2016.12.18
Core
+ [extractor/common] Recognize DASH formats in html5 media entries
Extractors
+ [ccma] Add support for ccma.cat (#11359)
* [laola1tv] Improve extraction
+ [laola1tv] Add support embed URLs (#11460)
* [nbc] Fix extraction for MSNBC videos (#11466)
* [twitch] Adapt to new videos pages URL schema (#11469)
+ [meipai] Add support for meipai.com (#10718)
* [jwplatform] Improve subtitles and duration extraction
+ [ondemandkorea] Add support for ondemandkorea.com (#10772)
+ [vvvvid] Add support for vvvvid.it (#5915)
version 2016.12.15
Core
+ [utils] Add convenience urljoin
Extractors
+ [openload] Recognize oload.tv URLs (#10408)
+ [facebook] Recognize .onion URLs (#11443)
* [vlive] Fix extraction (#11375, #11383)
+ [canvas] Extract DASH formats
+ [melonvod] Add support for vod.melon.com (#11419)
version 2016.12.12
Core
+ [utils] Add common user agents map
+ [common] Recognize HLS manifests that contain video only formats (#11394)
Extractors
+ [dplay] Use Safari user agent for HLS (#11418)
+ [facebook] Detect login required error message
* [facebook] Improve video selection (#11390)
+ [canalplus] Add another video id pattern (#11399)
* [mixcloud] Relax URL regular expression (#11406)
* [ctvnews] Relax URL regular expression (#11394)
+ [rte] Capture and output error message (#7746, #10498)
+ [prosiebensat1] Add support for DASH formats
* [srgssr] Improve extraction for geo restricted videos (#11089)
* [rts] Improve extraction for geo restricted videos (#4989)
version 2016.12.09
Core
* [socks] Fix error reporting (#11355)
Extractors
* [openload] Fix extraction (#10408)
* [pandoratv] Fix extraction (#11023)
+ [telebruxelles] Add support for emission URLs
* [telebruxelles] Extract all formats
+ [bloomberg] Add another video id regular expression (#11371)
* [fusion] Update ooyala id regular expression (#11364)
+ [1tv] Add support for playlists (#11335)
* [1tv] Improve extraction (#11335)
+ [aenetworks] Extract more formats (#11321)
+ [thisoldhouse] Recognize /tv-episode/ URLs (#11271)
version 2016.12.01
Extractors
* [soundcloud] Update client id (#11327)
* [ruutu] Detect DRM protected videos
+ [liveleak] Add support for youtube embeds (#10688)
* [spike] Fix full episodes support (#11312)
* [comedycentral] Fix full episodes support
* [normalboots] Rewrite in terms of JWPlatform (#11184)
* [teamfourstar] Rewrite in terms of JWPlatform (#11184)
- [screenwavemedia] Remove extractor (#11184)
version 2016.11.27
Extractors
+ [webcaster] Add support for webcaster.pro
+ [azubu] Add support for azubu.uol.com.br (#11305)
* [viki] Prefer hls formats
* [viki] Fix rtmp formats extraction (#11255)
* [puls4] Relax URL regular expression (#11267)
* [vevo] Improve artist extraction (#10911)
* [mitele] Relax URL regular expression and extract more metadata (#11244)
+ [cbslocal] Recognize New York site (#11285)
+ [youtube:playlist] Pass disable_polymer in URL query (#11193)
version 2016.11.22
Extractors
* [hellporno] Fix video extension extraction (#11247)
+ [hellporno] Add support for hellporno.net (#11247)
+ [amcnetworks] Recognize more BBC America URLs (#11263)
* [funnyordie] Improve extraction (#11208)
* [extractor/generic] Improve limelight embeds support
- [crunchyroll] Remove ScaledBorderAndShadow from ASS subtitles (#8207, #9028)
* [bandcamp] Fix free downloads extraction and extract all formats (#11067)
* [twitter:card] Relax URL regular expression (#11225)
+ [tvanouvelles] Add support for tvanouvelles.ca (#10616)
version 2016.11.18
Extractors
* [youtube:live] Relax URL regular expression (#11164)
* [openload] Fix extraction (#10408, #11122)
* [vlive] Prefer locale over language for subtitles id (#11203)
version 2016.11.14.1
Core
+ [downoader/fragment,f4m,hls] Respect HTTP headers from info dict
* [extractor/common] Fix media templates with Bandwidth substitution pattern in
MPD manifests (#11175)
* [extractor/common] Improve thumbnail extraction from JSON-LD
Extractors
+ [nrk] Workaround geo restriction
+ [nrk] Improve error detection and messages
+ [afreecatv] Add support for vod.afreecatv.com (#11174)
* [cda] Fix and improve extraction (#10929, #10936)
* [plays] Fix extraction (#11165)
* [eagleplatform] Fix extraction (#11160)
+ [audioboom] Recognize /posts/ URLs (#11149)
version 2016.11.08.1
Extractors
* [espn:article] Fix support for espn.com articles
* [franceculture] Fix extraction (#11140)
version 2016.11.08
Extractors
* [tmz:article] Fix extraction (#11052)
* [espn] Fix extraction (#11041)
* [mitele] Fix extraction after website redesign (#10824)
- [ard] Remove age restriction check (#11129)
* [generic] Improve support for pornhub.com embeds (#11100)
+ [generic] Add support for redtube.com embeds (#11099)
+ [generic] Add support for drtuber.com embeds (#11098)
+ [redtube] Add support for embed URLs
+ [drtuber] Add support for embed URLs
+ [yahoo] Improve content id extraction (#11088)
* [toutv] Relax URL regular expression (#11121)
version 2016.11.04
Core
* [extractor/common] Tolerate malformed RESOLUTION attribute in m3u8
manifests (#11113)
* [downloader/ism] Fix AVC Decoder Configuration Record
Extractors
+ [fox9] Add support for fox9.com (#11110)
+ [anvato] Extract more metadata and improve formats extraction
* [vodlocker] Improve removed videos detection (#11106)
+ [vzaar] Add support for vzaar.com (#11093)
+ [vice] Add support for uplynk preplay videos (#11101)
* [tubitv] Fix extraction (#11061)
+ [shahid] Add support for authentication (#11091)
+ [radiocanada] Add subtitles support (#11096)
+ [generic] Add support for ISM manifests
version 2016.11.02
Core
+ Add basic support for Smooth Streaming protocol (#8118, #10969)
* Improve MPD manifest base URL extraction (#10909, #11079)
* Fix --match-filter for int-like strings (#11082)
Extractors
+ [mva] Add support for ISM formats
+ [msn] Add support for ISM formats
+ [onet] Add support for ISM formats
+ [tvp] Add support for ISM formats
+ [nicknight] Add support for nicknight sites (#10769)
version 2016.10.30
Extractors
* [facebook] Improve 1080P video detection (#11073)
* [imgur] Recognize /r/ URLs (#11071)
* [beeg] Fix extraction (#11069)
* [openload] Fix extraction (#10408)
* [gvsearch] Modernize and fix search request (#11051)
* [adultswim] Fix extraction (#10979)
+ [nobelprize] Add support for nobelprize.org (#9999)
* [hornbunny] Fix extraction (#10981)
* [tvp] Improve video id extraction (#10585)
version 2016.10.26
Extractors
+ [rentv] Add support for ren.tv (#10620)
+ [ard] Detect unavailable videos (#11018)
* [vk] Fix extraction (#11022)
version 2016.10.25
Core
* Running youtube-dl in the background is fixed (#10996, #10706, #955)
Extractors
+ [jamendo] Add support for jamendo.com (#10132, #10736)
+ [pandatv] Add support for panda.tv (#10736)
+ [dotsub] Support Vimeo embed (#10964)
* [litv] Fix extraction
+ [vimeo] Delegate ondemand redirects to ondemand extractor (#10994)
* [vivo] Fix extraction (#11003)
+ [twitch:stream] Add support for rebroadcasts (#10995)
* [pluralsight] Fix subtitles conversion (#10990)
version 2016.10.21.1
Extractors
+ [pluralsight] Process all clip URLs (#10984)
version 2016.10.21
Core
- Disable thumbnails embedding in mkv
+ Add support for Comcast multiple-system operator (#10819)
Extractors
* [pluralsight] Adapt to new API (#10972)
* [openload] Fix extraction (#10408, #10971)
+ [natgeo] Extract m3u8 formats (#10959)
version 2016.10.19
Core
+ [utils] Expose PACKED_CODES_RE
+ [extractor/common] Extract non smil wowza mpd manifests
+ [extractor/common] Detect f4m audio-only formats
Extractors
* [vidzi] Fix extraction (#10908, #10952)
* [urplay] Fix subtitles extraction
+ [urplay] Add support for urskola.se (#10915)
+ [orf] Add subtitles support (#10939)
* [youtube] Fix --no-playlist behavior for youtu.be/id URLs (#10896)
* [nrk] Relax URL regular expression (#10928)
+ [nytimes] Add support for podcasts (#10926)
* [pluralsight] Relax URL regular expression (#10941)
version 2016.10.16
Core

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete

View File

@@ -29,7 +29,7 @@ Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.ex
You can also use pip:
sudo pip install --upgrade youtube-dl
sudo -H pip install --upgrade youtube-dl
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
@@ -44,11 +44,7 @@ Or with [MacPorts](https://www.macports.org/):
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like.
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...]
@@ -84,6 +80,9 @@ which means you can modify it, redistribute it or use it however you like.
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--config-location PATH Location of the configuration file; either
the path to the config or its containing
directory.
--flat-playlist Do not extract the videos of a playlist,
only list them.
--mark-watched Mark videos watched (YouTube only)
@@ -187,7 +186,7 @@ which means you can modify it, redistribute it or use it however you like.
of SIZE.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize Set file xattribute ytdl.filesize with
expected filesize (experimental)
expected file size (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
@@ -354,7 +353,7 @@ which means you can modify it, redistribute it or use it however you like.
-u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left
out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR Two-factor auth code
-2, --twofactor TWOFACTOR Two-factor authentication code
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku)
@@ -447,6 +446,8 @@ Note that options in configuration file are just the same options aka switches u
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
You can also use `--config-location` if you want to use custom configuration file for a particular youtube-dl run.
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by only you:
@@ -638,7 +639,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
@@ -664,7 +665,7 @@ $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger that 50 MB
# Download best video only format but no bigger than 50 MB
$ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
@@ -728,7 +729,7 @@ Add a file exclusion for `youtube-dl.exe` in Windows Defender settings.
YouTube changed their playlist format in March 2014 and later on, so you'll need at least youtube-dl 2014.07.25 to download all YouTube videos.
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging guys](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging people](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
### I'm getting an error when trying to use output template: `error: using output template conflicts with using title, video ID or auto number`
@@ -744,7 +745,7 @@ Most people asking this question are not aware that youtube-dl now defaults to d
### I get HTTP error 402 when trying to download a video. What's this?
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs?
@@ -756,9 +757,9 @@ Videos or video formats streamed via RTMP protocol can only be downloaded when [
Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](http://www.videolan.org/) or [mplayer](http://www.mplayerhq.hu/).
### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
### I extracted a video URL with `-g`, but it does not play on another machine / in my web browser.
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies and/or HTTP headers. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl. You can also get necessary cookies and HTTP headers from JSON output obtained with `--dump-json`.
It may be beneficial to use IPv6; in some cases, the restrictions are only applied to IPv4. Some services (sometimes only for a subset of videos) do not restrict the video URL by IP address, cookie, or user-agent, but these are the exception rather than the rule.
@@ -930,9 +931,9 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with:
@@ -962,7 +963,7 @@ After you have ensured this site is distributing it's content legally, you can f
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
@@ -1037,7 +1038,7 @@ Assume at this point `meta`'s layout is:
}
```
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
@@ -1083,7 +1084,7 @@ Say `meta` from the previous example has a `title` and you are about to extract
title = meta['title']
```
If `title` disappeares from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
@@ -1149,7 +1150,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@@ -1252,7 +1253,7 @@ We are then presented with a very complicated request when the original problem
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?

View File

@@ -25,5 +25,6 @@ def build_completion(opt_parser):
filled_template = template.replace("{{flags}}", " ".join(opts_flag))
f.write(filled_template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -424,8 +424,6 @@ class BuildHTTPRequestHandler(compat_http_server.BaseHTTPRequestHandler):
self.send_header('Content-Length', len(msg))
self.end_headers()
self.wfile.write(msg)
except HTTPError as e:
self.send_response(e.code, str(e))
else:
self.send_response(500, 'Unknown build method "%s"' % action)
else:

View File

@@ -2,11 +2,13 @@
from __future__ import unicode_literals
import base64
import io
import json
import mimetypes
import netrc
import optparse
import os
import re
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@@ -90,16 +92,23 @@ class GitHubReleaser(object):
def main():
parser = optparse.OptionParser(usage='%prog VERSION BUILDPATH')
parser = optparse.OptionParser(usage='%prog CHANGELOG VERSION BUILDPATH')
options, args = parser.parse_args()
if len(args) != 2:
if len(args) != 3:
parser.error('Expected a version and a build directory')
version, build_path = args
changelog_file, version, build_path = args
with io.open(changelog_file, encoding='utf-8') as inf:
changelog = inf.read()
mobj = re.search(r'(?s)version %s\n{2}(.+?)\n{3}' % version, changelog)
body = mobj.group(1) if mobj else ''
releaser = GitHubReleaser()
new_release = releaser.create_release(version, name='youtube-dl %s' % version)
new_release = releaser.create_release(
version, name='youtube-dl %s' % version, body=body)
release_id = new_release['id']
for asset in os.listdir(build_path):

View File

@@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(FISH_COMPLETION_FILE, 'w') as f:
f.write(filled_template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -23,6 +23,7 @@ def openssl_encode(algo, key, iv):
out, _ = prog.communicate(secret_msg)
return out
iv = key = [0x20, 0x15] + 14 * [0]
r = openssl_encode('aes-128-cbc', key, iv)

View File

@@ -32,5 +32,6 @@ def main():
with open('supportedsites.html', 'w', encoding='utf-8') as sitesf:
sitesf.write(template)
if __name__ == '__main__':
main()

View File

@@ -28,5 +28,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -59,6 +59,7 @@ def build_lazy_ie(ie, name):
s += make_valid_template.format(valid_url=ie._make_valid_url())
return s
# find the correct sorting and add the required base classes so that sublcasses
# can be correctly created
classes = _ALL_CLASSES[:-1]

View File

@@ -41,5 +41,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -74,5 +74,6 @@ def filter_options(readme):
return ret
if __name__ == '__main__':
main()

View File

@@ -110,7 +110,7 @@ RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done
ROOT=$(pwd)
python devscripts/create-github-release.py $version "$ROOT/build/$version"
python devscripts/create-github-release.py ChangeLog $version "$ROOT/build/$version"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"

View File

@@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(ZSH_COMPLETION_FILE, "w") as f:
f.write(template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -131,7 +131,8 @@
- **cbsnews**: CBS News
- **cbsnews:livevideo**: CBS News Live Videos
- **CBSSports**
- **CCTV**
- **CCMA**
- **CCTV**: 央视网
- **CDA**
- **CeskaTelevize**
- **channel9**: Channel 9
@@ -158,6 +159,7 @@
- **CollegeRama**
- **ComCarCoff**
- **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname**
- **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
@@ -225,6 +227,7 @@
- **EroProfile**
- **Escapist**
- **ESPN**
- **ESPNArticle**
- **EsriVideo**
- **Europa**
- **EveryonesMixtape**
@@ -237,7 +240,6 @@
- **fc2**
- **fc2:embed**
- **Fczenit**
- **features.aol.com**
- **fernsehkritik.tv**
- **Firstpost**
- **FiveTV**
@@ -247,6 +249,7 @@
- **FootyRoom**
- **Formula1**
- **FOX**
- **FOX9**
- **Foxgay**
- **foxnews**: Fox News and Fox Business Video
- **foxnews:article**
@@ -259,7 +262,6 @@
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **Fusion**
@@ -301,6 +303,7 @@
- **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
- **HitRecord**
- **HornBunny**
- **HotNewHipHop**
- **HotStar**
@@ -332,6 +335,8 @@
- **ivideon**: Ivideon TV
- **Iwara**
- **Izlesene**
- **Jamendo**
- **JamendoAlbum**
- **JeuxVideo**
- **Jove**
- **jpopsuki.tv**
@@ -359,7 +364,8 @@
- **kuwo:singer**: 酷我音乐 - 歌手
- **kuwo:song**: 酷我音乐
- **la7.it**
- **Laola1Tv**
- **laola1tv**
- **laola1tv:embed**
- **LCI**
- **Lcp**
- **LcpPlay**
@@ -397,6 +403,8 @@
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
- **Meipai**: 美拍
- **MelonVOD**
- **META**
- **metacafe**
- **Metacritic**
@@ -481,11 +489,13 @@
- **nhl.com:videocenter:category**: NHL videocenter category
- **nick.com**
- **nick.de**
- **nicknight**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **Nintendo**
- **njoy**: N-JOY
- **njoy:embed**
- **NobelPrize**
- **Noco**
- **Normalboots**
- **NosVideo**
@@ -506,6 +516,8 @@
- **NRKPlaylist**
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
- **NRKTVEpisodes**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
@@ -516,6 +528,7 @@
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **OnDemandKorea**
- **onet.tv**
- **onet.tv:channel**
- **OnionStudios**
@@ -527,6 +540,7 @@
- **orf:iptv**: iptv.ORF.at
- **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek
- **PandaTV**: 熊猫TV
- **pandora.tv**: 판도라TV
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
@@ -538,6 +552,7 @@
- **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de**
- **Photobucket**
- **Piksel**
- **Pinkbike**
- **Pladform**
- **play.fm**
@@ -586,6 +601,8 @@
- **RDS**: RDS.ca
- **RedTube**
- **RegioTV**
- **RENTV**
- **RENTVArticle**
- **Restudy**
- **Reuters**
- **ReverbNation**
@@ -633,16 +650,14 @@
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**
- **ScreenJunkies**
- **ScreenwaveMedia**
- **Seeker**
- **SenateISVP**
- **SendtoNews**
- **ServingSys**
- **Sexu**
- **Shahid**
- **Shared**: shared.sx and vivo.sx
- **ShareSix**
- **Shared**: shared.sx
- **ShowRoomLive**
- **Sina**
- **SixPlay**
- **skynewsarabia:article**
@@ -706,7 +721,7 @@
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
- **Teamcoco**
- **TeamFour**
- **TeamFourStar**
- **TechTalks**
- **techtv.mit.edu**
- **ted**
@@ -762,6 +777,8 @@
- **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se
- **TVANouvelles**
- **TVANouvellesArticle**
- **TVC**
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
@@ -773,10 +790,13 @@
- **Tweakers**
- **twitch:chapter**
- **twitch:clips**
- **twitch:past_broadcasts**
- **twitch:profile**
- **twitch:stream**
- **twitch:video**
- **twitch:videos:all**
- **twitch:videos:highlights**
- **twitch:videos:past-broadcasts**
- **twitch:videos:uploads**
- **twitch:vod**
- **twitter**
- **twitter:amplify**
@@ -784,6 +804,7 @@
- **udemy**
- **udemy:course**
- **UDNEmbed**: 聯合影音
- **UKTVPlay**
- **Unistra**
- **uol.com.br**
- **uplynk**
@@ -812,6 +833,7 @@
- **ViceShow**
- **Vidbit**
- **Viddler**
- **Videa**
- **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective**
@@ -821,7 +843,6 @@
- **videomore:season**
- **videomore:video**
- **VideoPremium**
- **VideoTt**: video.tt - Your True Tube (Currently broken)
- **videoweed**: VideoWeed
- **Vidio**
- **vidme**
@@ -848,6 +869,10 @@
- **Vimple**: Vimple - one-click video hosting
- **Vine**
- **vine:user**
- **Viu**
- **viu:ott**
- **viu:playlist**
- **Vivo**: vivo.sx
- **vk**: VK
- **vk:uservideos**: VK - User's Videos
- **vk:wallpost**
@@ -861,7 +886,9 @@
- **VRT**
- **vube**: Vube.com
- **VuClip**
- **VVVVID**
- **VyboryMos**
- **Vzaar**
- **Walla**
- **washingtonpost**
- **washingtonpost:article**
@@ -869,6 +896,8 @@
- **WatchIndianPorn**: Watch Indian Porn
- **WDR**
- **wdr:mobile**
- **Webcaster**
- **WebcasterFeed**
- **WebOfStories**
- **WebOfStoriesPlaylist**
- **WeiqiTV**: WQTV

View File

@@ -84,5 +84,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -605,6 +605,7 @@ class TestYoutubeDL(unittest.TestCase):
'extractor': 'TEST',
'duration': 30,
'filesize': 10 * 1024,
'playlist_id': '42',
}
second = {
'id': '2',
@@ -614,6 +615,7 @@ class TestYoutubeDL(unittest.TestCase):
'duration': 10,
'description': 'foo',
'filesize': 5 * 1024,
'playlist_id': '43',
}
videos = [first, second]
@@ -650,6 +652,10 @@ class TestYoutubeDL(unittest.TestCase):
res = get_videos(f)
self.assertEqual(res, ['1'])
f = match_filter_func('playlist_id = 42')
res = get_videos(f)
self.assertEqual(res, ['1'])
def test_playlist_items_selection(self):
entries = [{
'id': compat_str(i),

View File

@@ -51,5 +51,6 @@ class TestAES(unittest.TestCase):
decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg)
if __name__ == '__main__':
unittest.main()

View File

@@ -60,6 +60,7 @@ def _file_md5(fn):
with open(fn, 'rb') as f:
return hashlib.md5(f.read()).hexdigest()
defs = gettestcases()
@@ -217,6 +218,7 @@ def generator(test_case):
return test_template
# And add them to TestDownload
for n, test_case in enumerate(defs):
test_method = generator(test_case)

View File

@@ -39,5 +39,6 @@ class TestExecution(unittest.TestCase):
_, stderr = p.communicate()
self.assertFalse(stderr)
if __name__ == '__main__':
unittest.main()

View File

@@ -169,5 +169,6 @@ class TestProxy(unittest.TestCase):
# b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__':
unittest.main()

View File

@@ -43,5 +43,6 @@ class TestIqiyiSDKInterpreter(unittest.TestCase):
ie._login()
self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__':
unittest.main()

View File

@@ -104,6 +104,14 @@ class TestJSInterpreter(unittest.TestCase):
}''')
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_call(self):
jsi = JSInterpreter('''
function x() { return 2; }
function y(a) { return x() + a; }
function z() { return y(3); }
''')
self.assertEqual(jsi.call_function('z'), 5)
if __name__ == '__main__':
unittest.main()

View File

@@ -69,6 +69,8 @@ from youtube_dl.utils import (
uppercase_escape,
lowercase_escape,
url_basename,
base_url,
urljoin,
urlencode_postdata,
urshift,
update_url_query,
@@ -437,6 +439,30 @@ class TestUtil(unittest.TestCase):
url_basename('http://media.w3.org/2010/05/sintel/trailer.mp4'),
'trailer.mp4')
def test_base_url(self):
self.assertEqual(base_url('http://foo.de/'), 'http://foo.de/')
self.assertEqual(base_url('http://foo.de/bar'), 'http://foo.de/')
self.assertEqual(base_url('http://foo.de/bar/'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz?x=z/x/c'), 'http://foo.de/bar/')
def test_urljoin(self):
self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('//foo.de/', '/a/b/c.txt'), '//foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
self.assertEqual(urljoin(None, 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin(None, '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
self.assertEqual(urljoin('', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin(['foobar'], 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', None), None)
self.assertEqual(urljoin('http://foo.de/', ''), None)
self.assertEqual(urljoin('http://foo.de/', ['foobar']), None)
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt')
def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None)
self.assertEqual(parse_age_limit(False), None)
@@ -1067,5 +1093,6 @@ The first line
self.assertEqual(get_element_by_class('foo', html), 'nice')
self.assertEqual(get_element_by_class('no-such-class', html), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -66,5 +66,6 @@ class TestVerboseOutput(unittest.TestCase):
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
if __name__ == '__main__':
unittest.main()

View File

@@ -24,6 +24,7 @@ class YoutubeDL(youtube_dl.YoutubeDL):
super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen
params = get_params({
'writeannotations': True,
'skip_download': True,
@@ -74,5 +75,6 @@ class TestAnnotations(unittest.TestCase):
def tearDown(self):
try_rm(ANNOTATIONS_FILE)
if __name__ == '__main__':
unittest.main()

View File

@@ -66,5 +66,6 @@ class TestYoutubeLists(unittest.TestCase):
for entry in result['entries']:
self.assertTrue(entry.get('title'))
if __name__ == '__main__':
unittest.main()

View File

@@ -114,6 +114,7 @@ def make_tfunc(url, stype, sig_input, expected_sig):
test_func.__name__ = str('test_signature_' + stype + '_' + test_id)
setattr(TestSignature, test_func.__name__, test_func)
for test_spec in _TESTS:
make_tfunc(*test_spec)

View File

@@ -1339,7 +1339,7 @@ class YoutubeDL(object):
format['format_id'] = compat_str(i)
else:
# Sanitize format_id from characters used in format selector expression
format['format_id'] = re.sub('[\s,/+\[\]()]', '_', format['format_id'])
format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
@@ -1658,7 +1658,7 @@ class YoutubeDL(object):
video_ext, audio_ext = audio.get('ext'), video.get('ext')
if video_ext and audio_ext:
COMPATIBLE_EXTS = (
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v'),
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'),
('webm')
)
for exts in COMPATIBLE_EXTS:

View File

@@ -95,8 +95,7 @@ def _real_main(argv=None):
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError:
sys.exit('ERROR: batch file could not be read')
all_urls = batch_urls + args
all_urls = [url.strip() for url in all_urls]
all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
_enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@@ -406,7 +405,7 @@ def _real_main(argv=None):
'postprocessor_args': postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
'config_location': opts.config_location,
}
with YoutubeDL(ydl_opts) as ydl:
@@ -450,4 +449,5 @@ def main(argv=None):
except KeyboardInterrupt:
sys.exit('\nERROR: Interrupted by user')
__all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']

View File

@@ -174,6 +174,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
return plaintext
RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36)
SBOX = (0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
@@ -328,4 +329,5 @@ def inc(data):
break
return data
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']

View File

@@ -2344,7 +2344,7 @@ try:
from urllib.parse import unquote_plus as compat_urllib_parse_unquote_plus
except ImportError: # Python 2
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire')
else re.compile('([\x00-\x7f]+)'))
else re.compile(r'([\x00-\x7f]+)'))
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version
@@ -2491,6 +2491,7 @@ class _TreeBuilder(etree.TreeBuilder):
def doctype(self, name, pubid, system):
pass
if sys.version_info[0] >= 3:
def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
@@ -2787,6 +2788,7 @@ def workaround_optparse_bug9161():
return real_add_option(self, *bargs, **bkwargs)
optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size
else:

View File

@@ -7,6 +7,7 @@ from .http import HttpFD
from .rtmp import RtmpFD
from .dash import DashSegmentsFD
from .rtsp import RtspFD
from .ism import IsmFD
from .external import (
get_external_downloader,
FFmpegFD,
@@ -24,6 +25,7 @@ PROTOCOL_MAP = {
'rtsp': RtspFD,
'f4m': F4mFD,
'http_dash_segments': DashSegmentsFD,
'ism': IsmFD,
}

View File

@@ -293,6 +293,7 @@ class FFmpegFD(ExternalFD):
class AVconvFD(FFmpegFD):
pass
_BY_NAME = dict(
(klass.get_basename(), klass)
for name, klass in globals().items()

View File

@@ -314,7 +314,8 @@ class F4mFD(FragmentFD):
man_url = info_dict['url']
requested_bitrate = info_dict.get('tbr')
self.to_screen('[%s] Downloading f4m manifest' % self.FD_NAME)
urlh = self.ydl.urlopen(man_url)
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244
@@ -387,7 +388,10 @@ class F4mFD(FragmentFD):
url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query))
frag_filename = '%s-%s' % (ctx['tmpfilename'], name)
try:
success = ctx['dl'].download(frag_filename, {'url': url_parsed.geturl()})
success = ctx['dl'].download(frag_filename, {
'url': url_parsed.geturl(),
'http_headers': info_dict.get('http_headers'),
})
if not success:
return False
(down, frag_sanitized) = sanitize_open(frag_filename, 'rb')

View File

@@ -9,6 +9,7 @@ from ..utils import (
error_to_compat_str,
encodeFilename,
sanitize_open,
sanitized_Request,
)
@@ -37,6 +38,10 @@ class FragmentFD(FileDownloader):
def report_skip_fragment(self, fragment_name):
self.to_screen('[download] Skipping fragment %s...' % fragment_name)
def _prepare_url(self, info_dict, url):
headers = info_dict.get('http_headers')
return sanitized_Request(url, None, headers) if headers else url
def _prepare_and_start_frag_download(self, ctx):
self._prepare_frag_download(ctx)
self._start_frag_download(ctx)

View File

@@ -59,11 +59,15 @@ class HlsFD(FragmentFD):
def real_download(self, filename, info_dict):
man_url = info_dict['url']
self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME)
manifest = self.ydl.urlopen(man_url).read()
manifest = self.ydl.urlopen(self._prepare_url(info_dict, man_url)).read()
s = manifest.decode('utf-8', 'ignore')
if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url'):
self.report_error('pycrypto not found. Please install it.')
return False
self.report_warning(
'hlsnative has detected features it does not support, '
'extraction will be delegated to ffmpeg')
@@ -112,7 +116,10 @@ class HlsFD(FragmentFD):
count = 0
while count <= fragment_retries:
try:
success = ctx['dl'].download(frag_filename, {'url': frag_url})
success = ctx['dl'].download(frag_filename, {
'url': frag_url,
'http_headers': info_dict.get('http_headers'),
})
if not success:
return False
down, frag_sanitized = sanitize_open(frag_filename, 'rb')

View File

@@ -0,0 +1,271 @@
from __future__ import unicode_literals
import os
import time
import struct
import binascii
import io
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import (
sanitize_open,
encodeFilename,
)
u8 = struct.Struct(b'>B')
u88 = struct.Struct(b'>Bx')
u16 = struct.Struct(b'>H')
u1616 = struct.Struct(b'>Hxx')
u32 = struct.Struct(b'>I')
u64 = struct.Struct(b'>Q')
s88 = struct.Struct(b'>bx')
s16 = struct.Struct(b'>h')
s1616 = struct.Struct(b'>hxx')
s32 = struct.Struct(b'>i')
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
TRACK_ENABLED = 0x1
TRACK_IN_MOVIE = 0x2
TRACK_IN_PREVIEW = 0x4
SELF_CONTAINED = 0x1
def box(box_type, payload):
return u32.pack(8 + len(payload)) + box_type + payload
def full_box(box_type, version, flags, payload):
return box(box_type, u8.pack(version) + u32.pack(flags)[1:] + payload)
def write_piff_header(stream, params):
track_id = params['track_id']
fourcc = params['fourcc']
duration = params['duration']
timescale = params.get('timescale', 10000000)
language = params.get('language', 'und')
height = params.get('height', 0)
width = params.get('width', 0)
is_audio = width == 0 and height == 0
creation_time = modification_time = int(time.time())
ftyp_payload = b'isml' # major brand
ftyp_payload += u32.pack(1) # minor version
ftyp_payload += b'piff' + b'iso2' # compatible brands
stream.write(box(b'ftyp', ftyp_payload)) # File Type Box
mvhd_payload = u64.pack(creation_time)
mvhd_payload += u64.pack(modification_time)
mvhd_payload += u32.pack(timescale)
mvhd_payload += u64.pack(duration)
mvhd_payload += s1616.pack(1) # rate
mvhd_payload += s88.pack(1) # volume
mvhd_payload += u16.pack(0) # reserved
mvhd_payload += u32.pack(0) * 2 # reserved
mvhd_payload += unity_matrix
mvhd_payload += u32.pack(0) * 6 # pre defined
mvhd_payload += u32.pack(0xffffffff) # next track id
moov_payload = full_box(b'mvhd', 1, 0, mvhd_payload) # Movie Header Box
tkhd_payload = u64.pack(creation_time)
tkhd_payload += u64.pack(modification_time)
tkhd_payload += u32.pack(track_id) # track id
tkhd_payload += u32.pack(0) # reserved
tkhd_payload += u64.pack(duration)
tkhd_payload += u32.pack(0) * 2 # reserved
tkhd_payload += s16.pack(0) # layer
tkhd_payload += s16.pack(0) # alternate group
tkhd_payload += s88.pack(1 if is_audio else 0) # volume
tkhd_payload += u16.pack(0) # reserved
tkhd_payload += unity_matrix
tkhd_payload += u1616.pack(width)
tkhd_payload += u1616.pack(height)
trak_payload = full_box(b'tkhd', 1, TRACK_ENABLED | TRACK_IN_MOVIE | TRACK_IN_PREVIEW, tkhd_payload) # Track Header Box
mdhd_payload = u64.pack(creation_time)
mdhd_payload += u64.pack(modification_time)
mdhd_payload += u32.pack(timescale)
mdhd_payload += u64.pack(duration)
mdhd_payload += u16.pack(((ord(language[0]) - 0x60) << 10) | ((ord(language[1]) - 0x60) << 5) | (ord(language[2]) - 0x60))
mdhd_payload += u16.pack(0) # pre defined
mdia_payload = full_box(b'mdhd', 1, 0, mdhd_payload) # Media Header Box
hdlr_payload = u32.pack(0) # pre defined
hdlr_payload += b'soun' if is_audio else b'vide' # handler type
hdlr_payload += u32.pack(0) * 3 # reserved
hdlr_payload += (b'Sound' if is_audio else b'Video') + b'Handler\0' # name
mdia_payload += full_box(b'hdlr', 0, 0, hdlr_payload) # Handler Reference Box
if is_audio:
smhd_payload = s88.pack(0) # balance
smhd_payload = u16.pack(0) # reserved
media_header_box = full_box(b'smhd', 0, 0, smhd_payload) # Sound Media Header
else:
vmhd_payload = u16.pack(0) # graphics mode
vmhd_payload += u16.pack(0) * 3 # opcolor
media_header_box = full_box(b'vmhd', 0, 1, vmhd_payload) # Video Media Header
minf_payload = media_header_box
dref_payload = u32.pack(1) # entry count
dref_payload += full_box(b'url ', 0, SELF_CONTAINED, b'') # Data Entry URL Box
dinf_payload = full_box(b'dref', 0, 0, dref_payload) # Data Reference Box
minf_payload += box(b'dinf', dinf_payload) # Data Information Box
stsd_payload = u32.pack(1) # entry count
sample_entry_payload = u8.pack(0) * 6 # reserved
sample_entry_payload += u16.pack(1) # data reference index
if is_audio:
sample_entry_payload += u32.pack(0) * 2 # reserved
sample_entry_payload += u16.pack(params.get('channels', 2))
sample_entry_payload += u16.pack(params.get('bits_per_sample', 16))
sample_entry_payload += u16.pack(0) # pre defined
sample_entry_payload += u16.pack(0) # reserved
sample_entry_payload += u1616.pack(params['sampling_rate'])
if fourcc == 'AACL':
sample_entry_box = box(b'mp4a', sample_entry_payload)
else:
sample_entry_payload = sample_entry_payload
sample_entry_payload += u16.pack(0) # pre defined
sample_entry_payload += u16.pack(0) # reserved
sample_entry_payload += u32.pack(0) * 3 # pre defined
sample_entry_payload += u16.pack(width)
sample_entry_payload += u16.pack(height)
sample_entry_payload += u1616.pack(0x48) # horiz resolution 72 dpi
sample_entry_payload += u1616.pack(0x48) # vert resolution 72 dpi
sample_entry_payload += u32.pack(0) # reserved
sample_entry_payload += u16.pack(1) # frame count
sample_entry_payload += u8.pack(0) * 32 # compressor name
sample_entry_payload += u16.pack(0x18) # depth
sample_entry_payload += s16.pack(-1) # pre defined
codec_private_data = binascii.unhexlify(params['codec_private_data'])
if fourcc in ('H264', 'AVC1'):
sps, pps = codec_private_data.split(u32.pack(1))[1:]
avcc_payload = u8.pack(1) # configuration version
avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication
avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete represenation (1) + reserved (11111) + length size minus one
avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001)
avcc_payload += u16.pack(len(sps))
avcc_payload += sps
avcc_payload += u8.pack(1) # number of pps
avcc_payload += u16.pack(len(pps))
avcc_payload += pps
sample_entry_payload += box(b'avcC', avcc_payload) # AVC Decoder Configuration Record
sample_entry_box = box(b'avc1', sample_entry_payload) # AVC Simple Entry
stsd_payload += sample_entry_box
stbl_payload = full_box(b'stsd', 0, 0, stsd_payload) # Sample Description Box
stts_payload = u32.pack(0) # entry count
stbl_payload += full_box(b'stts', 0, 0, stts_payload) # Decoding Time to Sample Box
stsc_payload = u32.pack(0) # entry count
stbl_payload += full_box(b'stsc', 0, 0, stsc_payload) # Sample To Chunk Box
stco_payload = u32.pack(0) # entry count
stbl_payload += full_box(b'stco', 0, 0, stco_payload) # Chunk Offset Box
minf_payload += box(b'stbl', stbl_payload) # Sample Table Box
mdia_payload += box(b'minf', minf_payload) # Media Information Box
trak_payload += box(b'mdia', mdia_payload) # Media Box
moov_payload += box(b'trak', trak_payload) # Track Box
mehd_payload = u64.pack(duration)
mvex_payload = full_box(b'mehd', 1, 0, mehd_payload) # Movie Extends Header Box
trex_payload = u32.pack(track_id) # track id
trex_payload += u32.pack(1) # default sample description index
trex_payload += u32.pack(0) # default sample duration
trex_payload += u32.pack(0) # default sample size
trex_payload += u32.pack(0) # default sample flags
mvex_payload += full_box(b'trex', 0, 0, trex_payload) # Track Extends Box
moov_payload += box(b'mvex', mvex_payload) # Movie Extends Box
stream.write(box(b'moov', moov_payload)) # Movie Box
def extract_box_data(data, box_sequence):
data_reader = io.BytesIO(data)
while True:
box_size = u32.unpack(data_reader.read(4))[0]
box_type = data_reader.read(4)
if box_type == box_sequence[0]:
box_data = data_reader.read(box_size - 8)
if len(box_sequence) == 1:
return box_data
return extract_box_data(box_data, box_sequence[1:])
data_reader.seek(box_size - 8, 1)
class IsmFD(FragmentFD):
"""
Download segments in a ISM manifest
"""
FD_NAME = 'ism'
def real_download(self, filename, info_dict):
segments = info_dict['fragments'][:1] if self.params.get(
'test', False) else info_dict['fragments']
ctx = {
'filename': filename,
'total_frags': len(segments),
}
self._prepare_and_start_frag_download(ctx)
segments_filenames = []
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
track_written = False
for i, segment in enumerate(segments):
segment_url = segment['url']
segment_name = 'Frag%d' % i
target_filename = '%s-%s' % (ctx['tmpfilename'], segment_name)
count = 0
while count <= fragment_retries:
try:
success = ctx['dl'].download(target_filename, {'url': segment_url})
if not success:
return False
down, target_sanitized = sanitize_open(target_filename, 'rb')
down_data = down.read()
if not track_written:
tfhd_data = extract_box_data(down_data, [b'moof', b'traf', b'tfhd'])
info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0]
write_piff_header(ctx['dest_stream'], info_dict['_download_params'])
track_written = True
ctx['dest_stream'].write(down_data)
down.close()
segments_filenames.append(target_sanitized)
break
except compat_urllib_error.HTTPError as err:
count += 1
if count <= fragment_retries:
self.report_retry_fragment(err, segment_name, count, fragment_retries)
if count > fragment_retries:
if skip_unavailable_fragments:
self.report_skip_fragment(segment_name)
continue
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False
self._finish_frag_download(ctx)
for segment_file in segments_filenames:
os.remove(encodeFilename(segment_file))
return True

View File

@@ -23,7 +23,7 @@ class AbcNewsVideoIE(AMPIE):
'title': '\'This Week\' Exclusive: Iran\'s Foreign Minister Zarif',
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
@@ -59,7 +59,7 @@ class AbcNewsIE(InfoExtractor):
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
},

View File

@@ -23,7 +23,7 @@ class ABCOTVSIE(InfoExtractor):
'ext': 'mp4',
'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421123075,
'upload_date': '20150113',
'uploader': 'Jonathan Bloom',

View File

@@ -8,6 +8,7 @@ from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_iso8601,
OnDemandPagedList,
)
@@ -15,18 +16,33 @@ from ..utils import (
class ACastIE(InfoExtractor):
IE_NAME = 'acast'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
_TEST = {
_TESTS = [{
# test with one bling
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
'timestamp': 1196172000000,
'timestamp': 1196172000,
'upload_date': '20071127',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211,
}
}
}, {
# test with multiple blings
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': '55c0097badd7095f494c99a172f86501',
'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3',
'title': '2. Raggarmordet - Röster ur det förflutna',
'timestamp': 1477346700,
'upload_date': '20161024',
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'duration': 2797,
}
}]
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
@@ -35,11 +51,11 @@ class ACastIE(InfoExtractor):
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
'url': cast_data['blings'][0]['audio'],
'url': [b['audio'] for b in cast_data['blings'] if b['type'] == 'BlingAudio'][0],
'title': cast_data['name'],
'description': cast_data.get('description'),
'thumbnail': cast_data.get('image'),
'timestamp': int_or_none(cast_data.get('publishingDate')),
'timestamp': parse_iso8601(cast_data.get('publishingDate')),
'duration': int_or_none(cast_data.get('duration')),
}

View File

@@ -26,6 +26,11 @@ MSO_INFO = {
'username_field': 'UserName',
'password_field': 'UserPassword',
},
'Comcast_SSO': {
'name': 'Comcast XFINITY',
'username_field': 'user',
'password_field': 'passwd',
},
'thr030': {
'name': '3 Rivers Communications'
},
@@ -1364,14 +1369,53 @@ class AdobePassIE(InfoExtractor):
'domain_name': 'adobe.com',
'redirect_url': url,
})
provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page')
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password,
})
if mso_id != 'Rogers':
post_form(mvpd_confirm_page_res, 'Confirming Login')
if mso_id == 'Comcast_SSO':
# Comcast page flow varies by video site and whether you
# are on Comcast's network.
provider_redirect_page, urlh = provider_redirect_page_res
# Check for Comcast auto login
if 'automatically signing you in' in provider_redirect_page:
oauth_redirect_url = self._html_search_regex(
r'window\.location\s*=\s*[\'"]([^\'"]+)',
provider_redirect_page, 'oauth redirect')
# Just need to process the request. No useful data comes back
self._download_webpage(
oauth_redirect_url, video_id, 'Confirming auto login')
else:
if '<form name="signin"' in provider_redirect_page:
# already have the form, just fill it
provider_login_page_res = provider_redirect_page_res
elif 'http-equiv="refresh"' in provider_redirect_page:
# redirects to the login page
oauth_redirect_url = self._html_search_regex(
r'content="0;\s*url=([^\'"]+)',
provider_redirect_page, 'meta refresh redirect')
provider_login_page_res = self._download_webpage_handle(
oauth_redirect_url,
video_id, 'Downloading Provider Login Page')
else:
provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page')
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password,
})
mvpd_confirm_page, urlh = mvpd_confirm_page_res
if '<button class="submit" value="Resume">Resume</button>' in mvpd_confirm_page:
post_form(mvpd_confirm_page_res, 'Confirming Login')
else:
# Normal, non-Comcast flow
provider_login_page_res = post_form(
provider_redirect_page_res, 'Downloading Provider Login Page')
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
mso_info.get('username_field', 'username'): username,
mso_info.get('password_field', 'password'): password,
})
if mso_id != 'Rogers':
post_form(mvpd_confirm_page_res, 'Confirming Login')
session = self._download_webpage(
self._SERVICE_PROVIDER_TEMPLATE % 'session', video_id,

View File

@@ -30,7 +30,7 @@ class AdobeTVIE(AdobeTVBaseIE):
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20110914',
'duration': 60,
'view_count': int,

View File

@@ -96,6 +96,27 @@ class AdultSwimIE(TurnerBaseIE):
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest'],
}, {
'url': 'http://www.adultswim.com/videos/toonami/friday-october-14th-2016/',
'info_dict': {
'id': 'eYiLsKVgQ6qTC6agD67Sig',
'title': 'Toonami - Friday, October 14th, 2016',
'description': 'md5:99892c96ffc85e159a428de85c30acde',
},
'playlist': [{
'md5': '',
'info_dict': {
'id': 'eYiLsKVgQ6qTC6agD67Sig',
'ext': 'mp4',
'title': 'Toonami - Friday, October 14th, 2016',
'description': 'md5:99892c96ffc85e159a428de85c30acde',
},
}],
'params': {
# m3u8 download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest'],
}]
@staticmethod
@@ -163,6 +184,8 @@ class AdultSwimIE(TurnerBaseIE):
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
elif video_info.get('videoPlaybackID'):
segment_ids = [video_info['videoPlaybackID']]
elif video_info.get('id'):
segment_ids = [video_info['id']]
else:
if video_info.get('auth') is True:
raise ExtractorError(

View File

@@ -26,7 +26,7 @@ class AENetworksIE(AENetworksBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': '8ff93eb073449f151d6b90c0ae1ef0c7',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': {
'id': '22253814',
'ext': 'mp4',
@@ -99,7 +99,7 @@ class AENetworksIE(AENetworksBaseIE):
query = {
'mbr': 'true',
'assetTypes': 'medium_video_s3'
'assetTypes': 'high_video_s3'
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
@@ -155,7 +155,7 @@ class HistoryTopicIE(AENetworksBaseIE):
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 24,
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
@@ -193,7 +193,8 @@ class HistoryTopicIE(AENetworksBaseIE):
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
'switch': 'hls'
'switch': 'hls',
'assetTypes': 'high_video_ak',
})
else:
webpage = self._download_webpage(url, topic_id)
@@ -203,6 +204,7 @@ class HistoryTopicIE(AENetworksBaseIE):
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls'
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@@ -11,6 +11,7 @@ from ..compat import (
from ..utils import (
ExtractorError,
int_or_none,
update_url_query,
xpath_element,
xpath_text,
)
@@ -18,12 +19,18 @@ from ..utils import (
class AfreecaTVIE(InfoExtractor):
IE_DESC = 'afreecatv.com'
_VALID_URL = r'''(?x)^
https?://(?:(live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)?
(?:
/app/(?:index|read_ucc_bbs)\.cgi|
/player/[Pp]layer\.(?:swf|html))
\?.*?\bnTitleNo=(?P<id>\d+)'''
_VALID_URL = r'''(?x)
https?://
(?:
(?:(?:live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)?
(?:
/app/(?:index|read_ucc_bbs)\.cgi|
/player/[Pp]layer\.(?:swf|html)
)\?.*?\bnTitleNo=|
vod\.afreecatv\.com/PLAYER/STATION/
)
(?P<id>\d+)
'''
_TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
@@ -66,6 +73,9 @@ class AfreecaTVIE(InfoExtractor):
}, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True,
}, {
'url': 'http://vod.afreecatv.com/PLAYER/STATION/15055030',
'only_matching': True,
}]
@staticmethod
@@ -83,7 +93,9 @@ class AfreecaTVIE(InfoExtractor):
info_url = compat_urlparse.urlunparse(parsed_url._replace(
netloc='afbbs.afreecatv.com:8080',
path='/api/video/get_video_info.php'))
video_xml = self._download_xml(info_url, video_id)
video_xml = self._download_xml(
update_url_query(info_url, {'nTitleNo': video_id}), video_id)
if xpath_element(video_xml, './track/video/file') is None:
raise ExtractorError('Specified AfreecaTV video does not exist',

View File

@@ -20,7 +20,7 @@ class AirMozillaIE(InfoExtractor):
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
'thumbnail': 're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
'thumbnail': r're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',

View File

@@ -21,7 +21,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Astérix - Le Domaine des Dieux Teaser VF',
'description': 'md5:4a754271d9c6f16c72629a8a993ee884',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19540403&cfilm=222257.html',
@@ -32,7 +32,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Planes 2 Bande-annonce VF',
'description': 'Regardez la bande annonce du film Planes 2 (Planes 2 Bande-annonce VF). Planes 2, un film de Roberts Gannaway',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19544709&cfilm=181290.html',
@@ -43,7 +43,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Dragons 2 - Bande annonce finale VF',
'description': 'md5:6cdd2d7c2687d4c6aafe80a35e17267a',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/video-19550147/',
@@ -53,7 +53,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Faux Raccord N°123 - Les gaffes de Cliffhanger',
'description': 'md5:bc734b83ffa2d8a12188d9eb48bb6354',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}]

View File

@@ -19,7 +19,7 @@ class AlphaPornoIE(InfoExtractor):
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
'ext': 'mp4',
'title': 'Sensual striptease porn with Samantha Alexandra',
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1418694611,
'upload_date': '20141216',
'duration': 387,

View File

@@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?season-\d+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '',
@@ -41,6 +41,9 @@ class AMCNetworksIE(ThePlatformIE):
}, {
'url': 'http://www.ifc.com/movies/chaos',
'only_matching': True,
}, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -157,22 +157,16 @@ class AnvatoIE(InfoExtractor):
video_data_url, video_id, transform_source=strip_jsonp,
data=json.dumps(payload).encode('utf-8'))
def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json(self._html_search_regex(
r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
'Anvato player data'), video_id)
video_id = anvplayer_data['video']
access_key = anvplayer_data['accessKey']
def _get_anvato_videos(self, access_key, video_id):
video_data = self._get_video_json(access_key, video_id)
formats = []
for published_url in video_data['published_urls']:
video_url = published_url['embed_url']
media_format = published_url.get('format')
ext = determine_ext(video_url)
if ext == 'smil':
if ext == 'smil' or media_format == 'smil':
formats.extend(self._extract_smil_formats(video_url, video_id))
continue
@@ -183,7 +177,7 @@ class AnvatoIE(InfoExtractor):
'tbr': tbr if tbr != 0 else None,
}
if ext == 'm3u8':
if ext == 'm3u8' or media_format in ('m3u8', 'm3u8-variant'):
# Not using _extract_m3u8_formats here as individual media
# playlists are also included in published_urls.
if tbr is None:
@@ -194,7 +188,7 @@ class AnvatoIE(InfoExtractor):
'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])),
'ext': 'mp4',
})
elif ext == 'mp3':
elif ext == 'mp3' or media_format == 'mp3':
a_format['vcodec'] = 'none'
else:
a_format.update({
@@ -218,7 +212,19 @@ class AnvatoIE(InfoExtractor):
'formats': formats,
'title': video_data.get('def_title'),
'description': video_data.get('def_description'),
'tags': video_data.get('def_tags', '').split(','),
'categories': video_data.get('categories'),
'thumbnail': video_data.get('thumbnail'),
'timestamp': int_or_none(video_data.get(
'ts_published') or video_data.get('ts_added')),
'uploader': video_data.get('mcp_id'),
'duration': int_or_none(video_data.get('duration')),
'subtitles': subtitles,
}
def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json(self._html_search_regex(
r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
'Anvato player data'), video_id)
return self._get_anvato_videos(
anvplayer_data['accessKey'], anvplayer_data['video'])

View File

@@ -12,7 +12,7 @@ from ..utils import (
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{
# video with 5min ID
@@ -33,7 +33,7 @@ class AolIE(InfoExtractor):
}
}, {
# video with vidible ID
'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
@@ -108,30 +108,3 @@ class AolIE(InfoExtractor):
'uploader': video_data.get('videoOwner'),
'formats': formats,
}
class AolFeaturesIE(InfoExtractor):
IE_NAME = 'features.aol.com'
_VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
'md5': '7db483bb0c09c85e241f84a34238cc75',
'info_dict': {
'id': '519507715',
'ext': 'mp4',
'title': 'What To Watch - February 17, 2016',
},
'add_ie': ['FiveMin'],
'params': {
# encrypted m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self.url_result(self._search_regex(
r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
webpage, '5min embed url'), 'FiveMin')

View File

@@ -174,11 +174,15 @@ class ARDMediathekIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
if '>Der gewünschte Beitrag ist nicht mehr verfügbar.<' in webpage:
raise ExtractorError('Video %s is no longer available' % video_id, expected=True)
ERRORS = (
('>Leider liegt eine Störung vor.', 'Video %s is unavailable'),
('>Der gewünschte Beitrag ist nicht mehr verfügbar.<',
'Video %s is no longer available'),
)
if 'Diese Sendung ist für Jugendliche unter 12 Jahren nicht geeignet. Der Clip ist deshalb nur von 20 bis 6 Uhr verfügbar.' in webpage:
raise ExtractorError('This program is only suitable for those aged 12 and older. Video %s is therefore only available between 20 pm and 6 am.' % video_id, expected=True)
for pattern, message in ERRORS:
if pattern in webpage:
raise ExtractorError(message % video_id, expected=True)
if re.search(r'[\?&]rss($|[=&])', url):
doc = compat_etree_fromstring(webpage.encode('utf-8'))
@@ -249,7 +253,7 @@ class ARDIE(InfoExtractor):
'duration': 2600,
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20140804',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'HTTP Error 404: Not Found',
}

View File

@@ -4,8 +4,10 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
int_or_none,
mimetype2ext,
@@ -15,7 +17,13 @@ from ..utils import (
class ArkenaIE(InfoExtractor):
_VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
_VALID_URL = r'''(?x)
https?://
(?:
video\.arkena\.com/play2/embed/player\?|
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
)
'''
_TESTS = [{
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
@@ -37,6 +45,9 @@ class ArkenaIE(InfoExtractor):
}, {
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
'only_matching': True,
}, {
'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
'only_matching': True,
}]
@staticmethod
@@ -53,6 +64,14 @@ class ArkenaIE(InfoExtractor):
video_id = mobj.group('id')
account_id = mobj.group('account_id')
# Handle http://video.arkena.com/play2/embed/player URL
if not video_id:
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
video_id = qs.get('mediaId', [None])[0]
account_id = qs.get('accountId', [None])[0]
if not video_id or not account_id:
raise ExtractorError('Invalid URL', expected=True)
playlist = self._download_json(
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
% (video_id, account_id),

View File

@@ -30,7 +30,7 @@ class AtresPlayerIE(InfoExtractor):
'title': 'Especial Solidario de Nochebuena',
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 5527.6,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'This video is only available for registered users'
},
@@ -43,7 +43,7 @@ class AtresPlayerIE(InfoExtractor):
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
},
{

View File

@@ -14,7 +14,7 @@ class ATTTechChannelIE(InfoExtractor):
'ext': 'flv',
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140127',
},
'params': {

View File

@@ -6,8 +6,8 @@ from ..utils import float_or_none
class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/boos/(?P<id>[0-9]+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
'info_dict': {
@@ -17,9 +17,12 @@ class AudioBoomIE(InfoExtractor):
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}
}, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -11,7 +11,7 @@ from ..utils import (
class AzubuIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/[^/]+#!/play/(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
@@ -21,7 +21,7 @@ class AzubuIE(InfoExtractor):
'ext': 'mp4',
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
'thumbnail': 're:^https?://.*\.jpe?g',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1417523507.334,
'upload_date': '20141202',
'duration': 9988.7,
@@ -38,7 +38,7 @@ class AzubuIE(InfoExtractor):
'ext': 'mp4',
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
'thumbnail': 're:^https?://.*\.jpe?g',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1410530893.320,
'upload_date': '20140912',
'duration': 172.385,
@@ -103,12 +103,15 @@ class AzubuIE(InfoExtractor):
class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/(?P<id>[^/]+)$'
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/(?P<id>[^/]+)$'
_TEST = {
_TESTS = [{
'url': 'http://www.azubu.tv/MarsTVMDLen',
'only_matching': True,
}
}, {
'url': 'http://azubu.uol.com.br/adolfz',
'only_matching': True,
}]
def _real_extract(self, url):
user = self._match_id(url)

View File

@@ -1,7 +1,9 @@
from __future__ import unicode_literals
import json
import random
import re
import time
from .common import InfoExtractor
from ..compat import (
@@ -12,6 +14,9 @@ from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
parse_filesize,
unescapeHTML,
update_url_query,
)
@@ -81,35 +86,68 @@ class BandcampIE(InfoExtractor):
r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
webpage, 'video id')
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
# We get the dictionary of the track from some javascript code
all_info = self._parse_json(self._search_regex(
r'(?sm)items: (.*?),$', download_webpage, 'items'), video_id)
info = all_info[0]
# We pick mp3-320 for now, until format selection can be easily implemented.
mp3_info = info['downloads']['mp3-320']
# If we try to use this url it says the link has expired
initial_url = mp3_info['url']
m_url = re.match(
r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$',
initial_url)
# We build the url we will use to get the final track url
# This url is build in Bandcamp in the script download_bunde_*.js
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
# If we could correctly generate the .rand field the url would be
# in the "download_url" key
final_url = self._proto_relative_url(self._search_regex(
r'"retry_url":"(.+?)"', final_url_webpage, 'final video URL'), 'http:')
download_webpage = self._download_webpage(
download_link, video_id, 'Downloading free downloads page')
blob = self._parse_json(
self._search_regex(
r'data-blob=(["\'])(?P<blob>{.+?})\1', download_webpage,
'blob', group='blob'),
video_id, transform_source=unescapeHTML)
info = blob['digital_items'][0]
downloads = info['downloads']
track = info['title']
artist = info.get('artist')
title = '%s - %s' % (artist, track) if artist else track
download_formats = {}
for f in blob['download_formats']:
name, ext = f.get('name'), f.get('file_extension')
if all(isinstance(x, compat_str) for x in (name, ext)):
download_formats[name] = ext.strip('.')
formats = []
for format_id, f in downloads.items():
format_url = f.get('url')
if not format_url:
continue
# Stat URL generation algorithm is reverse engineered from
# download_*_bundle_*.js
stat_url = update_url_query(
format_url.replace('/download/', '/statdownload/'), {
'.rand': int(time.time() * 1000 * random.random()),
})
format_id = f.get('encoding_name') or format_id
stat = self._download_json(
stat_url, video_id, 'Downloading %s JSON' % format_id,
transform_source=lambda s: s[s.index('{'):s.rindex('}') + 1],
fatal=False)
if not stat:
continue
retry_url = stat.get('retry_url')
if not isinstance(retry_url, compat_str):
continue
formats.append({
'url': self._proto_relative_url(retry_url, 'http:'),
'ext': download_formats.get(format_id),
'format_id': format_id,
'format_note': f.get('description'),
'filesize': parse_filesize(f.get('size_mb')),
'vcodec': 'none',
})
self._sort_formats(formats)
return {
'id': video_id,
'title': info['title'],
'ext': 'mp3',
'vcodec': 'none',
'url': final_url,
'title': title,
'thumbnail': info.get('thumb_url'),
'uploader': info.get('artist'),
'artist': artist,
'track': track,
'formats': formats,
}

View File

@@ -46,19 +46,19 @@ class BeegIE(InfoExtractor):
self._proto_relative_url(cpl_url), video_id,
'Downloading cpl JS', fatal=False)
if cpl:
beeg_version = self._search_regex(
r'beeg_version\s*=\s*(\d+)', cpl,
'beeg version', default=None) or self._search_regex(
beeg_version = int_or_none(self._search_regex(
r'beeg_version\s*=\s*([^\b]+)', cpl,
'beeg version', default=None)) or self._search_regex(
r'/(\d+)\.js', cpl_url, 'beeg version', default=None)
beeg_salt = self._search_regex(
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg beeg_salt',
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg salt',
default=None, group='beeg_salt')
beeg_version = beeg_version or '1750'
beeg_salt = beeg_salt or 'MIDtGaw96f0N1kMMAM1DE46EC9pmFr'
beeg_version = beeg_version or '2000'
beeg_salt = beeg_salt or 'pmweAkq8lAYKdfWcFCUj0yoVgoPlinamH5UE1CB3H'
video = self._download_json(
'http://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
'https://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
video_id)
def split(o, e):

View File

@@ -17,7 +17,7 @@ class BetIE(MTVServicesInfoExtractor):
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'upload_date': '20141208',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
@@ -37,7 +37,7 @@ class BetIE(MTVServicesInfoExtractor):
'description': 'A BET News special.',
'duration': 1696,
'upload_date': '20141125',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}

View File

@@ -19,7 +19,7 @@ class BildIE(InfoExtractor):
'ext': 'mp4',
'title': 'Das können die neuen iPads',
'description': 'md5:a4058c4fa2a804ab59c00d7244bbf62f',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 196,
}
}

View File

@@ -28,7 +28,7 @@ class BiliBiliIE(InfoExtractor):
'duration': 308.315,
'timestamp': 1398012660,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
'thumbnail': r're:^https?://.+\.jpg',
'uploader': '菊子桑',
'uploader_id': '156160',
},

View File

@@ -19,7 +19,7 @@ class BioBioChileTVIE(InfoExtractor):
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
@@ -31,7 +31,7 @@ class BioBioChileTVIE(InfoExtractor):
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {

View File

@@ -45,7 +45,8 @@ class BloombergIE(InfoExtractor):
name = self._match_id(url)
webpage = self._download_webpage(url, name)
video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
webpage, 'id', group='url', default=None)
if not video_id:
bplayer_data = self._parse_json(self._search_regex(

View File

@@ -1,9 +1,9 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_age_limit,
@@ -11,7 +11,7 @@ from ..utils import (
class BreakIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?(?P<site>break|screenjunkies)\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {
@@ -20,45 +20,124 @@ class BreakIE(InfoExtractor):
'title': 'When Girls Act Like D-Bags',
'age_limit': 13,
}
}, {
'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
'md5': '5c2b686bec3d43de42bde9ec047536b0',
'info_dict': {
'id': '2841915',
'display_id': 'best-quentin-tarantino-movie',
'ext': 'mp4',
'title': 'Best Quentin Tarantino Movie',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3671,
'age_limit': 13,
'tags': list,
},
}, {
'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
'info_dict': {
'id': '2348808',
'display_id': 'honest-trailers-the-dark-knight',
'ext': 'mp4',
'title': 'Honest Trailers - The Dark Knight',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'age_limit': 10,
'tags': list,
},
}, {
# requires subscription but worked around
'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
'info_dict': {
'id': '3003285',
'display_id': 'knocking-dead-ep-1-the-show-so-far',
'ext': 'mp4',
'title': 'State of The Dead Recap: Knocking Dead Pilot',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3307,
'age_limit': 13,
'tags': list,
},
}, {
'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://www.break.com/embed/%s' % video_id, video_id)
info = json.loads(self._search_regex(
r'var embedVars = ({.*})\s*?</script>',
webpage, 'info json', flags=re.DOTALL))
_DEFAULT_BITRATES = (48, 150, 320, 496, 864, 2240, 3264)
youtube_id = info.get('youtubeId')
def _real_extract(self, url):
site, display_id, video_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
(r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
webpage, 'video id')
webpage = self._download_webpage(
'http://www.%s.com/embed/%s' % (site, video_id),
display_id, 'Downloading video embed page')
embed_vars = self._parse_json(
self._search_regex(
r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
display_id)
youtube_id = embed_vars.get('youtubeId')
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
formats = [{
'url': media['uri'] + '?' + info['AuthToken'],
'tbr': media['bitRate'],
'width': media['width'],
'height': media['height'],
} for media in info['media'] if media.get('mediaPurpose') == 'play']
title = embed_vars['contentName']
if not formats:
formats = []
bitrates = []
for f in embed_vars.get('media', []):
if not f.get('uri') or f.get('mediaPurpose') != 'play':
continue
bitrate = int_or_none(f.get('bitRate'))
if bitrate:
bitrates.append(bitrate)
formats.append({
'url': info['videoUri']
'url': f['uri'],
'format_id': 'http-%d' % bitrate if bitrate else 'http',
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': bitrate,
'format': 'mp4',
})
self._sort_formats(formats)
if not bitrates:
# When subscriptionLevel > 0, i.e. plus subscription is required
# media list will be empty. However, hds and hls uris are still
# available. We can grab them assuming bitrates to be default.
bitrates = self._DEFAULT_BITRATES
duration = int_or_none(info.get('videoLengthInSeconds'))
age_limit = parse_age_limit(info.get('audienceRating'))
auth_token = embed_vars.get('AuthToken')
def construct_manifest_url(base_url, ext):
pieces = [base_url]
pieces.extend([compat_str(b) for b in bitrates])
pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
return ','.join(pieces)
if bitrates and auth_token:
hds_url = embed_vars.get('hdsUri')
if hds_url:
formats.extend(self._extract_f4m_formats(
construct_manifest_url(hds_url, 'f4m'),
display_id, f4m_id='hds', fatal=False))
hls_url = embed_vars.get('hlsUri')
if hls_url:
formats.extend(self._extract_m3u8_formats(
construct_manifest_url(hls_url, 'm3u8'),
display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,
'title': info['contentName'],
'thumbnail': info['thumbUri'],
'duration': duration,
'age_limit': age_limit,
'display_id': display_id,
'title': title,
'thumbnail': embed_vars.get('thumbUri'),
'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
'tags': embed_vars.get('tags', '').split(','),
'formats': formats,
}

View File

@@ -232,13 +232,16 @@ class BrightcoveLegacyIE(InfoExtractor):
"""Return a list of all Brightcove URLs from the webpage """
url_m = re.search(
r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
webpage)
r'''(?x)
<meta\s+
(?:property|itemprop)=([\'"])(?:og:video|embedURL)\1[^>]+
content=([\'"])(?P<url>https?://(?:secure|c)\.brightcove.com/(?:(?!\2).)+)\2
''', webpage)
if url_m:
url = unescapeHTML(url_m.group(1))
url = unescapeHTML(url_m.group('url'))
# Some sites don't add it, we can't download with this url, for example:
# http://www.ktvu.com/videos/news/raw-video-caltrain-releases-video-of-man-almost/vCTZdY/
if 'playerKey' in url or 'videoId' in url:
if 'playerKey' in url or 'videoId' in url or 'idVideo' in url:
return [url]
matches = re.findall(
@@ -259,7 +262,7 @@ class BrightcoveLegacyIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
# Change the 'videoId' and others field to '@videoPlayer'
url = re.sub(r'(?<=[?&])(videoI(d|D)|bctid)', '%40videoPlayer', url)
url = re.sub(r'(?<=[?&])(videoI(d|D)|idVideo|bctid)', '%40videoPlayer', url)
# Change bckey (used by bcove.me urls) to playerKey
url = re.sub(r'(?<=[?&])bckey', 'playerKey', url)
mobj = re.match(self._VALID_URL, url)
@@ -548,7 +551,7 @@ class BrightcoveNewIE(InfoExtractor):
container = source.get('container')
ext = mimetype2ext(source.get('type'))
src = source.get('src')
if ext == 'ism':
if ext == 'ism' or container == 'WVM':
continue
elif ext == 'm3u8' or container == 'M2TS':
if not src:

View File

@@ -16,7 +16,7 @@ class BYUtvIE(InfoExtractor):
'ext': 'mp4',
'title': 'Season 5 Episode 5',
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1486.486,
},
'params': {

View File

@@ -26,7 +26,7 @@ class CamdemyIE(InfoExtractor):
'id': '5181',
'ext': 'mp4',
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'ss11spring',
'duration': 1591,
'upload_date': '20130114',
@@ -41,7 +41,7 @@ class CamdemyIE(InfoExtractor):
'id': '13885',
'ext': 'mp4',
'title': 'EverCam + Camdemy QuickStart',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2a9f989c2b153a2342acee579c6e7db6',
'creator': 'evercam',
'duration': 318,

View File

@@ -105,7 +105,8 @@ class CanalplusIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
r'id=["\']canal_video_player(?P<id>\d+)'],
r'id=["\']canal_video_player(?P<id>\d+)',
r'data-video=["\'](?P<id>\d+)'],
webpage, 'video id', group='id')
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)

View File

@@ -17,7 +17,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'De afspraak veilt voor de Warmste Week',
'description': 'md5:24cb860c320dc2be7358e0e5aa317ba6',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 49.02,
}
}, {
@@ -29,7 +29,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'Pieter 0167',
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2553.08,
'subtitles': {
'nl': [{
@@ -48,7 +48,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'Herbekijk Sorry voor alles',
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 3788.06,
},
'params': {
@@ -89,6 +89,9 @@ class CanvasIE(InfoExtractor):
elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url, display_id, f4m_id=format_type, fatal=False))
elif format_type == 'MPEG_DASH':
formats.extend(self._extract_mpd_formats(
format_url, display_id, mpd_id=format_type, fatal=False))
else:
formats.append({
'format_id': format_type,

View File

@@ -21,7 +21,7 @@ class CarambaTVIE(InfoExtractor):
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
@@ -69,7 +69,7 @@ class CarambaTVPageIE(InfoExtractor):
'id': '475222',
'ext': 'flv',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
# duration reported by videomore is incorrect
'duration': int,
},

View File

@@ -283,11 +283,6 @@ class CBCWatchVideoIE(CBCWatchBaseIE):
formats = self._extract_m3u8_formats(re.sub(r'/([^/]+)/[^/?]+\.m3u8', r'/\1/\1.m3u8', m3u8_url), video_id, 'mp4', fatal=False)
if len(formats) < 2:
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
# Despite metadata in m3u8 all video+audio formats are
# actually video-only (no audio)
for f in formats:
if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
f['acodec'] = 'none'
self._sort_formats(formats)
info = {

View File

@@ -4,11 +4,14 @@ from __future__ import unicode_literals
from .anvato import AnvatoIE
from .sendtonews import SendtoNewsIE
from ..compat import compat_urlparse
from ..utils import unified_timestamp
from ..utils import (
parse_iso8601,
unified_timestamp,
)
class CBSLocalIE(AnvatoIE):
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/\d+/\d+/\d+/(?P<id>[0-9a-z-]+)'
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/(?:\d+/\d+/\d+|video)/(?P<id>[0-9a-z-]+)'
_TESTS = [{
# Anvato backend
@@ -22,6 +25,7 @@ class CBSLocalIE(AnvatoIE):
'thumbnail': 're:^https?://.*',
'timestamp': 1463440500,
'upload_date': '20160516',
'uploader': 'CBS',
'subtitles': {
'en': 'mincount:5',
},
@@ -35,6 +39,7 @@ class CBSLocalIE(AnvatoIE):
'Syndication\\Curb.tv',
'Content\\News'
],
'tags': ['CBS 2 News Evening'],
},
}, {
# SendtoNews embed
@@ -47,6 +52,31 @@ class CBSLocalIE(AnvatoIE):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://newyork.cbslocal.com/video/3580809-a-very-blue-anniversary/',
'info_dict': {
'id': '3580809',
'ext': 'mp4',
'title': 'A Very Blue Anniversary',
'description': 'CBS2s Cindy Hsu has more.',
'thumbnail': 're:^https?://.*',
'timestamp': 1479962220,
'upload_date': '20161124',
'uploader': 'CBS',
'subtitles': {
'en': 'mincount:5',
},
'categories': [
'Stations\\Spoken Word\\WCBSTV',
'Syndication\\AOL',
'Syndication\\MSN',
'Syndication\\NDN',
'Syndication\\Yahoo',
'Content\\News',
'Content\\News\\Local News',
],
'tags': ['CBS 2 News Weekends', 'Cindy Hsu', 'Blue Man Group'],
},
}]
def _real_extract(self, url):
@@ -62,8 +92,11 @@ class CBSLocalIE(AnvatoIE):
info_dict = self._extract_anvato_videos(webpage, display_id)
time_str = self._html_search_regex(
r'class="entry-date">([^<]+)<', webpage, 'released date', fatal=False)
timestamp = unified_timestamp(time_str)
r'class="entry-date">([^<]+)<', webpage, 'released date', default=None)
if time_str:
timestamp = unified_timestamp(time_str)
else:
timestamp = parse_iso8601(self._html_search_meta('uploadDate', webpage))
info_dict.update({
'display_id': display_id,

View File

@@ -39,7 +39,7 @@ class CBSNewsIE(CBSIE):
'upload_date': '20140404',
'timestamp': 1396650660,
'uploader': 'CBSI-NEW',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 205,
'subtitles': {
'en': [{

View File

@@ -19,7 +19,7 @@ class CCCIE(InfoExtractor):
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
@@ -32,7 +32,7 @@ class CCCIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
event_id = self._search_regex("data-id='(\d+)'", webpage, 'event id')
event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
event_data = self._download_json('https://media.ccc.de/public/events/%s' % event_id, event_id)
formats = []

View File

@@ -0,0 +1,99 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
clean_html,
)
class CCMAIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ccma\.cat/(?:[^/]+/)*?(?P<type>video|audio)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.ccma.cat/tv3/alacarta/lespot-de-la-marato-de-tv3/lespot-de-la-marato-de-tv3/video/5630208/',
'md5': '7296ca43977c8ea4469e719c609b0871',
'info_dict': {
'id': '5630208',
'ext': 'mp4',
'title': 'L\'espot de La Marató de TV3',
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
'timestamp': 1470918540,
'upload_date': '20160811',
}
}, {
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
'md5': 'fa3e38f269329a278271276330261425',
'info_dict': {
'id': '943685',
'ext': 'mp3',
'title': 'El Consell de Savis analitza el derbi',
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
'upload_date': '20171205',
'timestamp': 1512507300,
}
}]
def _real_extract(self, url):
media_type, media_id = re.match(self._VALID_URL, url).groups()
media_data = {}
formats = []
profiles = ['pc'] if media_type == 'audio' else ['mobil', 'pc']
for i, profile in enumerate(profiles):
md = self._download_json('http://dinamics.ccma.cat/pvideo/media.jsp', media_id, query={
'media': media_type,
'idint': media_id,
'profile': profile,
}, fatal=False)
if md:
media_data = md
media_url = media_data.get('media', {}).get('url')
if media_url:
formats.append({
'format_id': profile,
'url': media_url,
'quality': i,
})
self._sort_formats(formats)
informacio = media_data['informacio']
title = informacio['titol']
durada = informacio.get('durada', {})
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
subtitles = {}
subtitols = media_data.get('subtitols', {})
if subtitols:
sub_url = subtitols.get('url')
if sub_url:
subtitles.setdefault(
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
'url': sub_url,
})
thumbnails = []
imatges = media_data.get('imatges', {})
if imatges:
thumbnail_url = imatges.get('url')
if thumbnail_url:
thumbnails = [{
'url': thumbnail_url,
'width': int_or_none(imatges.get('amplada')),
'height': int_or_none(imatges.get('alcada')),
}]
return {
'id': media_id,
'title': title,
'description': clean_html(informacio.get('descripcio')),
'duration': duration,
'timestamp': timestamp,
'thumnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}

View File

@@ -4,50 +4,188 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import float_or_none
from ..compat import compat_str
from ..utils import (
float_or_none,
try_get,
unified_timestamp,
)
class CCTVIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
(?:
cctv\.(?:com|cn)|
cntv\.cn
)/
(?:
video/[^/]+/(?P<id>[0-9a-f]{32})|
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
)'''
IE_DESC = '央视网'
_VALID_URL = r'https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)'
_TESTS = [{
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
'md5': '819c7b49fc3927d529fb4cd555621823',
# fo.addVariable("videoCenterId","id")
'url': 'http://sports.cntv.cn/2016/02/12/ARTIaBRxv4rTT1yWf1frW2wi160212.shtml',
'md5': 'd61ec00a493e09da810bf406a078f691',
'info_dict': {
'id': '454368eb19ad44a1925bf1eb96140a61',
'id': '5ecdbeab623f4973b40ff25f18b174e8',
'ext': 'mp4',
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
}
'title': '[NBA]二少联手砍下46分 雷霆主场击败鹈鹕(快讯)',
'description': 'md5:7e14a5328dc5eb3d1cd6afbbe0574e95',
'duration': 98,
'uploader': 'songjunjie',
'timestamp': 1455279956,
'upload_date': '20160212',
},
}, {
# var guid = "id"
'url': 'http://tv.cctv.com/2016/02/05/VIDEUS7apq3lKrHG9Dncm03B160205.shtml',
'info_dict': {
'id': 'efc5d49e5b3b4ab2b34f3a502b73d3ae',
'ext': 'mp4',
'title': '[赛车]“车王”舒马赫恢复情况成谜(快讯)',
'description': '2月4日蒙特泽莫罗透露了关于“车王”舒马赫恢复情况但情况是否属实遭到了质疑。',
'duration': 37,
'uploader': 'shujun',
'timestamp': 1454677291,
'upload_date': '20160205',
},
'params': {
'skip_download': True,
},
}, {
# changePlayer('id')
'url': 'http://english.cntv.cn/special/four_comprehensives/index.shtml',
'info_dict': {
'id': '4bb9bb4db7a6471ba85fdeda5af0381e',
'ext': 'mp4',
'title': 'NHnews008 ANNUAL POLITICAL SEASON',
'description': 'Four Comprehensives',
'duration': 60,
'uploader': 'zhangyunlei',
'timestamp': 1425385521,
'upload_date': '20150303',
},
'params': {
'skip_download': True,
},
}, {
# loadvideo('id')
'url': 'http://cctv.cntv.cn/lm/tvseries_russian/yilugesanghua/index.shtml',
'info_dict': {
'id': 'b15f009ff45c43968b9af583fc2e04b2',
'ext': 'mp4',
'title': 'Путь,усыпанный космеями Серия 1',
'description': 'Путь, усыпанный космеями',
'duration': 2645,
'uploader': 'renxue',
'timestamp': 1477479241,
'upload_date': '20161026',
},
'params': {
'skip_download': True,
},
}, {
# var initMyAray = 'id'
'url': 'http://www.ncpa-classic.com/2013/05/22/VIDE1369219508996867.shtml',
'info_dict': {
'id': 'a194cfa7f18c426b823d876668325946',
'ext': 'mp4',
'title': '小泽征尔音乐塾 音乐梦想无国界',
'duration': 2173,
'timestamp': 1369248264,
'upload_date': '20130522',
},
'params': {
'skip_download': True,
},
}, {
# var ids = ["id"]
'url': 'http://www.ncpa-classic.com/clt/more/416/index.shtml',
'info_dict': {
'id': 'a8606119a4884588a79d81c02abecc16',
'ext': 'mp3',
'title': '来自维也纳的新年贺礼',
'description': 'md5:f13764ae8dd484e84dd4b39d5bcba2a7',
'duration': 1578,
'uploader': 'djy',
'timestamp': 1482942419,
'upload_date': '20161228',
},
'params': {
'skip_download': True,
},
'expected_warnings': ['Failed to download m3u8 information'],
}, {
'url': 'http://ent.cntv.cn/2016/01/18/ARTIjprSSJH8DryTVr5Bx8Wb160118.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cntv.cn/video/C39296/e0210d949f113ddfb38d31f00a4e5c44',
'only_matching': True,
}, {
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
'only_matching': True
'only_matching': True,
}]
def _real_extract(self, url):
video_id, display_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
webpage, 'video_id')
api_data = self._download_json(
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_id = self._search_regex(
[r'var\s+guid\s*=\s*["\']([\da-fA-F]+)',
r'videoCenterId["\']\s*,\s*["\']([\da-fA-F]+)',
r'changePlayer\s*\(\s*["\']([\da-fA-F]+)',
r'load[Vv]ideo\s*\(\s*["\']([\da-fA-F]+)',
r'var\s+initMyAray\s*=\s*["\']([\da-fA-F]+)',
r'var\s+ids\s*=\s*\[["\']([\da-fA-F]+)'],
webpage, 'video id')
data = self._download_json(
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do', video_id,
query={
'pid': video_id,
'url': url,
'idl': 32,
'idlr': 32,
'modifyed': 'false',
})
title = data['title']
formats = []
video = data.get('video')
if isinstance(video, dict):
for quality, chapters_key in enumerate(('lowChapters', 'chapters')):
video_url = try_get(
video, lambda x: x[chapters_key][0]['url'], compat_str)
if video_url:
formats.append({
'url': video_url,
'format_id': 'http',
'quality': quality,
'preference': -1,
})
hls_url = try_get(data, lambda x: x['hls_url'], compat_str)
if hls_url:
hls_url = re.sub(r'maxbr=\d+&?', '', hls_url)
formats.extend(self._extract_m3u8_formats(
hls_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
uploader = data.get('editer_name')
description = self._html_search_meta(
'description', webpage, default=None)
timestamp = unified_timestamp(data.get('f_pgmtime'))
duration = float_or_none(try_get(video, lambda x: x['totalLength']))
return {
'id': video_id,
'title': api_data['title'],
'formats': self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
'title': title,
'description': description,
'uploader': uploader,
'timestamp': timestamp,
'duration': duration,
'formats': formats,
}

View File

@@ -5,14 +5,16 @@ import re
from .common import InfoExtractor
from ..utils import (
decode_packed_codes,
ExtractorError,
parse_duration
float_or_none,
int_or_none,
parse_duration,
)
class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_BASE_URL = 'http://www.cda.pl/'
_TESTS = [{
'url': 'http://www.cda.pl/video/5749950c',
'md5': '6f844bf51b15f31fae165365707ae970',
@@ -21,6 +23,9 @@ class CDAIE(InfoExtractor):
'ext': 'mp4',
'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'description': 'md5:269ccd135d550da90d1662651fcb9772',
'thumbnail': r're:^https?://.*\.jpg$',
'average_rating': float,
'duration': 39
}
}, {
@@ -30,6 +35,11 @@ class CDAIE(InfoExtractor):
'id': '57413289',
'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze',
'description': 'md5:60d76b71186dcce4e0ba6d4bbdb13e1a',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'crash404',
'view_count': int,
'average_rating': float,
'duration': 137
}
}, {
@@ -39,31 +49,55 @@ class CDAIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage('http://ebd.cda.pl/0x0/' + video_id, video_id)
self._set_cookie('cda.pl', 'cda.player', 'html5')
webpage = self._download_webpage(
self._BASE_URL + '/video/' + video_id, video_id)
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
formats = []
uploader = self._search_regex(r'''(?x)
<(span|meta)[^>]+itemprop=(["\'])author\2[^>]*>
(?:<\1[^>]*>[^<]*</\1>|(?!</\1>)(?:.|\n))*?
<(span|meta)[^>]+itemprop=(["\'])name\4[^>]*>(?P<uploader>[^<]+)</\3>
''', webpage, 'uploader', default=None, group='uploader')
view_count = self._search_regex(
r'Odsłony:(?:\s|&nbsp;)*([0-9]+)', webpage,
'view_count', default=None)
average_rating = self._search_regex(
r'<(?:span|meta)[^>]+itemprop=(["\'])ratingValue\1[^>]*>(?P<rating_value>[0-9.]+)',
webpage, 'rating', fatal=False, group='rating_value')
info_dict = {
'id': video_id,
'title': title,
'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'uploader': uploader,
'view_count': int_or_none(view_count),
'average_rating': float_or_none(average_rating),
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats,
'duration': None,
}
def extract_format(page, version):
unpacked = decode_packed_codes(page)
format_url = self._search_regex(
r"(?:file|url)\s*:\s*(\\?[\"'])(?P<url>http.+?)\1", unpacked,
'%s url' % version, fatal=False, group='url')
if not format_url:
json_str = self._search_regex(
r'player_data=(\\?["\'])(?P<player_data>.+?)\1', page,
'%s player_json' % version, fatal=False, group='player_data')
if not json_str:
return
player_data = self._parse_json(
json_str, '%s player_data' % version, fatal=False)
if not player_data:
return
video = player_data.get('video')
if not video or 'file' not in video:
self.report_warning('Unable to extract %s version information' % version)
return
f = {
'url': format_url,
'url': video['file'],
}
m = re.search(
r'<a[^>]+data-quality="(?P<format_id>[^"]+)"[^>]+href="[^"]+"[^>]+class="[^"]*quality-btn-active[^"]*">(?P<height>[0-9]+)p',
@@ -75,9 +109,7 @@ class CDAIE(InfoExtractor):
})
info_dict['formats'].append(f)
if not info_dict['duration']:
info_dict['duration'] = parse_duration(self._search_regex(
r"duration\s*:\s*(\\?[\"'])(?P<duration>.+?)\1",
unpacked, 'duration', fatal=False, group='duration'))
info_dict['duration'] = parse_duration(video.get('duration'))
extract_format(webpage, 'default')
@@ -85,7 +117,8 @@ class CDAIE(InfoExtractor):
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage):
webpage = self._download_webpage(
href, video_id, 'Downloading %s version information' % resolution, fatal=False)
self._BASE_URL + href, video_id,
'Downloading %s version information' % resolution, fatal=False)
if not webpage:
# Manually report warning because empty page is returned when
# invalid version is requested.

View File

@@ -25,7 +25,7 @@ class CeskaTelevizeIE(InfoExtractor):
'ext': 'mp4',
'title': 'Hyde Park Civilizace',
'description': 'md5:fe93f6eda372d150759d11644ebbfb4a',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3350,
},
'params': {
@@ -39,7 +39,7 @@ class CeskaTelevizeIE(InfoExtractor):
'ext': 'mp4',
'title': 'Hyde Park Civilizace: Bonus 01 - En',
'description': 'English Subtittles',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 81.3,
},
'params': {
@@ -52,7 +52,7 @@ class CeskaTelevizeIE(InfoExtractor):
'info_dict': {
'id': 402,
'ext': 'mp4',
'title': 're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'title': r're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
},
'params': {
@@ -80,7 +80,7 @@ class CeskaTelevizeIE(InfoExtractor):
'id': '61924494877068022',
'ext': 'mp4',
'title': 'Queer: Bogotart (Queer)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 1558.3,
},
}],

View File

@@ -31,7 +31,7 @@ class Channel9IE(InfoExtractor):
'title': 'Developer Kick-Off Session: Stuff We Love',
'description': 'md5:c08d72240b7c87fcecafe2692f80e35f',
'duration': 4576,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
'session_code': 'KOS002',
'session_day': 'Day 1',
'session_room': 'Arena 1A',
@@ -47,7 +47,7 @@ class Channel9IE(InfoExtractor):
'title': 'Self-service BI with Power BI - nuclear testing',
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
'duration': 1540,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
'authors': ['Mike Wilmot'],
},
}, {
@@ -59,7 +59,7 @@ class Channel9IE(InfoExtractor):
'title': 'Ranges for the Standard Library',
'description': 'md5:2e6b4917677af3728c5f6d63784c4c5d',
'duration': 5646,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
'params': {
'skip_download': True,

View File

@@ -13,7 +13,7 @@ class CharlieRoseIE(InfoExtractor):
'id': '27996',
'ext': 'mp4',
'title': 'Remembering Zaha Hadid',
'thumbnail': 're:^https?://.*\.jpg\?\d+',
'thumbnail': r're:^https?://.*\.jpg\?\d+',
'description': 'We revisit past conversations with Zaha Hadid, in memory of the world renowned Iraqi architect.',
'subtitles': {
'en': [{

View File

@@ -30,7 +30,7 @@ class CliphunterIE(InfoExtractor):
'id': '1012420',
'ext': 'flv',
'title': 'Fun Jynx Maze solo',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
'skip': 'Video gone',
@@ -41,7 +41,7 @@ class CliphunterIE(InfoExtractor):
'id': '2019449',
'ext': 'mp4',
'title': 'ShesNew - My booty girlfriend, Victoria Paradice\'s pussy filled with jizz',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
}]

View File

@@ -18,7 +18,7 @@ class ClipsyndicateIE(InfoExtractor):
'ext': 'mp4',
'title': 'Brick Briscoe',
'duration': 612,
'thumbnail': 're:^https?://.+\.jpg',
'thumbnail': r're:^https?://.+\.jpg',
},
}, {
'url': 'http://chic.clipsyndicate.com/video/play/5844117/shark_attack',

View File

@@ -19,7 +19,7 @@ class ClubicIE(InfoExtractor):
'ext': 'mp4',
'title': 'Clubic Week 2.0 : le FBI se lance dans la photo d\u0092identité',
'description': 're:Gueule de bois chez Nokia. Le constructeur a indiqué cette.*',
'thumbnail': 're:^http://img\.clubic\.com/.*\.jpg$',
'thumbnail': r're:^http://img\.clubic\.com/.*\.jpg$',
}
}, {
'url': 'http://www.clubic.com/video/video-clubic-week-2-0-apple-iphone-6s-et-plus-mais-surtout-le-pencil-469792.html',

View File

@@ -21,7 +21,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'mp4',
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
'description': '',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 7713.088,
'timestamp': 1413309600,
'upload_date': '20141014',

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(video-clips|episodes|cc-studios|video-collections|full-episodes|shows)
(video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
/(?P<title>.*)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
@@ -27,6 +27,41 @@ class ComedyCentralIE(MTVServicesInfoExtractor):
}]
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(?:full-episodes|shows(?=/[^/]+/full-episodes))
/(?P<id>[^?]+)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
'info_dict': {
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
'title': 'November 28, 2016 - Ryan Speedo Green',
},
'playlist_count': 4,
}, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
feed = self._parse_json(feed_json, playlist_id)
zones = feed['manifest']['zones']
video_zone = zones['t2_lc_promo1']
feed = self._download_json(video_zone['feed'], playlist_id)
mgid = feed['result']['data']['id']
videos_info = self._get_videos_info(mgid, use_hls=True)
return videos_info
class ToshIE(MTVServicesInfoExtractor):
IE_DESC = 'Tosh.0'
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'
@@ -45,7 +80,7 @@ class ToshIE(MTVServicesInfoExtractor):
'ext': 'mp4',
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
'description': 'Tosh asked fans to share their summer plans.',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
# It's really reported to be published on year 2077
'upload_date': '20770610',
'timestamp': 3390510600,

View File

@@ -30,6 +30,7 @@ from ..downloader.f4m import remove_encrypted_media
from ..utils import (
NO_DEFAULT,
age_restricted,
base_url,
bug_reports_message,
clean_html,
compiled_regex_type,
@@ -58,6 +59,7 @@ from ..utils import (
parse_m3u8_attributes,
extract_attributes,
parse_codecs,
urljoin,
)
@@ -187,9 +189,10 @@ class InfoExtractor(object):
uploader_url: Full URL to a personal webpage of the video uploader.
location: Physical location where the video was filmed.
subtitles: The available subtitles as a dictionary in the format
{language: subformats}. "subformats" is a list sorted from
lower to higher preference, each element is a dictionary
with the "ext" entry and one of:
{tag: subformats}. "tag" is usually a language code, and
"subformats" is a list sorted from lower to higher
preference, each element is a dictionary with the "ext"
entry and one of:
* "data": The subtitles file contents
* "url": A URL pointing to the subtitles file
"ext" will be calculated from URL if missing
@@ -235,7 +238,7 @@ class InfoExtractor(object):
chapter_id: Id of the chapter the video belongs to, as a unicode string.
The following fields should only be used when the video is an episode of some
series or programme:
series, programme or podcast:
series: Title of the series or programme the video episode belongs to.
season: Title of the season the video episode belongs to.
@@ -885,7 +888,7 @@ class InfoExtractor(object):
'url': e.get('contentUrl'),
'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')),
'thumbnail': e.get('thumbnailUrl'),
'thumbnail': e.get('thumbnailUrl') or e.get('thumbnailURL'),
'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('uploadDate')),
'filesize': float_or_none(e.get('contentSize')),
@@ -1100,6 +1103,13 @@ class InfoExtractor(object):
manifest, ['{http://ns.adobe.com/f4m/1.0}bootstrapInfo', '{http://ns.adobe.com/f4m/2.0}bootstrapInfo'],
'bootstrap info', default=None)
vcodec = None
mime_type = xpath_text(
manifest, ['{http://ns.adobe.com/f4m/1.0}mimeType', '{http://ns.adobe.com/f4m/2.0}mimeType'],
'base URL', default=None)
if mime_type and mime_type.startswith('audio/'):
vcodec = 'none'
for i, media_el in enumerate(media_nodes):
tbr = int_or_none(media_el.attrib.get('bitrate'))
width = int_or_none(media_el.attrib.get('width'))
@@ -1140,6 +1150,7 @@ class InfoExtractor(object):
'width': f.get('width') or width,
'height': f.get('height') or height,
'format_id': f.get('format_id') if not tbr else format_id,
'vcodec': vcodec,
})
formats.extend(f4m_formats)
continue
@@ -1156,6 +1167,7 @@ class InfoExtractor(object):
'tbr': tbr,
'width': width,
'height': height,
'vcodec': vcodec,
'preference': preference,
})
return formats
@@ -1214,6 +1226,7 @@ class InfoExtractor(object):
'protocol': entry_protocol,
'preference': preference,
}]
audio_in_video_stream = {}
last_info = {}
last_media = {}
for line in m3u8_doc.splitlines():
@@ -1223,25 +1236,32 @@ class InfoExtractor(object):
media = parse_m3u8_attributes(line)
media_type = media.get('TYPE')
if media_type in ('VIDEO', 'AUDIO'):
group_id = media.get('GROUP-ID')
media_url = media.get('URI')
if media_url:
format_id = []
for v in (media.get('GROUP-ID'), media.get('NAME')):
for v in (group_id, media.get('NAME')):
if v:
format_id.append(v)
formats.append({
f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'language': media.get('LANGUAGE'),
'vcodec': 'none' if media_type == 'AUDIO' else None,
'ext': ext,
'protocol': entry_protocol,
'preference': preference,
})
}
if media_type == 'AUDIO':
f['vcodec'] = 'none'
if group_id and not audio_in_video_stream.get(group_id):
audio_in_video_stream[group_id] = False
formats.append(f)
else:
# When there is no URI in EXT-X-MEDIA let this tag's
# data be used by regular URI lines below
last_media = media
if media_type == 'AUDIO' and group_id:
audio_in_video_stream[group_id] = True
elif line.startswith('#') or not line.strip():
continue
else:
@@ -1270,9 +1290,10 @@ class InfoExtractor(object):
}
resolution = last_info.get('RESOLUTION')
if resolution:
width_str, height_str = resolution.split('x')
f['width'] = int(width_str)
f['height'] = int(height_str)
mobj = re.search(r'(?P<width>\d+)[xX](?P<height>\d+)', resolution)
if mobj:
f['width'] = int(mobj.group('width'))
f['height'] = int(mobj.group('height'))
# Unified Streaming Platform
mobj = re.search(
r'audio.*?(?:%3D|=)(\d+)(?:-video.*?(?:%3D|=)(\d+))?', f['url'])
@@ -1284,6 +1305,9 @@ class InfoExtractor(object):
'abr': abr,
})
f.update(parse_codecs(last_info.get('CODECS')))
if audio_in_video_stream.get(last_info.get('AUDIO')) is False:
# TODO: update acodec for for audio only formats with the same GROUP-ID
f['acodec'] = 'none'
formats.append(f)
last_info = {}
last_media = {}
@@ -1530,7 +1554,7 @@ class InfoExtractor(object):
if res is False:
return []
mpd, urlh = res
mpd_base_url = re.match(r'https?://.+/', urlh.geturl()).group()
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(
compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url,
@@ -1613,11 +1637,6 @@ class InfoExtractor(object):
extract_Initialization(segment_template)
return ms_info
def combine_url(base_url, target_url):
if re.match(r'^https?://', target_url):
return target_url
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
mpd_duration = parse_duration(mpd_doc.get('mediaPresentationDuration'))
formats = []
for period in mpd_doc.findall(_add_ns('Period')):
@@ -1667,12 +1686,11 @@ class InfoExtractor(object):
'tbr': int_or_none(representation_attrib.get('bandwidth'), 1000),
'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
'fps': int_or_none(representation_attrib.get('frameRate')),
'vcodec': 'none' if content_type == 'audio' else representation_attrib.get('codecs'),
'acodec': 'none' if content_type == 'video' else representation_attrib.get('codecs'),
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
'format_note': 'DASH %s' % content_type,
'filesize': filesize,
}
f.update(parse_codecs(representation_attrib.get('codecs')))
representation_ms_info = extract_multisegment_info(representation, adaption_set_ms_info)
if 'segment_urls' not in representation_ms_info and 'media_template' in representation_ms_info:
@@ -1692,7 +1710,7 @@ class InfoExtractor(object):
representation_ms_info['fragments'] = [{
'url': media_template % {
'Number': segment_number,
'Bandwidth': representation_attrib.get('bandwidth'),
'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
},
'duration': segment_duration,
} for segment_number in range(
@@ -1710,7 +1728,7 @@ class InfoExtractor(object):
def add_segment_url():
segment_url = media_template % {
'Time': segment_time,
'Bandwidth': representation_attrib.get('bandwidth'),
'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
'Number': segment_number,
}
representation_ms_info['fragments'].append({
@@ -1756,7 +1774,7 @@ class InfoExtractor(object):
f['fragments'].append({'url': initialization_url})
f['fragments'].extend(representation_ms_info['fragments'])
for fragment in f['fragments']:
fragment['url'] = combine_url(base_url, fragment['url'])
fragment['url'] = urljoin(base_url, fragment['url'])
try:
existing_format = next(
fo for fo in formats
@@ -1771,7 +1789,106 @@ class InfoExtractor(object):
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8'):
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
res = self._download_webpage_handle(
ism_url, video_id,
note=note or 'Downloading ISM manifest',
errnote=errnote or 'Failed to download ISM manifest',
fatal=fatal)
if res is False:
return []
ism, urlh = res
return self._parse_ism_formats(
compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id)
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
if ism_doc.get('IsLive') == 'TRUE' or ism_doc.find('Protection') is not None:
return []
duration = int(ism_doc.attrib['Duration'])
timescale = int_or_none(ism_doc.get('TimeScale')) or 10000000
formats = []
for stream in ism_doc.findall('StreamIndex'):
stream_type = stream.get('Type')
if stream_type not in ('video', 'audio'):
continue
url_pattern = stream.attrib['Url']
stream_timescale = int_or_none(stream.get('TimeScale')) or timescale
stream_name = stream.get('Name')
for track in stream.findall('QualityLevel'):
fourcc = track.get('FourCC')
# TODO: add support for WVC1 and WMAP
if fourcc not in ('H264', 'AVC1', 'AACL'):
self.report_warning('%s is not a supported codec' % fourcc)
continue
tbr = int(track.attrib['Bitrate']) // 1000
width = int_or_none(track.get('MaxWidth'))
height = int_or_none(track.get('MaxHeight'))
sampling_rate = int_or_none(track.get('SamplingRate'))
track_url_pattern = re.sub(r'{[Bb]itrate}', track.attrib['Bitrate'], url_pattern)
track_url_pattern = compat_urlparse.urljoin(ism_url, track_url_pattern)
fragments = []
fragment_ctx = {
'time': 0,
}
stream_fragments = stream.findall('c')
for stream_fragment_index, stream_fragment in enumerate(stream_fragments):
fragment_ctx['time'] = int_or_none(stream_fragment.get('t')) or fragment_ctx['time']
fragment_repeat = int_or_none(stream_fragment.get('r')) or 1
fragment_ctx['duration'] = int_or_none(stream_fragment.get('d'))
if not fragment_ctx['duration']:
try:
next_fragment_time = int(stream_fragment[stream_fragment_index + 1].attrib['t'])
except IndexError:
next_fragment_time = duration
fragment_ctx['duration'] = (next_fragment_time - fragment_ctx['time']) / fragment_repeat
for _ in range(fragment_repeat):
fragments.append({
'url': re.sub(r'{start[ _]time}', compat_str(fragment_ctx['time']), track_url_pattern),
'duration': fragment_ctx['duration'] / stream_timescale,
})
fragment_ctx['time'] += fragment_ctx['duration']
format_id = []
if ism_id:
format_id.append(ism_id)
if stream_name:
format_id.append(stream_name)
format_id.append(compat_str(tbr))
formats.append({
'format_id': '-'.join(format_id),
'url': ism_url,
'manifest_url': ism_url,
'ext': 'ismv' if stream_type == 'video' else 'isma',
'width': width,
'height': height,
'tbr': tbr,
'asr': sampling_rate,
'vcodec': 'none' if stream_type == 'audio' else fourcc,
'acodec': 'none' if stream_type == 'video' else fourcc,
'protocol': 'ism',
'fragments': fragments,
'_download_params': {
'duration': duration,
'timescale': stream_timescale,
'width': width or 0,
'height': height or 0,
'fourcc': fourcc,
'codec_private_data': track.get('CodecPrivateData'),
'sampling_rate': sampling_rate,
'channels': int_or_none(track.get('Channels', 2)),
'bits_per_sample': int_or_none(track.get('BitsPerSample', 16)),
'nal_unit_length_field': int_or_none(track.get('NALUnitLengthField', 4)),
},
})
return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None):
def absolute_url(video_url):
return compat_urlparse.urljoin(base_url, video_url)
@@ -1788,11 +1905,16 @@ class InfoExtractor(object):
def _media_formats(src, cur_media_type):
full_url = absolute_url(src)
if determine_ext(full_url) == 'm3u8':
ext = determine_ext(full_url)
if ext == 'm3u8':
is_plain_url = False
formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id)
elif ext == 'mpd':
is_plain_url = False
formats = self._extract_mpd_formats(
full_url, video_id, mpd_id=mpd_id)
else:
is_plain_url = True
formats = [{
@@ -1875,11 +1997,11 @@ class InfoExtractor(object):
formats.extend(self._extract_f4m_formats(
http_base_url + '/manifest.f4m',
video_id, f4m_id='hds', fatal=False))
if 'dash' not in skip_protocols:
formats.extend(self._extract_mpd_formats(
http_base_url + '/manifest.mpd',
video_id, mpd_id='dash', fatal=False))
if re.search(r'(?:/smil:|\.smil)', url_base):
if 'dash' not in skip_protocols:
formats.extend(self._extract_mpd_formats(
http_base_url + '/manifest.mpd',
video_id, mpd_id='dash', fatal=False))
if 'smil' not in skip_protocols:
rtmp_formats = self._extract_smil_formats(
http_base_url + '/jwplayer.smil',

View File

@@ -20,7 +20,7 @@ class CoubIE(InfoExtractor):
'id': '5u5n1',
'ext': 'mp4',
'title': 'The Matrix Moonwalk',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4.6,
'timestamp': 1428527772,
'upload_date': '20150408',

View File

@@ -14,7 +14,7 @@ class CrackleIE(InfoExtractor):
'ext': 'mp4',
'title': 'Everybody Respects A Bloody Nose',
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). Theyre headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 906,
'series': 'Comedians In Cars Getting Coffee',
'season_number': 8,

View File

@@ -14,7 +14,7 @@ class CriterionIE(InfoExtractor):
'ext': 'mp4',
'title': 'Le Samouraï',
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
}
}

View File

@@ -16,7 +16,7 @@ class CrooksAndLiarsIE(InfoExtractor):
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',

View File

@@ -142,7 +142,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'ext': 'flv',
'title': 'Culture Japan Episode 1 Rebuilding Japan after the 3.11',
'description': 'md5:2fbc01f90b87e8e9137296f37b461c12',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Danny Choo Network',
'upload_date': '20120213',
},
@@ -158,7 +158,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'ext': 'mp4',
'title': 'Re:ZERO -Starting Life in Another World- Episode 5 The Morning of Our Promise Is Still Distant',
'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'TV TOKYO',
'upload_date': '20160508',
},
@@ -236,7 +236,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
output += """ScaledBorderAndShadow: yes
output += """ScaledBorderAndShadow: no
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding

View File

@@ -28,7 +28,7 @@ class CtsNewsIE(InfoExtractor):
'ext': 'mp4',
'title': '韓國31歲童顏男 貌如十多歲小孩',
'description': '越有年紀的人越希望看起來年輕一點而南韓卻有一位31歲的男子看起來像是11、12歲的小孩身...',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1378205880,
'upload_date': '20130903',
}
@@ -41,7 +41,7 @@ class CtsNewsIE(InfoExtractor):
'ext': 'mp4',
'title': 'iPhone6熱銷 蘋果財報亮眼',
'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150128',
'uploader_id': 'TBSCTS',
'uploader': '中華電視公司',

Some files were not shown because too many files have changed in this diff Show More