Compare commits

...

120 Commits

Author SHA1 Message Date
Sergey M․
c1c2fe2045 release 2017.01.16 2017-01-16 23:44:04 +07:00
Sergey M․
ddd53c392e [ChangeLog] Actualize 2017-01-16 23:42:04 +07:00
Sergey M․
79fc8496c6 [xiami] Improve extraction (closes #11699)
* Relax _VALID_URLs
* Improve track metadata extraction
2017-01-16 23:31:50 +07:00
Sergey M․
0ce8c66fb0 [options] Include custom conf in final argv (closes #11741) 2017-01-16 22:07:12 +07:00
Sergey M․
906420cae3 [limelight] Improve and make more robust (closes #11737)
+ Add support for direct http for videos hosted on video.llnw.net
* Check handmade http URLs
2017-01-16 21:54:47 +07:00
Yen Chi Hsuan
16e2c8f771 [brightcove] Recognize another player ID
Closes #11688
2017-01-16 00:06:52 +08:00
Yen Chi Hsuan
dcae7b3fdc [niconico] Allow login via cookies
Some codes are borrowed from #7968, which is by @jlhg

Closes #7968
2017-01-15 22:51:54 +08:00
Yen Chi Hsuan
8e4988f1a2 [niconico] Remove codes for downloading anonymously
Apparently Niconico now blocks playing without an account

Closes #11170
2017-01-15 22:10:57 +08:00
Sergey M․
a7acf868a5 [yourupload] Fix extraction (closes #11601) 2017-01-15 10:34:39 +07:00
Sergey M․
6f0be93747 [YoutubeDL] Improve protocol auto determining (closes #11720) 2017-01-15 06:09:32 +07:00
Sergey M․
af62de104f [beam:live] Improve and simplify (#10702, closes #11596) 2017-01-15 06:07:35 +07:00
sh!zeeg
cd55c6ccd7 [beam:live] Add extractor 2017-01-15 06:06:10 +07:00
Sergey M․
621a2800ca [vevo] Improve geo restriction detection 2017-01-15 04:42:05 +07:00
Sergey M․
b80e2ebc8d [dramafever] Add support for URLs with language code (#11714) 2017-01-14 18:27:22 +07:00
Remita Amine
99d537a5e0 [ooyala] fix typo 2017-01-14 07:12:50 +01:00
Sergey M․
8854f3fe78 [README.md] Clarify newline format in cookies section (closes #11709) 2017-01-14 08:48:26 +07:00
Sergey M․
abe8cb763f [cbc] Improve playlist support (closes #11704) 2017-01-14 08:30:00 +07:00
Sergey M․
5d4c7daa49 release 2017.01.14 2017-01-14 07:31:07 +07:00
Sergey M․
0b94510cd0 [ChangeLog] Actualize 2017-01-14 07:30:32 +07:00
Jakub Wilk
4f66c16f33 [brightcove:legacy] Fix misplaced backslash in a regexp 2017-01-14 06:26:11 +07:00
Sergey M․
e54fc0524e [cmt] Add support for video-clips 2017-01-14 06:23:24 +07:00
Sergey M․
adf063dad1 [mtv,cc,cmt,spike] Improve and refactor
- Eliminate _transform_rtmp_url
* Generalize triforce mgid extraction
+ [cmt] Add support for full-episodes (closes #11623)
2017-01-14 06:18:38 +07:00
Remita Amine
5e8eebb600 [mitele] extract dash formats 2017-01-13 23:06:59 +01:00
Remita Amine
9837cb7507 [ooyala] add support for videos with embedToken(#11684) 2017-01-13 23:06:59 +01:00
Sergey M․
fb6a59205e [mixcloud] Fix extraction (closes #11674) 2017-01-13 23:56:16 +07:00
Vijay Singh
06e9363b7a [openload] Fix extraction (closes #10408)
Just a minor fix for openload
2017-01-13 23:40:19 +07:00
Remita Amine
1f393a3241 [tv4] improve extraction(closes #11698)
- remove check for requires_subscription
- extract more formats
- extract subtitles
2017-01-13 10:21:37 +01:00
Remita Amine
c4251b9aaa [common] add possibility to customize akamai manifest host 2017-01-13 10:21:36 +01:00
Sergey M․
3a407e707a [freesound] Improve and remove unrelated metadata (closes #11608) 2017-01-12 23:03:53 +07:00
Sergey M․
cb655f34fb [utils] Add more date formats 2017-01-12 22:39:45 +07:00
sh!zeeg
ed06da4e7b [freesound] Fix extraction and extended (closes #11602) 2017-01-12 22:35:14 +07:00
Sergey M․
365d136b7c [vimeo] Fix tests 2017-01-11 22:57:08 +07:00
Sergey M․
1fd0fc42bd [vimeo:ondemand] Fix test (closes #11651) 2017-01-11 22:51:03 +07:00
Sergey M․
10cd2003b4 [nick] Add support for beta.nick.com (closes #11655) 2017-01-10 22:32:34 +07:00
Sergey M․
cdd11c0540 [mtv] Use native hls by default 2017-01-10 22:31:20 +07:00
Sergey M․
67fc365b86 [mtv,cc] Use hls by default (closes #11641) 2017-01-10 22:30:47 +07:00
Sergey M․
20faad74b6 [mtv] Fix non-hls extraction
method attribute may not be present
2017-01-10 22:27:23 +07:00
Sergey M․
2032d935d1 [mtv] Add default value for use_hls
These methods are used across codebase with old number of arguments
2017-01-10 22:25:33 +07:00
Sergey M․
31ea2ad89d release 2017.01.10 2017-01-10 21:29:20 +07:00
Sergey M․
2184d44361 [ChangeLog] Actualize 2017-01-10 21:27:17 +07:00
Sergey M․
d1aeacd9bf [youtube] Fix extraction (closes #11663, #11664) 2017-01-10 21:25:29 +07:00
Sergey M․
366b759a60 [inc] Improve (closes #11647) 2017-01-09 23:08:59 +07:00
Déstin Reed
7f0bdc7a31 [inc] Add extractor 2017-01-09 22:57:14 +07:00
Sergey M․
022a5d663b [youtube] Add test for itag 212 (#11575) 2017-01-09 22:30:46 +07:00
Kacper Michajłow
8409b3683c [youtube] Add itag 212
Seen on video with id 1t24XAntNCY
2017-01-09 22:29:03 +07:00
Philipp Hagemeister
bfedb2cc5a small fix to Changelog format 2017-01-09 11:26:01 +01:00
Philipp Hagemeister
8084951b7f [egghead:course] Add support for egghead.io course playlists
Individual egghead videos are already handled by the generic/Wistia extractors.
2017-01-09 11:24:40 +01:00
Sergey M․
e7ea724cb9 release 2017.01.08 2017-01-08 20:58:43 +07:00
Sergey M․
e60166020b [ChangeLog] Actualize 2017-01-08 20:56:38 +07:00
Sergey M․
364131584b [hitrecord] Improve (closes #11626) 2017-01-08 20:17:18 +07:00
J
553c68bbd9 [hitrecord] Add extractor 2017-01-08 20:17:18 +07:00
Remita Amine
827961b122 [videott] remove extractor 2017-01-07 14:47:36 +01:00
Remita Amine
a5eefc492b [swrmediathek] skip tests correctly 2017-01-06 15:09:10 +01:00
Remita Amine
a9cd1691b2 [swrmediathek] improve extraction 2017-01-06 15:06:08 +01:00
Remita Amine
2365f94412 [sharesix] remove extractor 2017-01-06 13:56:58 +01:00
Remita Amine
32b7c2a57e [aol] remove AolFeaturesIE 2017-01-06 12:10:47 +01:00
Remita Amine
221ce32529 [break] merge BreakIE and ScreenJunkiesIE 2017-01-06 11:25:48 +01:00
Remita Amine
e5dfdc8164 [sendtonews] improve info extraction 2017-01-06 11:23:43 +01:00
Remita Amine
a814da3f62 [skynews] update test 2017-01-06 11:22:35 +01:00
Sergey M․
b2727d0bee [3sat,phoenix] Fix extraction (closes #11619) 2017-01-06 17:13:53 +07:00
Philipp Hagemeister
dbaf601646 [comedycentral/mtv] Add support for HLS videos (fixes #11600)
Currently, the HTTP files of the RTMP urls are not present for the The Daily Show.
Use HLS instead for now.
2017-01-05 22:36:07 +01:00
Sergey M․
a9ee260217 [README.md] Mention --config-location in configuration section (#11615) 2017-01-06 02:45:45 +07:00
Yen Chi Hsuan
1219201143 [ChangeLog] Update after #11581
[ci skip]
2017-01-06 01:07:30 +08:00
Yen Chi Hsuan
ec85ded83c Fix "invalid escape sequences" error on Python 3.6 2017-01-06 00:58:56 +08:00
Yen Chi Hsuan
24d8a75982 [discoverygo] Fix JSON data parsing
HTMLParser, which is used by extract_attributes, already unescapes
attribute values with HTMLParser.unescape. They shouldn't be unescaped
again, to there may be parsing errors.

Ref: #11219, #11522
2017-01-05 18:50:34 +08:00
Sergey M․
7232bb299b release 2017.01.05 2017-01-05 04:10:15 +07:00
Sergey M․
2b12e34076 [ChangeLog] Actualize 2017-01-05 04:07:32 +07:00
Sergey M․
fb47cb5b23 [zdf] Improve (closes #11055, closes #11063) 2017-01-05 04:05:27 +07:00
Paul Hartmann
b6de53ea8a [zdf] Fix extraction 2017-01-05 04:04:53 +07:00
Sergey M․
96d315c2be [pornhub:playlist] Improve extraction (closes #11594) 2017-01-04 05:32:18 +07:00
Sergey M․
1911d77d28 [cctv] Add support for ncpa-classic.com (closes #11591) 2017-01-04 01:30:40 +07:00
Sergey M․
027e231295 [tunein] Add support for embeds (closes #11579) 2017-01-03 01:45:59 +07:00
Sergey M․
7a9e066972 [cctv] Relax some video id regexes 2017-01-03 01:13:02 +07:00
Sergey M․
2021b650dd release 2017.01.02 2017-01-02 23:55:04 +07:00
Sergey M․
b890caaf21 [ChangeLog] Actualize 2017-01-02 23:54:29 +07:00
Sergey M․
3783a5ccba [cctv] Relax _VALID_URL 2017-01-02 23:18:44 +07:00
Sergey M․
327caf661a [cctv] Do not fallback on video id extracted from URL 2017-01-02 23:00:37 +07:00
Sergey M․
ce7ccb1caa [cctv] Improve and merge with cntv (closes #879, closes #6753, closes #8541) 2017-01-02 22:55:24 +07:00
RPing
295eac6165 [cntv] Add extractor 2017-01-02 22:55:19 +07:00
Yen Chi Hsuan
d546d4c8e0 Merge pull request #11578 from rpunkfu/patch-2
Update Readme: Set HOME properly in install command
2017-01-02 13:50:26 +08:00
Oskar Cieslik
eec45445a8 Update Readme: Set home in sudo pip install
Hi, it's not always a default behaviour, but when you use `sudo pip` you most likely may want to use `-H` flag to set HOME value to directory of the target user :)
2017-01-02 00:22:10 +01:00
Sergey M․
7fc06b6a15 [README.md] Update link to available YoutubeDL options 2017-01-01 23:36:52 +07:00
Sergey M․
966815e139 [nrktv:episodes] Add support for episodes (#11571) 2017-01-01 21:26:32 +07:00
Sergey M․
e5e19379be [ISSUE_TEMPLATE_tmpl.md] Clarify example URLs constraints for site support request 2017-01-01 15:11:49 +07:00
Sergey M․
1f766b6e7b [arkena] Add support for video.arkena.com (closes #11568) 2017-01-01 02:46:47 +07:00
Sergey M․
dc48a35404 release 2016.12.31 2016-12-31 23:58:41 +07:00
Sergey M․
1ea0b727c4 [ChangeLog] Actualize 2016-12-31 23:53:30 +07:00
Sergey M․
b6ee45e9fa Improve custom config support (closes #10648) 2016-12-31 23:41:37 +07:00
Fabian Stahl
e66dca5e4a Add option --config-location
A configfile can now be passed to youtube_dl.

undo changes

Raise parser error if file not found, change to user_conf

change metavar hand helptext for --configfile

Fix help for --configfile

Update help for --configfile

Numbering placeholder in configfile error msg

minor fix

Change option --configfile top --config-file

Fix -config-file error
2016-12-31 23:04:16 +07:00
Sergey M․
3f1ce16876 [twitch:vod] Improve _VALID_URL (closes #11537) 2016-12-31 22:40:42 +07:00
Robert Smith
9a0f999585 [twitch] Added support for player.twitch.tv URLs (closes #11535) 2016-12-31 22:32:49 +07:00
David Haberthür
3540fe262f [README.md] Fix spelling and harmonize line length 2016-12-31 22:29:35 +07:00
Sergey M․
e186a9ec03 [videa] Add support for videa embeds 2016-12-31 22:05:32 +07:00
Sergey M․
69677f3ee2 [videa] Improve and simplify (closes #8181, closes #11133) 2016-12-31 22:05:32 +07:00
Bagira
e746021577 [videa] Add extractor 2016-12-31 22:05:32 +07:00
Chris Gavin
490da94edf [devscripts/buildserver] Remove unreachable except block 2016-12-31 19:17:52 +07:00
Sergey M․
424ed37ec4 [vk] Fix postlive videos extraction 2016-12-30 04:31:19 +07:00
Sergey M․
9cdb0a338d [vk] Extract from playerParams (closes #11555) 2016-12-30 04:21:49 +07:00
Sergey M․
6cf261d882 [freevideo] Remove extractor (closes #11515)
Handled by generic extractor
2016-12-30 00:32:23 +07:00
Sergey M․
df086e74e2 [showroomlive] Improve (closes #11458) 2016-12-30 00:12:35 +07:00
Arjan Verwer
963bd5ecfc [showroomlive] Add extractor 2016-12-29 23:17:00 +07:00
Sergey M․
51378d359e [xhamster] Fix duration extraction (closes #11549) 2016-12-28 23:04:46 +07:00
Sergey M․
b63005f5af [rtve:live] Fix extraction (closes #11529) 2016-12-25 04:02:29 +07:00
Yen Chi Hsuan
4606c34e19 [extractor/common] Allow non-lang in subtitles' keys
See 264e77c406
2016-12-25 01:50:50 +08:00
Sergey M․
53a664edf4 [brightcove:legacy] Improve embeds detection (closes #11523) 2016-12-24 22:46:27 +07:00
Sergey M․
264e77c406 [twitch] Add support for rechat messages (closes #11524) 2016-12-24 22:10:54 +07:00
Remita Amine
d1cd7e0ed9 Credit @wader for #11521 2016-12-24 15:00:23 +01:00
Mattias Wadman
846fd69bac [acast] Add test with multiple blings 2016-12-24 14:28:30 +01:00
Mattias Wadman
12da830993 [acast] Fix broken audio URL and timestamp extraction
Before first bling was used now we look for the first bling with
type BlingAudio.

Before publishingDate was a ms unix timestamp now it is iso8601.
2016-12-24 14:28:30 +01:00
Sergey M․
e7ac722d62 [README.md] Add missing protocols to format selection section 2016-12-23 22:01:22 +07:00
hub2git
19f37ce4b1 [README.md] Fix typo 2016-12-23 09:25:39 +07:00
Sergey M․
5e77c0b58e release 2016.12.22 2016-12-22 22:52:54 +07:00
Remita Amine
ab3091feda [ChangeLog] Actualize 2016-12-22 22:51:51 +07:00
Remita Amine
a07588369f [common] improve detection for video only formats and m3u8 manifest(fixes #11507) 2016-12-22 10:02:56 +01:00
Remita Amine
f5a723a78a [theplatform] pass geo verification headers to smil request(closes #10146) 2016-12-21 20:59:03 +01:00
Remita Amine
f120646f04 [viu] pass geo verification headers to auth request 2016-12-21 20:50:10 +01:00
Remita Amine
9c5b5f2115 [rtl2] extract more formats and metadata 2016-12-21 18:46:25 +01:00
Sergey M․
ae806db628 [vbox7] Skip malformed JSON-LD (closes #11501) 2016-12-21 22:39:05 +07:00
Remita Amine
bfa1073e11 [uplynk] force downloading using hls native downloader(closes #11496) 2016-12-20 19:49:45 +01:00
Remita Amine
e029c43bd4 [laola1] add support for another extraction scenario(closes #11460) 2016-12-20 18:22:57 +01:00
337 changed files with 2697 additions and 1652 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.20*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.20**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.16**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.12.20
[debug] youtube-dl version 2017.01.16
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information

View File

@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information

View File

@@ -190,3 +190,4 @@ John Hawkinson
Rich Leeper
Zhong Jianxin
Thor77
Mattias Wadman

View File

@@ -58,7 +58,7 @@ We are then presented with a very complicated request when the original problem
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?
@@ -94,7 +94,7 @@ If you want to create a build of youtube-dl yourself, you'll need
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with:
@@ -124,7 +124,7 @@ After you have ensured this site is distributing it's content legally, you can f
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
@@ -199,7 +199,7 @@ Assume at this point `meta`'s layout is:
}
```
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct

119
ChangeLog
View File

@@ -1,3 +1,122 @@
version 2017.01.16
Core
* [options] Apply custom config to final composite configuration (#11741)
* [YoutubeDL] Improve protocol auto determining (#11720)
Extractors
* [xiami] Relax URL regular expressions
* [xiami] Improve track metadata extraction (#11699)
+ [limelight] Check hand-make direct HTTP links
+ [limelight] Add support for direct HTTP links at video.llnw.net (#11737)
+ [brightcove] Recognize another player ID pattern (#11688)
+ [niconico] Support login via cookies (#7968)
* [yourupload] Fix extraction (#11601)
+ [beam:live] Add support for beam.pro live streams (#10702, #11596)
* [vevo] Improve geo restriction detection
+ [dramafever] Add support for URLs with language code (#11714)
* [cbc] Improve playlist support (#11704)
version 2017.01.14
Core
+ [common] Add ability to customize akamai manifest host
+ [utils] Add more date formats
Extractors
- [mtv] Eliminate _transform_rtmp_url
* [mtv] Generalize triforce mgid extraction
+ [cmt] Add support for full episodes and video clips (#11623)
+ [mitele] Extract DASH formats
+ [ooyala] Add support for videos with embedToken (#11684)
* [mixcloud] Fix extraction (#11674)
* [openload] Fix extraction (#10408)
* [tv4] Improve extraction (#11698)
* [freesound] Fix and improve extraction (#11602)
+ [nick] Add support for beta.nick.com (#11655)
* [mtv,cc] Use HLS by default with native HLS downloader (#11641)
* [mtv] Fix non-HLS extraction
version 2017.01.10
Extractors
* [youtube] Fix extraction (#11663, #11664)
+ [inc] Add support for inc.com (#11277, #11647)
+ [youtube] Add itag 212 (#11575)
+ [egghead:course] Add support for egghead.io courses
version 2017.01.08
Core
* Fix "invalid escape sequence" errors under Python 3.6 (#11581)
Extractors
+ [hitrecord] Add support for hitrecord.org (#10867, #11626)
- [videott] Remove extractor
* [swrmediathek] Improve extraction
- [sharesix] Remove extractor
- [aol:features] Remove extractor
* [sendtonews] Improve info extraction
* [3sat,phoenix] Fix extraction (#11619)
* [comedycentral/mtv] Add support for HLS videos (#11600)
* [discoverygo] Fix JSON data parsing (#11219, #11522)
version 2017.01.05
Extractors
+ [zdf] Fix extraction (#11055, #11063)
* [pornhub:playlist] Improve extraction (#11594)
+ [cctv] Add support for ncpa-classic.com (#11591)
+ [tunein] Add support for embeds (#11579)
version 2017.01.02
Extractors
* [cctv] Improve extraction (#879, #6753, #8541)
+ [nrktv:episodes] Add support for episodes (#11571)
+ [arkena] Add support for video.arkena.com (#11568)
version 2016.12.31
Core
+ Introduce --config-location option for custom configuration files (#6745,
#10648)
Extractors
+ [twitch] Add support for player.twitch.tv (#11535, #11537)
+ [videa] Add support for videa.hu (#8181, #11133)
* [vk] Fix postlive videos extraction
* [vk] Extract from playerParams (#11555)
- [freevideo] Remove extractor (#11515)
+ [showroomlive] Add support for showroom-live.com (#11458)
* [xhamster] Fix duration extraction (#11549)
* [rtve:live] Fix extraction (#11529)
* [brightcove:legacy] Improve embeds detection (#11523)
+ [twitch] Add support for rechat messages (#11524)
* [acast] Fix audio and timestamp extraction (#11521)
version 2016.12.22
Core
* [extractor/common] Improve detection of video-only formats in m3u8
manifests (#11507)
Extractors
+ [theplatform] Pass geo verification headers to SMIL request (#10146)
+ [viu] Pass geo verification headers to auth request
* [rtl2] Extract more formats and metadata
* [vbox7] Skip malformed JSON-LD (#11501)
* [uplynk] Force downloading using native HLS downloader (#11496)
+ [laola1] Add support for another extraction scenario (#11460)
version 2016.12.20
Core

View File

@@ -29,7 +29,7 @@ Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.ex
You can also use pip:
sudo pip install --upgrade youtube-dl
sudo -H pip install --upgrade youtube-dl
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
@@ -44,11 +44,7 @@ Or with [MacPorts](https://www.macports.org/):
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like.
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...]
@@ -84,6 +80,9 @@ which means you can modify it, redistribute it or use it however you like.
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--config-location PATH Location of the configuration file; either
the path to the config or its containing
directory.
--flat-playlist Do not extract the videos of a playlist,
only list them.
--mark-watched Mark videos watched (YouTube only)
@@ -187,7 +186,7 @@ which means you can modify it, redistribute it or use it however you like.
of SIZE.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize Set file xattribute ytdl.filesize with
expected filesize (experimental)
expected file size (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
@@ -354,7 +353,7 @@ which means you can modify it, redistribute it or use it however you like.
-u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left
out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR Two-factor auth code
-2, --twofactor TWOFACTOR Two-factor authentication code
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku)
@@ -447,6 +446,8 @@ Note that options in configuration file are just the same options aka switches u
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
You can also use `--config-location` if you want to use custom configuration file for a particular youtube-dl run.
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by only you:
@@ -638,7 +639,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
@@ -744,7 +745,7 @@ Most people asking this question are not aware that youtube-dl now defaults to d
### I get HTTP error 402 when trying to download a video. What's this?
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs?
@@ -756,7 +757,7 @@ Videos or video formats streamed via RTMP protocol can only be downloaded when [
Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](http://www.videolan.org/) or [mplayer](http://www.mplayerhq.hu/).
### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
### I extracted a video URL with `-g`, but it does not play on another machine / in my web browser.
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies and/or HTTP headers. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl. You can also get necessary cookies and HTTP headers from JSON output obtained with `--dump-json`.
@@ -840,7 +841,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
@@ -932,7 +933,7 @@ If you want to create a build of youtube-dl yourself, you'll need
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with:
@@ -962,7 +963,7 @@ After you have ensured this site is distributing it's content legally, you can f
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
@@ -1037,7 +1038,7 @@ Assume at this point `meta`'s layout is:
}
```
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
@@ -1149,7 +1150,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@@ -1252,7 +1253,7 @@ We are then presented with a very complicated request when the original problem
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?

View File

@@ -424,8 +424,6 @@ class BuildHTTPRequestHandler(compat_http_server.BaseHTTPRequestHandler):
self.send_header('Content-Length', len(msg))
self.end_headers()
self.wfile.write(msg)
except HTTPError as e:
self.send_response(e.code, str(e))
else:
self.send_response(500, 'Unknown build method "%s"' % action)
else:

View File

@@ -86,6 +86,7 @@
- **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist**
- **Beam:live**
- **Beatport**
- **Beeg**
- **BehindKink**
@@ -132,7 +133,7 @@
- **cbsnews:livevideo**: CBS News Live Videos
- **CBSSports**
- **CCMA**
- **CCTV**
- **CCTV**: 央视网
- **CDA**
- **CeskaTelevize**
- **channel9**: Channel 9
@@ -214,6 +215,7 @@
- **EaglePlatform**
- **EbaumsWorld**
- **EchoMsk**
- **egghead:course**: egghead.io course
- **eHow**
- **Einthusan**
- **eitb.tv**
@@ -240,7 +242,6 @@
- **fc2**
- **fc2:embed**
- **Fczenit**
- **features.aol.com**
- **fernsehkritik.tv**
- **Firstpost**
- **FiveTV**
@@ -263,7 +264,6 @@
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **Fusion**
@@ -305,6 +305,7 @@
- **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
- **HitRecord**
- **HornBunny**
- **HotNewHipHop**
- **HotStar**
@@ -322,6 +323,7 @@
- **Imgur**
- **ImgurAlbum**
- **Ina**
- **Inc**
- **Indavideo**
- **IndavideoEmbed**
- **InfoQ**
@@ -365,8 +367,8 @@
- **kuwo:singer**: 酷我音乐 - 歌手
- **kuwo:song**: 酷我音乐
- **la7.it**
- **Laola1Tv**
- **Laola1TvEmbed**
- **laola1tv**
- **laola1tv:embed**
- **LCI**
- **Lcp**
- **LcpPlay**
@@ -518,6 +520,7 @@
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
- **NRKTVEpisodes**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
@@ -650,7 +653,6 @@
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**
- **ScreenJunkies**
- **Seeker**
- **SenateISVP**
- **SendtoNews**
@@ -658,7 +660,7 @@
- **Sexu**
- **Shahid**
- **Shared**: shared.sx
- **ShareSix**
- **ShowRoomLive**
- **Sina**
- **SixPlay**
- **skynewsarabia:article**
@@ -834,6 +836,7 @@
- **ViceShow**
- **Vidbit**
- **Viddler**
- **Videa**
- **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective**
@@ -843,7 +846,6 @@
- **videomore:season**
- **videomore:video**
- **VideoPremium**
- **VideoTt**: video.tt - Your True Tube (Currently broken)
- **videoweed**: VideoWeed
- **Vidio**
- **vidme**

View File

@@ -295,6 +295,9 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_strdate('Feb 7, 2016 at 6:35 pm'), '20160207')
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)

View File

@@ -1339,7 +1339,7 @@ class YoutubeDL(object):
format['format_id'] = compat_str(i)
else:
# Sanitize format_id from characters used in format selector expression
format['format_id'] = re.sub('[\s,/+\[\]()]', '_', format['format_id'])
format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
@@ -1363,7 +1363,7 @@ class YoutubeDL(object):
format['ext'] = determine_ext(format['url']).lower()
# Automatically determine protocol if missing (useful for format
# selection purposes)
if 'protocol' not in format:
if format.get('protocol') is None:
format['protocol'] = determine_protocol(format)
# Add HTTP headers, so that external programs can use them from the
# json output

View File

@@ -405,7 +405,7 @@ def _real_main(argv=None):
'postprocessor_args': postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
'config_location': opts.config_location,
}
with YoutubeDL(ydl_opts) as ydl:

View File

@@ -2344,7 +2344,7 @@ try:
from urllib.parse import unquote_plus as compat_urllib_parse_unquote_plus
except ImportError: # Python 2
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire')
else re.compile('([\x00-\x7f]+)'))
else re.compile(r'([\x00-\x7f]+)'))
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version

View File

@@ -65,6 +65,9 @@ class HlsFD(FragmentFD):
s = manifest.decode('utf-8', 'ignore')
if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url'):
self.report_error('pycrypto not found. Please install it.')
return False
self.report_warning(
'hlsnative has detected features it does not support, '
'extraction will be delegated to ffmpeg')

View File

@@ -23,7 +23,7 @@ class AbcNewsVideoIE(AMPIE):
'title': '\'This Week\' Exclusive: Iran\'s Foreign Minister Zarif',
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
@@ -59,7 +59,7 @@ class AbcNewsIE(InfoExtractor):
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
},

View File

@@ -23,7 +23,7 @@ class ABCOTVSIE(InfoExtractor):
'ext': 'mp4',
'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421123075,
'upload_date': '20150113',
'uploader': 'Jonathan Bloom',

View File

@@ -8,6 +8,7 @@ from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_iso8601,
OnDemandPagedList,
)
@@ -15,18 +16,33 @@ from ..utils import (
class ACastIE(InfoExtractor):
IE_NAME = 'acast'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
_TEST = {
_TESTS = [{
# test with one bling
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
'timestamp': 1196172000000,
'timestamp': 1196172000,
'upload_date': '20071127',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211,
}
}
}, {
# test with multiple blings
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': '55c0097badd7095f494c99a172f86501',
'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3',
'title': '2. Raggarmordet - Röster ur det förflutna',
'timestamp': 1477346700,
'upload_date': '20161024',
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'duration': 2797,
}
}]
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
@@ -35,11 +51,11 @@ class ACastIE(InfoExtractor):
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
'url': cast_data['blings'][0]['audio'],
'url': [b['audio'] for b in cast_data['blings'] if b['type'] == 'BlingAudio'][0],
'title': cast_data['name'],
'description': cast_data.get('description'),
'thumbnail': cast_data.get('image'),
'timestamp': int_or_none(cast_data.get('publishingDate')),
'timestamp': parse_iso8601(cast_data.get('publishingDate')),
'duration': int_or_none(cast_data.get('duration')),
}

View File

@@ -30,7 +30,7 @@ class AdobeTVIE(AdobeTVBaseIE):
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20110914',
'duration': 60,
'view_count': int,

View File

@@ -20,7 +20,7 @@ class AirMozillaIE(InfoExtractor):
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
'thumbnail': 're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
'thumbnail': r're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',

View File

@@ -21,7 +21,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Astérix - Le Domaine des Dieux Teaser VF',
'description': 'md5:4a754271d9c6f16c72629a8a993ee884',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19540403&cfilm=222257.html',
@@ -32,7 +32,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Planes 2 Bande-annonce VF',
'description': 'Regardez la bande annonce du film Planes 2 (Planes 2 Bande-annonce VF). Planes 2, un film de Roberts Gannaway',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19544709&cfilm=181290.html',
@@ -43,7 +43,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Dragons 2 - Bande annonce finale VF',
'description': 'md5:6cdd2d7c2687d4c6aafe80a35e17267a',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/video-19550147/',
@@ -53,7 +53,7 @@ class AllocineIE(InfoExtractor):
'ext': 'mp4',
'title': 'Faux Raccord N°123 - Les gaffes de Cliffhanger',
'description': 'md5:bc734b83ffa2d8a12188d9eb48bb6354',
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
}]

View File

@@ -19,7 +19,7 @@ class AlphaPornoIE(InfoExtractor):
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
'ext': 'mp4',
'title': 'Sensual striptease porn with Samantha Alexandra',
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1418694611,
'upload_date': '20141216',
'duration': 387,

View File

@@ -12,7 +12,7 @@ from ..utils import (
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{
# video with 5min ID
@@ -33,7 +33,7 @@ class AolIE(InfoExtractor):
}
}, {
# video with vidible ID
'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
@@ -108,30 +108,3 @@ class AolIE(InfoExtractor):
'uploader': video_data.get('videoOwner'),
'formats': formats,
}
class AolFeaturesIE(InfoExtractor):
IE_NAME = 'features.aol.com'
_VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
'md5': '7db483bb0c09c85e241f84a34238cc75',
'info_dict': {
'id': '519507715',
'ext': 'mp4',
'title': 'What To Watch - February 17, 2016',
},
'add_ie': ['FiveMin'],
'params': {
# encrypted m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self.url_result(self._search_regex(
r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
webpage, '5min embed url'), 'FiveMin')

View File

@@ -253,7 +253,7 @@ class ARDIE(InfoExtractor):
'duration': 2600,
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20140804',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'HTTP Error 404: Not Found',
}

View File

@@ -4,8 +4,10 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
int_or_none,
mimetype2ext,
@@ -15,7 +17,13 @@ from ..utils import (
class ArkenaIE(InfoExtractor):
_VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
_VALID_URL = r'''(?x)
https?://
(?:
video\.arkena\.com/play2/embed/player\?|
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
)
'''
_TESTS = [{
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
@@ -37,6 +45,9 @@ class ArkenaIE(InfoExtractor):
}, {
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
'only_matching': True,
}, {
'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
'only_matching': True,
}]
@staticmethod
@@ -53,6 +64,14 @@ class ArkenaIE(InfoExtractor):
video_id = mobj.group('id')
account_id = mobj.group('account_id')
# Handle http://video.arkena.com/play2/embed/player URL
if not video_id:
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
video_id = qs.get('mediaId', [None])[0]
account_id = qs.get('accountId', [None])[0]
if not video_id or not account_id:
raise ExtractorError('Invalid URL', expected=True)
playlist = self._download_json(
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
% (video_id, account_id),

View File

@@ -30,7 +30,7 @@ class AtresPlayerIE(InfoExtractor):
'title': 'Especial Solidario de Nochebuena',
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 5527.6,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'This video is only available for registered users'
},
@@ -43,7 +43,7 @@ class AtresPlayerIE(InfoExtractor):
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
},
{

View File

@@ -14,7 +14,7 @@ class ATTTechChannelIE(InfoExtractor):
'ext': 'flv',
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140127',
},
'params': {

View File

@@ -17,7 +17,7 @@ class AudioBoomIE(InfoExtractor):
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',

View File

@@ -21,7 +21,7 @@ class AzubuIE(InfoExtractor):
'ext': 'mp4',
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
'thumbnail': 're:^https?://.*\.jpe?g',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1417523507.334,
'upload_date': '20141202',
'duration': 9988.7,
@@ -38,7 +38,7 @@ class AzubuIE(InfoExtractor):
'ext': 'mp4',
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
'thumbnail': 're:^https?://.*\.jpe?g',
'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1410530893.320,
'upload_date': '20140912',
'duration': 172.385,

View File

@@ -0,0 +1,73 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
compat_str,
int_or_none,
parse_iso8601,
try_get,
)
class BeamProLiveIE(InfoExtractor):
IE_NAME = 'Beam:live'
_VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
_RATINGS = {'family': 0, 'teen': 13, '18+': 18}
_TEST = {
'url': 'http://www.beam.pro/niterhayven',
'info_dict': {
'id': '261562',
'ext': 'mp4',
'title': 'Introducing The Witcher 3 // The Grind Starts Now!',
'description': 'md5:0b161ac080f15fe05d18a07adb44a74d',
'thumbnail': r're:https://.*\.jpg$',
'timestamp': 1483477281,
'upload_date': '20170103',
'uploader': 'niterhayven',
'uploader_id': '373396',
'age_limit': 18,
'is_live': True,
'view_count': int,
},
'skip': 'niterhayven is offline',
'params': {
'skip_download': True,
},
}
def _real_extract(self, url):
channel_name = self._match_id(url)
chan = self._download_json(
'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name)
if chan.get('online') is False:
raise ExtractorError(
'{0} is offline'.format(channel_name), expected=True)
channel_id = chan['id']
formats = self._extract_m3u8_formats(
'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id,
channel_name, ext='mp4', m3u8_id='hls', fatal=False)
self._sort_formats(formats)
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
return {
'id': compat_str(chan.get('id') or channel_name),
'title': self._live_title(chan.get('name') or channel_name),
'description': clean_html(chan.get('description')),
'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str),
'timestamp': parse_iso8601(chan.get('updatedAt')),
'uploader': chan.get('token') or try_get(
chan, lambda x: x['user']['username'], compat_str),
'uploader_id': compat_str(user_id) if user_id else None,
'age_limit': self._RATINGS.get(chan.get('audience')),
'is_live': True,
'view_count': int_or_none(chan.get('viewersTotal')),
'formats': formats,
}

View File

@@ -17,7 +17,7 @@ class BetIE(MTVServicesInfoExtractor):
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'upload_date': '20141208',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
@@ -37,7 +37,7 @@ class BetIE(MTVServicesInfoExtractor):
'description': 'A BET News special.',
'duration': 1696,
'upload_date': '20141125',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}

View File

@@ -19,7 +19,7 @@ class BildIE(InfoExtractor):
'ext': 'mp4',
'title': 'Das können die neuen iPads',
'description': 'md5:a4058c4fa2a804ab59c00d7244bbf62f',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 196,
}
}

View File

@@ -28,7 +28,7 @@ class BiliBiliIE(InfoExtractor):
'duration': 308.315,
'timestamp': 1398012660,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
'thumbnail': r're:^https?://.+\.jpg',
'uploader': '菊子桑',
'uploader_id': '156160',
},

View File

@@ -19,7 +19,7 @@ class BioBioChileTVIE(InfoExtractor):
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
@@ -31,7 +31,7 @@ class BioBioChileTVIE(InfoExtractor):
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {

View File

@@ -1,9 +1,9 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_age_limit,
@@ -11,7 +11,7 @@ from ..utils import (
class BreakIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?(?P<site>break|screenjunkies)\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {
@@ -20,45 +20,124 @@ class BreakIE(InfoExtractor):
'title': 'When Girls Act Like D-Bags',
'age_limit': 13,
}
}, {
'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
'md5': '5c2b686bec3d43de42bde9ec047536b0',
'info_dict': {
'id': '2841915',
'display_id': 'best-quentin-tarantino-movie',
'ext': 'mp4',
'title': 'Best Quentin Tarantino Movie',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3671,
'age_limit': 13,
'tags': list,
},
}, {
'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
'info_dict': {
'id': '2348808',
'display_id': 'honest-trailers-the-dark-knight',
'ext': 'mp4',
'title': 'Honest Trailers - The Dark Knight',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'age_limit': 10,
'tags': list,
},
}, {
# requires subscription but worked around
'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
'info_dict': {
'id': '3003285',
'display_id': 'knocking-dead-ep-1-the-show-so-far',
'ext': 'mp4',
'title': 'State of The Dead Recap: Knocking Dead Pilot',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3307,
'age_limit': 13,
'tags': list,
},
}, {
'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://www.break.com/embed/%s' % video_id, video_id)
info = json.loads(self._search_regex(
r'var embedVars = ({.*})\s*?</script>',
webpage, 'info json', flags=re.DOTALL))
_DEFAULT_BITRATES = (48, 150, 320, 496, 864, 2240, 3264)
youtube_id = info.get('youtubeId')
def _real_extract(self, url):
site, display_id, video_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
(r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
webpage, 'video id')
webpage = self._download_webpage(
'http://www.%s.com/embed/%s' % (site, video_id),
display_id, 'Downloading video embed page')
embed_vars = self._parse_json(
self._search_regex(
r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
display_id)
youtube_id = embed_vars.get('youtubeId')
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
formats = [{
'url': media['uri'] + '?' + info['AuthToken'],
'tbr': media['bitRate'],
'width': media['width'],
'height': media['height'],
} for media in info['media'] if media.get('mediaPurpose') == 'play']
title = embed_vars['contentName']
if not formats:
formats = []
bitrates = []
for f in embed_vars.get('media', []):
if not f.get('uri') or f.get('mediaPurpose') != 'play':
continue
bitrate = int_or_none(f.get('bitRate'))
if bitrate:
bitrates.append(bitrate)
formats.append({
'url': info['videoUri']
'url': f['uri'],
'format_id': 'http-%d' % bitrate if bitrate else 'http',
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': bitrate,
'format': 'mp4',
})
self._sort_formats(formats)
if not bitrates:
# When subscriptionLevel > 0, i.e. plus subscription is required
# media list will be empty. However, hds and hls uris are still
# available. We can grab them assuming bitrates to be default.
bitrates = self._DEFAULT_BITRATES
duration = int_or_none(info.get('videoLengthInSeconds'))
age_limit = parse_age_limit(info.get('audienceRating'))
auth_token = embed_vars.get('AuthToken')
def construct_manifest_url(base_url, ext):
pieces = [base_url]
pieces.extend([compat_str(b) for b in bitrates])
pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
return ','.join(pieces)
if bitrates and auth_token:
hds_url = embed_vars.get('hdsUri')
if hds_url:
formats.extend(self._extract_f4m_formats(
construct_manifest_url(hds_url, 'f4m'),
display_id, f4m_id='hds', fatal=False))
hls_url = embed_vars.get('hlsUri')
if hls_url:
formats.extend(self._extract_m3u8_formats(
construct_manifest_url(hls_url, 'm3u8'),
display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,
'title': info['contentName'],
'thumbnail': info['thumbUri'],
'duration': duration,
'age_limit': age_limit,
'display_id': display_id,
'title': title,
'thumbnail': embed_vars.get('thumbUri'),
'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
'tags': embed_vars.get('tags', '').split(','),
'formats': formats,
}

View File

@@ -179,7 +179,7 @@ class BrightcoveLegacyIE(InfoExtractor):
params = {}
playerID = find_param('playerID')
playerID = find_param('playerID') or find_param('playerId')
if playerID is None:
raise ExtractorError('Cannot find player ID')
params['playerID'] = playerID
@@ -204,7 +204,7 @@ class BrightcoveLegacyIE(InfoExtractor):
# // build Brightcove <object /> XML
# }
m = re.search(
r'''(?x)customBC.\createVideo\(
r'''(?x)customBC\.createVideo\(
.*? # skipping width and height
["\'](?P<playerID>\d+)["\']\s*,\s* # playerID
["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s* # playerKey begins with AQ and is 50 characters
@@ -232,13 +232,16 @@ class BrightcoveLegacyIE(InfoExtractor):
"""Return a list of all Brightcove URLs from the webpage """
url_m = re.search(
r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
webpage)
r'''(?x)
<meta\s+
(?:property|itemprop)=([\'"])(?:og:video|embedURL)\1[^>]+
content=([\'"])(?P<url>https?://(?:secure|c)\.brightcove.com/(?:(?!\2).)+)\2
''', webpage)
if url_m:
url = unescapeHTML(url_m.group(1))
url = unescapeHTML(url_m.group('url'))
# Some sites don't add it, we can't download with this url, for example:
# http://www.ktvu.com/videos/news/raw-video-caltrain-releases-video-of-man-almost/vCTZdY/
if 'playerKey' in url or 'videoId' in url:
if 'playerKey' in url or 'videoId' in url or 'idVideo' in url:
return [url]
matches = re.findall(
@@ -259,7 +262,7 @@ class BrightcoveLegacyIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
# Change the 'videoId' and others field to '@videoPlayer'
url = re.sub(r'(?<=[?&])(videoI(d|D)|bctid)', '%40videoPlayer', url)
url = re.sub(r'(?<=[?&])(videoI(d|D)|idVideo|bctid)', '%40videoPlayer', url)
# Change bckey (used by bcove.me urls) to playerKey
url = re.sub(r'(?<=[?&])bckey', 'playerKey', url)
mobj = re.match(self._VALID_URL, url)

View File

@@ -16,7 +16,7 @@ class BYUtvIE(InfoExtractor):
'ext': 'mp4',
'title': 'Season 5 Episode 5',
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1486.486,
},
'params': {

View File

@@ -26,7 +26,7 @@ class CamdemyIE(InfoExtractor):
'id': '5181',
'ext': 'mp4',
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'ss11spring',
'duration': 1591,
'upload_date': '20130114',
@@ -41,7 +41,7 @@ class CamdemyIE(InfoExtractor):
'id': '13885',
'ext': 'mp4',
'title': 'EverCam + Camdemy QuickStart',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2a9f989c2b153a2342acee579c6e7db6',
'creator': 'evercam',
'duration': 318,

View File

@@ -17,7 +17,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'De afspraak veilt voor de Warmste Week',
'description': 'md5:24cb860c320dc2be7358e0e5aa317ba6',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 49.02,
}
}, {
@@ -29,7 +29,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'Pieter 0167',
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2553.08,
'subtitles': {
'nl': [{
@@ -48,7 +48,7 @@ class CanvasIE(InfoExtractor):
'ext': 'mp4',
'title': 'Herbekijk Sorry voor alles',
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 3788.06,
},
'params': {

View File

@@ -21,7 +21,7 @@ class CarambaTVIE(InfoExtractor):
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
@@ -69,7 +69,7 @@ class CarambaTVPageIE(InfoExtractor):
'id': '475222',
'ext': 'flv',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
# duration reported by videomore is incorrect
'duration': int,
},

View File

@@ -90,36 +90,49 @@ class CBCIE(InfoExtractor):
},
}],
'skip': 'Geo-restricted to Canada',
}, {
# multiple CBC.APP.Caffeine.initInstance(...)
'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238',
},
'playlist_mincount': 6,
}]
@classmethod
def suitable(cls, url):
return False if CBCPlayerIE.suitable(url) else super(CBCIE, cls).suitable(url)
def _extract_player_init(self, player_init, display_id):
player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
feed = self._download_json(
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
clip_id, fatal=False)
if feed:
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
if not media_id:
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_init = self._search_regex(
r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage, 'player init',
default=None)
if player_init:
player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
feed = self._download_json(
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
clip_id, fatal=False)
if feed:
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
if not media_id:
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
else:
entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
return self.playlist_result(entries)
entries = [
self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
entries.extend([
self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)])
return self.playlist_result(
entries, display_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))
class CBCPlayerIE(InfoExtractor):

View File

@@ -39,7 +39,7 @@ class CBSNewsIE(CBSIE):
'upload_date': '20140404',
'timestamp': 1396650660,
'uploader': 'CBSI-NEW',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 205,
'subtitles': {
'en': [{

View File

@@ -19,7 +19,7 @@ class CCCIE(InfoExtractor):
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
@@ -32,7 +32,7 @@ class CCCIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
event_id = self._search_regex("data-id='(\d+)'", webpage, 'event id')
event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
event_data = self._download_json('https://media.ccc.de/public/events/%s' % event_id, event_id)
formats = []

View File

@@ -4,50 +4,188 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import float_or_none
from ..compat import compat_str
from ..utils import (
float_or_none,
try_get,
unified_timestamp,
)
class CCTVIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
(?:
cctv\.(?:com|cn)|
cntv\.cn
)/
(?:
video/[^/]+/(?P<id>[0-9a-f]{32})|
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
)'''
IE_DESC = '央视网'
_VALID_URL = r'https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)'
_TESTS = [{
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
'md5': '819c7b49fc3927d529fb4cd555621823',
# fo.addVariable("videoCenterId","id")
'url': 'http://sports.cntv.cn/2016/02/12/ARTIaBRxv4rTT1yWf1frW2wi160212.shtml',
'md5': 'd61ec00a493e09da810bf406a078f691',
'info_dict': {
'id': '454368eb19ad44a1925bf1eb96140a61',
'id': '5ecdbeab623f4973b40ff25f18b174e8',
'ext': 'mp4',
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
}
'title': '[NBA]二少联手砍下46分 雷霆主场击败鹈鹕(快讯)',
'description': 'md5:7e14a5328dc5eb3d1cd6afbbe0574e95',
'duration': 98,
'uploader': 'songjunjie',
'timestamp': 1455279956,
'upload_date': '20160212',
},
}, {
# var guid = "id"
'url': 'http://tv.cctv.com/2016/02/05/VIDEUS7apq3lKrHG9Dncm03B160205.shtml',
'info_dict': {
'id': 'efc5d49e5b3b4ab2b34f3a502b73d3ae',
'ext': 'mp4',
'title': '[赛车]“车王”舒马赫恢复情况成谜(快讯)',
'description': '2月4日蒙特泽莫罗透露了关于“车王”舒马赫恢复情况但情况是否属实遭到了质疑。',
'duration': 37,
'uploader': 'shujun',
'timestamp': 1454677291,
'upload_date': '20160205',
},
'params': {
'skip_download': True,
},
}, {
# changePlayer('id')
'url': 'http://english.cntv.cn/special/four_comprehensives/index.shtml',
'info_dict': {
'id': '4bb9bb4db7a6471ba85fdeda5af0381e',
'ext': 'mp4',
'title': 'NHnews008 ANNUAL POLITICAL SEASON',
'description': 'Four Comprehensives',
'duration': 60,
'uploader': 'zhangyunlei',
'timestamp': 1425385521,
'upload_date': '20150303',
},
'params': {
'skip_download': True,
},
}, {
# loadvideo('id')
'url': 'http://cctv.cntv.cn/lm/tvseries_russian/yilugesanghua/index.shtml',
'info_dict': {
'id': 'b15f009ff45c43968b9af583fc2e04b2',
'ext': 'mp4',
'title': 'Путь,усыпанный космеями Серия 1',
'description': 'Путь, усыпанный космеями',
'duration': 2645,
'uploader': 'renxue',
'timestamp': 1477479241,
'upload_date': '20161026',
},
'params': {
'skip_download': True,
},
}, {
# var initMyAray = 'id'
'url': 'http://www.ncpa-classic.com/2013/05/22/VIDE1369219508996867.shtml',
'info_dict': {
'id': 'a194cfa7f18c426b823d876668325946',
'ext': 'mp4',
'title': '小泽征尔音乐塾 音乐梦想无国界',
'duration': 2173,
'timestamp': 1369248264,
'upload_date': '20130522',
},
'params': {
'skip_download': True,
},
}, {
# var ids = ["id"]
'url': 'http://www.ncpa-classic.com/clt/more/416/index.shtml',
'info_dict': {
'id': 'a8606119a4884588a79d81c02abecc16',
'ext': 'mp3',
'title': '来自维也纳的新年贺礼',
'description': 'md5:f13764ae8dd484e84dd4b39d5bcba2a7',
'duration': 1578,
'uploader': 'djy',
'timestamp': 1482942419,
'upload_date': '20161228',
},
'params': {
'skip_download': True,
},
'expected_warnings': ['Failed to download m3u8 information'],
}, {
'url': 'http://ent.cntv.cn/2016/01/18/ARTIjprSSJH8DryTVr5Bx8Wb160118.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cntv.cn/video/C39296/e0210d949f113ddfb38d31f00a4e5c44',
'only_matching': True,
}, {
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
'only_matching': True
'only_matching': True,
}]
def _real_extract(self, url):
video_id, display_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
webpage, 'video_id')
api_data = self._download_json(
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_id = self._search_regex(
[r'var\s+guid\s*=\s*["\']([\da-fA-F]+)',
r'videoCenterId["\']\s*,\s*["\']([\da-fA-F]+)',
r'changePlayer\s*\(\s*["\']([\da-fA-F]+)',
r'load[Vv]ideo\s*\(\s*["\']([\da-fA-F]+)',
r'var\s+initMyAray\s*=\s*["\']([\da-fA-F]+)',
r'var\s+ids\s*=\s*\[["\']([\da-fA-F]+)'],
webpage, 'video id')
data = self._download_json(
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do', video_id,
query={
'pid': video_id,
'url': url,
'idl': 32,
'idlr': 32,
'modifyed': 'false',
})
title = data['title']
formats = []
video = data.get('video')
if isinstance(video, dict):
for quality, chapters_key in enumerate(('lowChapters', 'chapters')):
video_url = try_get(
video, lambda x: x[chapters_key][0]['url'], compat_str)
if video_url:
formats.append({
'url': video_url,
'format_id': 'http',
'quality': quality,
'preference': -1,
})
hls_url = try_get(data, lambda x: x['hls_url'], compat_str)
if hls_url:
hls_url = re.sub(r'maxbr=\d+&?', '', hls_url)
formats.extend(self._extract_m3u8_formats(
hls_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
uploader = data.get('editer_name')
description = self._html_search_meta(
'description', webpage, default=None)
timestamp = unified_timestamp(data.get('f_pgmtime'))
duration = float_or_none(try_get(video, lambda x: x['totalLength']))
return {
'id': video_id,
'title': api_data['title'],
'formats': self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
'title': title,
'description': description,
'uploader': uploader,
'timestamp': timestamp,
'duration': duration,
'formats': formats,
}

View File

@@ -24,7 +24,7 @@ class CDAIE(InfoExtractor):
'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'description': 'md5:269ccd135d550da90d1662651fcb9772',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'average_rating': float,
'duration': 39
}
@@ -36,7 +36,7 @@ class CDAIE(InfoExtractor):
'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze',
'description': 'md5:60d76b71186dcce4e0ba6d4bbdb13e1a',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'crash404',
'view_count': int,
'average_rating': float,

View File

@@ -25,7 +25,7 @@ class CeskaTelevizeIE(InfoExtractor):
'ext': 'mp4',
'title': 'Hyde Park Civilizace',
'description': 'md5:fe93f6eda372d150759d11644ebbfb4a',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3350,
},
'params': {
@@ -39,7 +39,7 @@ class CeskaTelevizeIE(InfoExtractor):
'ext': 'mp4',
'title': 'Hyde Park Civilizace: Bonus 01 - En',
'description': 'English Subtittles',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 81.3,
},
'params': {
@@ -52,7 +52,7 @@ class CeskaTelevizeIE(InfoExtractor):
'info_dict': {
'id': 402,
'ext': 'mp4',
'title': 're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'title': r're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
},
'params': {
@@ -80,7 +80,7 @@ class CeskaTelevizeIE(InfoExtractor):
'id': '61924494877068022',
'ext': 'mp4',
'title': 'Queer: Bogotart (Queer)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 1558.3,
},
}],

View File

@@ -31,7 +31,7 @@ class Channel9IE(InfoExtractor):
'title': 'Developer Kick-Off Session: Stuff We Love',
'description': 'md5:c08d72240b7c87fcecafe2692f80e35f',
'duration': 4576,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
'session_code': 'KOS002',
'session_day': 'Day 1',
'session_room': 'Arena 1A',
@@ -47,7 +47,7 @@ class Channel9IE(InfoExtractor):
'title': 'Self-service BI with Power BI - nuclear testing',
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
'duration': 1540,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
'authors': ['Mike Wilmot'],
},
}, {
@@ -59,7 +59,7 @@ class Channel9IE(InfoExtractor):
'title': 'Ranges for the Standard Library',
'description': 'md5:2e6b4917677af3728c5f6d63784c4c5d',
'duration': 5646,
'thumbnail': 're:http://.*\.jpg',
'thumbnail': r're:http://.*\.jpg',
},
'params': {
'skip_download': True,

View File

@@ -13,7 +13,7 @@ class CharlieRoseIE(InfoExtractor):
'id': '27996',
'ext': 'mp4',
'title': 'Remembering Zaha Hadid',
'thumbnail': 're:^https?://.*\.jpg\?\d+',
'thumbnail': r're:^https?://.*\.jpg\?\d+',
'description': 'We revisit past conversations with Zaha Hadid, in memory of the world renowned Iraqi architect.',
'subtitles': {
'en': [{

View File

@@ -30,7 +30,7 @@ class CliphunterIE(InfoExtractor):
'id': '1012420',
'ext': 'flv',
'title': 'Fun Jynx Maze solo',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
'skip': 'Video gone',
@@ -41,7 +41,7 @@ class CliphunterIE(InfoExtractor):
'id': '2019449',
'ext': 'mp4',
'title': 'ShesNew - My booty girlfriend, Victoria Paradice\'s pussy filled with jizz',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
}]

View File

@@ -18,7 +18,7 @@ class ClipsyndicateIE(InfoExtractor):
'ext': 'mp4',
'title': 'Brick Briscoe',
'duration': 612,
'thumbnail': 're:^https?://.+\.jpg',
'thumbnail': r're:^https?://.+\.jpg',
},
}, {
'url': 'http://chic.clipsyndicate.com/video/play/5844117/shark_attack',

View File

@@ -19,7 +19,7 @@ class ClubicIE(InfoExtractor):
'ext': 'mp4',
'title': 'Clubic Week 2.0 : le FBI se lance dans la photo d\u0092identité',
'description': 're:Gueule de bois chez Nokia. Le constructeur a indiqué cette.*',
'thumbnail': 're:^http://img\.clubic\.com/.*\.jpg$',
'thumbnail': r're:^http://img\.clubic\.com/.*\.jpg$',
}
}, {
'url': 'http://www.clubic.com/video/video-clubic-week-2-0-apple-iphone-6s-et-plus-mais-surtout-le-pencil-469792.html',

View File

@@ -1,13 +1,11 @@
from __future__ import unicode_literals
from .mtv import MTVIE
from ..utils import ExtractorError
class CMTIE(MTVIE):
IE_NAME = 'cmt.com'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
_FEED_URL = 'http://www.cmt.com/sitewide/apps/player/embed/rss/'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|full-episodes|video-clips)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cmt.com/videos/garth-brooks/989124/the-call-featuring-trisha-yearwood.jhtml#artist=30061',
@@ -33,17 +31,24 @@ class CMTIE(MTVIE):
}, {
'url': 'http://www.cmt.com/shows/party-down-south/party-down-south-ep-407-gone-girl/1738172/playlist/#id=1738172',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/full-episodes/537qb3/nashville-the-wayfaring-stranger-season-5-ep-501',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/video-clips/t9e4ci/nashville-juliette-in-2-minutes',
'only_matching': True,
}]
@classmethod
def _transform_rtmp_url(cls, rtmp_video_url):
if 'error_not_available.swf' in rtmp_video_url:
raise ExtractorError(
'%s said: video is not available' % cls.IE_NAME, expected=True)
return super(CMTIE, cls)._transform_rtmp_url(rtmp_video_url)
def _extract_mgid(self, webpage):
return self._search_regex(
mgid = self._search_regex(
r'MTVN\.VIDEO\.contentUri\s*=\s*([\'"])(?P<mgid>.+?)\1',
webpage, 'mgid', group='mgid')
webpage, 'mgid', group='mgid', default=None)
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
return mgid
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mgid = self._extract_mgid(webpage)
return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)

View File

@@ -21,7 +21,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'mp4',
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
'description': '',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 7713.088,
'timestamp': 1413309600,
'upload_date': '20141014',

View File

@@ -48,15 +48,7 @@ class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
feed = self._parse_json(feed_json, playlist_id)
zones = feed['manifest']['zones']
video_zone = zones['t2_lc_promo1']
feed = self._download_json(video_zone['feed'], playlist_id)
mgid = feed['result']['data']['id']
mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
videos_info = self._get_videos_info(mgid)
return videos_info
@@ -79,7 +71,7 @@ class ToshIE(MTVServicesInfoExtractor):
'ext': 'mp4',
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
'description': 'Tosh asked fans to share their summer plans.',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
# It's really reported to be published on year 2077
'upload_date': '20770610',
'timestamp': 3390510600,
@@ -93,12 +85,6 @@ class ToshIE(MTVServicesInfoExtractor):
'only_matching': True,
}]
@classmethod
def _transform_rtmp_url(cls, rtmp_video_url):
new_urls = super(ToshIE, cls)._transform_rtmp_url(rtmp_video_url)
new_urls['rtmp'] = rtmp_video_url.replace('viacomccstrm', 'viacommtvstrm')
return new_urls
class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'

View File

@@ -189,9 +189,10 @@ class InfoExtractor(object):
uploader_url: Full URL to a personal webpage of the video uploader.
location: Physical location where the video was filmed.
subtitles: The available subtitles as a dictionary in the format
{language: subformats}. "subformats" is a list sorted from
lower to higher preference, each element is a dictionary
with the "ext" entry and one of:
{tag: subformats}. "tag" is usually a language code, and
"subformats" is a list sorted from lower to higher
preference, each element is a dictionary with the "ext"
entry and one of:
* "data": The subtitles file contents
* "url": A URL pointing to the subtitles file
"ext" will be calculated from URL if missing
@@ -1225,7 +1226,7 @@ class InfoExtractor(object):
'protocol': entry_protocol,
'preference': preference,
}]
audio_groups = set()
audio_in_video_stream = {}
last_info = {}
last_media = {}
for line in m3u8_doc.splitlines():
@@ -1235,10 +1236,11 @@ class InfoExtractor(object):
media = parse_m3u8_attributes(line)
media_type = media.get('TYPE')
if media_type in ('VIDEO', 'AUDIO'):
group_id = media.get('GROUP-ID')
media_url = media.get('URI')
if media_url:
format_id = []
for v in (media.get('GROUP-ID'), media.get('NAME')):
for v in (group_id, media.get('NAME')):
if v:
format_id.append(v)
f = {
@@ -1251,12 +1253,15 @@ class InfoExtractor(object):
}
if media_type == 'AUDIO':
f['vcodec'] = 'none'
audio_groups.add(media['GROUP-ID'])
if group_id and not audio_in_video_stream.get(group_id):
audio_in_video_stream[group_id] = False
formats.append(f)
else:
# When there is no URI in EXT-X-MEDIA let this tag's
# data be used by regular URI lines below
last_media = media
if media_type == 'AUDIO' and group_id:
audio_in_video_stream[group_id] = True
elif line.startswith('#') or not line.strip():
continue
else:
@@ -1300,7 +1305,7 @@ class InfoExtractor(object):
'abr': abr,
})
f.update(parse_codecs(last_info.get('CODECS')))
if last_info.get('AUDIO') in audio_groups:
if audio_in_video_stream.get(last_info.get('AUDIO')) is False:
# TODO: update acodec for for audio only formats with the same GROUP-ID
f['acodec'] = 'none'
formats.append(f)
@@ -1962,10 +1967,13 @@ class InfoExtractor(object):
entries.append(media_info)
return entries
def _extract_akamai_formats(self, manifest_url, video_id):
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
formats = []
hdcore_sign = 'hdcore=3.7.0'
f4m_url = re.sub(r'(https?://.+?)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
hds_host = hosts.get('hds')
if hds_host:
f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)
if 'hdcore=' not in f4m_url:
f4m_url += ('&' if '?' in f4m_url else '?') + hdcore_sign
f4m_formats = self._extract_f4m_formats(
@@ -1973,7 +1981,10 @@ class InfoExtractor(object):
for entry in f4m_formats:
entry.update({'extra_param_to_segment_url': hdcore_sign})
formats.extend(f4m_formats)
m3u8_url = re.sub(r'(https?://.+?)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
m3u8_url = re.sub(r'(https?://[^/]+)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
hls_host = hosts.get('hls')
if hls_host:
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))

View File

@@ -20,7 +20,7 @@ class CoubIE(InfoExtractor):
'id': '5u5n1',
'ext': 'mp4',
'title': 'The Matrix Moonwalk',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4.6,
'timestamp': 1428527772,
'upload_date': '20150408',

View File

@@ -14,7 +14,7 @@ class CrackleIE(InfoExtractor):
'ext': 'mp4',
'title': 'Everybody Respects A Bloody Nose',
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). Theyre headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 906,
'series': 'Comedians In Cars Getting Coffee',
'season_number': 8,

View File

@@ -14,7 +14,7 @@ class CriterionIE(InfoExtractor):
'ext': 'mp4',
'title': 'Le Samouraï',
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
}
}

View File

@@ -16,7 +16,7 @@ class CrooksAndLiarsIE(InfoExtractor):
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',

View File

@@ -142,7 +142,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'ext': 'flv',
'title': 'Culture Japan Episode 1 Rebuilding Japan after the 3.11',
'description': 'md5:2fbc01f90b87e8e9137296f37b461c12',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Danny Choo Network',
'upload_date': '20120213',
},
@@ -158,7 +158,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'ext': 'mp4',
'title': 'Re:ZERO -Starting Life in Another World- Episode 5 The Morning of Our Promise Is Still Distant',
'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'TV TOKYO',
'upload_date': '20160508',
},

View File

@@ -28,7 +28,7 @@ class CtsNewsIE(InfoExtractor):
'ext': 'mp4',
'title': '韓國31歲童顏男 貌如十多歲小孩',
'description': '越有年紀的人越希望看起來年輕一點而南韓卻有一位31歲的男子看起來像是11、12歲的小孩身...',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1378205880,
'upload_date': '20130903',
}
@@ -41,7 +41,7 @@ class CtsNewsIE(InfoExtractor):
'ext': 'mp4',
'title': 'iPhone6熱銷 蘋果財報亮眼',
'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150128',
'uploader_id': 'TBSCTS',
'uploader': '中華電視公司',

View File

@@ -21,7 +21,7 @@ class CultureUnpluggedIE(InfoExtractor):
'ext': 'mp4',
'title': 'The Next, Best West',
'description': 'md5:0423cd00833dea1519cf014e9d0903b1',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'Coldstream Creative',
'duration': 2203,
'view_count': int,

View File

@@ -58,7 +58,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'ext': 'mp4',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
'description': 'Several come bundled with the Steam Controller.',
'thumbnail': 're:^https?:.*\.(?:jpg|png)$',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 74,
'timestamp': 1425657362,
'upload_date': '20150306',

View File

@@ -32,7 +32,7 @@ class DaumIE(InfoExtractor):
'title': '마크 헌트 vs 안토니오 실바',
'description': 'Mark Hunt vs Antonio Silva',
'upload_date': '20131217',
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 2117,
'view_count': int,
'comment_count': int,
@@ -45,7 +45,7 @@ class DaumIE(InfoExtractor):
'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118',
'description': 'md5:79794514261164ff27e36a21ad229fc5',
'upload_date': '20150604',
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 154,
'view_count': int,
'comment_count': int,
@@ -61,7 +61,7 @@ class DaumIE(InfoExtractor):
'title': '01-Korean War ( Trouble on the horizon )',
'description': '\nKorean War 01\nTrouble on the horizon\n전쟁의 먹구름',
'upload_date': '20080223',
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 249,
'view_count': int,
'comment_count': int,
@@ -139,7 +139,7 @@ class DaumClipIE(InfoExtractor):
'title': 'DOTA 2GETHER 시즌2 6회 - 2부',
'description': 'DOTA 2GETHER 시즌2 6회 - 2부',
'upload_date': '20130831',
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 3868,
'view_count': int,
},

View File

@@ -17,7 +17,7 @@ class DBTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen',
'description': 'md5:1504a54606c4dde3e4e61fc97aa857e0',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
'timestamp': 1404039863,
'upload_date': '20140629',
'duration': 69.544,

View File

@@ -17,7 +17,7 @@ class DctpTvIE(InfoExtractor):
'title': 'Videoinstallation für eine Kaufhausfassade',
'description': 'Kurzfilm',
'upload_date': '20110407',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}

View File

@@ -19,7 +19,7 @@ class DeezerPlaylistIE(InfoExtractor):
'id': '176747451',
'title': 'Best!',
'uploader': 'Anonymous',
'thumbnail': 're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
'thumbnail': r're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
},
'playlist_count': 30,
'skip': 'Only available in .de',

View File

@@ -17,7 +17,7 @@ class DHMIE(InfoExtractor):
'title': 'MARSHALL PLAN AT WORK IN WESTERN GERMANY, THE',
'description': 'md5:1fabd480c153f97b07add61c44407c82',
'duration': 660,
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.dhm.de/filmarchiv/02-mapping-the-wall/peter-g/rolle-1/',
@@ -26,7 +26,7 @@ class DHMIE(InfoExtractor):
'id': 'rolle-1',
'ext': 'flv',
'title': 'ROLLE 1',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}]

View File

@@ -36,7 +36,7 @@ class DigitekaIE(InfoExtractor):
'id': 's8uk0r',
'ext': 'mp4',
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 74,
'upload_date': '20150317',
'timestamp': 1426604939,
@@ -50,7 +50,7 @@ class DigitekaIE(InfoExtractor):
'id': 'xvpfp8',
'ext': 'mp4',
'title': 'Two - C\'est La Vie (clip)',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 233,
'upload_date': '20150224',
'timestamp': 1424760500,

View File

@@ -6,7 +6,6 @@ from ..utils import (
extract_attributes,
int_or_none,
parse_age_limit,
unescapeHTML,
ExtractorError,
)
@@ -49,7 +48,7 @@ class DiscoveryGoIE(InfoExtractor):
webpage, 'video container'))
video = self._parse_json(
unescapeHTML(container.get('data-video') or container.get('data-json')),
container.get('data-video') or container.get('data-json'),
display_id)
title = video['name']

View File

@@ -26,8 +26,8 @@ class DouyuTVIE(InfoExtractor):
'display_id': 'iseven',
'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 're:.*m7show@163\.com.*',
'thumbnail': 're:^https?://.*\.jpg$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',
'is_live': True,
},
@@ -42,7 +42,7 @@ class DouyuTVIE(InfoExtractor):
'ext': 'flv',
'title': 're:^小漠从零单排记——CSOL2躲猫猫 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:746a2f7a253966a06755a912f0acc0d2',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'douyu小漠',
'is_live': True,
},
@@ -57,8 +57,8 @@ class DouyuTVIE(InfoExtractor):
'display_id': '17732',
'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 're:.*m7show@163\.com.*',
'thumbnail': 're:^https?://.*\.jpg$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',
'is_live': True,
},

View File

@@ -66,7 +66,7 @@ class DramaFeverBaseIE(AMPIE):
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': {
@@ -76,7 +76,7 @@ class DramaFeverIE(DramaFeverBaseIE):
'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'episode': 'Episode 1',
'episode_number': 1,
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1404336058,
'upload_date': '20140702',
'duration': 343,
@@ -94,7 +94,7 @@ class DramaFeverIE(DramaFeverBaseIE):
'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91',
'episode': 'Mnet Asian Music Awards 2015 - Part 3',
'episode_number': 4,
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1450213200,
'upload_date': '20151215',
'duration': 5602,
@@ -103,6 +103,9 @@ class DramaFeverIE(DramaFeverBaseIE):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -148,7 +151,7 @@ class DramaFeverIE(DramaFeverBaseIE):
class DramaFeverSeriesIE(DramaFeverBaseIE):
IE_NAME = 'dramafever:series'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {

View File

@@ -20,7 +20,7 @@ class DRBonanzaIE(InfoExtractor):
'ext': 'mp4',
'title': 'Talkshowet - Leonard Cohen',
'description': 'md5:8f34194fb30cd8c8a30ad8b27b70c0ca',
'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1295537932,
'upload_date': '20110120',
'duration': 3664,
@@ -36,7 +36,7 @@ class DRBonanzaIE(InfoExtractor):
'ext': 'mp3',
'title': 'EM fodbold 1992 Danmark - Tyskland finale Transmission',
'description': 'md5:501e5a195749480552e214fbbed16c4e',
'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1223274900,
'upload_date': '20081006',
'duration': 7369,

View File

@@ -2,10 +2,19 @@ from __future__ import unicode_literals
import re
from .zdf import ZDFIE
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
xpath_text,
determine_ext,
qualities,
float_or_none,
ExtractorError,
)
class DreiSatIE(ZDFIE):
class DreiSatIE(InfoExtractor):
IE_NAME = '3sat'
_VALID_URL = r'(?:https?://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TESTS = [
@@ -31,6 +40,163 @@ class DreiSatIE(ZDFIE):
},
]
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
param_groups = {}
for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
group_id = param_group.attrib.get(self._xpath_ns('id', 'http://www.w3.org/XML/1998/namespace'))
params = {}
for param in param_group:
params[param.get('name')] = param.get('value')
param_groups[group_id] = params
formats = []
for video in smil.findall(self._xpath_ns('.//video', namespace)):
src = video.get('src')
if not src:
continue
bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
group_id = video.get('paramGroup')
param_group = param_groups[group_id]
for proto in param_group['protocols'].split(','):
formats.append({
'url': '%s://%s' % (proto, param_group['host']),
'app': param_group['app'],
'play_path': src,
'ext': 'flv',
'format_id': '%s-%d' % (proto, bitrate),
'tbr': bitrate,
})
self._sort_formats(formats)
return formats
def extract_from_xml_url(self, video_id, xml_url):
doc = self._download_xml(
xml_url, video_id,
note='Downloading video info',
errnote='Failed to download video info')
status_code = doc.find('./status/statuscode')
if status_code is not None and status_code.text != 'ok':
code = status_code.text
if code == 'notVisibleAnymore':
message = 'Video %s is not available' % video_id
else:
message = '%s returned error: %s' % (self.IE_NAME, code)
raise ExtractorError(message, expected=True)
title = doc.find('.//information/title').text
description = xpath_text(doc, './/information/detail', 'description')
duration = int_or_none(xpath_text(doc, './/details/lengthSec', 'duration'))
uploader = xpath_text(doc, './/details/originChannelTitle', 'uploader')
uploader_id = xpath_text(doc, './/details/originChannelId', 'uploader id')
upload_date = unified_strdate(xpath_text(doc, './/details/airtime', 'upload date'))
def xml_to_thumbnails(fnode):
thumbnails = []
for node in fnode:
thumbnail_url = node.text
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
if 'key' in node.attrib:
m = re.match('^([0-9]+)x([0-9]+)$', node.attrib['key'])
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
return thumbnails
thumbnails = xml_to_thumbnails(doc.findall('.//teaserimages/teaserimage'))
format_nodes = doc.findall('.//formitaeten/formitaet')
quality = qualities(['veryhigh', 'high', 'med', 'low'])
def get_quality(elem):
return quality(xpath_text(elem, 'quality'))
format_nodes.sort(key=get_quality)
format_ids = []
formats = []
for fnode in format_nodes:
video_url = fnode.find('url').text
is_available = 'http://www.metafilegenerator' not in video_url
if not is_available:
continue
format_id = fnode.attrib['basetype']
quality = xpath_text(fnode, './quality', 'quality')
format_m = re.match(r'''(?x)
(?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
(?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
''', format_id)
ext = determine_ext(video_url, None) or format_m.group('container')
if ext not in ('smil', 'f4m', 'm3u8'):
format_id = format_id + '-' + quality
if format_id in format_ids:
continue
if ext == 'meta':
continue
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
video_url, video_id, fatal=False))
elif ext == 'm3u8':
# the certificates are misconfigured (see
# https://github.com/rg3/youtube-dl/issues/8665)
if video_url.startswith('https://'):
continue
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id=format_id, fatal=False))
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
video_url, video_id, f4m_id=format_id, fatal=False))
else:
proto = format_m.group('proto').lower()
abr = int_or_none(xpath_text(fnode, './audioBitrate', 'abr'), 1000)
vbr = int_or_none(xpath_text(fnode, './videoBitrate', 'vbr'), 1000)
width = int_or_none(xpath_text(fnode, './width', 'width'))
height = int_or_none(xpath_text(fnode, './height', 'height'))
filesize = int_or_none(xpath_text(fnode, './filesize', 'filesize'))
format_note = ''
if not format_note:
format_note = None
formats.append({
'format_id': format_id,
'url': video_url,
'ext': ext,
'acodec': format_m.group('acodec'),
'vcodec': format_m.group('vcodec'),
'abr': abr,
'vbr': vbr,
'width': width,
'height': height,
'filesize': filesize,
'format_note': format_note,
'protocol': proto,
'_available': is_available,
})
format_ids.append(format_id)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'thumbnails': thumbnails,
'uploader': uploader,
'uploader_id': uploader_id,
'upload_date': upload_date,
'formats': formats,
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')

View File

@@ -22,7 +22,7 @@ class DrTuberIE(InfoExtractor):
'like_count': int,
'comment_count': int,
'categories': ['Babe', 'Blonde', 'Erotic', 'Outdoor', 'Softcore', 'Solo'],
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
}
}, {

View File

@@ -21,7 +21,7 @@ class DumpertIE(InfoExtractor):
'ext': 'mp4',
'title': 'Ik heb nieuws voor je',
'description': 'Niet schrikken hoor',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://www.dumpert.nl/embed/6675421/dc440fe7/',

View File

@@ -31,7 +31,7 @@ class EaglePlatformIE(InfoExtractor):
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
@@ -45,7 +45,7 @@ class EaglePlatformIE(InfoExtractor):
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},

View File

@@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class EggheadCourseIE(InfoExtractor):
IE_DESC = 'egghead.io course'
IE_NAME = 'egghead:course'
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[a-zA-Z_0-9-]+)'
_TEST = {
'url': 'https://egghead.io/courses/professor-frisby-introduces-composable-functional-javascript',
'playlist_count': 29,
'info_dict': {
'id': 'professor-frisby-introduces-composable-functional-javascript',
'title': 'Professor Frisby Introduces Composable Functional JavaScript',
'description': 're:(?s)^This course teaches the ubiquitous.*You\'ll start composing functionality before you know it.$',
},
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(r'<h1 class="title">([^<]+)</h1>', webpage, 'title')
ul = self._search_regex(r'(?s)<ul class="series-lessons-list">(.*?)</ul>', webpage, 'session list')
found = re.findall(r'(?s)<a class="[^"]*"\s*href="([^"]+)">\s*<li class="item', ul)
entries = [self.url_result(m) for m in found]
return {
'_type': 'playlist',
'id': playlist_id,
'title': title,
'description': self._og_search_description(webpage),
'entries': entries,
}

View File

@@ -19,7 +19,7 @@ class EinthusanIE(InfoExtractor):
'id': '2447',
'ext': 'mp4',
'title': 'Ek Villain',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:9d29fc91a7abadd4591fb862fa560d93',
}
},
@@ -30,7 +30,7 @@ class EinthusanIE(InfoExtractor):
'id': '1671',
'ext': 'mp4',
'title': 'Soodhu Kavvuum',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:b40f2bf7320b4f9414f3780817b2af8c',
}
},

View File

@@ -22,7 +22,7 @@ class EroProfileIE(InfoExtractor):
'display_id': 'sexy-babe-softcore',
'ext': 'm4v',
'title': 'sexy babe softcore',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
'age_limit': 18,
}
}, {
@@ -32,7 +32,7 @@ class EroProfileIE(InfoExtractor):
'id': '1133519',
'ext': 'm4v',
'title': 'Try It On Pee_cut_2.wmv - 4shared.com - file sharing - download movie file',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
'age_limit': 18,
},
'skip': 'Requires login',

View File

@@ -45,7 +45,7 @@ class EscapistIE(InfoExtractor):
'ext': 'mp4',
'description': "Baldur's Gate: Original, Modded or Enhanced Edition? I'll break down what you can expect from the new Baldur's Gate: Enhanced Edition.",
'title': "Breaking Down Baldur's Gate",
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 264,
'uploader': 'The Escapist',
}
@@ -57,7 +57,7 @@ class EscapistIE(InfoExtractor):
'ext': 'mp4',
'description': 'This week, Zero Punctuation reviews Evolve.',
'title': 'Evolve - One vs Multiplayer',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 304,
'uploader': 'The Escapist',
}

View File

@@ -22,7 +22,7 @@ class EsriVideoIE(InfoExtractor):
'ext': 'mp4',
'title': 'ArcGIS Online - Developing Applications',
'description': 'Jeremy Bartley demonstrates how to develop applications with ArcGIS Online.',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 185,
'upload_date': '20120419',
}

View File

@@ -23,7 +23,7 @@ class EuropaIE(InfoExtractor):
'ext': 'mp4',
'title': 'TRADE - Wikileaks on TTIP',
'description': 'NEW LIVE EC Midday press briefing of 11/08/2015',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150811',
'duration': 34,
'view_count': int,

View File

@@ -17,7 +17,7 @@ class ExpoTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'NYX Butter Lipstick Little Susie',
'description': 'Goes on like butter, but looks better!',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Stephanie S.',
'upload_date': '20150520',
'view_count': int,

View File

@@ -38,10 +38,7 @@ from .amcnetworks import AMCNetworksIE
from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE
from .anysex import AnySexIE
from .aol import (
AolIE,
AolFeaturesIE,
)
from .aol import AolIE
from .allocine import AllocineIE
from .aparat import AparatIE
from .appleconnect import AppleConnectIE
@@ -91,6 +88,7 @@ from .bbc import (
BBCCoUkPlaylistIE,
BBCIE,
)
from .beampro import BeamProLiveIE
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE
@@ -255,6 +253,7 @@ from .dw import (
from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE
from .egghead import EggheadCourseIE
from .ehow import EHowIE
from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE
@@ -320,7 +319,6 @@ from .francetv import (
)
from .freesound import FreesoundIE
from .freespeech import FreespeechIE
from .freevideo import FreeVideoIE
from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE
@@ -370,6 +368,7 @@ from .hgtv import (
)
from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hitrecord import HitRecordIE
from .hornbunny import HornBunnyIE
from .hotnewhiphop import HotNewHipHopIE
from .hotstar import HotStarIE
@@ -397,6 +396,7 @@ from .imgur import (
ImgurAlbumIE,
)
from .ina import InaIE
from .inc import IncIE
from .indavideo import (
IndavideoIE,
IndavideoEmbedIE,
@@ -656,6 +656,7 @@ from .nrk import (
NRKSkoleIE,
NRKTVIE,
NRKTVDirekteIE,
NRKTVEpisodesIE,
)
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
@@ -813,7 +814,6 @@ from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .screenjunkies import ScreenJunkiesIE
from .seeker import SeekerIE
from .senateisvp import SenateISVPIE
from .sendtonews import SendtoNewsIE
@@ -824,7 +824,7 @@ from .shared import (
SharedIE,
VivoIE,
)
from .sharesix import ShareSixIE
from .showroomlive import ShowRoomLiveIE
from .sina import SinaIE
from .sixplay import SixPlayIE
from .skynewsarabia import (
@@ -1064,6 +1064,7 @@ from .vice import (
from .viceland import VicelandIE
from .vidbit import VidbitIE
from .viddler import ViddlerIE
from .videa import VideaIE
from .videodetective import VideoDetectiveIE
from .videofyme import VideofyMeIE
from .videomega import VideoMegaIE
@@ -1073,7 +1074,6 @@ from .videomore import (
VideomoreSeasonIE,
)
from .videopremium import VideoPremiumIE
from .videott import VideoTtIE
from .vidio import VidioIE
from .vidme import (
VidmeIE,

View File

@@ -133,7 +133,7 @@ class FC2EmbedIE(InfoExtractor):
'id': '201403223kCqB3Ez',
'ext': 'flv',
'title': 'プリズン・ブレイク S1-01 マイケル 【吹替】',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}

View File

@@ -26,7 +26,7 @@ class FirstTVIE(InfoExtractor):
'id': '40049',
'ext': 'mp4',
'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20150212',
'duration': 2694,
},
@@ -37,7 +37,7 @@ class FirstTVIE(InfoExtractor):
'id': '364746',
'ext': 'mp4',
'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20160407',
'duration': 179,
'formats': 'mincount:3',

View File

@@ -25,7 +25,7 @@ class FiveTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Россияне выбрали имя для общенациональной платежной системы',
'description': 'md5:a8aa13e2b7ad36789e9f77a74b6de660',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
@@ -35,7 +35,7 @@ class FiveTVIE(InfoExtractor):
'ext': 'mp4',
'title': '3D принтер',
'description': 'md5:d76c736d29ef7ec5c0cf7d7c65ffcb41',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
@@ -44,7 +44,7 @@ class FiveTVIE(InfoExtractor):
'id': 'glavnoe',
'ext': 'mp4',
'title': 'Итоги недели с 8 по 14 июня 2015 года',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.5-tv.ru/glavnoe/broadcasts/508645/',

View File

@@ -19,7 +19,7 @@ class FKTVIE(InfoExtractor):
'id': '1',
'ext': 'mp4',
'title': 'Folge 1 vom 10. April 2007',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
}

View File

@@ -20,7 +20,7 @@ class FoxgayIE(InfoExtractor):
'title': 'Fuck Turkish-style',
'description': 'md5:6ae2d9486921891efe89231ace13ffdf',
'age_limit': 18,
'thumbnail': 're:https?://.*\.jpg$',
'thumbnail': r're:https?://.*\.jpg$',
},
}

View File

@@ -22,7 +22,7 @@ class FoxNewsIE(AMPIE):
'duration': 265,
'timestamp': 1304411491,
'upload_date': '20110503',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
@@ -36,7 +36,7 @@ class FoxNewsIE(AMPIE):
'duration': 292,
'timestamp': 1417662047,
'upload_date': '20141204',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
@@ -111,7 +111,7 @@ class FoxNewsInsiderIE(InfoExtractor):
'description': 'Is campus censorship getting out of control?',
'timestamp': 1472168725,
'upload_date': '20160825',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download

View File

@@ -17,7 +17,7 @@ class FranceCultureIE(InfoExtractor):
'display_id': 'rendez-vous-au-pays-des-geeks',
'ext': 'mp3',
'title': 'Rendez-vous au pays des geeks',
'thumbnail': 're:^https?://.*\\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140301',
'vcodec': 'none',
}

View File

@@ -168,7 +168,7 @@ class FranceTvInfoIE(FranceTVBaseInfoExtractor):
'id': 'NI_173343',
'ext': 'mp4',
'title': 'Les entreprises familiales : le secret de la réussite',
'thumbnail': 're:^https?://.*\.jpe?g$',
'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1433273139,
'upload_date': '20150602',
},
@@ -184,7 +184,7 @@ class FranceTvInfoIE(FranceTVBaseInfoExtractor):
'ext': 'mp4',
'title': 'Olivier Monthus, réalisateur de "Bretagne, le choix de lArmor"',
'description': 'md5:a3264114c9d29aeca11ced113c37b16c',
'thumbnail': 're:^https?://.*\.jpe?g$',
'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1458300695,
'upload_date': '20160318',
},

View File

@@ -3,10 +3,16 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
float_or_none,
get_element_by_class,
get_element_by_id,
unified_strdate,
)
class FreesoundIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?freesound\.org/people/([^/]+)/sounds/(?P<id>[^/]+)'
_VALID_URL = r'https?://(?:www\.)?freesound\.org/people/[^/]+/sounds/(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.freesound.org/people/miklovan/sounds/194503/',
'md5': '12280ceb42c81f19a515c745eae07650',
@@ -14,26 +20,60 @@ class FreesoundIE(InfoExtractor):
'id': '194503',
'ext': 'mp3',
'title': 'gulls in the city.wav',
'uploader': 'miklovan',
'description': 'the sounds of seagulls in the city',
'duration': 130.233,
'uploader': 'miklovan',
'upload_date': '20130715',
'tags': list,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
music_id = mobj.group('id')
webpage = self._download_webpage(url, music_id)
title = self._html_search_regex(
r'<div id="single_sample_header">.*?<a href="#">(.+?)</a>',
webpage, 'music title', flags=re.DOTALL)
audio_id = self._match_id(url)
webpage = self._download_webpage(url, audio_id)
audio_url = self._og_search_property('audio', webpage, 'song url')
title = self._og_search_property('audio:title', webpage, 'song title')
description = self._html_search_regex(
r'<div id="sound_description">(.*?)</div>', webpage, 'description',
fatal=False, flags=re.DOTALL)
r'(?s)id=["\']sound_description["\'][^>]*>(.+?)</div>',
webpage, 'description', fatal=False)
duration = float_or_none(
get_element_by_class('duration', webpage), scale=1000)
upload_date = unified_strdate(get_element_by_id('sound_date', webpage))
uploader = self._og_search_property(
'audio:artist', webpage, 'uploader', fatal=False)
channels = self._html_search_regex(
r'Channels</dt><dd>(.+?)</dd>', webpage,
'channels info', fatal=False)
tags_str = get_element_by_class('tags', webpage)
tags = re.findall(r'<a[^>]+>([^<]+)', tags_str) if tags_str else None
audio_urls = [audio_url]
LQ_FORMAT = '-lq.mp3'
if LQ_FORMAT in audio_url:
audio_urls.append(audio_url.replace(LQ_FORMAT, '-hq.mp3'))
formats = [{
'url': format_url,
'format_note': channels,
'quality': quality,
} for quality, format_url in enumerate(audio_urls)]
self._sort_formats(formats)
return {
'id': music_id,
'id': audio_id,
'title': title,
'url': self._og_search_property('audio', webpage, 'music url'),
'uploader': self._og_search_property('audio:artist', webpage, 'music uploader'),
'description': description,
'duration': duration,
'uploader': uploader,
'upload_date': upload_date,
'tags': tags,
'formats': formats,
}

View File

@@ -1,38 +0,0 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import ExtractorError
class FreeVideoIE(InfoExtractor):
_VALID_URL = r'^https?://www.freevideo.cz/vase-videa/(?P<id>[^.]+)\.html(?:$|[?#])'
_TEST = {
'url': 'http://www.freevideo.cz/vase-videa/vysukany-zadecek-22033.html',
'info_dict': {
'id': 'vysukany-zadecek-22033',
'ext': 'mp4',
'title': 'vysukany-zadecek-22033',
'age_limit': 18,
},
'skip': 'Blocked outside .cz',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage, handle = self._download_webpage_handle(url, video_id)
if '//www.czechav.com/' in handle.geturl():
raise ExtractorError(
'Access to freevideo is blocked from your location',
expected=True)
video_url = self._search_regex(
r'\s+url: "(http://[a-z0-9-]+.cdn.freevideo.cz/stream/.*?/video.mp4)"',
webpage, 'video URL')
return {
'id': video_id,
'url': video_url,
'title': video_id,
'age_limit': 18,
}

View File

@@ -29,7 +29,7 @@ class FunimationIE(InfoExtractor):
'ext': 'mp4',
'title': 'Air - 1 - Breeze',
'description': 'md5:1769f43cd5fc130ace8fd87232207892',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
},
'skip': 'Access without user interaction is forbidden by CloudFlare, and video removed',
}, {
@@ -40,7 +40,7 @@ class FunimationIE(InfoExtractor):
'ext': 'mp4',
'title': '.hack//SIGN - 1 - Role Play',
'description': 'md5:b602bdc15eef4c9bbb201bb6e6a4a2dd',
'thumbnail': 're:https?://.*\.jpg',
'thumbnail': r're:https?://.*\.jpg',
},
'skip': 'Access without user interaction is forbidden by CloudFlare',
}, {
@@ -51,7 +51,7 @@ class FunimationIE(InfoExtractor):
'ext': 'mp4',
'title': 'Attack on Titan: Junior High - Broadcast Dub Preview',
'description': 'md5:f8ec49c0aff702a7832cd81b8a44f803',
'thumbnail': 're:https?://.*\.(?:jpg|png)',
'thumbnail': r're:https?://.*\.(?:jpg|png)',
},
'skip': 'Access without user interaction is forbidden by CloudFlare',
}]

View File

@@ -17,7 +17,7 @@ class FunnyOrDieIE(InfoExtractor):
'ext': 'mp4',
'title': 'Heart-Shaped Box: Literal Video Version',
'description': 'md5:ea09a01bc9a1c46d9ab696c01747c338',
'thumbnail': 're:^http:.*\.jpg$',
'thumbnail': r're:^http:.*\.jpg$',
},
}, {
'url': 'http://www.funnyordie.com/embed/e402820827',
@@ -26,7 +26,7 @@ class FunnyOrDieIE(InfoExtractor):
'ext': 'mp4',
'title': 'Please Use This Song (Jon Lajoie)',
'description': 'Please use this to sell something. www.jonlajoie.com',
'thumbnail': 're:^http:.*\.jpg$',
'thumbnail': r're:^http:.*\.jpg$',
},
'params': {
'skip_download': True,

View File

@@ -20,7 +20,7 @@ class GamersydeIE(InfoExtractor):
'ext': 'mp4',
'duration': 372,
'title': 'Bloodborne - Birth of a hero',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
}
}

View File

@@ -63,7 +63,7 @@ class GameSpotIE(OnceIE):
streams, ('progressive_hd', 'progressive_high', 'progressive_low'))
if progressive_url and manifest_url:
qualities_basename = self._search_regex(
'/([^/]+)\.csmil/',
r'/([^/]+)\.csmil/',
manifest_url, 'qualities basename', default=None)
if qualities_basename:
QUALITIES_RE = r'((,\d+)+,?)'

View File

@@ -18,7 +18,7 @@ class GameStarIE(InfoExtractor):
'ext': 'mp4',
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1406542020,
'upload_date': '20140728',
'duration': 17

View File

@@ -16,7 +16,7 @@ class GazetaIE(InfoExtractor):
'ext': 'mp4',
'title': '«7080 процентов гражданских в Донецке на грани голода»',
'description': 'md5:38617526050bd17b234728e7f9620a71',
'thumbnail': 're:^https?://.*\.jpg',
'thumbnail': r're:^https?://.*\.jpg',
},
'skip': 'video not found',
}, {

View File

@@ -73,9 +73,11 @@ from .kaltura import KalturaIE
from .eagleplatform import EaglePlatformIE
from .facebook import FacebookIE
from .soundcloud import SoundcloudIE
from .tunein import TuneInBaseIE
from .vbox7 import Vbox7IE
from .dbtv import DBTVIE
from .piksel import PikselIE
from .videa import VideaIE
class GenericIE(InfoExtractor):
@@ -237,7 +239,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': 'Tikibad ontruimd wegens brand',
'description': 'md5:05ca046ff47b931f9b04855015e163a4',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 33,
},
'params': {
@@ -298,7 +300,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'upload_date': '20130224',
'uploader_id': 'TheVerge',
'description': 're:^Chris Ziegler takes a look at the\.*',
'description': r're:^Chris Ziegler takes a look at the\.*',
'uploader': 'The Verge',
'title': 'First Firefox OS phones side-by-side',
},
@@ -344,10 +346,10 @@ class GenericIE(InfoExtractor):
},
'skip': 'There is a limit of 200 free downloads / month for the test song',
},
# embedded brightcove video
# it also tests brightcove videos that need to set the 'Referer' in the
# http requests
{
# embedded brightcove video
# it also tests brightcove videos that need to set the 'Referer'
# in the http requests
'add_ie': ['BrightcoveLegacy'],
'url': 'http://www.bfmtv.com/video/bfmbusiness/cours-bourse/cours-bourse-l-analyse-technique-154522/',
'info_dict': {
@@ -361,6 +363,24 @@ class GenericIE(InfoExtractor):
'skip_download': True,
},
},
{
# embedded with itemprop embedURL and video id spelled as `idVideo`
'add_id': ['BrightcoveLegacy'],
'url': 'http://bfmbusiness.bfmtv.com/mediaplayer/chroniques/olivier-delamarche/',
'info_dict': {
'id': '5255628253001',
'ext': 'mp4',
'title': 'md5:37c519b1128915607601e75a87995fc0',
'description': 'md5:37f7f888b434bb8f8cc8dbd4f7a4cf26',
'uploader': 'BFM BUSINESS',
'uploader_id': '876450612001',
'timestamp': 1482255315,
'upload_date': '20161220',
},
'params': {
'skip_download': True,
},
},
{
# https://github.com/rg3/youtube-dl/issues/2253
'url': 'http://bcove.me/i6nfkrc3',
@@ -402,6 +422,26 @@ class GenericIE(InfoExtractor):
'skip_download': True, # m3u8 download
},
},
{
# Brightcove with alternative playerID key
'url': 'http://www.nature.com/nmeth/journal/v9/n7/fig_tab/nmeth.2062_SV1.html',
'info_dict': {
'id': 'nmeth.2062_SV1',
'title': 'Simultaneous multiview imaging of the Drosophila syncytial blastoderm : Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy : Nature Methods : Nature Research',
},
'playlist': [{
'info_dict': {
'id': '2228375078001',
'ext': 'mp4',
'title': 'nmeth.2062-sv1',
'description': 'nmeth.2062-sv1',
'timestamp': 1363357591,
'upload_date': '20130315',
'uploader': 'Nature Publishing Group',
'uploader_id': '1964492299001',
},
}],
},
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
@@ -519,7 +559,7 @@ class GenericIE(InfoExtractor):
'id': 'f4dafcad-ff21-423d-89b5-146cfd89fa1e',
'ext': 'mp4',
'title': 'Ужастики, русский трейлер (2015)',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 153,
}
},
@@ -739,7 +779,7 @@ class GenericIE(InfoExtractor):
'duration': 48,
'timestamp': 1401537900,
'upload_date': '20140531',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
},
# Wistia embed
@@ -809,6 +849,21 @@ class GenericIE(InfoExtractor):
},
'playlist_mincount': 7,
},
# TuneIn station embed
{
'url': 'http://radiocnrv.com/promouvoir-radio-cnrv/',
'info_dict': {
'id': '204146',
'ext': 'mp3',
'title': 'CNRV',
'location': 'Paris, France',
'is_live': True,
},
'params': {
# Live stream
'skip_download': True,
},
},
# Livestream embed
{
'url': 'http://www.esa.int/Our_Activities/Space_Science/Rosetta/Philae_comet_touch-down_webcast',
@@ -996,7 +1051,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
@@ -1010,7 +1065,7 @@ class GenericIE(InfoExtractor):
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
@@ -1023,7 +1078,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
@@ -1035,7 +1090,7 @@ class GenericIE(InfoExtractor):
'id': '3519514',
'ext': 'mp4',
'title': 'Joe Dirt 2 Beautiful Loser Teaser Trailer',
'thumbnail': 're:^https?://.*\.png$',
'thumbnail': r're:^https?://.*\.png$',
'duration': 45.115,
},
},
@@ -1118,7 +1173,7 @@ class GenericIE(InfoExtractor):
'id': '300346',
'ext': 'mp4',
'title': '中一中男師變性 全校師生力挺',
'thumbnail': 're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
@@ -1164,7 +1219,7 @@ class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': 'Sauvons les abeilles ! - Le débat',
'description': 'md5:d9082128b1c5277987825d684939ca26',
'thumbnail': 're:^https?://.*\.jpe?g$',
'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1434970506,
'upload_date': '20150622',
'uploader': 'Public Sénat',
@@ -1178,7 +1233,7 @@ class GenericIE(InfoExtractor):
'id': '2855',
'ext': 'mp4',
'title': 'Dont Understand Bitcoin? This Man Will Mumble An Explanation At You',
'thumbnail': 're:^https?://.*\.jpe?g$',
'thumbnail': r're:^https?://.*\.jpe?g$',
'uploader': 'ClickHole',
'uploader_id': 'clickhole',
}
@@ -1404,6 +1459,15 @@ class GenericIE(InfoExtractor):
},
'playlist_mincount': 3,
},
{
# Videa embeds
'url': 'http://forum.dvdtalk.com/movie-talk/623756-deleted-magic-star-wars-ot-deleted-alt-scenes-docu-style.html',
'info_dict': {
'id': '623756-deleted-magic-star-wars-ot-deleted-alt-scenes-docu-style',
'title': 'Deleted Magic - Star Wars: OT Deleted / Alt. Scenes Docu. Style - DVD Talk Forum',
},
'playlist_mincount': 2,
},
# {
# # TODO: find another test
# # http://schema.org/VideoObject
@@ -1895,7 +1959,14 @@ class GenericIE(InfoExtractor):
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage) or
re.search(r'data-ooyala-video-id\s*=\s*[\'"](?P<ec>.{32})[\'"]', webpage))
if mobj is not None:
return OoyalaIE._build_url_result(smuggle_url(mobj.group('ec'), {'domain': url}))
embed_token = self._search_regex(
r'embedToken[\'"]?\s*:\s*[\'"]([^\'"]+)',
webpage, 'ooyala embed token', default=None)
return OoyalaIE._build_url_result(smuggle_url(
mobj.group('ec'), {
'domain': url,
'embed_token': embed_token,
}))
# Look for multiple Ooyala embeds on SBN network websites
mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
@@ -2060,6 +2131,11 @@ class GenericIE(InfoExtractor):
if soundcloud_urls:
return _playlist_from_matches(soundcloud_urls, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
# Look for tunein player
tunein_urls = TuneInBaseIE._extract_urls(webpage)
if tunein_urls:
return _playlist_from_matches(tunein_urls)
# Look for embedded mtvservices player
mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
if mtvservices_url:
@@ -2340,6 +2416,11 @@ class GenericIE(InfoExtractor):
if dbtv_urls:
return _playlist_from_matches(dbtv_urls, ie=DBTVIE.ie_key())
# Look for Videa embeds
videa_urls = VideaIE._extract_urls(webpage)
if videa_urls:
return _playlist_from_matches(videa_urls, ie=VideaIE.ie_key())
# Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(
webpage, video_id, default={}, expected_type='VideoObject')

Some files were not shown because too many files have changed in this diff Show More