Compare commits
2 Commits
2018.03.26
...
totalwebca
Author | SHA1 | Date | |
---|---|---|---|
97bc05116e | |||
7608a91ee7 |
7
.github/ISSUE_TEMPLATE.md
vendored
7
.github/ISSUE_TEMPLATE.md
vendored
@ -6,13 +6,12 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.03.26.1*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.12.31*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.03.26.1**
|
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.12.31**
|
||||||
|
|
||||||
### Before submitting an *issue* make sure you have:
|
### Before submitting an *issue* make sure you have:
|
||||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||||
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
||||||
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
|
|
||||||
|
|
||||||
### What is the purpose of your *issue*?
|
### What is the purpose of your *issue*?
|
||||||
- [ ] Bug report (encountered problems with youtube-dl)
|
- [ ] Bug report (encountered problems with youtube-dl)
|
||||||
@ -36,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
|
|||||||
[debug] User config: []
|
[debug] User config: []
|
||||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||||
[debug] youtube-dl version 2018.03.26.1
|
[debug] youtube-dl version 2017.12.31
|
||||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||||
[debug] Proxy map: {}
|
[debug] Proxy map: {}
|
||||||
|
1
.github/ISSUE_TEMPLATE_tmpl.md
vendored
1
.github/ISSUE_TEMPLATE_tmpl.md
vendored
@ -12,7 +12,6 @@
|
|||||||
### Before submitting an *issue* make sure you have:
|
### Before submitting an *issue* make sure you have:
|
||||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||||
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
||||||
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
|
|
||||||
|
|
||||||
### What is the purpose of your *issue*?
|
### What is the purpose of your *issue*?
|
||||||
- [ ] Bug report (encountered problems with youtube-dl)
|
- [ ] Bug report (encountered problems with youtube-dl)
|
||||||
|
5
AUTHORS
5
AUTHORS
@ -231,8 +231,3 @@ John Dong
|
|||||||
Tatsuyuki Ishi
|
Tatsuyuki Ishi
|
||||||
Daniel Weber
|
Daniel Weber
|
||||||
Kay Bouché
|
Kay Bouché
|
||||||
Yang Hongbo
|
|
||||||
Lei Wang
|
|
||||||
Petr Novák
|
|
||||||
Leonardo Taccari
|
|
||||||
Martin Weinelt
|
|
||||||
|
290
ChangeLog
290
ChangeLog
@ -1,297 +1,9 @@
|
|||||||
version 2018.03.26.1
|
version <unreleased>
|
||||||
|
|
||||||
Core
|
|
||||||
+ [downloader/external] Add elapsed time to progress hook (#10876)
|
|
||||||
* [downloader/external,fragment] Fix download finalization when writing file
|
|
||||||
to stdout (#10809, #10876, #15799)
|
|
||||||
|
|
||||||
Extractors
|
Extractors
|
||||||
* [vrv] Fix extraction on python2 (#15928)
|
|
||||||
* [afreecatv] Update referrer (#15947)
|
|
||||||
+ [24video] Add support for 24video.sexy (#15973)
|
|
||||||
* [crackle] Bypass geo restriction
|
|
||||||
* [crackle] Fix extraction (#15969)
|
|
||||||
+ [lenta] Add support for lenta.ru (#15953)
|
|
||||||
+ [instagram:user] Add pagination (#15934)
|
|
||||||
* [youku] Update ccode (#15939)
|
|
||||||
* [libsyn] Adapt to new page structure
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.03.20
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [extractor/common] Improve thumbnail extraction for HTML5 entries
|
|
||||||
* Generalize XML manifest processing code and improve XSPF parsing
|
|
||||||
+ [extractor/common] Add _download_xml_handle
|
|
||||||
+ [extractor/common] Add support for relative URIs in _parse_xspf (#15794)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [7plus] Extract series metadata (#15862, #15906)
|
|
||||||
* [9now] Bypass geo restriction (#15920)
|
|
||||||
* [cbs] Skip unavailable assets (#13490, #13506, #15776)
|
|
||||||
+ [canalc2] Add support for HTML5 videos (#15916, #15919)
|
|
||||||
+ [ceskatelevize] Add support for iframe embeds (#15918)
|
|
||||||
+ [prosiebensat1] Add support for galileo.tv (#15894)
|
|
||||||
+ [generic] Add support for xfileshare embeds (#15879)
|
|
||||||
* [bilibili] Switch to v2 playurl API
|
|
||||||
* [bilibili] Fix and improve extraction (#15048, #15430, #15622, #15863)
|
|
||||||
* [heise] Improve extraction (#15496, #15784, #15026)
|
|
||||||
* [instagram] Fix user videos extraction (#15858)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.03.14
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [soundcloud] Update client id (#15866)
|
|
||||||
+ [tennistv] Add support for tennistv.com
|
|
||||||
+ [line] Add support for tv.line.me (#9427)
|
|
||||||
* [xnxx] Fix extraction (#15817)
|
|
||||||
* [njpwworld] Fix authentication (#15815)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.03.10
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [downloader/hls] Skip uplynk ad fragments (#15748)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [pornhub] Don't override session cookies (#15697)
|
|
||||||
+ [raywenderlich] Add support for videos.raywenderlich.com (#15251)
|
|
||||||
* [funk] Fix extraction and rework extractors (#15792)
|
|
||||||
* [nexx] Restore reverse engineered approach
|
|
||||||
+ [heise] Add support for kaltura embeds (#14961, #15728)
|
|
||||||
+ [tvnow] Extract series metadata (#15774)
|
|
||||||
* [ruutu] Continue formats extraction on NOT-USED URLs (#15775)
|
|
||||||
* [vrtnu] Use redirect URL for building video JSON URL (#15767, #15769)
|
|
||||||
* [vimeo] Modernize login code and improve error messaging
|
|
||||||
* [archiveorg] Fix extraction (#15770, #15772)
|
|
||||||
+ [hidive] Add support for hidive.com (#15494)
|
|
||||||
* [afreecatv] Detect deleted videos
|
|
||||||
* [afreecatv] Fix extraction (#15755)
|
|
||||||
* [vice] Fix extraction and rework extractors (#11101, #13019, #13622, #13778)
|
|
||||||
+ [vidzi] Add support for vidzi.si (#15751)
|
|
||||||
* [npo] Fix typo
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.03.03
|
|
||||||
|
|
||||||
Core
|
|
||||||
+ [utils] Add parse_resolution
|
|
||||||
Revert respect --prefer-insecure while updating
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [yapfiles] Add support for yapfiles.ru (#15726, #11085)
|
|
||||||
* [spankbang] Fix formats extraction (#15727)
|
|
||||||
* [adn] Fix extraction (#15716)
|
|
||||||
+ [toggle] Extract DASH and ISM formats (#15721)
|
|
||||||
+ [nickelodeon] Add support for nickelodeon.com.tr (#15706)
|
|
||||||
* [npo] Validate and filter format URLs (#15709)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.26
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [udemy] Use custom User-Agent (#15571)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.25
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [postprocessor/embedthumbnail] Skip embedding when there aren't any
|
|
||||||
thumbnails (#12573)
|
|
||||||
* [extractor/common] Improve jwplayer subtitles extraction (#15695)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [vidlii] Add support for vidlii.com (#14472, #14512, #14779)
|
|
||||||
+ [streamango] Capture and output error messages
|
|
||||||
* [streamango] Fix extraction (#14160, #14256)
|
|
||||||
+ [telequebec] Add support for emissions (#14649, #14655)
|
|
||||||
+ [telequebec:live] Add support for live streams (#15688)
|
|
||||||
+ [mailru:music] Add support for mail.ru/music (#15618)
|
|
||||||
* [aenetworks] Switch to akamai HLS formats (#15612)
|
|
||||||
* [ytsearch] Fix flat title extraction (#11260, #15681)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.22
|
|
||||||
|
|
||||||
Core
|
|
||||||
+ [utils] Fixup some common URL typos in sanitize_url (#15649)
|
|
||||||
* Respect --prefer-insecure while updating (#15497)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [vidio] Fix HLS URL extraction (#15675)
|
|
||||||
+ [nexx] Add support for arc.nexx.cloud URLs
|
|
||||||
* [nexx] Switch to arc API (#15652)
|
|
||||||
* [redtube] Fix duration extraction (#15659)
|
|
||||||
+ [sonyliv] Respect referrer (#15648)
|
|
||||||
+ [brightcove:new] Use referrer for formats' HTTP headers
|
|
||||||
+ [cbc] Add support for olympics.cbc.ca (#15535)
|
|
||||||
+ [fusion] Add support for fusion.tv (#15628)
|
|
||||||
* [npo] Improve quality metadata extraction
|
|
||||||
* [npo] Relax URL regular expression (#14987, #14994)
|
|
||||||
+ [npo] Capture and output error message
|
|
||||||
+ [pornhub] Add support for channels (#15613)
|
|
||||||
* [youtube] Handle shared URLs with generic extractor (#14303)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.11
|
|
||||||
|
|
||||||
Core
|
|
||||||
+ [YoutubeDL] Add support for filesize_approx in format selector (#15550)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [francetv] Add support for live streams (#13689)
|
|
||||||
+ [francetv] Add support for zouzous.fr and ludo.fr (#10454, #13087, #13103,
|
|
||||||
#15012)
|
|
||||||
* [francetv] Separate main extractor and rework others to delegate to it
|
|
||||||
* [francetv] Improve manifest URL signing (#15536)
|
|
||||||
+ [francetv] Sign m3u8 manifest URLs (#15565)
|
|
||||||
+ [veoh] Add support for embed URLs (#15561)
|
|
||||||
* [afreecatv] Fix extraction (#15556)
|
|
||||||
* [periscope] Use accessVideoPublic endpoint (#15554)
|
|
||||||
* [discovery] Fix auth request (#15542)
|
|
||||||
+ [6play] Extract subtitles (#15541)
|
|
||||||
* [newgrounds] Fix metadata extraction (#15531)
|
|
||||||
+ [nbc] Add support for stream.nbcolympics.com (#10295)
|
|
||||||
* [dvtv] Fix live streams extraction (#15442)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.08
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [myvi] Extend URL regular expression
|
|
||||||
+ [myvi:embed] Add support for myvi.tv embeds (#15521)
|
|
||||||
+ [prosiebensat1] Extend URL regular expression (#15520)
|
|
||||||
* [pokemon] Relax URL regular expression and extend title extraction (#15518)
|
|
||||||
+ [gameinformer] Use geo verification headers
|
|
||||||
* [la7] Fix extraction (#15501, #15502)
|
|
||||||
* [gameinformer] Fix brightcove id extraction (#15416)
|
|
||||||
+ [afreecatv] Pass referrer to video info request (#15507)
|
|
||||||
+ [telebruxelles] Add support for live streams
|
|
||||||
* [telebruxelles] Relax URL regular expression
|
|
||||||
* [telebruxelles] Fix extraction (#15504)
|
|
||||||
* [extractor/common] Respect secure schemes in _extract_wowza_formats
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.04
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [downloader/http] Randomize HTTP chunk size
|
|
||||||
+ [downloader/http] Add ability to pass downloader options via info dict
|
|
||||||
* [downloader/http] Fix 302 infinite loops by not reusing requests
|
|
||||||
+ Document http_chunk_size
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [brightcove] Pass embed page URL as referrer (#15486)
|
|
||||||
+ [youtube] Enforce using chunked HTTP downloading for DASH formats
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.02.03
|
|
||||||
|
|
||||||
Core
|
|
||||||
+ Introduce --http-chunk-size for chunk-based HTTP downloading
|
|
||||||
+ Add support for IronPython
|
|
||||||
* [downloader/ism] Fix Python 3.2 support
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [redbulltv] Fix extraction (#15481)
|
|
||||||
* [redtube] Fix metadata extraction (#15472)
|
|
||||||
* [pladform] Respect platform id and extract HLS formats (#15468)
|
|
||||||
- [rtlnl] Remove progressive formats (#15459)
|
|
||||||
* [6play] Do no modify asset URLs with a token (#15248)
|
|
||||||
* [nationalgeographic] Relax URL regular expression
|
|
||||||
* [dplay] Relax URL regular expression (#15458)
|
|
||||||
* [cbsinteractive] Fix data extraction (#15451)
|
|
||||||
+ [amcnetworks] Add support for sundancetv.com (#9260)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.27
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [extractor/common] Improve _json_ld for articles
|
|
||||||
* Switch codebase to use compat_b64decode
|
|
||||||
+ [compat] Add compat_b64decode
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [seznamzpravy] Add support for seznam.cz and seznamzpravy.cz (#14102, #14616)
|
|
||||||
* [dplay] Bypass geo restriction
|
|
||||||
+ [dplay] Add support for disco-api videos (#15396)
|
|
||||||
* [youtube] Extract precise error messages (#15284)
|
|
||||||
* [teachertube] Capture and output error message
|
|
||||||
* [teachertube] Fix and relax thumbnail extraction (#15403)
|
|
||||||
+ [prosiebensat1] Add another clip id regular expression (#15378)
|
|
||||||
* [tbs] Update tokenizer url (#15395)
|
|
||||||
* [mixcloud] Use compat_b64decode (#15394)
|
|
||||||
- [thesixtyone] Remove extractor (#15341)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.21
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [extractor/common] Improve jwplayer DASH formats extraction (#9242, #15187)
|
|
||||||
* [utils] Improve scientific notation handling in js_to_json (#14789)
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [southparkdk] Add support for southparkstudios.nu
|
|
||||||
+ [southpark] Add support for collections (#14803)
|
|
||||||
* [franceinter] Fix upload date extraction (#14996)
|
|
||||||
+ [rtvs] Add support for rtvs.sk (#9242, #15187)
|
|
||||||
* [restudy] Fix extraction and extend URL regular expression (#15347)
|
|
||||||
* [youtube:live] Improve live detection (#15365)
|
|
||||||
+ [springboardplatform] Add support for springboardplatform.com
|
|
||||||
* [prosiebensat1] Add another clip id regular expression (#15290)
|
|
||||||
- [ringtv] Remove extractor (#15345)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.18
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [soundcloud] Update client id (#15306)
|
|
||||||
- [kamcord] Remove extractor (#15322)
|
|
||||||
+ [spiegel] Add support for nexx videos (#15285)
|
|
||||||
* [twitch] Fix authentication and error capture (#14090, #15264)
|
|
||||||
* [vk] Detect more errors due to copyright complaints (#15259)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.14
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
* [youtube] Fix live streams extraction (#15202)
|
|
||||||
* [wdr] Bypass geo restriction
|
|
||||||
* [wdr] Rework extractors (#14598)
|
|
||||||
+ [wdr] Add support for wdrmaus.de/elefantenseite (#14598)
|
|
||||||
+ [gamestar] Add support for gamepro.de (#3384)
|
|
||||||
* [viafree] Skip rtmp formats (#15232)
|
|
||||||
+ [pandoratv] Add support for mobile URLs (#12441)
|
|
||||||
+ [pandoratv] Add support for new URL format (#15131)
|
|
||||||
+ [ximalaya] Add support for ximalaya.com (#14687)
|
|
||||||
+ [digg] Add support for digg.com (#15214)
|
|
||||||
* [limelight] Tolerate empty pc formats (#15150, #15151, #15207)
|
|
||||||
* [ndr:embed:base] Make separate formats extraction non fatal (#15203)
|
|
||||||
+ [weibo] Add extractor (#15079)
|
|
||||||
+ [ok] Add support for live streams
|
|
||||||
* [canalplus] Fix extraction (#15072)
|
|
||||||
* [bilibili] Fix extraction (#15188)
|
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.07
|
|
||||||
|
|
||||||
Core
|
|
||||||
* [utils] Fix youtube-dl under PyPy3 on Windows
|
|
||||||
* [YoutubeDL] Output python implementation in debug header
|
|
||||||
|
|
||||||
Extractors
|
|
||||||
+ [jwplatform] Add support for multiple embeds (#15192)
|
|
||||||
* [mitele] Fix extraction (#15186)
|
|
||||||
+ [motherless] Add support for groups (#15124)
|
|
||||||
* [lynda] Relax URL regular expression (#15185)
|
|
||||||
* [soundcloud] Fallback to avatar picture for thumbnail (#12878)
|
|
||||||
* [youku] Fix list extraction (#15135)
|
* [youku] Fix list extraction (#15135)
|
||||||
* [openload] Fix extraction (#15166)
|
* [openload] Fix extraction (#15166)
|
||||||
* [lynda] Skip invalid subtitles (#15159)
|
|
||||||
* [twitch] Pass video id to url_result when extracting playlist (#15139)
|
|
||||||
* [rtve.es:alacarta] Fix extraction of some new URLs
|
* [rtve.es:alacarta] Fix extraction of some new URLs
|
||||||
* [acast] Fix extraction (#15147)
|
|
||||||
|
|
||||||
|
|
||||||
version 2017.12.31
|
version 2017.12.31
|
||||||
|
@ -46,7 +46,7 @@ Or with [MacPorts](https://www.macports.org/):
|
|||||||
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
||||||
|
|
||||||
# DESCRIPTION
|
# DESCRIPTION
|
||||||
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
||||||
|
|
||||||
youtube-dl [OPTIONS] URL [URL...]
|
youtube-dl [OPTIONS] URL [URL...]
|
||||||
|
|
||||||
@ -198,11 +198,6 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
|
|||||||
size. By default, the buffer size is
|
size. By default, the buffer size is
|
||||||
automatically resized from an initial value
|
automatically resized from an initial value
|
||||||
of SIZE.
|
of SIZE.
|
||||||
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
|
|
||||||
downloading (e.g. 10485760 or 10M) (default
|
|
||||||
is disabled). May be useful for bypassing
|
|
||||||
bandwidth throttling imposed by a webserver
|
|
||||||
(experimental)
|
|
||||||
--playlist-reverse Download playlist videos in reverse order
|
--playlist-reverse Download playlist videos in reverse order
|
||||||
--playlist-random Download playlist videos in random order
|
--playlist-random Download playlist videos in random order
|
||||||
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
||||||
@ -868,7 +863,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
|||||||
|
|
||||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
||||||
|
|
||||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||||
|
|
||||||
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
||||||
|
|
||||||
|
@ -128,14 +128,13 @@
|
|||||||
- **CamdemyFolder**
|
- **CamdemyFolder**
|
||||||
- **CamWithHer**
|
- **CamWithHer**
|
||||||
- **canalc2.tv**
|
- **canalc2.tv**
|
||||||
- **Canalplus**: mycanal.fr and piwiplus.fr
|
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
|
||||||
- **Canvas**
|
- **Canvas**
|
||||||
- **CanvasEen**: canvas.be and een.be
|
- **CanvasEen**: canvas.be and een.be
|
||||||
- **CarambaTV**
|
- **CarambaTV**
|
||||||
- **CarambaTVPage**
|
- **CarambaTVPage**
|
||||||
- **CartoonNetwork**
|
- **CartoonNetwork**
|
||||||
- **cbc.ca**
|
- **cbc.ca**
|
||||||
- **cbc.ca:olympics**
|
|
||||||
- **cbc.ca:player**
|
- **cbc.ca:player**
|
||||||
- **cbc.ca:watch**
|
- **cbc.ca:watch**
|
||||||
- **cbc.ca:watch:video**
|
- **cbc.ca:watch:video**
|
||||||
@ -190,7 +189,7 @@
|
|||||||
- **CSpan**: C-SPAN
|
- **CSpan**: C-SPAN
|
||||||
- **CtsNews**: 華視新聞
|
- **CtsNews**: 華視新聞
|
||||||
- **CTVNews**
|
- **CTVNews**
|
||||||
- **Culturebox**
|
- **culturebox.francetvinfo.fr**
|
||||||
- **CultureUnplugged**
|
- **CultureUnplugged**
|
||||||
- **curiositystream**
|
- **curiositystream**
|
||||||
- **curiositystream:collection**
|
- **curiositystream:collection**
|
||||||
@ -211,7 +210,6 @@
|
|||||||
- **defense.gouv.fr**
|
- **defense.gouv.fr**
|
||||||
- **democracynow**
|
- **democracynow**
|
||||||
- **DHM**: Filmarchiv - Deutsches Historisches Museum
|
- **DHM**: Filmarchiv - Deutsches Historisches Museum
|
||||||
- **Digg**
|
|
||||||
- **DigitallySpeaking**
|
- **DigitallySpeaking**
|
||||||
- **Digiteka**
|
- **Digiteka**
|
||||||
- **Discovery**
|
- **Discovery**
|
||||||
@ -292,14 +290,11 @@
|
|||||||
- **FranceTV**
|
- **FranceTV**
|
||||||
- **FranceTVEmbed**
|
- **FranceTVEmbed**
|
||||||
- **francetvinfo.fr**
|
- **francetvinfo.fr**
|
||||||
- **FranceTVJeunesse**
|
|
||||||
- **FranceTVSite**
|
|
||||||
- **Freesound**
|
- **Freesound**
|
||||||
- **freespeech.org**
|
- **freespeech.org**
|
||||||
- **FreshLive**
|
- **FreshLive**
|
||||||
- **Funimation**
|
- **Funimation**
|
||||||
- **FunkChannel**
|
- **Funk**
|
||||||
- **FunkMix**
|
|
||||||
- **FunnyOrDie**
|
- **FunnyOrDie**
|
||||||
- **Fusion**
|
- **Fusion**
|
||||||
- **Fux**
|
- **Fux**
|
||||||
@ -337,7 +332,6 @@
|
|||||||
- **HentaiStigma**
|
- **HentaiStigma**
|
||||||
- **hetklokhuis**
|
- **hetklokhuis**
|
||||||
- **hgtv.com:show**
|
- **hgtv.com:show**
|
||||||
- **HiDive**
|
|
||||||
- **HistoricFilms**
|
- **HistoricFilms**
|
||||||
- **history:topic**: History.com Topic
|
- **history:topic**: History.com Topic
|
||||||
- **hitbox**
|
- **hitbox**
|
||||||
@ -388,6 +382,7 @@
|
|||||||
- **JWPlatform**
|
- **JWPlatform**
|
||||||
- **Kakao**
|
- **Kakao**
|
||||||
- **Kaltura**
|
- **Kaltura**
|
||||||
|
- **Kamcord**
|
||||||
- **KanalPlay**: Kanal 5/9/11 Play
|
- **KanalPlay**: Kanal 5/9/11 Play
|
||||||
- **Kankan**
|
- **Kankan**
|
||||||
- **Karaoketv**
|
- **Karaoketv**
|
||||||
@ -419,7 +414,6 @@
|
|||||||
- **Lecture2Go**
|
- **Lecture2Go**
|
||||||
- **LEGO**
|
- **LEGO**
|
||||||
- **Lemonde**
|
- **Lemonde**
|
||||||
- **Lenta**
|
|
||||||
- **LePlaylist**
|
- **LePlaylist**
|
||||||
- **LetvCloud**: 乐视云
|
- **LetvCloud**: 乐视云
|
||||||
- **Libsyn**
|
- **Libsyn**
|
||||||
@ -428,7 +422,6 @@
|
|||||||
- **limelight**
|
- **limelight**
|
||||||
- **limelight:channel**
|
- **limelight:channel**
|
||||||
- **limelight:channel_list**
|
- **limelight:channel_list**
|
||||||
- **LineTV**
|
|
||||||
- **LiTV**
|
- **LiTV**
|
||||||
- **LiveLeak**
|
- **LiveLeak**
|
||||||
- **LiveLeakEmbed**
|
- **LiveLeakEmbed**
|
||||||
@ -444,8 +437,6 @@
|
|||||||
- **m6**
|
- **m6**
|
||||||
- **macgamestore**: MacGameStore trailers
|
- **macgamestore**: MacGameStore trailers
|
||||||
- **mailru**: Видео@Mail.Ru
|
- **mailru**: Видео@Mail.Ru
|
||||||
- **mailru:music**: Музыка@Mail.Ru
|
|
||||||
- **mailru:music:search**: Музыка@Mail.Ru
|
|
||||||
- **MakersChannel**
|
- **MakersChannel**
|
||||||
- **MakerTV**
|
- **MakerTV**
|
||||||
- **mangomolo:live**
|
- **mangomolo:live**
|
||||||
@ -487,7 +478,6 @@
|
|||||||
- **Moniker**: allmyvideos.net and vidspot.net
|
- **Moniker**: allmyvideos.net and vidspot.net
|
||||||
- **Morningstar**: morningstar.com
|
- **Morningstar**: morningstar.com
|
||||||
- **Motherless**
|
- **Motherless**
|
||||||
- **MotherlessGroup**
|
|
||||||
- **Motorsport**: motorsport.com
|
- **Motorsport**: motorsport.com
|
||||||
- **MovieClips**
|
- **MovieClips**
|
||||||
- **MovieFap**
|
- **MovieFap**
|
||||||
@ -511,7 +501,6 @@
|
|||||||
- **MySpass**
|
- **MySpass**
|
||||||
- **Myvi**
|
- **Myvi**
|
||||||
- **MyVidster**
|
- **MyVidster**
|
||||||
- **MyviEmbed**
|
|
||||||
- **n-tv.de**
|
- **n-tv.de**
|
||||||
- **natgeo**
|
- **natgeo**
|
||||||
- **natgeo:episodeguide**
|
- **natgeo:episodeguide**
|
||||||
@ -520,8 +509,7 @@
|
|||||||
- **NBA**
|
- **NBA**
|
||||||
- **NBC**
|
- **NBC**
|
||||||
- **NBCNews**
|
- **NBCNews**
|
||||||
- **nbcolympics**
|
- **NBCOlympics**
|
||||||
- **nbcolympics:stream**
|
|
||||||
- **NBCSports**
|
- **NBCSports**
|
||||||
- **NBCSportsVPlayer**
|
- **NBCSportsVPlayer**
|
||||||
- **ndr**: NDR.de - Norddeutscher Rundfunk
|
- **ndr**: NDR.de - Norddeutscher Rundfunk
|
||||||
@ -678,7 +666,6 @@
|
|||||||
- **RaiPlay**
|
- **RaiPlay**
|
||||||
- **RaiPlayLive**
|
- **RaiPlayLive**
|
||||||
- **RaiPlayPlaylist**
|
- **RaiPlayPlaylist**
|
||||||
- **RayWenderlich**
|
|
||||||
- **RBMARadio**
|
- **RBMARadio**
|
||||||
- **RDS**: RDS.ca
|
- **RDS**: RDS.ca
|
||||||
- **RedBullTV**
|
- **RedBullTV**
|
||||||
@ -694,6 +681,7 @@
|
|||||||
- **revision**
|
- **revision**
|
||||||
- **revision3:embed**
|
- **revision3:embed**
|
||||||
- **RICE**
|
- **RICE**
|
||||||
|
- **RingTV**
|
||||||
- **RMCDecouverte**
|
- **RMCDecouverte**
|
||||||
- **RockstarGames**
|
- **RockstarGames**
|
||||||
- **RoosterTeeth**
|
- **RoosterTeeth**
|
||||||
@ -714,7 +702,6 @@
|
|||||||
- **rtve.es:live**: RTVE.es live streams
|
- **rtve.es:live**: RTVE.es live streams
|
||||||
- **rtve.es:television**
|
- **rtve.es:television**
|
||||||
- **RTVNH**
|
- **RTVNH**
|
||||||
- **RTVS**
|
|
||||||
- **Rudo**
|
- **Rudo**
|
||||||
- **RUHD**
|
- **RUHD**
|
||||||
- **RulePorn**
|
- **RulePorn**
|
||||||
@ -744,8 +731,6 @@
|
|||||||
- **ServingSys**
|
- **ServingSys**
|
||||||
- **Servus**
|
- **Servus**
|
||||||
- **Sexu**
|
- **Sexu**
|
||||||
- **SeznamZpravy**
|
|
||||||
- **SeznamZpravyArticle**
|
|
||||||
- **Shahid**
|
- **Shahid**
|
||||||
- **ShahidShow**
|
- **ShahidShow**
|
||||||
- **Shared**: shared.sx
|
- **Shared**: shared.sx
|
||||||
@ -787,7 +772,7 @@
|
|||||||
- **Sport5**
|
- **Sport5**
|
||||||
- **SportBoxEmbed**
|
- **SportBoxEmbed**
|
||||||
- **SportDeutschland**
|
- **SportDeutschland**
|
||||||
- **SpringboardPlatform**
|
- **Sportschau**
|
||||||
- **Sprout**
|
- **Sprout**
|
||||||
- **sr:mediathek**: Saarländischer Rundfunk
|
- **sr:mediathek**: Saarländischer Rundfunk
|
||||||
- **SRGSSR**
|
- **SRGSSR**
|
||||||
@ -827,11 +812,8 @@
|
|||||||
- **Telegraaf**
|
- **Telegraaf**
|
||||||
- **TeleMB**
|
- **TeleMB**
|
||||||
- **TeleQuebec**
|
- **TeleQuebec**
|
||||||
- **TeleQuebecEmission**
|
|
||||||
- **TeleQuebecLive**
|
|
||||||
- **TeleTask**
|
- **TeleTask**
|
||||||
- **Telewebion**
|
- **Telewebion**
|
||||||
- **TennisTV**
|
|
||||||
- **TF1**
|
- **TF1**
|
||||||
- **TFO**
|
- **TFO**
|
||||||
- **TheIntercept**
|
- **TheIntercept**
|
||||||
@ -839,6 +821,7 @@
|
|||||||
- **ThePlatform**
|
- **ThePlatform**
|
||||||
- **ThePlatformFeed**
|
- **ThePlatformFeed**
|
||||||
- **TheScene**
|
- **TheScene**
|
||||||
|
- **TheSixtyOne**
|
||||||
- **TheStar**
|
- **TheStar**
|
||||||
- **TheSun**
|
- **TheSun**
|
||||||
- **TheWeatherChannel**
|
- **TheWeatherChannel**
|
||||||
@ -940,6 +923,7 @@
|
|||||||
- **vice**
|
- **vice**
|
||||||
- **vice:article**
|
- **vice:article**
|
||||||
- **vice:show**
|
- **vice:show**
|
||||||
|
- **Viceland**
|
||||||
- **Vidbit**
|
- **Vidbit**
|
||||||
- **Viddler**
|
- **Viddler**
|
||||||
- **Videa**
|
- **Videa**
|
||||||
@ -955,7 +939,6 @@
|
|||||||
- **VideoPress**
|
- **VideoPress**
|
||||||
- **videoweed**: VideoWeed
|
- **videoweed**: VideoWeed
|
||||||
- **Vidio**
|
- **Vidio**
|
||||||
- **VidLii**
|
|
||||||
- **vidme**
|
- **vidme**
|
||||||
- **vidme:user**
|
- **vidme:user**
|
||||||
- **vidme:user:likes**
|
- **vidme:user:likes**
|
||||||
@ -1018,14 +1001,10 @@
|
|||||||
- **WatchIndianPorn**: Watch Indian Porn
|
- **WatchIndianPorn**: Watch Indian Porn
|
||||||
- **WDR**
|
- **WDR**
|
||||||
- **wdr:mobile**
|
- **wdr:mobile**
|
||||||
- **WDRElefant**
|
|
||||||
- **WDRPage**
|
|
||||||
- **Webcaster**
|
- **Webcaster**
|
||||||
- **WebcasterFeed**
|
- **WebcasterFeed**
|
||||||
- **WebOfStories**
|
- **WebOfStories**
|
||||||
- **WebOfStoriesPlaylist**
|
- **WebOfStoriesPlaylist**
|
||||||
- **Weibo**
|
|
||||||
- **WeiboMobile**
|
|
||||||
- **WeiqiTV**: WQTV
|
- **WeiqiTV**: WQTV
|
||||||
- **wholecloud**: WholeCloud
|
- **wholecloud**: WholeCloud
|
||||||
- **Wimp**
|
- **Wimp**
|
||||||
@ -1045,8 +1024,6 @@
|
|||||||
- **xiami:artist**: 虾米音乐 - 歌手
|
- **xiami:artist**: 虾米音乐 - 歌手
|
||||||
- **xiami:collection**: 虾米音乐 - 精选集
|
- **xiami:collection**: 虾米音乐 - 精选集
|
||||||
- **xiami:song**: 虾米音乐
|
- **xiami:song**: 虾米音乐
|
||||||
- **ximalaya**: 喜马拉雅FM
|
|
||||||
- **ximalaya:album**: 喜马拉雅FM 专辑
|
|
||||||
- **XMinus**
|
- **XMinus**
|
||||||
- **XNXX**
|
- **XNXX**
|
||||||
- **Xstream**
|
- **Xstream**
|
||||||
@ -1060,7 +1037,6 @@
|
|||||||
- **yandexmusic:album**: Яндекс.Музыка - Альбом
|
- **yandexmusic:album**: Яндекс.Музыка - Альбом
|
||||||
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
|
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
|
||||||
- **yandexmusic:track**: Яндекс.Музыка - Трек
|
- **yandexmusic:track**: Яндекс.Музыка - Трек
|
||||||
- **YapFiles**
|
|
||||||
- **YesJapan**
|
- **YesJapan**
|
||||||
- **yinyuetai:video**: 音悦Tai
|
- **yinyuetai:video**: 音悦Tai
|
||||||
- **Ynet**
|
- **Ynet**
|
||||||
|
@ -3,4 +3,4 @@ universal = True
|
|||||||
|
|
||||||
[flake8]
|
[flake8]
|
||||||
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
|
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
|
||||||
ignore = E402,E501,E731,E741
|
ignore = E402,E501,E731
|
||||||
|
@ -694,55 +694,6 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
|
|||||||
self.ie._sort_formats(formats)
|
self.ie._sort_formats(formats)
|
||||||
expect_value(self, formats, expected_formats, None)
|
expect_value(self, formats, expected_formats, None)
|
||||||
|
|
||||||
def test_parse_xspf(self):
|
|
||||||
_TEST_CASES = [
|
|
||||||
(
|
|
||||||
'foo_xspf',
|
|
||||||
'https://example.org/src/foo_xspf.xspf',
|
|
||||||
[{
|
|
||||||
'id': 'foo_xspf',
|
|
||||||
'title': 'Pandemonium',
|
|
||||||
'description': 'Visit http://bigbrother404.bandcamp.com',
|
|
||||||
'duration': 202.416,
|
|
||||||
'formats': [{
|
|
||||||
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
|
|
||||||
'url': 'https://example.org/src/cd1/track%201.mp3',
|
|
||||||
}],
|
|
||||||
}, {
|
|
||||||
'id': 'foo_xspf',
|
|
||||||
'title': 'Final Cartridge (Nichico Twelve Remix)',
|
|
||||||
'description': 'Visit http://bigbrother404.bandcamp.com',
|
|
||||||
'duration': 255.857,
|
|
||||||
'formats': [{
|
|
||||||
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
|
|
||||||
'url': 'https://example.org/%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3',
|
|
||||||
}],
|
|
||||||
}, {
|
|
||||||
'id': 'foo_xspf',
|
|
||||||
'title': 'Rebuilding Nightingale',
|
|
||||||
'description': 'Visit http://bigbrother404.bandcamp.com',
|
|
||||||
'duration': 287.915,
|
|
||||||
'formats': [{
|
|
||||||
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
|
|
||||||
'url': 'https://example.org/src/track3.mp3',
|
|
||||||
}, {
|
|
||||||
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
|
|
||||||
'url': 'https://example.com/track3.mp3',
|
|
||||||
}]
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
]
|
|
||||||
|
|
||||||
for xspf_file, xspf_url, expected_entries in _TEST_CASES:
|
|
||||||
with io.open('./test/testdata/xspf/%s.xspf' % xspf_file,
|
|
||||||
mode='r', encoding='utf-8') as f:
|
|
||||||
entries = self.ie._parse_xspf(
|
|
||||||
compat_etree_fromstring(f.read().encode('utf-8')),
|
|
||||||
xspf_file, xspf_url=xspf_url, xspf_base_url=xspf_url)
|
|
||||||
expect_value(self, entries, expected_entries, None)
|
|
||||||
for i in range(len(entries)):
|
|
||||||
expect_dict(self, entries[i], expected_entries[i])
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -92,8 +92,8 @@ class TestDownload(unittest.TestCase):
|
|||||||
def generator(test_case, tname):
|
def generator(test_case, tname):
|
||||||
|
|
||||||
def test_template(self):
|
def test_template(self):
|
||||||
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])()
|
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
|
||||||
other_ies = [get_info_extractor(ie_key)() for ie_key in test_case.get('add_ie', [])]
|
other_ies = [get_info_extractor(ie_key) for ie_key in test_case.get('add_ie', [])]
|
||||||
is_playlist = any(k.startswith('playlist') for k in test_case)
|
is_playlist = any(k.startswith('playlist') for k in test_case)
|
||||||
test_cases = test_case.get(
|
test_cases = test_case.get(
|
||||||
'playlist', [] if is_playlist else [test_case])
|
'playlist', [] if is_playlist else [test_case])
|
||||||
|
@ -1,125 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
# Allow direct execution
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
import unittest
|
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
||||||
|
|
||||||
from test.helper import try_rm
|
|
||||||
from youtube_dl import YoutubeDL
|
|
||||||
from youtube_dl.compat import compat_http_server
|
|
||||||
from youtube_dl.downloader.http import HttpFD
|
|
||||||
from youtube_dl.utils import encodeFilename
|
|
||||||
import ssl
|
|
||||||
import threading
|
|
||||||
|
|
||||||
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
|
|
||||||
|
|
||||||
|
|
||||||
def http_server_port(httpd):
|
|
||||||
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
|
|
||||||
# In Jython SSLSocket is not a subclass of socket.socket
|
|
||||||
sock = httpd.socket.sock
|
|
||||||
else:
|
|
||||||
sock = httpd.socket
|
|
||||||
return sock.getsockname()[1]
|
|
||||||
|
|
||||||
|
|
||||||
TEST_SIZE = 10 * 1024
|
|
||||||
|
|
||||||
|
|
||||||
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
|
||||||
def log_message(self, format, *args):
|
|
||||||
pass
|
|
||||||
|
|
||||||
def send_content_range(self, total=None):
|
|
||||||
range_header = self.headers.get('Range')
|
|
||||||
start = end = None
|
|
||||||
if range_header:
|
|
||||||
mobj = re.search(r'^bytes=(\d+)-(\d+)', range_header)
|
|
||||||
if mobj:
|
|
||||||
start = int(mobj.group(1))
|
|
||||||
end = int(mobj.group(2))
|
|
||||||
valid_range = start is not None and end is not None
|
|
||||||
if valid_range:
|
|
||||||
content_range = 'bytes %d-%d' % (start, end)
|
|
||||||
if total:
|
|
||||||
content_range += '/%d' % total
|
|
||||||
self.send_header('Content-Range', content_range)
|
|
||||||
return (end - start + 1) if valid_range else total
|
|
||||||
|
|
||||||
def serve(self, range=True, content_length=True):
|
|
||||||
self.send_response(200)
|
|
||||||
self.send_header('Content-Type', 'video/mp4')
|
|
||||||
size = TEST_SIZE
|
|
||||||
if range:
|
|
||||||
size = self.send_content_range(TEST_SIZE)
|
|
||||||
if content_length:
|
|
||||||
self.send_header('Content-Length', size)
|
|
||||||
self.end_headers()
|
|
||||||
self.wfile.write(b'#' * size)
|
|
||||||
|
|
||||||
def do_GET(self):
|
|
||||||
if self.path == '/regular':
|
|
||||||
self.serve()
|
|
||||||
elif self.path == '/no-content-length':
|
|
||||||
self.serve(content_length=False)
|
|
||||||
elif self.path == '/no-range':
|
|
||||||
self.serve(range=False)
|
|
||||||
elif self.path == '/no-range-no-content-length':
|
|
||||||
self.serve(range=False, content_length=False)
|
|
||||||
else:
|
|
||||||
assert False
|
|
||||||
|
|
||||||
|
|
||||||
class FakeLogger(object):
|
|
||||||
def debug(self, msg):
|
|
||||||
pass
|
|
||||||
|
|
||||||
def warning(self, msg):
|
|
||||||
pass
|
|
||||||
|
|
||||||
def error(self, msg):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
class TestHttpFD(unittest.TestCase):
|
|
||||||
def setUp(self):
|
|
||||||
self.httpd = compat_http_server.HTTPServer(
|
|
||||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
|
||||||
self.port = http_server_port(self.httpd)
|
|
||||||
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
|
||||||
self.server_thread.daemon = True
|
|
||||||
self.server_thread.start()
|
|
||||||
|
|
||||||
def download(self, params, ep):
|
|
||||||
params['logger'] = FakeLogger()
|
|
||||||
ydl = YoutubeDL(params)
|
|
||||||
downloader = HttpFD(ydl, params)
|
|
||||||
filename = 'testfile.mp4'
|
|
||||||
try_rm(encodeFilename(filename))
|
|
||||||
self.assertTrue(downloader.real_download(filename, {
|
|
||||||
'url': 'http://127.0.0.1:%d/%s' % (self.port, ep),
|
|
||||||
}))
|
|
||||||
self.assertEqual(os.path.getsize(encodeFilename(filename)), TEST_SIZE)
|
|
||||||
try_rm(encodeFilename(filename))
|
|
||||||
|
|
||||||
def download_all(self, params):
|
|
||||||
for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'):
|
|
||||||
self.download(params, ep)
|
|
||||||
|
|
||||||
def test_regular(self):
|
|
||||||
self.download_all({})
|
|
||||||
|
|
||||||
def test_chunked(self):
|
|
||||||
self.download_all({
|
|
||||||
'http_chunk_size': 1000,
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
unittest.main()
|
|
@ -47,7 +47,7 @@ class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
|||||||
self.end_headers()
|
self.end_headers()
|
||||||
return
|
return
|
||||||
|
|
||||||
new_url = 'http://127.0.0.1:%d/中文.html' % http_server_port(self.server)
|
new_url = 'http://localhost:%d/中文.html' % http_server_port(self.server)
|
||||||
self.send_response(302)
|
self.send_response(302)
|
||||||
self.send_header(b'Location', new_url.encode('utf-8'))
|
self.send_header(b'Location', new_url.encode('utf-8'))
|
||||||
self.end_headers()
|
self.end_headers()
|
||||||
@ -74,7 +74,7 @@ class FakeLogger(object):
|
|||||||
class TestHTTP(unittest.TestCase):
|
class TestHTTP(unittest.TestCase):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
self.httpd = compat_http_server.HTTPServer(
|
self.httpd = compat_http_server.HTTPServer(
|
||||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
('localhost', 0), HTTPTestRequestHandler)
|
||||||
self.port = http_server_port(self.httpd)
|
self.port = http_server_port(self.httpd)
|
||||||
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
||||||
self.server_thread.daemon = True
|
self.server_thread.daemon = True
|
||||||
@ -86,15 +86,15 @@ class TestHTTP(unittest.TestCase):
|
|||||||
return
|
return
|
||||||
|
|
||||||
ydl = YoutubeDL({'logger': FakeLogger()})
|
ydl = YoutubeDL({'logger': FakeLogger()})
|
||||||
r = ydl.extract_info('http://127.0.0.1:%d/302' % self.port)
|
r = ydl.extract_info('http://localhost:%d/302' % self.port)
|
||||||
self.assertEqual(r['entries'][0]['url'], 'http://127.0.0.1:%d/vid.mp4' % self.port)
|
self.assertEqual(r['entries'][0]['url'], 'http://localhost:%d/vid.mp4' % self.port)
|
||||||
|
|
||||||
|
|
||||||
class TestHTTPS(unittest.TestCase):
|
class TestHTTPS(unittest.TestCase):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
certfn = os.path.join(TEST_DIR, 'testcert.pem')
|
certfn = os.path.join(TEST_DIR, 'testcert.pem')
|
||||||
self.httpd = compat_http_server.HTTPServer(
|
self.httpd = compat_http_server.HTTPServer(
|
||||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
('localhost', 0), HTTPTestRequestHandler)
|
||||||
self.httpd.socket = ssl.wrap_socket(
|
self.httpd.socket = ssl.wrap_socket(
|
||||||
self.httpd.socket, certfile=certfn, server_side=True)
|
self.httpd.socket, certfile=certfn, server_side=True)
|
||||||
self.port = http_server_port(self.httpd)
|
self.port = http_server_port(self.httpd)
|
||||||
@ -107,11 +107,11 @@ class TestHTTPS(unittest.TestCase):
|
|||||||
ydl = YoutubeDL({'logger': FakeLogger()})
|
ydl = YoutubeDL({'logger': FakeLogger()})
|
||||||
self.assertRaises(
|
self.assertRaises(
|
||||||
Exception,
|
Exception,
|
||||||
ydl.extract_info, 'https://127.0.0.1:%d/video.html' % self.port)
|
ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
|
||||||
|
|
||||||
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
|
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
|
||||||
r = ydl.extract_info('https://127.0.0.1:%d/video.html' % self.port)
|
r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
|
||||||
self.assertEqual(r['entries'][0]['url'], 'https://127.0.0.1:%d/vid.mp4' % self.port)
|
self.assertEqual(r['entries'][0]['url'], 'https://localhost:%d/vid.mp4' % self.port)
|
||||||
|
|
||||||
|
|
||||||
def _build_proxy_handler(name):
|
def _build_proxy_handler(name):
|
||||||
@ -132,23 +132,23 @@ def _build_proxy_handler(name):
|
|||||||
class TestProxy(unittest.TestCase):
|
class TestProxy(unittest.TestCase):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
self.proxy = compat_http_server.HTTPServer(
|
self.proxy = compat_http_server.HTTPServer(
|
||||||
('127.0.0.1', 0), _build_proxy_handler('normal'))
|
('localhost', 0), _build_proxy_handler('normal'))
|
||||||
self.port = http_server_port(self.proxy)
|
self.port = http_server_port(self.proxy)
|
||||||
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
|
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
|
||||||
self.proxy_thread.daemon = True
|
self.proxy_thread.daemon = True
|
||||||
self.proxy_thread.start()
|
self.proxy_thread.start()
|
||||||
|
|
||||||
self.geo_proxy = compat_http_server.HTTPServer(
|
self.geo_proxy = compat_http_server.HTTPServer(
|
||||||
('127.0.0.1', 0), _build_proxy_handler('geo'))
|
('localhost', 0), _build_proxy_handler('geo'))
|
||||||
self.geo_port = http_server_port(self.geo_proxy)
|
self.geo_port = http_server_port(self.geo_proxy)
|
||||||
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
|
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
|
||||||
self.geo_proxy_thread.daemon = True
|
self.geo_proxy_thread.daemon = True
|
||||||
self.geo_proxy_thread.start()
|
self.geo_proxy_thread.start()
|
||||||
|
|
||||||
def test_proxy(self):
|
def test_proxy(self):
|
||||||
geo_proxy = '127.0.0.1:{0}'.format(self.geo_port)
|
geo_proxy = 'localhost:{0}'.format(self.geo_port)
|
||||||
ydl = YoutubeDL({
|
ydl = YoutubeDL({
|
||||||
'proxy': '127.0.0.1:{0}'.format(self.port),
|
'proxy': 'localhost:{0}'.format(self.port),
|
||||||
'geo_verification_proxy': geo_proxy,
|
'geo_verification_proxy': geo_proxy,
|
||||||
})
|
})
|
||||||
url = 'http://foo.com/bar'
|
url = 'http://foo.com/bar'
|
||||||
@ -162,7 +162,7 @@ class TestProxy(unittest.TestCase):
|
|||||||
|
|
||||||
def test_proxy_with_idn(self):
|
def test_proxy_with_idn(self):
|
||||||
ydl = YoutubeDL({
|
ydl = YoutubeDL({
|
||||||
'proxy': '127.0.0.1:{0}'.format(self.port),
|
'proxy': 'localhost:{0}'.format(self.port),
|
||||||
})
|
})
|
||||||
url = 'http://中文.tw/'
|
url = 'http://中文.tw/'
|
||||||
response = ydl.urlopen(url).read().decode('utf-8')
|
response = ydl.urlopen(url).read().decode('utf-8')
|
||||||
|
@ -53,12 +53,10 @@ from youtube_dl.utils import (
|
|||||||
parse_filesize,
|
parse_filesize,
|
||||||
parse_count,
|
parse_count,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
parse_resolution,
|
|
||||||
pkcs1pad,
|
pkcs1pad,
|
||||||
read_batch_urls,
|
read_batch_urls,
|
||||||
sanitize_filename,
|
sanitize_filename,
|
||||||
sanitize_path,
|
sanitize_path,
|
||||||
sanitize_url,
|
|
||||||
expand_path,
|
expand_path,
|
||||||
prepend_extension,
|
prepend_extension,
|
||||||
replace_extension,
|
replace_extension,
|
||||||
@ -221,12 +219,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(sanitize_path('./abc'), 'abc')
|
self.assertEqual(sanitize_path('./abc'), 'abc')
|
||||||
self.assertEqual(sanitize_path('./../abc'), '..\\abc')
|
self.assertEqual(sanitize_path('./../abc'), '..\\abc')
|
||||||
|
|
||||||
def test_sanitize_url(self):
|
|
||||||
self.assertEqual(sanitize_url('//foo.bar'), 'http://foo.bar')
|
|
||||||
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
|
|
||||||
self.assertEqual(sanitize_url('rmtps://foo.bar'), 'rtmps://foo.bar')
|
|
||||||
self.assertEqual(sanitize_url('https://foo.bar'), 'https://foo.bar')
|
|
||||||
|
|
||||||
def test_expand_path(self):
|
def test_expand_path(self):
|
||||||
def env(var):
|
def env(var):
|
||||||
return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var)
|
return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var)
|
||||||
@ -352,7 +344,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
|
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
|
||||||
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
|
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
|
||||||
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
|
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
|
||||||
self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363)
|
|
||||||
|
|
||||||
def test_determine_ext(self):
|
def test_determine_ext(self):
|
||||||
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
|
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
|
||||||
@ -823,9 +814,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
inp = '''{"duration": "00:01:07"}'''
|
inp = '''{"duration": "00:01:07"}'''
|
||||||
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
|
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
|
||||||
|
|
||||||
inp = '''{segments: [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}'''
|
|
||||||
self.assertEqual(js_to_json(inp), '''{"segments": [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}''')
|
|
||||||
|
|
||||||
def test_js_to_json_edgecases(self):
|
def test_js_to_json_edgecases(self):
|
||||||
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
|
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
|
||||||
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
|
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
|
||||||
@ -897,13 +885,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
|
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
|
||||||
self.assertEqual(json.loads(on), {'42': 42})
|
self.assertEqual(json.loads(on), {'42': 42})
|
||||||
|
|
||||||
on = js_to_json('{42:4.2e1}')
|
|
||||||
self.assertEqual(json.loads(on), {'42': 42.0})
|
|
||||||
|
|
||||||
def test_js_to_json_malformed(self):
|
|
||||||
self.assertEqual(js_to_json('42a1'), '42"a1"')
|
|
||||||
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
|
|
||||||
|
|
||||||
def test_extract_attributes(self):
|
def test_extract_attributes(self):
|
||||||
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
|
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
|
||||||
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
|
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
|
||||||
@ -984,16 +965,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(parse_count('1.1kk '), 1100000)
|
self.assertEqual(parse_count('1.1kk '), 1100000)
|
||||||
self.assertEqual(parse_count('1.1kk views'), 1100000)
|
self.assertEqual(parse_count('1.1kk views'), 1100000)
|
||||||
|
|
||||||
def test_parse_resolution(self):
|
|
||||||
self.assertEqual(parse_resolution(None), {})
|
|
||||||
self.assertEqual(parse_resolution(''), {})
|
|
||||||
self.assertEqual(parse_resolution('1920x1080'), {'width': 1920, 'height': 1080})
|
|
||||||
self.assertEqual(parse_resolution('1920×1080'), {'width': 1920, 'height': 1080})
|
|
||||||
self.assertEqual(parse_resolution('1920 x 1080'), {'width': 1920, 'height': 1080})
|
|
||||||
self.assertEqual(parse_resolution('720p'), {'height': 720})
|
|
||||||
self.assertEqual(parse_resolution('4k'), {'height': 2160})
|
|
||||||
self.assertEqual(parse_resolution('8K'), {'height': 4320})
|
|
||||||
|
|
||||||
def test_version_tuple(self):
|
def test_version_tuple(self):
|
||||||
self.assertEqual(version_tuple('1'), (1,))
|
self.assertEqual(version_tuple('1'), (1,))
|
||||||
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
|
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
|
||||||
|
34
test/testdata/xspf/foo_xspf.xspf
vendored
34
test/testdata/xspf/foo_xspf.xspf
vendored
@ -1,34 +0,0 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
|
||||||
<playlist version="1" xmlns="http://xspf.org/ns/0/">
|
|
||||||
<date>2018-03-09T18:01:43Z</date>
|
|
||||||
<trackList>
|
|
||||||
<track>
|
|
||||||
<location>cd1/track%201.mp3</location>
|
|
||||||
<title>Pandemonium</title>
|
|
||||||
<creator>Foilverb</creator>
|
|
||||||
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
|
|
||||||
<album>Pandemonium EP</album>
|
|
||||||
<trackNum>1</trackNum>
|
|
||||||
<duration>202416</duration>
|
|
||||||
</track>
|
|
||||||
<track>
|
|
||||||
<location>../%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3</location>
|
|
||||||
<title>Final Cartridge (Nichico Twelve Remix)</title>
|
|
||||||
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
|
|
||||||
<creator>Foilverb</creator>
|
|
||||||
<album>Pandemonium EP</album>
|
|
||||||
<trackNum>2</trackNum>
|
|
||||||
<duration>255857</duration>
|
|
||||||
</track>
|
|
||||||
<track>
|
|
||||||
<location>track3.mp3</location>
|
|
||||||
<location>https://example.com/track3.mp3</location>
|
|
||||||
<title>Rebuilding Nightingale</title>
|
|
||||||
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
|
|
||||||
<creator>Foilverb</creator>
|
|
||||||
<album>Pandemonium EP</album>
|
|
||||||
<trackNum>3</trackNum>
|
|
||||||
<duration>287915</duration>
|
|
||||||
</track>
|
|
||||||
</trackList>
|
|
||||||
</playlist>
|
|
@ -298,8 +298,7 @@ class YoutubeDL(object):
|
|||||||
the downloader (see youtube_dl/downloader/common.py):
|
the downloader (see youtube_dl/downloader/common.py):
|
||||||
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
|
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
|
||||||
noresizebuffer, retries, continuedl, noprogress, consoletitle,
|
noresizebuffer, retries, continuedl, noprogress, consoletitle,
|
||||||
xattr_set_filesize, external_downloader_args, hls_use_mpegts,
|
xattr_set_filesize, external_downloader_args, hls_use_mpegts.
|
||||||
http_chunk_size.
|
|
||||||
|
|
||||||
The following options are used by the post processors:
|
The following options are used by the post processors:
|
||||||
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
|
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
|
||||||
@ -1033,7 +1032,7 @@ class YoutubeDL(object):
|
|||||||
'!=': operator.ne,
|
'!=': operator.ne,
|
||||||
}
|
}
|
||||||
operator_rex = re.compile(r'''(?x)\s*
|
operator_rex = re.compile(r'''(?x)\s*
|
||||||
(?P<key>width|height|tbr|abr|vbr|asr|filesize|filesize_approx|fps)
|
(?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
|
||||||
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
|
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
|
||||||
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
|
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
|
||||||
$
|
$
|
||||||
|
@ -191,11 +191,6 @@ def _real_main(argv=None):
|
|||||||
if numeric_buffersize is None:
|
if numeric_buffersize is None:
|
||||||
parser.error('invalid buffer size specified')
|
parser.error('invalid buffer size specified')
|
||||||
opts.buffersize = numeric_buffersize
|
opts.buffersize = numeric_buffersize
|
||||||
if opts.http_chunk_size is not None:
|
|
||||||
numeric_chunksize = FileDownloader.parse_bytes(opts.http_chunk_size)
|
|
||||||
if not numeric_chunksize:
|
|
||||||
parser.error('invalid http chunk size specified')
|
|
||||||
opts.http_chunk_size = numeric_chunksize
|
|
||||||
if opts.playliststart <= 0:
|
if opts.playliststart <= 0:
|
||||||
raise ValueError('Playlist start must be positive')
|
raise ValueError('Playlist start must be positive')
|
||||||
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
|
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
|
||||||
@ -351,7 +346,6 @@ def _real_main(argv=None):
|
|||||||
'keep_fragments': opts.keep_fragments,
|
'keep_fragments': opts.keep_fragments,
|
||||||
'buffersize': opts.buffersize,
|
'buffersize': opts.buffersize,
|
||||||
'noresizebuffer': opts.noresizebuffer,
|
'noresizebuffer': opts.noresizebuffer,
|
||||||
'http_chunk_size': opts.http_chunk_size,
|
|
||||||
'continuedl': opts.continue_dl,
|
'continuedl': opts.continue_dl,
|
||||||
'noprogress': opts.noprogress,
|
'noprogress': opts.noprogress,
|
||||||
'progress_with_newline': opts.progress_with_newline,
|
'progress_with_newline': opts.progress_with_newline,
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
from math import ceil
|
from math import ceil
|
||||||
|
|
||||||
from .compat import compat_b64decode
|
|
||||||
from .utils import bytes_to_intlist, intlist_to_bytes
|
from .utils import bytes_to_intlist, intlist_to_bytes
|
||||||
|
|
||||||
BLOCK_SIZE_BYTES = 16
|
BLOCK_SIZE_BYTES = 16
|
||||||
@ -180,7 +180,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
|
|||||||
"""
|
"""
|
||||||
NONCE_LENGTH_BYTES = 8
|
NONCE_LENGTH_BYTES = 8
|
||||||
|
|
||||||
data = bytes_to_intlist(compat_b64decode(data))
|
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
|
||||||
password = bytes_to_intlist(password.encode('utf-8'))
|
password = bytes_to_intlist(password.encode('utf-8'))
|
||||||
|
|
||||||
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))
|
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import base64
|
|
||||||
import binascii
|
import binascii
|
||||||
import collections
|
import collections
|
||||||
import ctypes
|
import ctypes
|
||||||
@ -2897,24 +2896,9 @@ except TypeError:
|
|||||||
if isinstance(spec, compat_str):
|
if isinstance(spec, compat_str):
|
||||||
spec = spec.encode('ascii')
|
spec = spec.encode('ascii')
|
||||||
return struct.unpack(spec, *args)
|
return struct.unpack(spec, *args)
|
||||||
|
|
||||||
class compat_Struct(struct.Struct):
|
|
||||||
def __init__(self, fmt):
|
|
||||||
if isinstance(fmt, compat_str):
|
|
||||||
fmt = fmt.encode('ascii')
|
|
||||||
super(compat_Struct, self).__init__(fmt)
|
|
||||||
else:
|
else:
|
||||||
compat_struct_pack = struct.pack
|
compat_struct_pack = struct.pack
|
||||||
compat_struct_unpack = struct.unpack
|
compat_struct_unpack = struct.unpack
|
||||||
if platform.python_implementation() == 'IronPython' and sys.version_info < (2, 7, 8):
|
|
||||||
class compat_Struct(struct.Struct):
|
|
||||||
def unpack(self, string):
|
|
||||||
if not isinstance(string, buffer): # noqa: F821
|
|
||||||
string = buffer(string) # noqa: F821
|
|
||||||
return super(compat_Struct, self).unpack(string)
|
|
||||||
else:
|
|
||||||
compat_Struct = struct.Struct
|
|
||||||
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from future_builtins import zip as compat_zip
|
from future_builtins import zip as compat_zip
|
||||||
@ -2924,16 +2908,6 @@ except ImportError: # not 2.6+ or is 3.x
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
compat_zip = zip
|
compat_zip = zip
|
||||||
|
|
||||||
|
|
||||||
if sys.version_info < (3, 3):
|
|
||||||
def compat_b64decode(s, *args, **kwargs):
|
|
||||||
if isinstance(s, compat_str):
|
|
||||||
s = s.encode('ascii')
|
|
||||||
return base64.b64decode(s, *args, **kwargs)
|
|
||||||
else:
|
|
||||||
compat_b64decode = base64.b64decode
|
|
||||||
|
|
||||||
|
|
||||||
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
|
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
|
||||||
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function
|
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function
|
||||||
# names, see the original PyPy issue [1] and the youtube-dl one [2].
|
# names, see the original PyPy issue [1] and the youtube-dl one [2].
|
||||||
@ -2956,8 +2930,6 @@ __all__ = [
|
|||||||
'compat_HTMLParseError',
|
'compat_HTMLParseError',
|
||||||
'compat_HTMLParser',
|
'compat_HTMLParser',
|
||||||
'compat_HTTPError',
|
'compat_HTTPError',
|
||||||
'compat_Struct',
|
|
||||||
'compat_b64decode',
|
|
||||||
'compat_basestring',
|
'compat_basestring',
|
||||||
'compat_chr',
|
'compat_chr',
|
||||||
'compat_cookiejar',
|
'compat_cookiejar',
|
||||||
|
@ -49,9 +49,6 @@ class FileDownloader(object):
|
|||||||
external_downloader_args: A list of additional command-line arguments for the
|
external_downloader_args: A list of additional command-line arguments for the
|
||||||
external downloader.
|
external downloader.
|
||||||
hls_use_mpegts: Use the mpegts container for HLS videos.
|
hls_use_mpegts: Use the mpegts container for HLS videos.
|
||||||
http_chunk_size: Size of a chunk for chunk-based HTTP downloading. May be
|
|
||||||
useful for bypassing bandwidth throttling imposed by
|
|
||||||
a webserver (experimental)
|
|
||||||
|
|
||||||
Subclasses of this one must re-define the real_download method.
|
Subclasses of this one must re-define the real_download method.
|
||||||
"""
|
"""
|
||||||
@ -249,13 +246,12 @@ class FileDownloader(object):
|
|||||||
if self.params.get('noprogress', False):
|
if self.params.get('noprogress', False):
|
||||||
self.to_screen('[download] Download completed')
|
self.to_screen('[download] Download completed')
|
||||||
else:
|
else:
|
||||||
msg_template = '100%%'
|
|
||||||
if s.get('total_bytes') is not None:
|
|
||||||
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
|
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
|
||||||
msg_template += ' of %(_total_bytes_str)s'
|
|
||||||
if s.get('elapsed') is not None:
|
if s.get('elapsed') is not None:
|
||||||
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
|
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
|
||||||
msg_template += ' in %(_elapsed_str)s'
|
msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
|
||||||
|
else:
|
||||||
|
msg_template = '100%% of %(_total_bytes_str)s'
|
||||||
self._report_progress_status(
|
self._report_progress_status(
|
||||||
msg_template % s, is_last_line=True)
|
msg_template % s, is_last_line=True)
|
||||||
|
|
||||||
|
@ -1,10 +1,9 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import os.path
|
import os.path
|
||||||
import re
|
|
||||||
import subprocess
|
import subprocess
|
||||||
import sys
|
import sys
|
||||||
import time
|
import re
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
@ -31,7 +30,6 @@ class ExternalFD(FileDownloader):
|
|||||||
tmpfilename = self.temp_name(filename)
|
tmpfilename = self.temp_name(filename)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
started = time.time()
|
|
||||||
retval = self._call_downloader(tmpfilename, info_dict)
|
retval = self._call_downloader(tmpfilename, info_dict)
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
if not info_dict.get('is_live'):
|
if not info_dict.get('is_live'):
|
||||||
@ -43,20 +41,15 @@ class ExternalFD(FileDownloader):
|
|||||||
self.to_screen('[%s] Interrupted by user' % self.get_basename())
|
self.to_screen('[%s] Interrupted by user' % self.get_basename())
|
||||||
|
|
||||||
if retval == 0:
|
if retval == 0:
|
||||||
status = {
|
|
||||||
'filename': filename,
|
|
||||||
'status': 'finished',
|
|
||||||
'elapsed': time.time() - started,
|
|
||||||
}
|
|
||||||
if filename != '-':
|
|
||||||
fsize = os.path.getsize(encodeFilename(tmpfilename))
|
fsize = os.path.getsize(encodeFilename(tmpfilename))
|
||||||
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
|
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
|
||||||
self.try_rename(tmpfilename, filename)
|
self.try_rename(tmpfilename, filename)
|
||||||
status.update({
|
self._hook_progress({
|
||||||
'downloaded_bytes': fsize,
|
'downloaded_bytes': fsize,
|
||||||
'total_bytes': fsize,
|
'total_bytes': fsize,
|
||||||
|
'filename': filename,
|
||||||
|
'status': 'finished',
|
||||||
})
|
})
|
||||||
self._hook_progress(status)
|
|
||||||
return True
|
return True
|
||||||
else:
|
else:
|
||||||
self.to_stderr('\n')
|
self.to_stderr('\n')
|
||||||
|
@ -1,12 +1,12 @@
|
|||||||
from __future__ import division, unicode_literals
|
from __future__ import division, unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import io
|
import io
|
||||||
import itertools
|
import itertools
|
||||||
import time
|
import time
|
||||||
|
|
||||||
from .fragment import FragmentFD
|
from .fragment import FragmentFD
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_etree_fromstring,
|
compat_etree_fromstring,
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
@ -312,7 +312,7 @@ class F4mFD(FragmentFD):
|
|||||||
boot_info = self._get_bootstrap_from_url(bootstrap_url)
|
boot_info = self._get_bootstrap_from_url(bootstrap_url)
|
||||||
else:
|
else:
|
||||||
bootstrap_url = None
|
bootstrap_url = None
|
||||||
bootstrap = compat_b64decode(node.text)
|
bootstrap = base64.b64decode(node.text.encode('ascii'))
|
||||||
boot_info = read_bootstrap_info(bootstrap)
|
boot_info = read_bootstrap_info(bootstrap)
|
||||||
return boot_info, bootstrap_url
|
return boot_info, bootstrap_url
|
||||||
|
|
||||||
@ -349,7 +349,7 @@ class F4mFD(FragmentFD):
|
|||||||
live = boot_info['live']
|
live = boot_info['live']
|
||||||
metadata_node = media.find(_add_ns('metadata'))
|
metadata_node = media.find(_add_ns('metadata'))
|
||||||
if metadata_node is not None:
|
if metadata_node is not None:
|
||||||
metadata = compat_b64decode(metadata_node.text)
|
metadata = base64.b64decode(metadata_node.text.encode('ascii'))
|
||||||
else:
|
else:
|
||||||
metadata = None
|
metadata = None
|
||||||
|
|
||||||
|
@ -241,16 +241,12 @@ class FragmentFD(FileDownloader):
|
|||||||
if os.path.isfile(ytdl_filename):
|
if os.path.isfile(ytdl_filename):
|
||||||
os.remove(ytdl_filename)
|
os.remove(ytdl_filename)
|
||||||
elapsed = time.time() - ctx['started']
|
elapsed = time.time() - ctx['started']
|
||||||
|
|
||||||
if ctx['tmpfilename'] == '-':
|
|
||||||
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
|
|
||||||
else:
|
|
||||||
self.try_rename(ctx['tmpfilename'], ctx['filename'])
|
self.try_rename(ctx['tmpfilename'], ctx['filename'])
|
||||||
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
|
fsize = os.path.getsize(encodeFilename(ctx['filename']))
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'downloaded_bytes': downloaded_bytes,
|
'downloaded_bytes': fsize,
|
||||||
'total_bytes': downloaded_bytes,
|
'total_bytes': fsize,
|
||||||
'filename': ctx['filename'],
|
'filename': ctx['filename'],
|
||||||
'status': 'finished',
|
'status': 'finished',
|
||||||
'elapsed': elapsed,
|
'elapsed': elapsed,
|
||||||
|
@ -75,9 +75,8 @@ class HlsFD(FragmentFD):
|
|||||||
fd.add_progress_hook(ph)
|
fd.add_progress_hook(ph)
|
||||||
return fd.real_download(filename, info_dict)
|
return fd.real_download(filename, info_dict)
|
||||||
|
|
||||||
def is_ad_fragment(s):
|
def anvato_ad(s):
|
||||||
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s or
|
return s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
|
||||||
s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
|
|
||||||
|
|
||||||
media_frags = 0
|
media_frags = 0
|
||||||
ad_frags = 0
|
ad_frags = 0
|
||||||
@ -87,7 +86,7 @@ class HlsFD(FragmentFD):
|
|||||||
if not line:
|
if not line:
|
||||||
continue
|
continue
|
||||||
if line.startswith('#'):
|
if line.startswith('#'):
|
||||||
if is_ad_fragment(line):
|
if anvato_ad(line):
|
||||||
ad_frags += 1
|
ad_frags += 1
|
||||||
ad_frag_next = True
|
ad_frag_next = True
|
||||||
continue
|
continue
|
||||||
@ -196,7 +195,7 @@ class HlsFD(FragmentFD):
|
|||||||
'start': sub_range_start,
|
'start': sub_range_start,
|
||||||
'end': sub_range_start + int(splitted_byte_range[0]),
|
'end': sub_range_start + int(splitted_byte_range[0]),
|
||||||
}
|
}
|
||||||
elif is_ad_fragment(line):
|
elif anvato_ad(line):
|
||||||
ad_frag_next = True
|
ad_frag_next = True
|
||||||
|
|
||||||
self._finish_frag_download(ctx)
|
self._finish_frag_download(ctx)
|
||||||
|
@ -4,18 +4,13 @@ import errno
|
|||||||
import os
|
import os
|
||||||
import socket
|
import socket
|
||||||
import time
|
import time
|
||||||
import random
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from ..compat import (
|
from ..compat import compat_urllib_error
|
||||||
compat_str,
|
|
||||||
compat_urllib_error,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ContentTooShortError,
|
ContentTooShortError,
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
int_or_none,
|
|
||||||
sanitize_open,
|
sanitize_open,
|
||||||
sanitized_Request,
|
sanitized_Request,
|
||||||
write_xattr,
|
write_xattr,
|
||||||
@ -43,26 +38,21 @@ class HttpFD(FileDownloader):
|
|||||||
add_headers = info_dict.get('http_headers')
|
add_headers = info_dict.get('http_headers')
|
||||||
if add_headers:
|
if add_headers:
|
||||||
headers.update(add_headers)
|
headers.update(add_headers)
|
||||||
|
basic_request = sanitized_Request(url, None, headers)
|
||||||
|
request = sanitized_Request(url, None, headers)
|
||||||
|
|
||||||
is_test = self.params.get('test', False)
|
is_test = self.params.get('test', False)
|
||||||
chunk_size = self._TEST_FILE_SIZE if is_test else (
|
|
||||||
info_dict.get('downloader_options', {}).get('http_chunk_size') or
|
if is_test:
|
||||||
self.params.get('http_chunk_size') or 0)
|
request.add_header('Range', 'bytes=0-%s' % str(self._TEST_FILE_SIZE - 1))
|
||||||
|
|
||||||
ctx.open_mode = 'wb'
|
ctx.open_mode = 'wb'
|
||||||
ctx.resume_len = 0
|
ctx.resume_len = 0
|
||||||
ctx.data_len = None
|
|
||||||
ctx.block_size = self.params.get('buffersize', 1024)
|
|
||||||
ctx.start_time = time.time()
|
|
||||||
ctx.chunk_size = None
|
|
||||||
|
|
||||||
if self.params.get('continuedl', True):
|
if self.params.get('continuedl', True):
|
||||||
# Establish possible resume length
|
# Establish possible resume length
|
||||||
if os.path.isfile(encodeFilename(ctx.tmpfilename)):
|
if os.path.isfile(encodeFilename(ctx.tmpfilename)):
|
||||||
ctx.resume_len = os.path.getsize(
|
ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
|
||||||
encodeFilename(ctx.tmpfilename))
|
|
||||||
|
|
||||||
ctx.is_resume = ctx.resume_len > 0
|
|
||||||
|
|
||||||
count = 0
|
count = 0
|
||||||
retries = self.params.get('retries', 0)
|
retries = self.params.get('retries', 0)
|
||||||
@ -74,36 +64,11 @@ class HttpFD(FileDownloader):
|
|||||||
def __init__(self, source_error):
|
def __init__(self, source_error):
|
||||||
self.source_error = source_error
|
self.source_error = source_error
|
||||||
|
|
||||||
class NextFragment(Exception):
|
|
||||||
pass
|
|
||||||
|
|
||||||
def set_range(req, start, end):
|
|
||||||
range_header = 'bytes=%d-' % start
|
|
||||||
if end:
|
|
||||||
range_header += compat_str(end)
|
|
||||||
req.add_header('Range', range_header)
|
|
||||||
|
|
||||||
def establish_connection():
|
def establish_connection():
|
||||||
ctx.chunk_size = (random.randint(int(chunk_size * 0.95), chunk_size)
|
if ctx.resume_len != 0:
|
||||||
if not is_test and chunk_size else chunk_size)
|
|
||||||
if ctx.resume_len > 0:
|
|
||||||
range_start = ctx.resume_len
|
|
||||||
if ctx.is_resume:
|
|
||||||
self.report_resuming_byte(ctx.resume_len)
|
self.report_resuming_byte(ctx.resume_len)
|
||||||
|
request.add_header('Range', 'bytes=%d-' % ctx.resume_len)
|
||||||
ctx.open_mode = 'ab'
|
ctx.open_mode = 'ab'
|
||||||
elif ctx.chunk_size > 0:
|
|
||||||
range_start = 0
|
|
||||||
else:
|
|
||||||
range_start = None
|
|
||||||
ctx.is_resume = False
|
|
||||||
range_end = range_start + ctx.chunk_size - 1 if ctx.chunk_size else None
|
|
||||||
if range_end and ctx.data_len is not None and range_end >= ctx.data_len:
|
|
||||||
range_end = ctx.data_len - 1
|
|
||||||
has_range = range_start is not None
|
|
||||||
ctx.has_range = has_range
|
|
||||||
request = sanitized_Request(url, None, headers)
|
|
||||||
if has_range:
|
|
||||||
set_range(request, range_start, range_end)
|
|
||||||
# Establish connection
|
# Establish connection
|
||||||
try:
|
try:
|
||||||
ctx.data = self.ydl.urlopen(request)
|
ctx.data = self.ydl.urlopen(request)
|
||||||
@ -112,24 +77,12 @@ class HttpFD(FileDownloader):
|
|||||||
# that don't support resuming and serve a whole file with no Content-Range
|
# that don't support resuming and serve a whole file with no Content-Range
|
||||||
# set in response despite of requested Range (see
|
# set in response despite of requested Range (see
|
||||||
# https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799)
|
# https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799)
|
||||||
if has_range:
|
if ctx.resume_len > 0:
|
||||||
content_range = ctx.data.headers.get('Content-Range')
|
content_range = ctx.data.headers.get('Content-Range')
|
||||||
if content_range:
|
if content_range:
|
||||||
content_range_m = re.search(r'bytes (\d+)-(\d+)?(?:/(\d+))?', content_range)
|
content_range_m = re.search(r'bytes (\d+)-', content_range)
|
||||||
# Content-Range is present and matches requested Range, resume is possible
|
# Content-Range is present and matches requested Range, resume is possible
|
||||||
if content_range_m:
|
if content_range_m and ctx.resume_len == int(content_range_m.group(1)):
|
||||||
if range_start == int(content_range_m.group(1)):
|
|
||||||
content_range_end = int_or_none(content_range_m.group(2))
|
|
||||||
content_len = int_or_none(content_range_m.group(3))
|
|
||||||
accept_content_len = (
|
|
||||||
# Non-chunked download
|
|
||||||
not ctx.chunk_size or
|
|
||||||
# Chunked download and requested piece or
|
|
||||||
# its part is promised to be served
|
|
||||||
content_range_end == range_end or
|
|
||||||
content_len < range_end)
|
|
||||||
if accept_content_len:
|
|
||||||
ctx.data_len = content_len
|
|
||||||
return
|
return
|
||||||
# Content-Range is either not present or invalid. Assuming remote webserver is
|
# Content-Range is either not present or invalid. Assuming remote webserver is
|
||||||
# trying to send the whole file, resume is not possible, so wiping the local file
|
# trying to send the whole file, resume is not possible, so wiping the local file
|
||||||
@ -137,15 +90,16 @@ class HttpFD(FileDownloader):
|
|||||||
self.report_unable_to_resume()
|
self.report_unable_to_resume()
|
||||||
ctx.resume_len = 0
|
ctx.resume_len = 0
|
||||||
ctx.open_mode = 'wb'
|
ctx.open_mode = 'wb'
|
||||||
ctx.data_len = int_or_none(ctx.data.info().get('Content-length', None))
|
|
||||||
return
|
return
|
||||||
except (compat_urllib_error.HTTPError, ) as err:
|
except (compat_urllib_error.HTTPError, ) as err:
|
||||||
if err.code == 416:
|
if (err.code < 500 or err.code >= 600) and err.code != 416:
|
||||||
|
# Unexpected HTTP error
|
||||||
|
raise
|
||||||
|
elif err.code == 416:
|
||||||
# Unable to resume (requested range not satisfiable)
|
# Unable to resume (requested range not satisfiable)
|
||||||
try:
|
try:
|
||||||
# Open the connection again without the range header
|
# Open the connection again without the range header
|
||||||
ctx.data = self.ydl.urlopen(
|
ctx.data = self.ydl.urlopen(basic_request)
|
||||||
sanitized_Request(url, None, headers))
|
|
||||||
content_length = ctx.data.info()['Content-Length']
|
content_length = ctx.data.info()['Content-Length']
|
||||||
except (compat_urllib_error.HTTPError, ) as err:
|
except (compat_urllib_error.HTTPError, ) as err:
|
||||||
if err.code < 500 or err.code >= 600:
|
if err.code < 500 or err.code >= 600:
|
||||||
@ -176,9 +130,6 @@ class HttpFD(FileDownloader):
|
|||||||
ctx.resume_len = 0
|
ctx.resume_len = 0
|
||||||
ctx.open_mode = 'wb'
|
ctx.open_mode = 'wb'
|
||||||
return
|
return
|
||||||
elif err.code < 500 or err.code >= 600:
|
|
||||||
# Unexpected HTTP error
|
|
||||||
raise
|
|
||||||
raise RetryDownload(err)
|
raise RetryDownload(err)
|
||||||
except socket.error as err:
|
except socket.error as err:
|
||||||
if err.errno != errno.ECONNRESET:
|
if err.errno != errno.ECONNRESET:
|
||||||
@ -209,7 +160,7 @@ class HttpFD(FileDownloader):
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
byte_counter = 0 + ctx.resume_len
|
byte_counter = 0 + ctx.resume_len
|
||||||
block_size = ctx.block_size
|
block_size = self.params.get('buffersize', 1024)
|
||||||
start = time.time()
|
start = time.time()
|
||||||
|
|
||||||
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly
|
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly
|
||||||
@ -282,30 +233,25 @@ class HttpFD(FileDownloader):
|
|||||||
|
|
||||||
# Progress message
|
# Progress message
|
||||||
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
|
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
|
||||||
if ctx.data_len is None:
|
if data_len is None:
|
||||||
eta = None
|
eta = None
|
||||||
else:
|
else:
|
||||||
eta = self.calc_eta(start, time.time(), ctx.data_len - ctx.resume_len, byte_counter - ctx.resume_len)
|
eta = self.calc_eta(start, time.time(), data_len - ctx.resume_len, byte_counter - ctx.resume_len)
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'status': 'downloading',
|
'status': 'downloading',
|
||||||
'downloaded_bytes': byte_counter,
|
'downloaded_bytes': byte_counter,
|
||||||
'total_bytes': ctx.data_len,
|
'total_bytes': data_len,
|
||||||
'tmpfilename': ctx.tmpfilename,
|
'tmpfilename': ctx.tmpfilename,
|
||||||
'filename': ctx.filename,
|
'filename': ctx.filename,
|
||||||
'eta': eta,
|
'eta': eta,
|
||||||
'speed': speed,
|
'speed': speed,
|
||||||
'elapsed': now - ctx.start_time,
|
'elapsed': now - start,
|
||||||
})
|
})
|
||||||
|
|
||||||
if is_test and byte_counter == data_len:
|
if is_test and byte_counter == data_len:
|
||||||
break
|
break
|
||||||
|
|
||||||
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
|
|
||||||
ctx.resume_len = byte_counter
|
|
||||||
# ctx.block_size = block_size
|
|
||||||
raise NextFragment()
|
|
||||||
|
|
||||||
if ctx.stream is None:
|
if ctx.stream is None:
|
||||||
self.to_stderr('\n')
|
self.to_stderr('\n')
|
||||||
self.report_error('Did not get any data blocks')
|
self.report_error('Did not get any data blocks')
|
||||||
@ -330,7 +276,7 @@ class HttpFD(FileDownloader):
|
|||||||
'total_bytes': byte_counter,
|
'total_bytes': byte_counter,
|
||||||
'filename': ctx.filename,
|
'filename': ctx.filename,
|
||||||
'status': 'finished',
|
'status': 'finished',
|
||||||
'elapsed': time.time() - ctx.start_time,
|
'elapsed': time.time() - start,
|
||||||
})
|
})
|
||||||
|
|
||||||
return True
|
return True
|
||||||
@ -344,8 +290,6 @@ class HttpFD(FileDownloader):
|
|||||||
if count <= retries:
|
if count <= retries:
|
||||||
self.report_retry(e.source_error, count, retries)
|
self.report_retry(e.source_error, count, retries)
|
||||||
continue
|
continue
|
||||||
except NextFragment:
|
|
||||||
continue
|
|
||||||
except SucceedDownload:
|
except SucceedDownload:
|
||||||
return True
|
return True
|
||||||
|
|
||||||
|
@ -1,27 +1,25 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import time
|
import time
|
||||||
|
import struct
|
||||||
import binascii
|
import binascii
|
||||||
import io
|
import io
|
||||||
|
|
||||||
from .fragment import FragmentFD
|
from .fragment import FragmentFD
|
||||||
from ..compat import (
|
from ..compat import compat_urllib_error
|
||||||
compat_Struct,
|
|
||||||
compat_urllib_error,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
u8 = compat_Struct('>B')
|
u8 = struct.Struct(b'>B')
|
||||||
u88 = compat_Struct('>Bx')
|
u88 = struct.Struct(b'>Bx')
|
||||||
u16 = compat_Struct('>H')
|
u16 = struct.Struct(b'>H')
|
||||||
u1616 = compat_Struct('>Hxx')
|
u1616 = struct.Struct(b'>Hxx')
|
||||||
u32 = compat_Struct('>I')
|
u32 = struct.Struct(b'>I')
|
||||||
u64 = compat_Struct('>Q')
|
u64 = struct.Struct(b'>Q')
|
||||||
|
|
||||||
s88 = compat_Struct('>bx')
|
s88 = struct.Struct(b'>bx')
|
||||||
s16 = compat_Struct('>h')
|
s16 = struct.Struct(b'>h')
|
||||||
s1616 = compat_Struct('>hxx')
|
s1616 = struct.Struct(b'>hxx')
|
||||||
s32 = compat_Struct('>i')
|
s32 = struct.Struct(b'>i')
|
||||||
|
|
||||||
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
|
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
|
||||||
|
|
||||||
@ -141,7 +139,7 @@ def write_piff_header(stream, params):
|
|||||||
sample_entry_payload += u16.pack(0x18) # depth
|
sample_entry_payload += u16.pack(0x18) # depth
|
||||||
sample_entry_payload += s16.pack(-1) # pre defined
|
sample_entry_payload += s16.pack(-1) # pre defined
|
||||||
|
|
||||||
codec_private_data = binascii.unhexlify(params['codec_private_data'].encode('utf-8'))
|
codec_private_data = binascii.unhexlify(params['codec_private_data'])
|
||||||
if fourcc in ('H264', 'AVC1'):
|
if fourcc in ('H264', 'AVC1'):
|
||||||
sps, pps = codec_private_data.split(u32.pack(1))[1:]
|
sps, pps = codec_private_data.split(u32.pack(1))[1:]
|
||||||
avcc_payload = u8.pack(1) # configuration version
|
avcc_payload = u8.pack(1) # configuration version
|
||||||
|
@ -66,7 +66,7 @@ class AbcNewsIE(InfoExtractor):
|
|||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
|
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '10505354',
|
'id': '10498713',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'display_id': 'dramatic-video-rare-death-job-america',
|
'display_id': 'dramatic-video-rare-death-job-america',
|
||||||
'title': 'Occupational Hazards',
|
'title': 'Occupational Hazards',
|
||||||
@ -79,7 +79,7 @@ class AbcNewsIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
|
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '38897857',
|
'id': '39125818',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
|
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
|
||||||
'title': 'Justin Timberlake Drops Hints For Secret Single',
|
'title': 'Justin Timberlake Drops Hints For Secret Single',
|
||||||
|
@ -1,15 +1,13 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..aes import aes_cbc_decrypt
|
from ..aes import aes_cbc_decrypt
|
||||||
from ..compat import (
|
from ..compat import compat_ord
|
||||||
compat_b64decode,
|
|
||||||
compat_ord,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
bytes_to_intlist,
|
bytes_to_intlist,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
@ -50,9 +48,9 @@ class ADNIE(InfoExtractor):
|
|||||||
|
|
||||||
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
|
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
|
||||||
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
|
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
|
||||||
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
|
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
|
||||||
bytes_to_intlist(b'\xc8\x6e\x06\xbc\xbe\xc6\x49\xf5\x88\x0d\xc8\x47\xc4\x27\x0c\x60'),
|
bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
|
||||||
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
|
bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
|
||||||
))
|
))
|
||||||
subtitles_json = self._parse_json(
|
subtitles_json = self._parse_json(
|
||||||
dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
|
dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
|
||||||
@ -107,18 +105,15 @@ class ADNIE(InfoExtractor):
|
|||||||
|
|
||||||
options = player_config.get('options') or {}
|
options = player_config.get('options') or {}
|
||||||
metas = options.get('metas') or {}
|
metas = options.get('metas') or {}
|
||||||
|
title = metas.get('title') or video_info['title']
|
||||||
links = player_config.get('links') or {}
|
links = player_config.get('links') or {}
|
||||||
sub_path = player_config.get('subtitles')
|
|
||||||
error = None
|
error = None
|
||||||
if not links:
|
if not links:
|
||||||
links_url = player_config.get('linksurl') or options['videoUrl']
|
links_url = player_config['linksurl']
|
||||||
links_data = self._download_json(urljoin(
|
links_data = self._download_json(urljoin(
|
||||||
self._BASE_URL, links_url), video_id)
|
self._BASE_URL, links_url), video_id)
|
||||||
links = links_data.get('links') or {}
|
links = links_data.get('links') or {}
|
||||||
metas = metas or links_data.get('meta') or {}
|
|
||||||
sub_path = sub_path or links_data.get('subtitles')
|
|
||||||
error = links_data.get('error')
|
error = links_data.get('error')
|
||||||
title = metas.get('title') or video_info['title']
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for format_id, qualities in links.items():
|
for format_id, qualities in links.items():
|
||||||
@ -149,7 +144,7 @@ class ADNIE(InfoExtractor):
|
|||||||
'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
|
'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
|
||||||
'thumbnail': video_info.get('image'),
|
'thumbnail': video_info.get('image'),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': self.extract_subtitles(sub_path, video_id),
|
'subtitles': self.extract_subtitles(player_config.get('subtitles'), video_id),
|
||||||
'episode': metas.get('subtitle') or video_info.get('videoTitle'),
|
'episode': metas.get('subtitle') or video_info.get('videoTitle'),
|
||||||
'series': video_info.get('playlistTitle'),
|
'series': video_info.get('playlistTitle'),
|
||||||
}
|
}
|
||||||
|
@ -122,8 +122,7 @@ class AENetworksIE(AENetworksBaseIE):
|
|||||||
|
|
||||||
query = {
|
query = {
|
||||||
'mbr': 'true',
|
'mbr': 'true',
|
||||||
'assetTypes': 'high_video_ak',
|
'assetTypes': 'high_video_s3'
|
||||||
'switch': 'hls_high_ak',
|
|
||||||
}
|
}
|
||||||
video_id = self._html_search_meta('aetn:VideoID', webpage)
|
video_id = self._html_search_meta('aetn:VideoID', webpage)
|
||||||
media_url = self._search_regex(
|
media_url = self._search_regex(
|
||||||
|
@ -175,27 +175,10 @@ class AfreecaTVIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
|
|
||||||
if re.search(r'alert\(["\']This video has been deleted', webpage):
|
|
||||||
raise ExtractorError(
|
|
||||||
'Video %s has been deleted' % video_id, expected=True)
|
|
||||||
|
|
||||||
station_id = self._search_regex(
|
|
||||||
r'nStationNo\s*=\s*(\d+)', webpage, 'station')
|
|
||||||
bbs_id = self._search_regex(
|
|
||||||
r'nBbsNo\s*=\s*(\d+)', webpage, 'bbs')
|
|
||||||
video_id = self._search_regex(
|
|
||||||
r'nTitleNo\s*=\s*(\d+)', webpage, 'title', default=video_id)
|
|
||||||
print(video_id, station_id, bbs_id)
|
|
||||||
video_xml = self._download_xml(
|
video_xml = self._download_xml(
|
||||||
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
|
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
|
||||||
video_id, headers={
|
video_id, query={
|
||||||
'Referer': url,
|
|
||||||
}, query={
|
|
||||||
'nTitleNo': video_id,
|
'nTitleNo': video_id,
|
||||||
'nStationNo': station_id,
|
|
||||||
'nBbsNo': bbs_id,
|
|
||||||
'partialView': 'SKIP_ADULT',
|
'partialView': 'SKIP_ADULT',
|
||||||
})
|
})
|
||||||
|
|
||||||
@ -204,10 +187,10 @@ class AfreecaTVIE(InfoExtractor):
|
|||||||
raise ExtractorError(
|
raise ExtractorError(
|
||||||
'%s said: %s' % (self.IE_NAME, flag), expected=True)
|
'%s said: %s' % (self.IE_NAME, flag), expected=True)
|
||||||
|
|
||||||
video_element = video_xml.findall(compat_xpath('./track/video'))[-1]
|
video_element = video_xml.findall(compat_xpath('./track/video'))[1]
|
||||||
if video_element is None or video_element.text is None:
|
if video_element is None or video_element.text is None:
|
||||||
raise ExtractorError(
|
raise ExtractorError('Specified AfreecaTV video does not exist',
|
||||||
'Video %s video does not exist' % video_id, expected=True)
|
expected=True)
|
||||||
|
|
||||||
video_url = video_element.text.strip()
|
video_url = video_element.text.strip()
|
||||||
|
|
||||||
|
@ -11,7 +11,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class AMCNetworksIE(ThePlatformIE):
|
class AMCNetworksIE(ThePlatformIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
|
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
|
||||||
'md5': '',
|
'md5': '',
|
||||||
@ -51,9 +51,6 @@ class AMCNetworksIE(ThePlatformIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
|
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -41,7 +41,7 @@ class ArchiveOrgIE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(
|
||||||
'http://archive.org/embed/' + video_id, video_id)
|
'http://archive.org/embed/' + video_id, video_id)
|
||||||
jwplayer_playlist = self._parse_json(self._search_regex(
|
jwplayer_playlist = self._parse_json(self._search_regex(
|
||||||
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)",
|
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\);",
|
||||||
webpage, 'jwplayer playlist'), video_id)
|
webpage, 'jwplayer playlist'), video_id)
|
||||||
info = self._parse_jwplayer_data(
|
info = self._parse_jwplayer_data(
|
||||||
{'playlist': jwplayer_playlist}, video_id, base_url=url)
|
{'playlist': jwplayer_playlist}, video_id, base_url=url)
|
||||||
|
@ -24,30 +24,57 @@ class ARDMediathekIE(InfoExtractor):
|
|||||||
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
|
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# available till 26.07.2022
|
'url': 'http://www.ardmediathek.de/tv/Dokumentation-und-Reportage/Ich-liebe-das-Leben-trotzdem/rbb-Fernsehen/Video?documentId=29582122&bcastId=3822114',
|
||||||
'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '44726822',
|
'id': '29582122',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?',
|
'title': 'Ich liebe das Leben trotzdem',
|
||||||
'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5',
|
'description': 'md5:45e4c225c72b27993314b31a84a5261c',
|
||||||
'duration': 1740,
|
'duration': 4557,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
}
|
},
|
||||||
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.ardmediathek.de/tv/Tatort/Tatort-Scheinwelten-H%C3%B6rfassung-Video/Das-Erste/Video?documentId=29522730&bcastId=602916',
|
||||||
|
'md5': 'f4d98b10759ac06c0072bbcd1f0b9e3e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '29522730',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Tatort: Scheinwelten - Hörfassung (Video tgl. ab 20 Uhr)',
|
||||||
|
'description': 'md5:196392e79876d0ac94c94e8cdb2875f1',
|
||||||
|
'duration': 5252,
|
||||||
|
},
|
||||||
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
}, {
|
}, {
|
||||||
# audio
|
# audio
|
||||||
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
|
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
|
||||||
'only_matching': True,
|
'md5': '219d94d8980b4f538c7fcb0865eb7f2c',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '28488308',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': 'Tod eines Fußballers',
|
||||||
|
'description': 'md5:f6e39f3461f0e1f54bfa48c8875c86ef',
|
||||||
|
'duration': 3240,
|
||||||
|
},
|
||||||
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
|
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# audio
|
# audio
|
||||||
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
|
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
|
||||||
'only_matching': True,
|
'md5': '4e8f00631aac0395fee17368ac0e9867',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '30796318',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': 'Vor dem Fest',
|
||||||
|
'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
|
||||||
|
'duration': 3287,
|
||||||
|
},
|
||||||
|
'skip': 'Video is no longer available',
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _extract_media_info(self, media_info_url, webpage, video_id):
|
def _extract_media_info(self, media_info_url, webpage, video_id):
|
||||||
@ -225,23 +252,20 @@ class ARDMediathekIE(InfoExtractor):
|
|||||||
|
|
||||||
class ARDIE(InfoExtractor):
|
class ARDIE(InfoExtractor):
|
||||||
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
||||||
_TESTS = [{
|
_TEST = {
|
||||||
# available till 14.02.2019
|
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
|
||||||
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
|
'md5': 'd216c3a86493f9322545e045ddc3eb35',
|
||||||
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video',
|
'display_id': 'die-story-im-ersten-mission-unter-falscher-flagge',
|
||||||
'id': '102',
|
'id': '100',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'duration': 4435.0,
|
'duration': 2600,
|
||||||
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?',
|
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
|
||||||
'upload_date': '20180214',
|
'upload_date': '20140804',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
},
|
},
|
||||||
}, {
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
|
}
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
@ -1,13 +1,11 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import compat_urllib_parse_unquote
|
||||||
compat_b64decode,
|
|
||||||
compat_urllib_parse_unquote,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class BigflixIE(InfoExtractor):
|
class BigflixIE(InfoExtractor):
|
||||||
@ -41,8 +39,8 @@ class BigflixIE(InfoExtractor):
|
|||||||
webpage, 'title')
|
webpage, 'title')
|
||||||
|
|
||||||
def decode_url(quoted_b64_url):
|
def decode_url(quoted_b64_url):
|
||||||
return compat_b64decode(compat_urllib_parse_unquote(
|
return base64.b64decode(compat_urllib_parse_unquote(
|
||||||
quoted_b64_url)).decode('utf-8')
|
quoted_b64_url).encode('ascii')).decode('utf-8')
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for height, encoded_url in re.findall(
|
for height, encoded_url in re.findall(
|
||||||
|
@ -27,14 +27,14 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||||
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788',
|
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1074402',
|
'id': '1074402',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': '【金坷垃】金泡沫',
|
'title': '【金坷垃】金泡沫',
|
||||||
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
|
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
|
||||||
'duration': 308.067,
|
'duration': 308.315,
|
||||||
'timestamp': 1398012678,
|
'timestamp': 1398012660,
|
||||||
'upload_date': '20140420',
|
'upload_date': '20140420',
|
||||||
'thumbnail': r're:^https?://.+\.jpg',
|
'thumbnail': r're:^https?://.+\.jpg',
|
||||||
'uploader': '菊子桑',
|
'uploader': '菊子桑',
|
||||||
@ -59,38 +59,17 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
'url': 'http://www.bilibili.com/video/av8903802/',
|
'url': 'http://www.bilibili.com/video/av8903802/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8903802',
|
'id': '8903802',
|
||||||
|
'ext': 'mp4',
|
||||||
'title': '阿滴英文|英文歌分享#6 "Closer',
|
'title': '阿滴英文|英文歌分享#6 "Closer',
|
||||||
'description': '滴妹今天唱Closer給你聽! 有史以来,被推最多次也是最久的歌曲,其实歌词跟我原本想像差蛮多的,不过还是好听! 微博@阿滴英文',
|
'description': '滴妹今天唱Closer給你聽! 有史以来,被推最多次也是最久的歌曲,其实歌词跟我原本想像差蛮多的,不过还是好听! 微博@阿滴英文',
|
||||||
},
|
|
||||||
'playlist': [{
|
|
||||||
'info_dict': {
|
|
||||||
'id': '8903802_part1',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': '阿滴英文|英文歌分享#6 "Closer',
|
|
||||||
'description': 'md5:3b1b9e25b78da4ef87e9b548b88ee76a',
|
|
||||||
'uploader': '阿滴英文',
|
'uploader': '阿滴英文',
|
||||||
'uploader_id': '65880958',
|
'uploader_id': '65880958',
|
||||||
'timestamp': 1488382634,
|
'timestamp': 1488382620,
|
||||||
'upload_date': '20170301',
|
'upload_date': '20170301',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True, # Test metadata only
|
'skip_download': True, # Test metadata only
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'info_dict': {
|
|
||||||
'id': '8903802_part2',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': '阿滴英文|英文歌分享#6 "Closer',
|
|
||||||
'description': 'md5:3b1b9e25b78da4ef87e9b548b88ee76a',
|
|
||||||
'uploader': '阿滴英文',
|
|
||||||
'uploader_id': '65880958',
|
|
||||||
'timestamp': 1488382634,
|
|
||||||
'upload_date': '20170301',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True, # Test metadata only
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
_APP_KEY = '84956560bc028eb7'
|
_APP_KEY = '84956560bc028eb7'
|
||||||
@ -113,13 +92,9 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
if 'anime/' not in url:
|
if 'anime/' not in url:
|
||||||
cid = self._search_regex(
|
cid = compat_parse_qs(self._search_regex(
|
||||||
r'cid(?:["\']:|=)(\d+)', webpage, 'cid',
|
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||||
default=None
|
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||||
) or compat_parse_qs(self._search_regex(
|
|
||||||
[r'1EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
|
||||||
r'1EmbedPlayer\([^)]+,\s*\\"([^"]+)\\"\)',
|
|
||||||
r'1<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
|
||||||
webpage, 'player parameters'))['cid'][0]
|
webpage, 'player parameters'))['cid'][0]
|
||||||
else:
|
else:
|
||||||
if 'no_bangumi_tip' not in smuggled_data:
|
if 'no_bangumi_tip' not in smuggled_data:
|
||||||
@ -127,7 +102,6 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
|
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
|
||||||
headers = {
|
headers = {
|
||||||
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
|
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
|
||||||
'Referer': url
|
|
||||||
}
|
}
|
||||||
headers.update(self.geo_verification_headers())
|
headers.update(self.geo_verification_headers())
|
||||||
|
|
||||||
@ -139,31 +113,19 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
self._report_error(js)
|
self._report_error(js)
|
||||||
cid = js['result']['cid']
|
cid = js['result']['cid']
|
||||||
|
|
||||||
headers = {
|
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
|
||||||
'Referer': url
|
|
||||||
}
|
|
||||||
headers.update(self.geo_verification_headers())
|
|
||||||
|
|
||||||
entries = []
|
|
||||||
|
|
||||||
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
|
|
||||||
for num, rendition in enumerate(RENDITIONS, start=1):
|
|
||||||
payload = 'appkey=%s&cid=%s&otype=json&%s' % (self._APP_KEY, cid, rendition)
|
|
||||||
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
|
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
|
||||||
|
|
||||||
video_info = self._download_json(
|
video_info = self._download_json(
|
||||||
'http://interface.bilibili.com/v2/playurl?%s&sign=%s' % (payload, sign),
|
'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
|
||||||
video_id, note='Downloading video info page',
|
video_id, note='Downloading video info page',
|
||||||
headers=headers, fatal=num == len(RENDITIONS))
|
headers=self.geo_verification_headers())
|
||||||
|
|
||||||
if not video_info:
|
|
||||||
continue
|
|
||||||
|
|
||||||
if 'durl' not in video_info:
|
if 'durl' not in video_info:
|
||||||
if num < len(RENDITIONS):
|
|
||||||
continue
|
|
||||||
self._report_error(video_info)
|
self._report_error(video_info)
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
|
||||||
for idx, durl in enumerate(video_info['durl']):
|
for idx, durl in enumerate(video_info['durl']):
|
||||||
formats = [{
|
formats = [{
|
||||||
'url': durl['url'],
|
'url': durl['url'],
|
||||||
@ -188,17 +150,11 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
'duration': float_or_none(durl.get('length'), 1000),
|
'duration': float_or_none(durl.get('length'), 1000),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
})
|
})
|
||||||
break
|
|
||||||
|
|
||||||
title = self._html_search_regex(
|
title = self._html_search_regex('<h1[^>]*>([^<]+)</h1>', webpage, 'title')
|
||||||
('<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
|
|
||||||
'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
|
|
||||||
group='title')
|
|
||||||
description = self._html_search_meta('description', webpage)
|
description = self._html_search_meta('description', webpage)
|
||||||
timestamp = unified_timestamp(self._html_search_regex(
|
timestamp = unified_timestamp(self._html_search_regex(
|
||||||
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time',
|
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))
|
||||||
default=None) or self._html_search_meta(
|
|
||||||
'uploadDate', webpage, 'timestamp', default=None))
|
|
||||||
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
|
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
|
||||||
|
|
||||||
# TODO 'view_count' requires deobfuscating Javascript
|
# TODO 'view_count' requires deobfuscating Javascript
|
||||||
@ -212,16 +168,13 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
|
|
||||||
uploader_mobj = re.search(
|
uploader_mobj = re.search(
|
||||||
r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]*>(?P<name>[^<]+)',
|
r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
|
||||||
webpage)
|
webpage)
|
||||||
if uploader_mobj:
|
if uploader_mobj:
|
||||||
info.update({
|
info.update({
|
||||||
'uploader': uploader_mobj.group('name'),
|
'uploader': uploader_mobj.group('name'),
|
||||||
'uploader_id': uploader_mobj.group('id'),
|
'uploader_id': uploader_mobj.group('id'),
|
||||||
})
|
})
|
||||||
if not info.get('uploader'):
|
|
||||||
info['uploader'] = self._html_search_meta(
|
|
||||||
'author', webpage, 'uploader', default=None)
|
|
||||||
|
|
||||||
for entry in entries:
|
for entry in entries:
|
||||||
entry.update(info)
|
entry.update(info)
|
||||||
|
@ -564,7 +564,7 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
|
|
||||||
return entries
|
return entries
|
||||||
|
|
||||||
def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
|
def _parse_brightcove_metadata(self, json_data, video_id):
|
||||||
title = json_data['name'].strip()
|
title = json_data['name'].strip()
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
@ -638,9 +638,6 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
for f in formats:
|
|
||||||
f.setdefault('http_headers', {}).update(headers)
|
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
for text_track in json_data.get('text_tracks', []):
|
for text_track in json_data.get('text_tracks', []):
|
||||||
if text_track.get('src'):
|
if text_track.get('src'):
|
||||||
@ -693,17 +690,10 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
webpage, 'policy key', group='pk')
|
webpage, 'policy key', group='pk')
|
||||||
|
|
||||||
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
|
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
|
||||||
headers = {
|
|
||||||
'Accept': 'application/json;pk=%s' % policy_key,
|
|
||||||
}
|
|
||||||
referrer = smuggled_data.get('referrer')
|
|
||||||
if referrer:
|
|
||||||
headers.update({
|
|
||||||
'Referer': referrer,
|
|
||||||
'Origin': re.search(r'https?://[^/]+', referrer).group(0),
|
|
||||||
})
|
|
||||||
try:
|
try:
|
||||||
json_data = self._download_json(api_url, video_id, headers=headers)
|
json_data = self._download_json(api_url, video_id, headers={
|
||||||
|
'Accept': 'application/json;pk=%s' % policy_key
|
||||||
|
})
|
||||||
except ExtractorError as e:
|
except ExtractorError as e:
|
||||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
||||||
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
|
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
|
||||||
@ -727,5 +717,4 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
'tveToken': tve_token,
|
'tveToken': tve_token,
|
||||||
})
|
})
|
||||||
|
|
||||||
return self._parse_brightcove_metadata(
|
return self._parse_brightcove_metadata(json_data, video_id)
|
||||||
json_data, video_id, headers=headers)
|
|
||||||
|
@ -31,10 +31,6 @@ class Canalc2IE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(
|
||||||
'http://www.canalc2.tv/video/%s' % video_id, video_id)
|
'http://www.canalc2.tv/video/%s' % video_id, video_id)
|
||||||
|
|
||||||
title = self._html_search_regex(
|
|
||||||
r'(?s)class="[^"]*col_description[^"]*">.*?<h3>(.+?)</h3>',
|
|
||||||
webpage, 'title')
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for _, video_url in re.findall(r'file\s*=\s*(["\'])(.+?)\1', webpage):
|
for _, video_url in re.findall(r'file\s*=\s*(["\'])(.+?)\1', webpage):
|
||||||
if video_url.startswith('rtmp://'):
|
if video_url.startswith('rtmp://'):
|
||||||
@ -53,21 +49,17 @@ class Canalc2IE(InfoExtractor):
|
|||||||
'url': video_url,
|
'url': video_url,
|
||||||
'format_id': 'http',
|
'format_id': 'http',
|
||||||
})
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
if formats:
|
title = self._html_search_regex(
|
||||||
info = {
|
r'(?s)class="[^"]*col_description[^"]*">.*?<h3>(.*?)</h3>', webpage, 'title')
|
||||||
'formats': formats,
|
duration = parse_duration(self._search_regex(
|
||||||
}
|
r'id=["\']video_duree["\'][^>]*>([^<]+)',
|
||||||
else:
|
webpage, 'duration', fatal=False))
|
||||||
info = self._parse_html5_media_entries(url, webpage, url)[0]
|
|
||||||
|
|
||||||
self._sort_formats(info['formats'])
|
return {
|
||||||
|
|
||||||
info.update({
|
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'duration': parse_duration(self._search_regex(
|
'duration': duration,
|
||||||
r'id=["\']video_duree["\'][^>]*>([^<]+)',
|
'formats': formats,
|
||||||
webpage, 'duration', fatal=False)),
|
}
|
||||||
})
|
|
||||||
return info
|
|
||||||
|
@ -4,36 +4,59 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_urllib_parse_urlparse
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
dict_get,
|
||||||
# ExtractorError,
|
# ExtractorError,
|
||||||
# HEADRequest,
|
# HEADRequest,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
qualities,
|
qualities,
|
||||||
|
remove_end,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class CanalplusIE(InfoExtractor):
|
class CanalplusIE(InfoExtractor):
|
||||||
IE_DESC = 'mycanal.fr and piwiplus.fr'
|
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?P<site>mycanal|piwiplus)\.fr/(?:[^/]+/)*(?P<display_id>[^?/]+)(?:\.html\?.*\bvid=|/p/)(?P<id>\d+)'
|
_VALID_URL = r'''(?x)
|
||||||
|
https?://
|
||||||
|
(?:
|
||||||
|
(?:
|
||||||
|
(?:(?:www|m)\.)?canalplus\.fr|
|
||||||
|
(?:www\.)?piwiplus\.fr|
|
||||||
|
(?:www\.)?d8\.tv|
|
||||||
|
(?:www\.)?c8\.fr|
|
||||||
|
(?:www\.)?d17\.tv|
|
||||||
|
(?:(?:football|www)\.)?cstar\.fr|
|
||||||
|
(?:www\.)?itele\.fr
|
||||||
|
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
|
||||||
|
player\.canalplus\.fr/#/(?P<id>\d+)
|
||||||
|
)
|
||||||
|
|
||||||
|
'''
|
||||||
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
|
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
|
||||||
_SITE_ID_MAP = {
|
_SITE_ID_MAP = {
|
||||||
'mycanal': 'cplus',
|
'canalplus': 'cplus',
|
||||||
'piwiplus': 'teletoon',
|
'piwiplus': 'teletoon',
|
||||||
|
'd8': 'd8',
|
||||||
|
'c8': 'd8',
|
||||||
|
'd17': 'd17',
|
||||||
|
'cstar': 'd17',
|
||||||
|
'itele': 'itele',
|
||||||
}
|
}
|
||||||
|
|
||||||
# Only works for direct mp4 URLs
|
# Only works for direct mp4 URLs
|
||||||
_GEO_COUNTRIES = ['FR']
|
_GEO_COUNTRIES = ['FR']
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.mycanal.fr/d17-emissions/lolywood/p/1397061',
|
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1192814',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1397061',
|
'id': '1405510',
|
||||||
'display_id': 'lolywood',
|
'display_id': 'pid1830-c-zapping',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Euro 2016 : Je préfère te prévenir - Lolywood - Episode 34',
|
'title': 'Zapping - 02/07/2016',
|
||||||
'description': 'md5:7d97039d455cb29cdba0d652a0efaa5e',
|
'description': 'Le meilleur de toutes les chaînes, tous les jours',
|
||||||
'upload_date': '20160602',
|
'upload_date': '20160702',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# geo restricted, bypassed
|
# geo restricted, bypassed
|
||||||
@ -47,12 +70,64 @@ class CanalplusIE(InfoExtractor):
|
|||||||
'upload_date': '20140724',
|
'upload_date': '20140724',
|
||||||
},
|
},
|
||||||
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
||||||
|
}, {
|
||||||
|
# geo restricted, bypassed
|
||||||
|
'url': 'http://www.c8.fr/c8-divertissement/ms-touche-pas-a-mon-poste/pid6318-videos-integrales.html?vid=1443684',
|
||||||
|
'md5': 'bb6f9f343296ab7ebd88c97b660ecf8d',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1443684',
|
||||||
|
'display_id': 'pid6318-videos-integrales',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Guess my iep ! - TPMP - 07/04/2017',
|
||||||
|
'description': 'md5:6f005933f6e06760a9236d9b3b5f17fa',
|
||||||
|
'upload_date': '20170407',
|
||||||
|
},
|
||||||
|
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.itele.fr/chroniques/invite-michael-darmon/rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1420176',
|
||||||
|
'display_id': 'rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'L\'invité de Michaël Darmon du 14/10/2016 - ',
|
||||||
|
'description': 'Chaque matin du lundi au vendredi, Michaël Darmon reçoit un invité politique à 8h25.',
|
||||||
|
'upload_date': '20161014',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://football.cstar.fr/cstar-minisite-foot/pid7566-feminines-videos.html?vid=1416769',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1416769',
|
||||||
|
'display_id': 'pid7566-feminines-videos',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'France - Albanie : les temps forts de la soirée - 20/09/2016',
|
||||||
|
'description': 'md5:c3f30f2aaac294c1c969b3294de6904e',
|
||||||
|
'upload_date': '20160921',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://m.canalplus.fr/?vid=1398231',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.d17.tv/emissions/pid8303-lolywood.html?vid=1397061',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
site, display_id, video_id = re.match(self._VALID_URL, url).groups()
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
|
||||||
site_id = self._SITE_ID_MAP[site]
|
site_id = self._SITE_ID_MAP[compat_urllib_parse_urlparse(url).netloc.rsplit('.', 2)[-2]]
|
||||||
|
|
||||||
|
# Beware, some subclasses do not define an id group
|
||||||
|
display_id = remove_end(dict_get(mobj.groupdict(), ('display_id', 'id', 'vid')), '.html')
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
video_id = self._search_regex(
|
||||||
|
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
|
||||||
|
r'id=["\']canal_video_player(?P<id>\d+)',
|
||||||
|
r'data-video=["\'](?P<id>\d+)'],
|
||||||
|
webpage, 'video id', default=mobj.group('vid'), group='id')
|
||||||
|
|
||||||
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
|
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
|
||||||
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
|
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
|
||||||
@ -86,7 +161,7 @@ class CanalplusIE(InfoExtractor):
|
|||||||
format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False))
|
format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False))
|
||||||
else:
|
else:
|
||||||
formats.append({
|
formats.append({
|
||||||
# the secret extracted from ya function in http://player.canalplus.fr/common/js/canalPlayer.js
|
# the secret extracted ya function in http://player.canalplus.fr/common/js/canalPlayer.js
|
||||||
'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes',
|
'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes',
|
||||||
'format_id': format_id,
|
'format_id': format_id,
|
||||||
'preference': preference(format_id),
|
'preference': preference(format_id),
|
||||||
|
@ -246,7 +246,7 @@ class VrtNUIE(GigyaBaseIE):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
webpage, urlh = self._download_webpage_handle(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
title = self._html_search_regex(
|
title = self._html_search_regex(
|
||||||
r'(?ms)<h1 class="content__heading">(.+?)</h1>',
|
r'(?ms)<h1 class="content__heading">(.+?)</h1>',
|
||||||
@ -276,7 +276,7 @@ class VrtNUIE(GigyaBaseIE):
|
|||||||
webpage, 'release_date', default=None))
|
webpage, 'release_date', default=None))
|
||||||
|
|
||||||
# If there's a ? or a # in the URL, remove them and everything after
|
# If there's a ? or a # in the URL, remove them and everything after
|
||||||
clean_url = urlh.geturl().split('?')[0].split('#')[0].strip('/')
|
clean_url = url.split('?')[0].split('#')[0].strip('/')
|
||||||
securevideo_url = clean_url + '.mssecurevideo.json'
|
securevideo_url = clean_url + '.mssecurevideo.json'
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
@ -14,7 +13,6 @@ from ..utils import (
|
|||||||
xpath_element,
|
xpath_element,
|
||||||
xpath_with_ns,
|
xpath_with_ns,
|
||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
parse_duration,
|
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
parse_age_limit,
|
parse_age_limit,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -361,63 +359,3 @@ class CBCWatchIE(CBCWatchBaseIE):
|
|||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
rss = self._call_api('web/browse/' + video_id, video_id)
|
rss = self._call_api('web/browse/' + video_id, video_id)
|
||||||
return self._parse_rss_feed(rss)
|
return self._parse_rss_feed(rss)
|
||||||
|
|
||||||
|
|
||||||
class CBCOlympicsIE(InfoExtractor):
|
|
||||||
IE_NAME = 'cbc.ca:olympics'
|
|
||||||
_VALID_URL = r'https?://olympics\.cbc\.ca/video/[^/]+/(?P<id>[^/?#]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://olympics.cbc.ca/video/whats-on-tv/olympic-morning-featuring-the-opening-ceremony/',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
|
||||||
video_id = self._hidden_inputs(webpage)['videoId']
|
|
||||||
video_doc = self._download_xml(
|
|
||||||
'https://olympics.cbc.ca/videodata/%s.xml' % video_id, video_id)
|
|
||||||
title = xpath_text(video_doc, 'title', fatal=True)
|
|
||||||
is_live = xpath_text(video_doc, 'kind') == 'Live'
|
|
||||||
if is_live:
|
|
||||||
title = self._live_title(title)
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
for video_source in video_doc.findall('videoSources/videoSource'):
|
|
||||||
uri = xpath_text(video_source, 'uri')
|
|
||||||
if not uri:
|
|
||||||
continue
|
|
||||||
tokenize = self._download_json(
|
|
||||||
'https://olympics.cbc.ca/api/api-akamai/tokenize',
|
|
||||||
video_id, data=json.dumps({
|
|
||||||
'VideoSource': uri,
|
|
||||||
}).encode(), headers={
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Referer': url,
|
|
||||||
# d3.VideoPlayer._init in https://olympics.cbc.ca/components/script/base.js
|
|
||||||
'Cookie': '_dvp=TK:C0ObxjerU', # AKAMAI CDN cookie
|
|
||||||
}, fatal=False)
|
|
||||||
if not tokenize:
|
|
||||||
continue
|
|
||||||
content_url = tokenize['ContentUrl']
|
|
||||||
video_source_format = video_source.get('format')
|
|
||||||
if video_source_format == 'IIS':
|
|
||||||
formats.extend(self._extract_ism_formats(
|
|
||||||
content_url, video_id, ism_id=video_source_format, fatal=False))
|
|
||||||
else:
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
content_url, video_id, 'mp4',
|
|
||||||
'm3u8' if is_live else 'm3u8_native',
|
|
||||||
m3u8_id=video_source_format, fatal=False))
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'display_id': display_id,
|
|
||||||
'title': title,
|
|
||||||
'description': xpath_text(video_doc, 'description'),
|
|
||||||
'thumbnail': xpath_text(video_doc, 'thumbnailUrl'),
|
|
||||||
'duration': parse_duration(xpath_text(video_doc, 'duration')),
|
|
||||||
'formats': formats,
|
|
||||||
'is_live': is_live,
|
|
||||||
}
|
|
||||||
|
@ -2,7 +2,6 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
from .theplatform import ThePlatformFeedIE
|
from .theplatform import ThePlatformFeedIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
|
||||||
int_or_none,
|
int_or_none,
|
||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
xpath_element,
|
xpath_element,
|
||||||
@ -62,7 +61,6 @@ class CBSIE(CBSBaseIE):
|
|||||||
asset_types = []
|
asset_types = []
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
formats = []
|
formats = []
|
||||||
last_e = None
|
|
||||||
for item in items_data.findall('.//item'):
|
for item in items_data.findall('.//item'):
|
||||||
asset_type = xpath_text(item, 'assetType')
|
asset_type = xpath_text(item, 'assetType')
|
||||||
if not asset_type or asset_type in asset_types:
|
if not asset_type or asset_type in asset_types:
|
||||||
@ -76,17 +74,11 @@ class CBSIE(CBSBaseIE):
|
|||||||
query['formats'] = 'MPEG4,M3U'
|
query['formats'] = 'MPEG4,M3U'
|
||||||
elif asset_type in ('RTMP', 'WIFI', '3G'):
|
elif asset_type in ('RTMP', 'WIFI', '3G'):
|
||||||
query['formats'] = 'MPEG4,FLV'
|
query['formats'] = 'MPEG4,FLV'
|
||||||
try:
|
|
||||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||||
update_url_query(tp_release_url, query), content_id,
|
update_url_query(tp_release_url, query), content_id,
|
||||||
'Downloading %s SMIL data' % asset_type)
|
'Downloading %s SMIL data' % asset_type)
|
||||||
except ExtractorError as e:
|
|
||||||
last_e = e
|
|
||||||
continue
|
|
||||||
formats.extend(tp_formats)
|
formats.extend(tp_formats)
|
||||||
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
|
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
|
||||||
if last_e and not formats:
|
|
||||||
raise last_e
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
info = self._extract_theplatform_metadata(tp_path, content_id)
|
info = self._extract_theplatform_metadata(tp_path, content_id)
|
||||||
|
@ -75,10 +75,10 @@ class CBSInteractiveIE(CBSIE):
|
|||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
data_json = self._html_search_regex(
|
data_json = self._html_search_regex(
|
||||||
r"data(?:-(?:cnet|zdnet))?-video(?:-(?:uvp(?:js)?|player))?-options='([^']+)'",
|
r"data-(?:cnet|zdnet)-video(?:-uvp(?:js)?)?-options='([^']+)'",
|
||||||
webpage, 'data json')
|
webpage, 'data json')
|
||||||
data = self._parse_json(data_json, display_id)
|
data = self._parse_json(data_json, display_id)
|
||||||
vdata = data.get('video') or (data.get('videos') or data.get('playlist'))[0]
|
vdata = data.get('video') or data['videos'][0]
|
||||||
|
|
||||||
video_id = vdata['mpxRefId']
|
video_id = vdata['mpxRefId']
|
||||||
|
|
||||||
|
@ -13,7 +13,6 @@ from ..utils import (
|
|||||||
float_or_none,
|
float_or_none,
|
||||||
sanitized_Request,
|
sanitized_Request,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
update_url_query,
|
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
USER_AGENTS,
|
USER_AGENTS,
|
||||||
)
|
)
|
||||||
@ -266,10 +265,6 @@ class CeskaTelevizePoradyIE(InfoExtractor):
|
|||||||
# m3u8 download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
# iframe embed
|
|
||||||
'url': 'http://www.ceskatelevize.cz/porady/10614999031-neviditelni/21251212048/',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -277,11 +272,8 @@ class CeskaTelevizePoradyIE(InfoExtractor):
|
|||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
data_url = update_url_query(unescapeHTML(self._search_regex(
|
data_url = unescapeHTML(self._search_regex(
|
||||||
(r'<span[^>]*\bdata-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
r'<span[^>]*\bdata-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||||
r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?ceskatelevize\.cz/ivysilani/embed/iFramePlayer\.php.*?)\1'),
|
webpage, 'iframe player url', group='url'))
|
||||||
webpage, 'iframe player url', group='url')), query={
|
|
||||||
'autoStart': 'true',
|
|
||||||
})
|
|
||||||
|
|
||||||
return self.url_result(data_url, ie=CeskaTelevizeIE.ie_key())
|
return self.url_result(data_url, ie=CeskaTelevizeIE.ie_key())
|
||||||
|
@ -1,11 +1,11 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
|
import base64
|
||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .youtube import YoutubeIE
|
from .youtube import YoutubeIE
|
||||||
from ..compat import compat_b64decode
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
ExtractorError
|
ExtractorError
|
||||||
@ -58,7 +58,7 @@ class ChilloutzoneIE(InfoExtractor):
|
|||||||
|
|
||||||
base64_video_info = self._html_search_regex(
|
base64_video_info = self._html_search_regex(
|
||||||
r'var cozVidData = "(.+?)";', webpage, 'video data')
|
r'var cozVidData = "(.+?)";', webpage, 'video data')
|
||||||
decoded_video_info = compat_b64decode(base64_video_info).decode('utf-8')
|
decoded_video_info = base64.b64decode(base64_video_info.encode('utf-8')).decode('utf-8')
|
||||||
video_info_dict = json.loads(decoded_video_info)
|
video_info_dict = json.loads(decoded_video_info)
|
||||||
|
|
||||||
# get video information from dict
|
# get video information from dict
|
||||||
|
@ -1,10 +1,10 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_b64decode
|
|
||||||
from ..utils import parse_duration
|
from ..utils import parse_duration
|
||||||
|
|
||||||
|
|
||||||
@ -44,7 +44,8 @@ class ChirbitIE(InfoExtractor):
|
|||||||
|
|
||||||
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look
|
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look
|
||||||
# for soundURL)
|
# for soundURL)
|
||||||
audio_url = compat_b64decode(data_fd[::-1]).decode('utf-8')
|
audio_url = base64.b64decode(
|
||||||
|
data_fd[::-1].encode('ascii')).decode('utf-8')
|
||||||
|
|
||||||
title = self._search_regex(
|
title = self._search_regex(
|
||||||
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')
|
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')
|
||||||
|
@ -174,8 +174,6 @@ class InfoExtractor(object):
|
|||||||
width : height ratio as float.
|
width : height ratio as float.
|
||||||
* no_resume The server does not support resuming the
|
* no_resume The server does not support resuming the
|
||||||
(HTTP or RTMP) download. Boolean.
|
(HTTP or RTMP) download. Boolean.
|
||||||
* downloader_options A dictionary of downloader options as
|
|
||||||
described in FileDownloader
|
|
||||||
|
|
||||||
url: Final video URL.
|
url: Final video URL.
|
||||||
ext: Video filename extension.
|
ext: Video filename extension.
|
||||||
@ -644,31 +642,19 @@ class InfoExtractor(object):
|
|||||||
content, _ = res
|
content, _ = res
|
||||||
return content
|
return content
|
||||||
|
|
||||||
def _download_xml_handle(
|
|
||||||
self, url_or_request, video_id, note='Downloading XML',
|
|
||||||
errnote='Unable to download XML', transform_source=None,
|
|
||||||
fatal=True, encoding=None, data=None, headers={}, query={}):
|
|
||||||
"""Return a tuple (xml as an xml.etree.ElementTree.Element, URL handle)"""
|
|
||||||
res = self._download_webpage_handle(
|
|
||||||
url_or_request, video_id, note, errnote, fatal=fatal,
|
|
||||||
encoding=encoding, data=data, headers=headers, query=query)
|
|
||||||
if res is False:
|
|
||||||
return res
|
|
||||||
xml_string, urlh = res
|
|
||||||
return self._parse_xml(
|
|
||||||
xml_string, video_id, transform_source=transform_source,
|
|
||||||
fatal=fatal), urlh
|
|
||||||
|
|
||||||
def _download_xml(self, url_or_request, video_id,
|
def _download_xml(self, url_or_request, video_id,
|
||||||
note='Downloading XML', errnote='Unable to download XML',
|
note='Downloading XML', errnote='Unable to download XML',
|
||||||
transform_source=None, fatal=True, encoding=None,
|
transform_source=None, fatal=True, encoding=None,
|
||||||
data=None, headers={}, query={}):
|
data=None, headers={}, query={}):
|
||||||
"""Return the xml as an xml.etree.ElementTree.Element"""
|
"""Return the xml as an xml.etree.ElementTree.Element"""
|
||||||
res = self._download_xml_handle(
|
xml_string = self._download_webpage(
|
||||||
url_or_request, video_id, note=note, errnote=errnote,
|
url_or_request, video_id, note, errnote, fatal=fatal,
|
||||||
transform_source=transform_source, fatal=fatal, encoding=encoding,
|
encoding=encoding, data=data, headers=headers, query=query)
|
||||||
data=data, headers=headers, query=query)
|
if xml_string is False:
|
||||||
return res if res is False else res[0]
|
return xml_string
|
||||||
|
return self._parse_xml(
|
||||||
|
xml_string, video_id, transform_source=transform_source,
|
||||||
|
fatal=fatal)
|
||||||
|
|
||||||
def _parse_xml(self, xml_string, video_id, transform_source=None, fatal=True):
|
def _parse_xml(self, xml_string, video_id, transform_source=None, fatal=True):
|
||||||
if transform_source:
|
if transform_source:
|
||||||
@ -1041,7 +1027,7 @@ class InfoExtractor(object):
|
|||||||
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
|
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
|
||||||
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
|
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
|
||||||
info['series'] = unescapeHTML(part_of_series.get('name'))
|
info['series'] = unescapeHTML(part_of_series.get('name'))
|
||||||
elif item_type in ('Article', 'NewsArticle'):
|
elif item_type == 'Article':
|
||||||
info.update({
|
info.update({
|
||||||
'timestamp': parse_iso8601(e.get('datePublished')),
|
'timestamp': parse_iso8601(e.get('datePublished')),
|
||||||
'title': unescapeHTML(e.get('headline')),
|
'title': unescapeHTML(e.get('headline')),
|
||||||
@ -1706,24 +1692,22 @@ class InfoExtractor(object):
|
|||||||
})
|
})
|
||||||
return subtitles
|
return subtitles
|
||||||
|
|
||||||
def _extract_xspf_playlist(self, xspf_url, playlist_id, fatal=True):
|
def _extract_xspf_playlist(self, playlist_url, playlist_id, fatal=True):
|
||||||
xspf = self._download_xml(
|
xspf = self._download_xml(
|
||||||
xspf_url, playlist_id, 'Downloading xpsf playlist',
|
playlist_url, playlist_id, 'Downloading xpsf playlist',
|
||||||
'Unable to download xspf manifest', fatal=fatal)
|
'Unable to download xspf manifest', fatal=fatal)
|
||||||
if xspf is False:
|
if xspf is False:
|
||||||
return []
|
return []
|
||||||
return self._parse_xspf(
|
return self._parse_xspf(xspf, playlist_id)
|
||||||
xspf, playlist_id, xspf_url=xspf_url,
|
|
||||||
xspf_base_url=base_url(xspf_url))
|
|
||||||
|
|
||||||
def _parse_xspf(self, xspf_doc, playlist_id, xspf_url=None, xspf_base_url=None):
|
def _parse_xspf(self, playlist, playlist_id):
|
||||||
NS_MAP = {
|
NS_MAP = {
|
||||||
'xspf': 'http://xspf.org/ns/0/',
|
'xspf': 'http://xspf.org/ns/0/',
|
||||||
's1': 'http://static.streamone.nl/player/ns/0',
|
's1': 'http://static.streamone.nl/player/ns/0',
|
||||||
}
|
}
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
for track in xspf_doc.findall(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP)):
|
for track in playlist.findall(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP)):
|
||||||
title = xpath_text(
|
title = xpath_text(
|
||||||
track, xpath_with_ns('./xspf:title', NS_MAP), 'title', default=playlist_id)
|
track, xpath_with_ns('./xspf:title', NS_MAP), 'title', default=playlist_id)
|
||||||
description = xpath_text(
|
description = xpath_text(
|
||||||
@ -1733,18 +1717,12 @@ class InfoExtractor(object):
|
|||||||
duration = float_or_none(
|
duration = float_or_none(
|
||||||
xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'), 1000)
|
xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'), 1000)
|
||||||
|
|
||||||
formats = []
|
formats = [{
|
||||||
for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP)):
|
'url': location.text,
|
||||||
format_url = urljoin(xspf_base_url, location.text)
|
|
||||||
if not format_url:
|
|
||||||
continue
|
|
||||||
formats.append({
|
|
||||||
'url': format_url,
|
|
||||||
'manifest_url': xspf_url,
|
|
||||||
'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
|
'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
|
||||||
'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
|
'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
|
||||||
'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
|
'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
|
||||||
})
|
} for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
entries.append({
|
entries.append({
|
||||||
@ -1758,18 +1736,18 @@ class InfoExtractor(object):
|
|||||||
return entries
|
return entries
|
||||||
|
|
||||||
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}):
|
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}):
|
||||||
res = self._download_xml_handle(
|
res = self._download_webpage_handle(
|
||||||
mpd_url, video_id,
|
mpd_url, video_id,
|
||||||
note=note or 'Downloading MPD manifest',
|
note=note or 'Downloading MPD manifest',
|
||||||
errnote=errnote or 'Failed to download MPD manifest',
|
errnote=errnote or 'Failed to download MPD manifest',
|
||||||
fatal=fatal)
|
fatal=fatal)
|
||||||
if res is False:
|
if res is False:
|
||||||
return []
|
return []
|
||||||
mpd_doc, urlh = res
|
mpd, urlh = res
|
||||||
mpd_base_url = base_url(urlh.geturl())
|
mpd_base_url = base_url(urlh.geturl())
|
||||||
|
|
||||||
return self._parse_mpd_formats(
|
return self._parse_mpd_formats(
|
||||||
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url,
|
compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url,
|
||||||
formats_dict=formats_dict, mpd_url=mpd_url)
|
formats_dict=formats_dict, mpd_url=mpd_url)
|
||||||
|
|
||||||
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
|
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
|
||||||
@ -2043,16 +2021,17 @@ class InfoExtractor(object):
|
|||||||
return formats
|
return formats
|
||||||
|
|
||||||
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
|
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
|
||||||
res = self._download_xml_handle(
|
res = self._download_webpage_handle(
|
||||||
ism_url, video_id,
|
ism_url, video_id,
|
||||||
note=note or 'Downloading ISM manifest',
|
note=note or 'Downloading ISM manifest',
|
||||||
errnote=errnote or 'Failed to download ISM manifest',
|
errnote=errnote or 'Failed to download ISM manifest',
|
||||||
fatal=fatal)
|
fatal=fatal)
|
||||||
if res is False:
|
if res is False:
|
||||||
return []
|
return []
|
||||||
ism_doc, urlh = res
|
ism, urlh = res
|
||||||
|
|
||||||
return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id)
|
return self._parse_ism_formats(
|
||||||
|
compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id)
|
||||||
|
|
||||||
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
|
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
|
||||||
"""
|
"""
|
||||||
@ -2150,8 +2129,8 @@ class InfoExtractor(object):
|
|||||||
return formats
|
return formats
|
||||||
|
|
||||||
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
|
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
|
||||||
def absolute_url(item_url):
|
def absolute_url(video_url):
|
||||||
return urljoin(base_url, item_url)
|
return compat_urlparse.urljoin(base_url, video_url)
|
||||||
|
|
||||||
def parse_content_type(content_type):
|
def parse_content_type(content_type):
|
||||||
if not content_type:
|
if not content_type:
|
||||||
@ -2208,7 +2187,7 @@ class InfoExtractor(object):
|
|||||||
if src:
|
if src:
|
||||||
_, formats = _media_formats(src, media_type)
|
_, formats = _media_formats(src, media_type)
|
||||||
media_info['formats'].extend(formats)
|
media_info['formats'].extend(formats)
|
||||||
media_info['thumbnail'] = absolute_url(media_attributes.get('poster'))
|
media_info['thumbnail'] = media_attributes.get('poster')
|
||||||
if media_content:
|
if media_content:
|
||||||
for source_tag in re.findall(r'<source[^>]+>', media_content):
|
for source_tag in re.findall(r'<source[^>]+>', media_content):
|
||||||
source_attributes = extract_attributes(source_tag)
|
source_attributes = extract_attributes(source_tag)
|
||||||
@ -2269,10 +2248,9 @@ class InfoExtractor(object):
|
|||||||
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
|
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
|
||||||
query = compat_urlparse.urlparse(url).query
|
query = compat_urlparse.urlparse(url).query
|
||||||
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
|
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
|
||||||
mobj = re.search(
|
url_base = self._search_regex(
|
||||||
r'(?:(?:http|rtmp|rtsp)(?P<s>s)?:)?(?P<url>//[^?]+)', url)
|
r'(?:(?:https?|rtmp|rtsp):)?(//[^?]+)', url, 'format url')
|
||||||
url_base = mobj.group('url')
|
http_base_url = '%s:%s' % ('http', url_base)
|
||||||
http_base_url = '%s%s:%s' % ('http', mobj.group('s') or '', url_base)
|
|
||||||
formats = []
|
formats = []
|
||||||
|
|
||||||
def manifest_url(manifest):
|
def manifest_url(manifest):
|
||||||
@ -2372,10 +2350,7 @@ class InfoExtractor(object):
|
|||||||
for track in tracks:
|
for track in tracks:
|
||||||
if not isinstance(track, dict):
|
if not isinstance(track, dict):
|
||||||
continue
|
continue
|
||||||
track_kind = track.get('kind')
|
if track.get('kind') != 'captions':
|
||||||
if not track_kind or not isinstance(track_kind, compat_str):
|
|
||||||
continue
|
|
||||||
if track_kind.lower() not in ('captions', 'subtitles'):
|
|
||||||
continue
|
continue
|
||||||
track_url = urljoin(base_url, track.get('file'))
|
track_url = urljoin(base_url, track.get('file'))
|
||||||
if not track_url:
|
if not track_url:
|
||||||
@ -2429,7 +2404,7 @@ class InfoExtractor(object):
|
|||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
m3u8_id=m3u8_id, fatal=False))
|
m3u8_id=m3u8_id, fatal=False))
|
||||||
elif source_type == 'dash' or ext == 'mpd':
|
elif ext == 'mpd':
|
||||||
formats.extend(self._extract_mpd_formats(
|
formats.extend(self._extract_mpd_formats(
|
||||||
source_url, video_id, mpd_id=mpd_id, fatal=False))
|
source_url, video_id, mpd_id=mpd_id, fatal=False))
|
||||||
elif ext == 'smil':
|
elif ext == 'smil':
|
||||||
|
@ -1,45 +1,31 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals, division
|
from __future__ import unicode_literals, division
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..utils import int_or_none
|
||||||
compat_str,
|
|
||||||
compat_HTTPError,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
|
||||||
determine_ext,
|
|
||||||
float_or_none,
|
|
||||||
int_or_none,
|
|
||||||
parse_age_limit,
|
|
||||||
parse_duration,
|
|
||||||
ExtractorError
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class CrackleIE(InfoExtractor):
|
class CrackleIE(InfoExtractor):
|
||||||
|
_GEO_COUNTRIES = ['US']
|
||||||
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
|
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
# geo restricted to CA
|
'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',
|
||||||
'url': 'https://www.crackle.com/andromeda/2502343',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '2502343',
|
'id': '2498934',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Under The Night',
|
'title': 'Everybody Respects A Bloody Nose',
|
||||||
'description': 'md5:d2b8ca816579ae8a7bf28bfff8cefc8a',
|
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). They’re headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
|
||||||
'duration': 2583,
|
'thumbnail': r're:^https?://.*\.jpg',
|
||||||
'view_count': int,
|
'duration': 906,
|
||||||
'average_rating': 0,
|
'series': 'Comedians In Cars Getting Coffee',
|
||||||
'age_limit': 14,
|
'season_number': 8,
|
||||||
'genre': 'Action, Sci-Fi',
|
'episode_number': 4,
|
||||||
'creator': 'Allan Kroeker',
|
'subtitles': {
|
||||||
'artist': 'Keith Hamilton Cobb, Kevin Sorbo, Lisa Ryder, Lexa Doig, Robert Hewitt Wolfe',
|
'en-US': [
|
||||||
'release_year': 2000,
|
{'ext': 'vtt'},
|
||||||
'series': 'Andromeda',
|
{'ext': 'tt'},
|
||||||
'episode': 'Under The Night',
|
]
|
||||||
'season_number': 1,
|
},
|
||||||
'episode_number': 1,
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
@ -47,118 +33,109 @@ class CrackleIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
_THUMBNAIL_RES = [
|
||||||
|
(120, 90),
|
||||||
|
(208, 156),
|
||||||
|
(220, 124),
|
||||||
|
(220, 220),
|
||||||
|
(240, 180),
|
||||||
|
(250, 141),
|
||||||
|
(315, 236),
|
||||||
|
(320, 180),
|
||||||
|
(360, 203),
|
||||||
|
(400, 300),
|
||||||
|
(421, 316),
|
||||||
|
(460, 330),
|
||||||
|
(460, 460),
|
||||||
|
(462, 260),
|
||||||
|
(480, 270),
|
||||||
|
(587, 330),
|
||||||
|
(640, 480),
|
||||||
|
(700, 330),
|
||||||
|
(700, 394),
|
||||||
|
(854, 480),
|
||||||
|
(1024, 1024),
|
||||||
|
(1920, 1080),
|
||||||
|
]
|
||||||
|
|
||||||
|
# extracted from http://legacyweb-us.crackle.com/flash/ReferrerRedirect.ashx
|
||||||
|
_MEDIA_FILE_SLOTS = {
|
||||||
|
'c544.flv': {
|
||||||
|
'width': 544,
|
||||||
|
'height': 306,
|
||||||
|
},
|
||||||
|
'360p.mp4': {
|
||||||
|
'width': 640,
|
||||||
|
'height': 360,
|
||||||
|
},
|
||||||
|
'480p.mp4': {
|
||||||
|
'width': 852,
|
||||||
|
'height': 478,
|
||||||
|
},
|
||||||
|
'480p_1mbps.mp4': {
|
||||||
|
'width': 852,
|
||||||
|
'height': 478,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
country_code = self._downloader.params.get('geo_bypass_country', None)
|
config_doc = self._download_xml(
|
||||||
countries = [country_code] if country_code else (
|
'http://legacyweb-us.crackle.com/flash/QueryReferrer.ashx?site=16',
|
||||||
'US', 'AU', 'CA', 'AS', 'FM', 'GU', 'MP', 'PR', 'PW', 'MH', 'VI')
|
video_id, 'Downloading config')
|
||||||
|
|
||||||
last_e = None
|
item = self._download_xml(
|
||||||
|
'http://legacyweb-us.crackle.com/app/revamp/vidwallcache.aspx?flags=-1&fm=%s' % video_id,
|
||||||
for country in countries:
|
video_id, headers=self.geo_verification_headers()).find('i')
|
||||||
try:
|
title = item.attrib['t']
|
||||||
media = self._download_json(
|
|
||||||
'https://web-api-us.crackle.com/Service.svc/details/media/%s/%s'
|
|
||||||
% (video_id, country), video_id,
|
|
||||||
'Downloading media JSON as %s' % country,
|
|
||||||
'Unable to download media JSON', query={
|
|
||||||
'disableProtocols': 'true',
|
|
||||||
'format': 'json'
|
|
||||||
})
|
|
||||||
except ExtractorError as e:
|
|
||||||
# 401 means geo restriction, trying next country
|
|
||||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
|
|
||||||
last_e = e
|
|
||||||
continue
|
|
||||||
raise
|
|
||||||
|
|
||||||
media_urls = media.get('MediaURLs')
|
|
||||||
if not media_urls or not isinstance(media_urls, list):
|
|
||||||
continue
|
|
||||||
|
|
||||||
title = media['Title']
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
for e in media['MediaURLs']:
|
|
||||||
if e.get('UseDRM') is True:
|
|
||||||
continue
|
|
||||||
format_url = e.get('Path')
|
|
||||||
if not format_url or not isinstance(format_url, compat_str):
|
|
||||||
continue
|
|
||||||
ext = determine_ext(format_url)
|
|
||||||
if ext == 'm3u8':
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
|
||||||
m3u8_id='hls', fatal=False))
|
|
||||||
elif ext == 'mpd':
|
|
||||||
formats.extend(self._extract_mpd_formats(
|
|
||||||
format_url, video_id, mpd_id='dash', fatal=False))
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
description = media.get('Description')
|
|
||||||
duration = int_or_none(media.get(
|
|
||||||
'DurationInSeconds')) or parse_duration(media.get('Duration'))
|
|
||||||
view_count = int_or_none(media.get('CountViews'))
|
|
||||||
average_rating = float_or_none(media.get('UserRating'))
|
|
||||||
age_limit = parse_age_limit(media.get('Rating'))
|
|
||||||
genre = media.get('Genre')
|
|
||||||
release_year = int_or_none(media.get('ReleaseYear'))
|
|
||||||
creator = media.get('Directors')
|
|
||||||
artist = media.get('Cast')
|
|
||||||
|
|
||||||
if media.get('MediaTypeDisplayValue') == 'Full Episode':
|
|
||||||
series = media.get('ShowName')
|
|
||||||
episode = title
|
|
||||||
season_number = int_or_none(media.get('Season'))
|
|
||||||
episode_number = int_or_none(media.get('Episode'))
|
|
||||||
else:
|
|
||||||
series = episode = season_number = episode_number = None
|
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
cc_files = media.get('ClosedCaptionFiles')
|
formats = self._extract_m3u8_formats(
|
||||||
if isinstance(cc_files, list):
|
'http://content.uplynk.com/ext/%s/%s.m3u8' % (config_doc.attrib['strUplynkOwnerId'], video_id),
|
||||||
for cc_file in cc_files:
|
video_id, 'mp4', m3u8_id='hls', fatal=None)
|
||||||
if not isinstance(cc_file, dict):
|
|
||||||
continue
|
|
||||||
cc_url = cc_file.get('Path')
|
|
||||||
if not cc_url or not isinstance(cc_url, compat_str):
|
|
||||||
continue
|
|
||||||
lang = cc_file.get('Locale') or 'en'
|
|
||||||
subtitles.setdefault(lang, []).append({'url': cc_url})
|
|
||||||
|
|
||||||
thumbnails = []
|
thumbnails = []
|
||||||
images = media.get('Images')
|
path = item.attrib.get('p')
|
||||||
if isinstance(images, list):
|
if path:
|
||||||
for image_key, image_url in images.items():
|
for width, height in self._THUMBNAIL_RES:
|
||||||
mobj = re.search(r'Img_(\d+)[xX](\d+)', image_key)
|
res = '%dx%d' % (width, height)
|
||||||
if not mobj:
|
|
||||||
continue
|
|
||||||
thumbnails.append({
|
thumbnails.append({
|
||||||
'url': image_url,
|
'id': res,
|
||||||
'width': int(mobj.group(1)),
|
'url': 'http://images-us-am.crackle.com/%stnl_%s.jpg' % (path, res),
|
||||||
'height': int(mobj.group(2)),
|
'width': width,
|
||||||
|
'height': height,
|
||||||
|
'resolution': res,
|
||||||
})
|
})
|
||||||
|
http_base_url = 'http://ahttp.crackle.com/' + path
|
||||||
|
for mfs_path, mfs_info in self._MEDIA_FILE_SLOTS.items():
|
||||||
|
formats.append({
|
||||||
|
'url': http_base_url + mfs_path,
|
||||||
|
'format_id': 'http-' + mfs_path.split('.')[0],
|
||||||
|
'width': mfs_info['width'],
|
||||||
|
'height': mfs_info['height'],
|
||||||
|
})
|
||||||
|
for cc in item.findall('cc'):
|
||||||
|
locale = cc.attrib.get('l')
|
||||||
|
v = cc.attrib.get('v')
|
||||||
|
if locale and v:
|
||||||
|
if locale not in subtitles:
|
||||||
|
subtitles[locale] = []
|
||||||
|
for url_ext, ext in (('vtt', 'vtt'), ('xml', 'tt')):
|
||||||
|
subtitles.setdefault(locale, []).append({
|
||||||
|
'url': '%s/%s%s_%s.%s' % (config_doc.attrib['strSubtitleServer'], path, locale, v, url_ext),
|
||||||
|
'ext': ext,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': description,
|
'description': item.attrib.get('d'),
|
||||||
'duration': duration,
|
'duration': int(item.attrib.get('r'), 16) / 1000 if item.attrib.get('r') else None,
|
||||||
'view_count': view_count,
|
'series': item.attrib.get('sn'),
|
||||||
'average_rating': average_rating,
|
'season_number': int_or_none(item.attrib.get('se')),
|
||||||
'age_limit': age_limit,
|
'episode_number': int_or_none(item.attrib.get('ep')),
|
||||||
'genre': genre,
|
|
||||||
'creator': creator,
|
|
||||||
'artist': artist,
|
|
||||||
'release_year': release_year,
|
|
||||||
'series': series,
|
|
||||||
'episode': episode,
|
|
||||||
'season_number': season_number,
|
|
||||||
'episode_number': episode_number,
|
|
||||||
'thumbnails': thumbnails,
|
'thumbnails': thumbnails,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
}
|
}
|
||||||
|
|
||||||
raise last_e
|
|
||||||
|
@ -3,13 +3,13 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
import json
|
import json
|
||||||
|
import base64
|
||||||
import zlib
|
import zlib
|
||||||
|
|
||||||
from hashlib import sha1
|
from hashlib import sha1
|
||||||
from math import pow, sqrt, floor
|
from math import pow, sqrt, floor
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_etree_fromstring,
|
compat_etree_fromstring,
|
||||||
compat_urllib_parse_urlencode,
|
compat_urllib_parse_urlencode,
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
@ -272,8 +272,8 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
|||||||
}
|
}
|
||||||
|
|
||||||
def _decrypt_subtitles(self, data, iv, id):
|
def _decrypt_subtitles(self, data, iv, id):
|
||||||
data = bytes_to_intlist(compat_b64decode(data))
|
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
|
||||||
iv = bytes_to_intlist(compat_b64decode(iv))
|
iv = bytes_to_intlist(base64.b64decode(iv.encode('utf-8')))
|
||||||
id = int(id)
|
id = int(id)
|
||||||
|
|
||||||
def obfuscate_key_aux(count, modulo, start):
|
def obfuscate_key_aux(count, modulo, start):
|
||||||
|
@ -10,7 +10,6 @@ from ..aes import (
|
|||||||
aes_cbc_decrypt,
|
aes_cbc_decrypt,
|
||||||
aes_cbc_encrypt,
|
aes_cbc_encrypt,
|
||||||
)
|
)
|
||||||
from ..compat import compat_b64decode
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
bytes_to_intlist,
|
bytes_to_intlist,
|
||||||
bytes_to_long,
|
bytes_to_long,
|
||||||
@ -94,7 +93,7 @@ class DaisukiMottoIE(InfoExtractor):
|
|||||||
|
|
||||||
rtn = self._parse_json(
|
rtn = self._parse_json(
|
||||||
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
|
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
|
||||||
compat_b64decode(encrypted_rtn)),
|
base64.b64decode(encrypted_rtn)),
|
||||||
aes_key, iv)).decode('utf-8').rstrip('\0'),
|
aes_key, iv)).decode('utf-8').rstrip('\0'),
|
||||||
video_id)
|
video_id)
|
||||||
|
|
||||||
|
@ -1,56 +0,0 @@
|
|||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import js_to_json
|
|
||||||
|
|
||||||
|
|
||||||
class DiggIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?digg\.com/video/(?P<id>[^/?#&]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
# JWPlatform via provider
|
|
||||||
'url': 'http://digg.com/video/sci-fi-short-jonah-daniel-kaluuya-get-out',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'LcqvmS0b',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': "'Get Out' Star Daniel Kaluuya Goes On 'Moby Dick'-Like Journey In Sci-Fi Short 'Jonah'",
|
|
||||||
'description': 'md5:541bb847648b6ee3d6514bc84b82efda',
|
|
||||||
'upload_date': '20180109',
|
|
||||||
'timestamp': 1515530551,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
# Youtube via provider
|
|
||||||
'url': 'http://digg.com/video/dog-boat-seal-play',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
# vimeo as regular embed
|
|
||||||
'url': 'http://digg.com/video/dream-girl-short-film',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
|
||||||
|
|
||||||
info = self._parse_json(
|
|
||||||
self._search_regex(
|
|
||||||
r'(?s)video_info\s*=\s*({.+?});\n', webpage, 'video info',
|
|
||||||
default='{}'), display_id, transform_source=js_to_json,
|
|
||||||
fatal=False)
|
|
||||||
|
|
||||||
video_id = info.get('video_id')
|
|
||||||
|
|
||||||
if video_id:
|
|
||||||
provider = info.get('provider_name')
|
|
||||||
if provider == 'youtube':
|
|
||||||
return self.url_result(
|
|
||||||
video_id, ie='Youtube', video_id=video_id)
|
|
||||||
elif provider == 'jwplayer':
|
|
||||||
return self.url_result(
|
|
||||||
'jwplatform:%s' % video_id, ie='JWPlatform',
|
|
||||||
video_id=video_id)
|
|
||||||
|
|
||||||
return self.url_result(url, 'Generic')
|
|
@ -5,16 +5,15 @@ import re
|
|||||||
import string
|
import string
|
||||||
|
|
||||||
from .discoverygo import DiscoveryGoBaseIE
|
from .discoverygo import DiscoveryGoBaseIE
|
||||||
from ..compat import compat_str
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
try_get,
|
update_url_query,
|
||||||
)
|
)
|
||||||
from ..compat import compat_HTTPError
|
from ..compat import compat_HTTPError
|
||||||
|
|
||||||
|
|
||||||
class DiscoveryIE(DiscoveryGoBaseIE):
|
class DiscoveryIE(DiscoveryGoBaseIE):
|
||||||
_VALID_URL = r'''(?x)https?://(?:www\.)?(?P<site>
|
_VALID_URL = r'''(?x)https?://(?:www\.)?(?:
|
||||||
discovery|
|
discovery|
|
||||||
investigationdiscovery|
|
investigationdiscovery|
|
||||||
discoverylife|
|
discoverylife|
|
||||||
@ -45,7 +44,7 @@ class DiscoveryIE(DiscoveryGoBaseIE):
|
|||||||
_GEO_BYPASS = False
|
_GEO_BYPASS = False
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
site, path, display_id = re.match(self._VALID_URL, url).groups()
|
path, display_id = re.match(self._VALID_URL, url).groups()
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
react_data = self._parse_json(self._search_regex(
|
react_data = self._parse_json(self._search_regex(
|
||||||
@ -56,13 +55,14 @@ class DiscoveryIE(DiscoveryGoBaseIE):
|
|||||||
video_id = video['id']
|
video_id = video['id']
|
||||||
|
|
||||||
access_token = self._download_json(
|
access_token = self._download_json(
|
||||||
'https://www.%s.com/anonymous' % site, display_id, query={
|
'https://www.discovery.com/anonymous', display_id, query={
|
||||||
'authRel': 'authorization',
|
'authLink': update_url_query(
|
||||||
'client_id': try_get(
|
'https://login.discovery.com/v1/oauth2/authorize', {
|
||||||
react_data, lambda x: x['application']['apiClientId'],
|
'client_id': react_data['application']['apiClientId'],
|
||||||
compat_str) or '3020a40c2356a645b4b4',
|
'redirect_uri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html',
|
||||||
'nonce': ''.join([random.choice(string.ascii_letters) for _ in range(32)]),
|
'response_type': 'anonymous',
|
||||||
'redirectUri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html?https://www.%s.com' % site,
|
'state': 'nonce,' + ''.join([random.choice(string.ascii_letters) for _ in range(32)]),
|
||||||
|
})
|
||||||
})['access_token']
|
})['access_token']
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
@ -12,28 +12,25 @@ from ..compat import (
|
|||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
|
||||||
int_or_none,
|
int_or_none,
|
||||||
remove_end,
|
remove_end,
|
||||||
try_get,
|
try_get,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
unified_timestamp,
|
|
||||||
update_url_query,
|
update_url_query,
|
||||||
USER_AGENTS,
|
USER_AGENTS,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class DPlayIE(InfoExtractor):
|
class DPlayIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?P<domain>www\.(?P<host>dplay\.(?P<country>dk|se|no)))/(?:video(?:er|s)/)?(?P<id>[^/]+/[^/?#]+)'
|
_VALID_URL = r'https?://(?P<domain>www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# non geo restricted, via secure api, unsigned download hls URL
|
# non geo restricted, via secure api, unsigned download hls URL
|
||||||
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
|
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '3172',
|
'id': '3172',
|
||||||
'display_id': 'nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet',
|
'display_id': 'season-1-svensken-lar-sig-njuta-av-livet',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Svensken lär sig njuta av livet',
|
'title': 'Svensken lär sig njuta av livet',
|
||||||
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
|
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
|
||||||
@ -51,7 +48,7 @@ class DPlayIE(InfoExtractor):
|
|||||||
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
|
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '70816',
|
'id': '70816',
|
||||||
'display_id': 'mig-og-min-mor/season-6-episode-12',
|
'display_id': 'season-6-episode-12',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Episode 12',
|
'title': 'Episode 12',
|
||||||
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
|
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
|
||||||
@ -68,33 +65,6 @@ class DPlayIE(InfoExtractor):
|
|||||||
# geo restricted, via direct unsigned hls URL
|
# geo restricted, via direct unsigned hls URL
|
||||||
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
|
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
# disco-api
|
|
||||||
'url': 'https://www.dplay.no/videoer/i-kongens-klr/sesong-1-episode-7',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '40206',
|
|
||||||
'display_id': 'i-kongens-klr/sesong-1-episode-7',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Episode 7',
|
|
||||||
'description': 'md5:e3e1411b2b9aebeea36a6ec5d50c60cf',
|
|
||||||
'duration': 2611.16,
|
|
||||||
'timestamp': 1516726800,
|
|
||||||
'upload_date': '20180123',
|
|
||||||
'series': 'I kongens klær',
|
|
||||||
'season_number': 1,
|
|
||||||
'episode_number': 7,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'format': 'bestvideo',
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
|
|
||||||
'url': 'https://www.dplay.dk/videoer/singleliv/season-5-episode-3',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.dplay.se/videos/sofias-anglar/sofias-anglar-1001',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -102,81 +72,10 @@ class DPlayIE(InfoExtractor):
|
|||||||
display_id = mobj.group('id')
|
display_id = mobj.group('id')
|
||||||
domain = mobj.group('domain')
|
domain = mobj.group('domain')
|
||||||
|
|
||||||
self._initialize_geo_bypass([mobj.group('country').upper()])
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
video_id = self._search_regex(
|
video_id = self._search_regex(
|
||||||
r'data-video-id=["\'](\d+)', webpage, 'video id', default=None)
|
r'data-video-id=["\'](\d+)', webpage, 'video id')
|
||||||
|
|
||||||
if not video_id:
|
|
||||||
host = mobj.group('host')
|
|
||||||
disco_base = 'https://disco-api.%s' % host
|
|
||||||
self._download_json(
|
|
||||||
'%s/token' % disco_base, display_id, 'Downloading token',
|
|
||||||
query={
|
|
||||||
'realm': host.replace('.', ''),
|
|
||||||
})
|
|
||||||
video = self._download_json(
|
|
||||||
'%s/content/videos/%s' % (disco_base, display_id), display_id,
|
|
||||||
headers={
|
|
||||||
'Referer': url,
|
|
||||||
'x-disco-client': 'WEB:UNKNOWN:dplay-client:0.0.1',
|
|
||||||
}, query={
|
|
||||||
'include': 'show'
|
|
||||||
})
|
|
||||||
video_id = video['data']['id']
|
|
||||||
info = video['data']['attributes']
|
|
||||||
title = info['name']
|
|
||||||
formats = []
|
|
||||||
for format_id, format_dict in self._download_json(
|
|
||||||
'%s/playback/videoPlaybackInfo/%s' % (disco_base, video_id),
|
|
||||||
display_id)['data']['attributes']['streaming'].items():
|
|
||||||
if not isinstance(format_dict, dict):
|
|
||||||
continue
|
|
||||||
format_url = format_dict.get('url')
|
|
||||||
if not format_url:
|
|
||||||
continue
|
|
||||||
ext = determine_ext(format_url)
|
|
||||||
if format_id == 'dash' or ext == 'mpd':
|
|
||||||
formats.extend(self._extract_mpd_formats(
|
|
||||||
format_url, display_id, mpd_id='dash', fatal=False))
|
|
||||||
elif format_id == 'hls' or ext == 'm3u8':
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
format_url, display_id, 'mp4',
|
|
||||||
entry_protocol='m3u8_native', m3u8_id='hls',
|
|
||||||
fatal=False))
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'url': format_url,
|
|
||||||
'format_id': format_id,
|
|
||||||
})
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
series = None
|
|
||||||
try:
|
|
||||||
included = video.get('included')
|
|
||||||
if isinstance(included, list):
|
|
||||||
show = next(e for e in included if e.get('type') == 'show')
|
|
||||||
series = try_get(
|
|
||||||
show, lambda x: x['attributes']['name'], compat_str)
|
|
||||||
except StopIteration:
|
|
||||||
pass
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'display_id': display_id,
|
|
||||||
'title': title,
|
|
||||||
'description': info.get('description'),
|
|
||||||
'duration': float_or_none(
|
|
||||||
info.get('videoDuration'), scale=1000),
|
|
||||||
'timestamp': unified_timestamp(info.get('publishStart')),
|
|
||||||
'series': series,
|
|
||||||
'season_number': int_or_none(info.get('seasonNumber')),
|
|
||||||
'episode_number': int_or_none(info.get('episodeNumber')),
|
|
||||||
'age_limit': int_or_none(info.get('minimum_age')),
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
||||||
|
|
||||||
info = self._download_json(
|
info = self._download_json(
|
||||||
'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),
|
'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),
|
||||||
|
@ -1,10 +1,10 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_b64decode
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
qualities,
|
qualities,
|
||||||
sanitized_Request,
|
sanitized_Request,
|
||||||
@ -42,7 +42,7 @@ class DumpertIE(InfoExtractor):
|
|||||||
r'data-files="([^"]+)"', webpage, 'data files')
|
r'data-files="([^"]+)"', webpage, 'data files')
|
||||||
|
|
||||||
files = self._parse_json(
|
files = self._parse_json(
|
||||||
compat_b64decode(files_base64).decode('utf-8'),
|
base64.b64decode(files_base64.encode('utf-8')).decode('utf-8'),
|
||||||
video_id)
|
video_id)
|
||||||
|
|
||||||
quality = qualities(['flv', 'mobile', 'tablet', '720p'])
|
quality = qualities(['flv', 'mobile', 'tablet', '720p'])
|
||||||
|
@ -32,7 +32,7 @@ class DVTVIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
|
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': r're:^DVTV 16\. 12\. 2014: útok Talibanu, boj o kliniku, uprchlíci',
|
'title': 'DVTV 16. 12. 2014: útok Talibanu, boj o kliniku, uprchlíci',
|
||||||
'id': '973eb3bc854e11e498be002590604f2e',
|
'id': '973eb3bc854e11e498be002590604f2e',
|
||||||
},
|
},
|
||||||
'playlist': [{
|
'playlist': [{
|
||||||
@ -91,24 +91,10 @@ class DVTVIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
|
'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://video.aktualne.cz/dvtv/babis-a-zeman-nesou-vinu-za-to-ze-nemame-jasno-v-tom-kdo-bud/r~026afb54fad711e79704ac1f6b220ee8/',
|
|
||||||
'md5': '87defe16681b1429c91f7a74809823c6',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'f5ae72f6fad611e794dbac1f6b220ee8',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Babiš a Zeman nesou vinu za to, že nemáme jasno v tom, kdo bude vládnout, říká Pekarová Adamová',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _parse_video_metadata(self, js, video_id, live_js=None):
|
def _parse_video_metadata(self, js, video_id):
|
||||||
data = self._parse_json(js, video_id, transform_source=js_to_json)
|
data = self._parse_json(js, video_id, transform_source=js_to_json)
|
||||||
if live_js:
|
|
||||||
data.update(self._parse_json(
|
|
||||||
live_js, video_id, transform_source=js_to_json))
|
|
||||||
|
|
||||||
title = unescapeHTML(data['title'])
|
title = unescapeHTML(data['title'])
|
||||||
|
|
||||||
@ -156,18 +142,13 @@ class DVTVIE(InfoExtractor):
|
|||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
# live content
|
|
||||||
live_item = self._search_regex(
|
|
||||||
r'(?s)embedData[0-9a-f]{32}\.asset\.liveStarter\s*=\s*(\{.+?\});',
|
|
||||||
webpage, 'video', default=None)
|
|
||||||
|
|
||||||
# single video
|
# single video
|
||||||
item = self._search_regex(
|
item = self._search_regex(
|
||||||
r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
|
r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
|
||||||
webpage, 'video', default=None)
|
webpage, 'video', default=None, fatal=False)
|
||||||
|
|
||||||
if item:
|
if item:
|
||||||
return self._parse_video_metadata(item, video_id, live_item)
|
return self._parse_video_metadata(item, video_id)
|
||||||
|
|
||||||
# playlist
|
# playlist
|
||||||
items = re.findall(
|
items = re.findall(
|
||||||
|
@ -1,13 +1,13 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_str,
|
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
|
compat_str,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
extract_attributes,
|
extract_attributes,
|
||||||
@ -36,9 +36,9 @@ class EinthusanIE(InfoExtractor):
|
|||||||
|
|
||||||
# reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js
|
# reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js
|
||||||
def _decrypt(self, encrypted_data, video_id):
|
def _decrypt(self, encrypted_data, video_id):
|
||||||
return self._parse_json(compat_b64decode((
|
return self._parse_json(base64.b64decode((
|
||||||
encrypted_data[:10] + encrypted_data[-1] + encrypted_data[12:-1]
|
encrypted_data[:10] + encrypted_data[-1] + encrypted_data[12:-1]
|
||||||
)).decode('utf-8'), video_id)
|
).encode('ascii')).decode('utf-8'), video_id)
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
@ -162,7 +162,6 @@ from .cbc import (
|
|||||||
CBCPlayerIE,
|
CBCPlayerIE,
|
||||||
CBCWatchVideoIE,
|
CBCWatchVideoIE,
|
||||||
CBCWatchIE,
|
CBCWatchIE,
|
||||||
CBCOlympicsIE,
|
|
||||||
)
|
)
|
||||||
from .cbs import CBSIE
|
from .cbs import CBSIE
|
||||||
from .cbslocal import CBSLocalIE
|
from .cbslocal import CBSLocalIE
|
||||||
@ -260,7 +259,6 @@ from .deezer import DeezerPlaylistIE
|
|||||||
from .democracynow import DemocracynowIE
|
from .democracynow import DemocracynowIE
|
||||||
from .dfb import DFBIE
|
from .dfb import DFBIE
|
||||||
from .dhm import DHMIE
|
from .dhm import DHMIE
|
||||||
from .digg import DiggIE
|
|
||||||
from .dotsub import DotsubIE
|
from .dotsub import DotsubIE
|
||||||
from .douyutv import (
|
from .douyutv import (
|
||||||
DouyuShowIE,
|
DouyuShowIE,
|
||||||
@ -374,10 +372,8 @@ from .franceculture import FranceCultureIE
|
|||||||
from .franceinter import FranceInterIE
|
from .franceinter import FranceInterIE
|
||||||
from .francetv import (
|
from .francetv import (
|
||||||
FranceTVIE,
|
FranceTVIE,
|
||||||
FranceTVSiteIE,
|
|
||||||
FranceTVEmbedIE,
|
FranceTVEmbedIE,
|
||||||
FranceTVInfoIE,
|
FranceTVInfoIE,
|
||||||
FranceTVJeunesseIE,
|
|
||||||
GenerationWhatIE,
|
GenerationWhatIE,
|
||||||
CultureboxIE,
|
CultureboxIE,
|
||||||
)
|
)
|
||||||
@ -385,10 +381,7 @@ from .freesound import FreesoundIE
|
|||||||
from .freespeech import FreespeechIE
|
from .freespeech import FreespeechIE
|
||||||
from .freshlive import FreshLiveIE
|
from .freshlive import FreshLiveIE
|
||||||
from .funimation import FunimationIE
|
from .funimation import FunimationIE
|
||||||
from .funk import (
|
from .funk import FunkIE
|
||||||
FunkMixIE,
|
|
||||||
FunkChannelIE,
|
|
||||||
)
|
|
||||||
from .funnyordie import FunnyOrDieIE
|
from .funnyordie import FunnyOrDieIE
|
||||||
from .fusion import FusionIE
|
from .fusion import FusionIE
|
||||||
from .fxnetworks import FXNetworksIE
|
from .fxnetworks import FXNetworksIE
|
||||||
@ -432,7 +425,6 @@ from .hellporno import HellPornoIE
|
|||||||
from .helsinki import HelsinkiIE
|
from .helsinki import HelsinkiIE
|
||||||
from .hentaistigma import HentaiStigmaIE
|
from .hentaistigma import HentaiStigmaIE
|
||||||
from .hgtv import HGTVComShowIE
|
from .hgtv import HGTVComShowIE
|
||||||
from .hidive import HiDiveIE
|
|
||||||
from .historicfilms import HistoricFilmsIE
|
from .historicfilms import HistoricFilmsIE
|
||||||
from .hitbox import HitboxIE, HitboxLiveIE
|
from .hitbox import HitboxIE, HitboxLiveIE
|
||||||
from .hitrecord import HitRecordIE
|
from .hitrecord import HitRecordIE
|
||||||
@ -497,6 +489,7 @@ from .jwplatform import JWPlatformIE
|
|||||||
from .jpopsukitv import JpopsukiIE
|
from .jpopsukitv import JpopsukiIE
|
||||||
from .kakao import KakaoIE
|
from .kakao import KakaoIE
|
||||||
from .kaltura import KalturaIE
|
from .kaltura import KalturaIE
|
||||||
|
from .kamcord import KamcordIE
|
||||||
from .kanalplay import KanalPlayIE
|
from .kanalplay import KanalPlayIE
|
||||||
from .kankan import KankanIE
|
from .kankan import KankanIE
|
||||||
from .karaoketv import KaraoketvIE
|
from .karaoketv import KaraoketvIE
|
||||||
@ -532,14 +525,13 @@ from .lcp import (
|
|||||||
)
|
)
|
||||||
from .learnr import LearnrIE
|
from .learnr import LearnrIE
|
||||||
from .lecture2go import Lecture2GoIE
|
from .lecture2go import Lecture2GoIE
|
||||||
|
from .lego import LEGOIE
|
||||||
|
from .lemonde import LemondeIE
|
||||||
from .leeco import (
|
from .leeco import (
|
||||||
LeIE,
|
LeIE,
|
||||||
LePlaylistIE,
|
LePlaylistIE,
|
||||||
LetvCloudIE,
|
LetvCloudIE,
|
||||||
)
|
)
|
||||||
from .lego import LEGOIE
|
|
||||||
from .lemonde import LemondeIE
|
|
||||||
from .lenta import LentaIE
|
|
||||||
from .libraryofcongress import LibraryOfCongressIE
|
from .libraryofcongress import LibraryOfCongressIE
|
||||||
from .libsyn import LibsynIE
|
from .libsyn import LibsynIE
|
||||||
from .lifenews import (
|
from .lifenews import (
|
||||||
@ -551,7 +543,6 @@ from .limelight import (
|
|||||||
LimelightChannelIE,
|
LimelightChannelIE,
|
||||||
LimelightChannelListIE,
|
LimelightChannelListIE,
|
||||||
)
|
)
|
||||||
from .line import LineTVIE
|
|
||||||
from .litv import LiTVIE
|
from .litv import LiTVIE
|
||||||
from .liveleak import (
|
from .liveleak import (
|
||||||
LiveLeakIE,
|
LiveLeakIE,
|
||||||
@ -572,11 +563,7 @@ from .lynda import (
|
|||||||
)
|
)
|
||||||
from .m6 import M6IE
|
from .m6 import M6IE
|
||||||
from .macgamestore import MacGameStoreIE
|
from .macgamestore import MacGameStoreIE
|
||||||
from .mailru import (
|
from .mailru import MailRuIE
|
||||||
MailRuIE,
|
|
||||||
MailRuMusicIE,
|
|
||||||
MailRuMusicSearchIE,
|
|
||||||
)
|
|
||||||
from .makerschannel import MakersChannelIE
|
from .makerschannel import MakersChannelIE
|
||||||
from .makertv import MakerTVIE
|
from .makertv import MakerTVIE
|
||||||
from .mangomolo import (
|
from .mangomolo import (
|
||||||
@ -643,10 +630,7 @@ from .musicplayon import MusicPlayOnIE
|
|||||||
from .mwave import MwaveIE, MwaveMeetGreetIE
|
from .mwave import MwaveIE, MwaveMeetGreetIE
|
||||||
from .myspace import MySpaceIE, MySpaceAlbumIE
|
from .myspace import MySpaceIE, MySpaceAlbumIE
|
||||||
from .myspass import MySpassIE
|
from .myspass import MySpassIE
|
||||||
from .myvi import (
|
from .myvi import MyviIE
|
||||||
MyviIE,
|
|
||||||
MyviEmbedIE,
|
|
||||||
)
|
|
||||||
from .myvidster import MyVidsterIE
|
from .myvidster import MyVidsterIE
|
||||||
from .nationalgeographic import (
|
from .nationalgeographic import (
|
||||||
NationalGeographicVideoIE,
|
NationalGeographicVideoIE,
|
||||||
@ -660,7 +644,6 @@ from .nbc import (
|
|||||||
NBCIE,
|
NBCIE,
|
||||||
NBCNewsIE,
|
NBCNewsIE,
|
||||||
NBCOlympicsIE,
|
NBCOlympicsIE,
|
||||||
NBCOlympicsStreamIE,
|
|
||||||
NBCSportsIE,
|
NBCSportsIE,
|
||||||
NBCSportsVPlayerIE,
|
NBCSportsVPlayerIE,
|
||||||
)
|
)
|
||||||
@ -877,7 +860,6 @@ from .rai import (
|
|||||||
RaiPlayPlaylistIE,
|
RaiPlayPlaylistIE,
|
||||||
RaiIE,
|
RaiIE,
|
||||||
)
|
)
|
||||||
from .raywenderlich import RayWenderlichIE
|
|
||||||
from .rbmaradio import RBMARadioIE
|
from .rbmaradio import RBMARadioIE
|
||||||
from .rds import RDSIE
|
from .rds import RDSIE
|
||||||
from .redbulltv import RedBullTVIE
|
from .redbulltv import RedBullTVIE
|
||||||
@ -899,6 +881,7 @@ from .revision3 import (
|
|||||||
Revision3IE,
|
Revision3IE,
|
||||||
)
|
)
|
||||||
from .rice import RICEIE
|
from .rice import RICEIE
|
||||||
|
from .ringtv import RingTVIE
|
||||||
from .rmcdecouverte import RMCDecouverteIE
|
from .rmcdecouverte import RMCDecouverteIE
|
||||||
from .ro220 import Ro220IE
|
from .ro220 import Ro220IE
|
||||||
from .rockstargames import RockstarGamesIE
|
from .rockstargames import RockstarGamesIE
|
||||||
@ -918,7 +901,6 @@ from .rtp import RTPIE
|
|||||||
from .rts import RTSIE
|
from .rts import RTSIE
|
||||||
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE, RTVELiveIE, RTVETelevisionIE
|
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE, RTVELiveIE, RTVETelevisionIE
|
||||||
from .rtvnh import RTVNHIE
|
from .rtvnh import RTVNHIE
|
||||||
from .rtvs import RTVSIE
|
|
||||||
from .rudo import RudoIE
|
from .rudo import RudoIE
|
||||||
from .ruhd import RUHDIE
|
from .ruhd import RUHDIE
|
||||||
from .ruleporn import RulePornIE
|
from .ruleporn import RulePornIE
|
||||||
@ -951,10 +933,6 @@ from .servingsys import ServingSysIE
|
|||||||
from .servus import ServusIE
|
from .servus import ServusIE
|
||||||
from .sevenplus import SevenPlusIE
|
from .sevenplus import SevenPlusIE
|
||||||
from .sexu import SexuIE
|
from .sexu import SexuIE
|
||||||
from .seznamzpravy import (
|
|
||||||
SeznamZpravyIE,
|
|
||||||
SeznamZpravyArticleIE,
|
|
||||||
)
|
|
||||||
from .shahid import (
|
from .shahid import (
|
||||||
ShahidIE,
|
ShahidIE,
|
||||||
ShahidShowIE,
|
ShahidShowIE,
|
||||||
@ -1012,7 +990,7 @@ from .stitcher import StitcherIE
|
|||||||
from .sport5 import Sport5IE
|
from .sport5 import Sport5IE
|
||||||
from .sportbox import SportBoxEmbedIE
|
from .sportbox import SportBoxEmbedIE
|
||||||
from .sportdeutschland import SportDeutschlandIE
|
from .sportdeutschland import SportDeutschlandIE
|
||||||
from .springboardplatform import SpringboardPlatformIE
|
from .sportschau import SportschauIE
|
||||||
from .sprout import SproutIE
|
from .sprout import SproutIE
|
||||||
from .srgssr import (
|
from .srgssr import (
|
||||||
SRGSSRIE,
|
SRGSSRIE,
|
||||||
@ -1056,14 +1034,9 @@ from .telebruxelles import TeleBruxellesIE
|
|||||||
from .telecinco import TelecincoIE
|
from .telecinco import TelecincoIE
|
||||||
from .telegraaf import TelegraafIE
|
from .telegraaf import TelegraafIE
|
||||||
from .telemb import TeleMBIE
|
from .telemb import TeleMBIE
|
||||||
from .telequebec import (
|
from .telequebec import TeleQuebecIE
|
||||||
TeleQuebecIE,
|
|
||||||
TeleQuebecEmissionIE,
|
|
||||||
TeleQuebecLiveIE,
|
|
||||||
)
|
|
||||||
from .teletask import TeleTaskIE
|
from .teletask import TeleTaskIE
|
||||||
from .telewebion import TelewebionIE
|
from .telewebion import TelewebionIE
|
||||||
from .tennistv import TennisTVIE
|
|
||||||
from .testurl import TestURLIE
|
from .testurl import TestURLIE
|
||||||
from .tf1 import TF1IE
|
from .tf1 import TF1IE
|
||||||
from .tfo import TFOIE
|
from .tfo import TFOIE
|
||||||
@ -1073,6 +1046,7 @@ from .theplatform import (
|
|||||||
ThePlatformFeedIE,
|
ThePlatformFeedIE,
|
||||||
)
|
)
|
||||||
from .thescene import TheSceneIE
|
from .thescene import TheSceneIE
|
||||||
|
from .thesixtyone import TheSixtyOneIE
|
||||||
from .thestar import TheStarIE
|
from .thestar import TheStarIE
|
||||||
from .thesun import TheSunIE
|
from .thesun import TheSunIE
|
||||||
from .theweatherchannel import TheWeatherChannelIE
|
from .theweatherchannel import TheWeatherChannelIE
|
||||||
@ -1094,6 +1068,7 @@ from .tnaflix import (
|
|||||||
from .toggle import ToggleIE
|
from .toggle import ToggleIE
|
||||||
from .tonline import TOnlineIE
|
from .tonline import TOnlineIE
|
||||||
from .toongoggles import ToonGogglesIE
|
from .toongoggles import ToonGogglesIE
|
||||||
|
from .totalwebcasting import TotalWebCastingIE
|
||||||
from .toutv import TouTvIE
|
from .toutv import TouTvIE
|
||||||
from .toypics import ToypicsUserIE, ToypicsIE
|
from .toypics import ToypicsUserIE, ToypicsIE
|
||||||
from .traileraddict import TrailerAddictIE
|
from .traileraddict import TrailerAddictIE
|
||||||
@ -1218,6 +1193,7 @@ from .vice import (
|
|||||||
ViceArticleIE,
|
ViceArticleIE,
|
||||||
ViceShowIE,
|
ViceShowIE,
|
||||||
)
|
)
|
||||||
|
from .viceland import VicelandIE
|
||||||
from .vidbit import VidbitIE
|
from .vidbit import VidbitIE
|
||||||
from .viddler import ViddlerIE
|
from .viddler import ViddlerIE
|
||||||
from .videa import VideaIE
|
from .videa import VideaIE
|
||||||
@ -1232,7 +1208,6 @@ from .videomore import (
|
|||||||
from .videopremium import VideoPremiumIE
|
from .videopremium import VideoPremiumIE
|
||||||
from .videopress import VideoPressIE
|
from .videopress import VideoPressIE
|
||||||
from .vidio import VidioIE
|
from .vidio import VidioIE
|
||||||
from .vidlii import VidLiiIE
|
|
||||||
from .vidme import (
|
from .vidme import (
|
||||||
VidmeIE,
|
VidmeIE,
|
||||||
VidmeUserIE,
|
VidmeUserIE,
|
||||||
@ -1314,8 +1289,6 @@ from .watchbox import WatchBoxIE
|
|||||||
from .watchindianporn import WatchIndianPornIE
|
from .watchindianporn import WatchIndianPornIE
|
||||||
from .wdr import (
|
from .wdr import (
|
||||||
WDRIE,
|
WDRIE,
|
||||||
WDRPageIE,
|
|
||||||
WDRElefantIE,
|
|
||||||
WDRMobileIE,
|
WDRMobileIE,
|
||||||
)
|
)
|
||||||
from .webcaster import (
|
from .webcaster import (
|
||||||
@ -1326,10 +1299,6 @@ from .webofstories import (
|
|||||||
WebOfStoriesIE,
|
WebOfStoriesIE,
|
||||||
WebOfStoriesPlaylistIE,
|
WebOfStoriesPlaylistIE,
|
||||||
)
|
)
|
||||||
from .weibo import (
|
|
||||||
WeiboIE,
|
|
||||||
WeiboMobileIE
|
|
||||||
)
|
|
||||||
from .weiqitv import WeiqiTVIE
|
from .weiqitv import WeiqiTVIE
|
||||||
from .wimp import WimpIE
|
from .wimp import WimpIE
|
||||||
from .wistia import WistiaIE
|
from .wistia import WistiaIE
|
||||||
@ -1355,10 +1324,6 @@ from .xiami import (
|
|||||||
XiamiArtistIE,
|
XiamiArtistIE,
|
||||||
XiamiCollectionIE
|
XiamiCollectionIE
|
||||||
)
|
)
|
||||||
from .ximalaya import (
|
|
||||||
XimalayaIE,
|
|
||||||
XimalayaAlbumIE
|
|
||||||
)
|
|
||||||
from .xminus import XMinusIE
|
from .xminus import XMinusIE
|
||||||
from .xnxx import XNXXIE
|
from .xnxx import XNXXIE
|
||||||
from .xstream import XstreamIE
|
from .xstream import XstreamIE
|
||||||
@ -1376,7 +1341,6 @@ from .yandexmusic import (
|
|||||||
YandexMusicPlaylistIE,
|
YandexMusicPlaylistIE,
|
||||||
)
|
)
|
||||||
from .yandexdisk import YandexDiskIE
|
from .yandexdisk import YandexDiskIE
|
||||||
from .yapfiles import YapFilesIE
|
|
||||||
from .yesjapan import YesJapanIE
|
from .yesjapan import YesJapanIE
|
||||||
from .yinyuetai import YinYueTaiIE
|
from .yinyuetai import YinYueTaiIE
|
||||||
from .ynet import YnetIE
|
from .ynet import YnetIE
|
||||||
|
@ -33,7 +33,7 @@ class FranceInterIE(InfoExtractor):
|
|||||||
description = self._og_search_description(webpage)
|
description = self._og_search_description(webpage)
|
||||||
|
|
||||||
upload_date_str = self._search_regex(
|
upload_date_str = self._search_regex(
|
||||||
r'class=["\']\s*cover-emission-period\s*["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
|
r'class=["\']cover-emission-period["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
|
||||||
webpage, 'upload date', fatal=False)
|
webpage, 'upload date', fatal=False)
|
||||||
if upload_date_str:
|
if upload_date_str:
|
||||||
upload_date_list = upload_date_str.split()
|
upload_date_list = upload_date_str.split()
|
||||||
|
@ -5,89 +5,19 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import compat_urlparse
|
||||||
compat_str,
|
|
||||||
compat_urlparse,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
determine_ext,
|
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
try_get,
|
determine_ext,
|
||||||
)
|
)
|
||||||
from .dailymotion import DailymotionIE
|
from .dailymotion import DailymotionIE
|
||||||
|
|
||||||
|
|
||||||
class FranceTVBaseInfoExtractor(InfoExtractor):
|
class FranceTVBaseInfoExtractor(InfoExtractor):
|
||||||
def _make_url_result(self, video_or_full_id, catalog=None):
|
|
||||||
full_id = 'francetv:%s' % video_or_full_id
|
|
||||||
if '@' not in video_or_full_id and catalog:
|
|
||||||
full_id += '@%s' % catalog
|
|
||||||
return self.url_result(
|
|
||||||
full_id, ie=FranceTVIE.ie_key(),
|
|
||||||
video_id=video_or_full_id.split('@')[0])
|
|
||||||
|
|
||||||
|
|
||||||
class FranceTVIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'''(?x)
|
|
||||||
(?:
|
|
||||||
https?://
|
|
||||||
sivideo\.webservices\.francetelevisions\.fr/tools/getInfosOeuvre/v2/\?
|
|
||||||
.*?\bidDiffusion=[^&]+|
|
|
||||||
(?:
|
|
||||||
https?://videos\.francetv\.fr/video/|
|
|
||||||
francetv:
|
|
||||||
)
|
|
||||||
(?P<id>[^@]+)(?:@(?P<catalog>.+))?
|
|
||||||
)
|
|
||||||
'''
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
# without catalog
|
|
||||||
'url': 'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/?idDiffusion=162311093&callback=_jsonp_loader_callback_request_0',
|
|
||||||
'md5': 'c2248a8de38c4e65ea8fae7b5df2d84f',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '162311093',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '13h15, le dimanche... - Les mystères de Jésus',
|
|
||||||
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
|
|
||||||
'timestamp': 1502623500,
|
|
||||||
'upload_date': '20170813',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
# with catalog
|
|
||||||
'url': 'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/?idDiffusion=NI_1004933&catalogue=Zouzous&callback=_jsonp_loader_callback_request_4',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'http://videos.francetv.fr/video/NI_657393@Regions',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'francetv:162311093',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'francetv:NI_1004933@Zouzous',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'francetv:NI_983319@Info-web',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'francetv:NI_983319',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'francetv:NI_657393@Regions',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
# france-3 live
|
|
||||||
'url': 'francetv:SIM_France3',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _extract_video(self, video_id, catalogue=None):
|
def _extract_video(self, video_id, catalogue=None):
|
||||||
# Videos are identified by idDiffusion so catalogue part is optional.
|
|
||||||
# However when provided, some extra formats may be returned so we pass
|
|
||||||
# it if available.
|
|
||||||
info = self._download_json(
|
info = self._download_json(
|
||||||
'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/',
|
'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/',
|
||||||
video_id, 'Downloading video JSON', query={
|
video_id, 'Downloading video JSON', query={
|
||||||
@ -97,8 +27,7 @@ class FranceTVIE(InfoExtractor):
|
|||||||
|
|
||||||
if info.get('status') == 'NOK':
|
if info.get('status') == 'NOK':
|
||||||
raise ExtractorError(
|
raise ExtractorError(
|
||||||
'%s returned error: %s' % (self.IE_NAME, info['message']),
|
'%s returned error: %s' % (self.IE_NAME, info['message']), expected=True)
|
||||||
expected=True)
|
|
||||||
allowed_countries = info['videos'][0].get('geoblocage')
|
allowed_countries = info['videos'][0].get('geoblocage')
|
||||||
if allowed_countries:
|
if allowed_countries:
|
||||||
georestricted = True
|
georestricted = True
|
||||||
@ -113,21 +42,6 @@ class FranceTVIE(InfoExtractor):
|
|||||||
else:
|
else:
|
||||||
georestricted = False
|
georestricted = False
|
||||||
|
|
||||||
def sign(manifest_url, manifest_id):
|
|
||||||
for host in ('hdfauthftv-a.akamaihd.net', 'hdfauth.francetv.fr'):
|
|
||||||
signed_url = self._download_webpage(
|
|
||||||
'https://%s/esi/TA' % host, video_id,
|
|
||||||
'Downloading signed %s manifest URL' % manifest_id,
|
|
||||||
fatal=False, query={
|
|
||||||
'url': manifest_url,
|
|
||||||
})
|
|
||||||
if (signed_url and isinstance(signed_url, compat_str) and
|
|
||||||
re.search(r'^(?:https?:)?//', signed_url)):
|
|
||||||
return signed_url
|
|
||||||
return manifest_url
|
|
||||||
|
|
||||||
is_live = None
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for video in info['videos']:
|
for video in info['videos']:
|
||||||
if video['statut'] != 'ONLINE':
|
if video['statut'] != 'ONLINE':
|
||||||
@ -135,10 +49,6 @@ class FranceTVIE(InfoExtractor):
|
|||||||
video_url = video['url']
|
video_url = video['url']
|
||||||
if not video_url:
|
if not video_url:
|
||||||
continue
|
continue
|
||||||
if is_live is None:
|
|
||||||
is_live = (try_get(
|
|
||||||
video, lambda x: x['plages_ouverture'][0]['direct'],
|
|
||||||
bool) is True) or '/live.francetv.fr/' in video_url
|
|
||||||
format_id = video['format']
|
format_id = video['format']
|
||||||
ext = determine_ext(video_url)
|
ext = determine_ext(video_url)
|
||||||
if ext == 'f4m':
|
if ext == 'f4m':
|
||||||
@ -146,14 +56,17 @@ class FranceTVIE(InfoExtractor):
|
|||||||
# See https://github.com/rg3/youtube-dl/issues/3963
|
# See https://github.com/rg3/youtube-dl/issues/3963
|
||||||
# m3u8 urls work fine
|
# m3u8 urls work fine
|
||||||
continue
|
continue
|
||||||
|
f4m_url = self._download_webpage(
|
||||||
|
'http://hdfauth.francetv.fr/esi/TA?url=%s' % video_url,
|
||||||
|
video_id, 'Downloading f4m manifest token', fatal=False)
|
||||||
|
if f4m_url:
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
sign(video_url, format_id) + '&hdcore=3.7.0&plugin=aasp-3.7.0.39.44',
|
f4m_url + '&hdcore=3.7.0&plugin=aasp-3.7.0.39.44',
|
||||||
video_id, f4m_id=format_id, fatal=False))
|
video_id, f4m_id=format_id, fatal=False))
|
||||||
elif ext == 'm3u8':
|
elif ext == 'm3u8':
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
sign(video_url, format_id), video_id, 'mp4',
|
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
entry_protocol='m3u8_native', m3u8_id=format_id,
|
m3u8_id=format_id, fatal=False))
|
||||||
fatal=False))
|
|
||||||
elif video_url.startswith('rtmp'):
|
elif video_url.startswith('rtmp'):
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': video_url,
|
'url': video_url,
|
||||||
@ -184,48 +97,33 @@ class FranceTVIE(InfoExtractor):
|
|||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': self._live_title(title) if is_live else title,
|
'title': title,
|
||||||
'description': clean_html(info['synopsis']),
|
'description': clean_html(info['synopsis']),
|
||||||
'thumbnail': compat_urlparse.urljoin('http://pluzz.francetv.fr', info['image']),
|
'thumbnail': compat_urlparse.urljoin('http://pluzz.francetv.fr', info['image']),
|
||||||
'duration': int_or_none(info.get('real_duration')) or parse_duration(info['duree']),
|
'duration': int_or_none(info.get('real_duration')) or parse_duration(info['duree']),
|
||||||
'timestamp': int_or_none(info['diffusion']['timestamp']),
|
'timestamp': int_or_none(info['diffusion']['timestamp']),
|
||||||
'is_live': is_live,
|
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
video_id = mobj.group('id')
|
|
||||||
catalog = mobj.group('catalog')
|
|
||||||
|
|
||||||
if not video_id:
|
class FranceTVIE(FranceTVBaseInfoExtractor):
|
||||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
|
||||||
video_id = qs.get('idDiffusion', [None])[0]
|
|
||||||
catalog = qs.get('catalogue', [None])[0]
|
|
||||||
if not video_id:
|
|
||||||
raise ExtractorError('Invalid URL', expected=True)
|
|
||||||
|
|
||||||
return self._extract_video(video_id, catalog)
|
|
||||||
|
|
||||||
|
|
||||||
class FranceTVSiteIE(FranceTVBaseInfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html'
|
_VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html',
|
'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '162311093',
|
'id': '157550144',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '13h15, le dimanche... - Les mystères de Jésus',
|
'title': '13h15, le dimanche... - Les mystères de Jésus',
|
||||||
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
|
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
|
||||||
'timestamp': 1502623500,
|
'timestamp': 1494156300,
|
||||||
'upload_date': '20170813',
|
'upload_date': '20170507',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
|
# m3u8 downloads
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': [FranceTVIE.ie_key()],
|
|
||||||
}, {
|
}, {
|
||||||
# france3
|
# france3
|
||||||
'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html',
|
'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html',
|
||||||
@ -258,10 +156,6 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.france.tv/142749-rouge-sang.html',
|
'url': 'https://www.france.tv/142749-rouge-sang.html',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
# france-3 live
|
|
||||||
'url': 'https://www.france.tv/france-3/direct.html',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -278,14 +172,13 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
|
|||||||
video_id, catalogue = self._html_search_regex(
|
video_id, catalogue = self._html_search_regex(
|
||||||
r'(?:href=|player\.setVideo\(\s*)"http://videos?\.francetv\.fr/video/([^@]+@[^"]+)"',
|
r'(?:href=|player\.setVideo\(\s*)"http://videos?\.francetv\.fr/video/([^@]+@[^"]+)"',
|
||||||
webpage, 'video ID').split('@')
|
webpage, 'video ID').split('@')
|
||||||
|
return self._extract_video(video_id, catalogue)
|
||||||
return self._make_url_result(video_id, catalogue)
|
|
||||||
|
|
||||||
|
|
||||||
class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
|
class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
|
||||||
_VALID_URL = r'https?://embed\.francetv\.fr/*\?.*?\bue=(?P<id>[^&]+)'
|
_VALID_URL = r'https?://embed\.francetv\.fr/*\?.*?\bue=(?P<id>[^&]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TEST = {
|
||||||
'url': 'http://embed.francetv.fr/?ue=7fd581a2ccf59d2fc5719c5c13cf6961',
|
'url': 'http://embed.francetv.fr/?ue=7fd581a2ccf59d2fc5719c5c13cf6961',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'NI_983319',
|
'id': 'NI_983319',
|
||||||
@ -295,11 +188,7 @@ class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
|
|||||||
'timestamp': 1493981780,
|
'timestamp': 1493981780,
|
||||||
'duration': 16,
|
'duration': 16,
|
||||||
},
|
},
|
||||||
'params': {
|
}
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'add_ie': [FranceTVIE.ie_key()],
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
@ -308,12 +197,12 @@ class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
|
|||||||
'http://api-embed.webservices.francetelevisions.fr/key/%s' % video_id,
|
'http://api-embed.webservices.francetelevisions.fr/key/%s' % video_id,
|
||||||
video_id)
|
video_id)
|
||||||
|
|
||||||
return self._make_url_result(video['video_id'], video.get('catalog'))
|
return self._extract_video(video['video_id'], video.get('catalog'))
|
||||||
|
|
||||||
|
|
||||||
class FranceTVInfoIE(FranceTVBaseInfoExtractor):
|
class FranceTVInfoIE(FranceTVBaseInfoExtractor):
|
||||||
IE_NAME = 'francetvinfo.fr'
|
IE_NAME = 'francetvinfo.fr'
|
||||||
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&.]+)'
|
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<title>[^/?#&.]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
|
'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
|
||||||
@ -328,18 +217,51 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
|
# m3u8 downloads
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': [FranceTVIE.ie_key()],
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html',
|
'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html',
|
||||||
'only_matching': True,
|
'info_dict': {
|
||||||
|
'id': 'EV_20019',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Débat des candidats à la Commission européenne',
|
||||||
|
'description': 'Débat des candidats à la Commission européenne',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': 'HLS (reqires ffmpeg)'
|
||||||
|
},
|
||||||
|
'skip': 'Ce direct est terminé et sera disponible en rattrapage dans quelques minutes.',
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html',
|
'url': 'http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html',
|
||||||
'only_matching': True,
|
'md5': 'f485bda6e185e7d15dbc69b72bae993e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'NI_173343',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Les entreprises familiales : le secret de la réussite',
|
||||||
|
'thumbnail': r're:^https?://.*\.jpe?g$',
|
||||||
|
'timestamp': 1433273139,
|
||||||
|
'upload_date': '20150602',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# m3u8 downloads
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://france3-regions.francetvinfo.fr/bretagne/cotes-d-armor/thalassa-echappee-breizh-ce-venredi-dans-les-cotes-d-armor-954961.html',
|
'url': 'http://france3-regions.francetvinfo.fr/bretagne/cotes-d-armor/thalassa-echappee-breizh-ce-venredi-dans-les-cotes-d-armor-954961.html',
|
||||||
'only_matching': True,
|
'md5': 'f485bda6e185e7d15dbc69b72bae993e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'NI_657393',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Olivier Monthus, réalisateur de "Bretagne, le choix de l’Armor"',
|
||||||
|
'description': 'md5:a3264114c9d29aeca11ced113c37b16c',
|
||||||
|
'thumbnail': r're:^https?://.*\.jpe?g$',
|
||||||
|
'timestamp': 1458300695,
|
||||||
|
'upload_date': '20160318',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
# Dailymotion embed
|
# Dailymotion embed
|
||||||
'url': 'http://www.francetvinfo.fr/politique/notre-dame-des-landes/video-sur-france-inter-cecile-duflot-denonce-le-regard-meprisant-de-patrick-cohen_1520091.html',
|
'url': 'http://www.francetvinfo.fr/politique/notre-dame-des-landes/video-sur-france-inter-cecile-duflot-denonce-le-regard-meprisant-de-patrick-cohen_1520091.html',
|
||||||
@ -361,9 +283,9 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
|
|||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
page_title = mobj.group('title')
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, page_title)
|
||||||
|
|
||||||
dailymotion_urls = DailymotionIE._extract_urls(webpage)
|
dailymotion_urls = DailymotionIE._extract_urls(webpage)
|
||||||
if dailymotion_urls:
|
if dailymotion_urls:
|
||||||
@ -375,13 +297,12 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
|
|||||||
(r'id-video=([^@]+@[^"]+)',
|
(r'id-video=([^@]+@[^"]+)',
|
||||||
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"'),
|
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"'),
|
||||||
webpage, 'video id').split('@')
|
webpage, 'video id').split('@')
|
||||||
|
return self._extract_video(video_id, catalogue)
|
||||||
return self._make_url_result(video_id, catalogue)
|
|
||||||
|
|
||||||
|
|
||||||
class GenerationWhatIE(InfoExtractor):
|
class GenerationWhatIE(InfoExtractor):
|
||||||
IE_NAME = 'france2.fr:generation-what'
|
IE_NAME = 'france2.fr:generation-what'
|
||||||
_VALID_URL = r'https?://generation-what\.francetv\.fr/[^/]+/video/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://generation-what\.francetv\.fr/[^/]+/video/(?P<id>[^/?#]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://generation-what.francetv.fr/portrait/video/present-arms',
|
'url': 'http://generation-what.francetv.fr/portrait/video/present-arms',
|
||||||
@ -393,10 +314,6 @@ class GenerationWhatIE(InfoExtractor):
|
|||||||
'uploader_id': 'UCHH9p1eetWCgt4kXBYCb3_w',
|
'uploader_id': 'UCHH9p1eetWCgt4kXBYCb3_w',
|
||||||
'upload_date': '20160411',
|
'upload_date': '20160411',
|
||||||
},
|
},
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'add_ie': ['Youtube'],
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://generation-what.francetv.fr/europe/video/present-arms',
|
'url': 'http://generation-what.francetv.fr/europe/video/present-arms',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -404,87 +321,42 @@ class GenerationWhatIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
youtube_id = self._search_regex(
|
youtube_id = self._search_regex(
|
||||||
r"window\.videoURL\s*=\s*'([0-9A-Za-z_-]{11})';",
|
r"window\.videoURL\s*=\s*'([0-9A-Za-z_-]{11})';",
|
||||||
webpage, 'youtube id')
|
webpage, 'youtube id')
|
||||||
|
return self.url_result(youtube_id, 'Youtube', youtube_id)
|
||||||
return self.url_result(youtube_id, ie='Youtube', video_id=youtube_id)
|
|
||||||
|
|
||||||
|
|
||||||
class CultureboxIE(FranceTVBaseInfoExtractor):
|
class CultureboxIE(FranceTVBaseInfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
IE_NAME = 'culturebox.francetvinfo.fr'
|
||||||
|
_VALID_URL = r'https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?P<name>.*?)(\?|$)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TEST = {
|
||||||
'url': 'https://culturebox.francetvinfo.fr/opera-classique/musique-classique/c-est-baroque/concerts/cantates-bwv-4-106-et-131-de-bach-par-raphael-pichon-57-268689',
|
'url': 'http://culturebox.francetvinfo.fr/live/musique/musique-classique/le-livre-vermeil-de-montserrat-a-la-cathedrale-delne-214511',
|
||||||
|
'md5': '9b88dc156781c4dbebd4c3e066e0b1d6',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'EV_134885',
|
'id': 'EV_50111',
|
||||||
'ext': 'mp4',
|
'ext': 'flv',
|
||||||
'title': 'Cantates BWV 4, 106 et 131 de Bach par Raphaël Pichon 5/7',
|
'title': "Le Livre Vermeil de Montserrat à la Cathédrale d'Elne",
|
||||||
'description': 'md5:19c44af004b88219f4daa50fa9a351d4',
|
'description': 'md5:f8a4ad202e8fe533e2c493cc12e739d9',
|
||||||
'upload_date': '20180206',
|
'upload_date': '20150320',
|
||||||
'timestamp': 1517945220,
|
'timestamp': 1426892400,
|
||||||
'duration': 5981,
|
'duration': 2760.9,
|
||||||
},
|
},
|
||||||
'params': {
|
}
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'add_ie': [FranceTVIE.ie_key()],
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
name = mobj.group('name')
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, name)
|
||||||
|
|
||||||
if ">Ce live n'est plus disponible en replay<" in webpage:
|
if ">Ce live n'est plus disponible en replay<" in webpage:
|
||||||
raise ExtractorError(
|
raise ExtractorError('Video %s is not available' % name, expected=True)
|
||||||
'Video %s is not available' % display_id, expected=True)
|
|
||||||
|
|
||||||
video_id, catalogue = self._search_regex(
|
video_id, catalogue = self._search_regex(
|
||||||
r'["\'>]https?://videos\.francetv\.fr/video/([^@]+@.+?)["\'<]',
|
r'["\'>]https?://videos\.francetv\.fr/video/([^@]+@.+?)["\'<]',
|
||||||
webpage, 'video id').split('@')
|
webpage, 'video id').split('@')
|
||||||
|
|
||||||
return self._make_url_result(video_id, catalogue)
|
return self._extract_video(video_id, catalogue)
|
||||||
|
|
||||||
|
|
||||||
class FranceTVJeunesseIE(FranceTVBaseInfoExtractor):
|
|
||||||
_VALID_URL = r'(?P<url>https?://(?:www\.)?(?:zouzous|ludo)\.fr/heros/(?P<id>[^/?#&]+))'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.zouzous.fr/heros/simon',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'simon',
|
|
||||||
},
|
|
||||||
'playlist_count': 9,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.ludo.fr/heros/ninjago',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'ninjago',
|
|
||||||
},
|
|
||||||
'playlist_count': 10,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.zouzous.fr/heros/simon?abc',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
playlist_id = mobj.group('id')
|
|
||||||
|
|
||||||
playlist = self._download_json(
|
|
||||||
'%s/%s' % (mobj.group('url'), 'playlist'), playlist_id)
|
|
||||||
|
|
||||||
if not playlist.get('count'):
|
|
||||||
raise ExtractorError(
|
|
||||||
'%s is not available' % playlist_id, expected=True)
|
|
||||||
|
|
||||||
entries = []
|
|
||||||
for item in playlist['items']:
|
|
||||||
identity = item.get('identity')
|
|
||||||
if identity and isinstance(identity, compat_str):
|
|
||||||
entries.append(self._make_url_result(identity))
|
|
||||||
|
|
||||||
return self.playlist_result(entries, playlist_id)
|
|
||||||
|
@ -1,102 +1,43 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .nexx import NexxIE
|
from .nexx import NexxIE
|
||||||
from ..utils import int_or_none
|
from ..utils import extract_attributes
|
||||||
|
|
||||||
|
|
||||||
class FunkBaseIE(InfoExtractor):
|
class FunkIE(InfoExtractor):
|
||||||
def _make_url_result(self, video):
|
_VALID_URL = r'https?://(?:www\.)?funk\.net/(?:mix|channel)/(?:[^/]+/)*(?P<id>[^?/#]+)'
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'url': 'nexx:741:%s' % video['sourceId'],
|
|
||||||
'ie_key': NexxIE.ie_key(),
|
|
||||||
'id': video['sourceId'],
|
|
||||||
'title': video.get('title'),
|
|
||||||
'description': video.get('description'),
|
|
||||||
'duration': int_or_none(video.get('duration')),
|
|
||||||
'season_number': int_or_none(video.get('seasonNr')),
|
|
||||||
'episode_number': int_or_none(video.get('episodeNr')),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class FunkMixIE(FunkBaseIE):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?funk\.net/mix/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/die-realste-kifferdoku-aller-zeiten',
|
'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/0/59d517e741dca10001252574/',
|
||||||
'md5': '8edf617c2f2b7c9847dfda313f199009',
|
'md5': '4d40974481fa3475f8bccfd20c5361f8',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '123748',
|
'id': '716599',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '"Die realste Kifferdoku aller Zeiten"',
|
'title': 'Neue Rechte Welle',
|
||||||
'description': 'md5:c97160f5bafa8d47ec8e2e461012aa9d',
|
'description': 'md5:a30a53f740ffb6bfd535314c2cc5fb69',
|
||||||
'timestamp': 1490274721,
|
'timestamp': 1501337639,
|
||||||
'upload_date': '20170323',
|
'upload_date': '20170729',
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
mix_id = mobj.group('id')
|
|
||||||
alias = mobj.group('alias')
|
|
||||||
|
|
||||||
lists = self._download_json(
|
|
||||||
'https://www.funk.net/api/v3.1/curation/curatedLists/',
|
|
||||||
mix_id, headers={
|
|
||||||
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbC12Mi4wIiwic2NvcGUiOiJzdGF0aWMtY29udGVudC1hcGksY3VyYXRpb24tc2VydmljZSxzZWFyY2gtYXBpIn0.SGCC1IXHLtZYoo8PvRKlU2gXH1su8YSu47sB3S4iXBI',
|
|
||||||
'Referer': url,
|
|
||||||
}, query={
|
|
||||||
'size': 100,
|
|
||||||
})['result']['lists']
|
|
||||||
|
|
||||||
metas = next(
|
|
||||||
l for l in lists
|
|
||||||
if mix_id in (l.get('entityId'), l.get('alias')))['videoMetas']
|
|
||||||
video = next(
|
|
||||||
meta['videoDataDelegate']
|
|
||||||
for meta in metas if meta.get('alias') == alias)
|
|
||||||
|
|
||||||
return self._make_url_result(video)
|
|
||||||
|
|
||||||
|
|
||||||
class FunkChannelIE(FunkBaseIE):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?funk\.net/channel/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.funk.net/channel/ba/die-lustigsten-instrumente-aus-dem-internet-teil-2',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1155821',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Die LUSTIGSTEN INSTRUMENTE aus dem Internet - Teil 2',
|
|
||||||
'description': 'md5:a691d0413ef4835588c5b03ded670c1f',
|
|
||||||
'timestamp': 1514507395,
|
|
||||||
'upload_date': '20171229',
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
|
'format': 'bestvideo',
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/mein-erster-job-lovemilla-folge-1/lovemilla/',
|
'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/0/59d52049999264000182e79d/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
channel_id = mobj.group('id')
|
|
||||||
alias = mobj.group('alias')
|
|
||||||
|
|
||||||
results = self._download_json(
|
webpage = self._download_webpage(url, video_id)
|
||||||
'https://www.funk.net/api/v3.0/content/videos/filter', channel_id,
|
|
||||||
headers={
|
|
||||||
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbCIsInNjb3BlIjoic3RhdGljLWNvbnRlbnQtYXBpLGN1cmF0aW9uLWFwaSxzZWFyY2gtYXBpIn0.q4Y2xZG8PFHai24-4Pjx2gym9RmJejtmK6lMXP5wAgc',
|
|
||||||
'Referer': url,
|
|
||||||
}, query={
|
|
||||||
'channelId': channel_id,
|
|
||||||
'size': 100,
|
|
||||||
})['result']
|
|
||||||
|
|
||||||
video = next(r for r in results if r.get('alias') == alias)
|
domain_id = NexxIE._extract_domain_id(webpage) or '741'
|
||||||
|
nexx_id = extract_attributes(self._search_regex(
|
||||||
|
r'(<div[^>]id=["\']mediaplayer-funk[^>]+>)',
|
||||||
|
webpage, 'media player'))['data-id']
|
||||||
|
|
||||||
return self._make_url_result(video)
|
return self.url_result(
|
||||||
|
'nexx:%s:%s' % (domain_id, nexx_id), ie=NexxIE.ie_key(),
|
||||||
|
video_id=nexx_id)
|
||||||
|
@ -5,9 +5,9 @@ from .ooyala import OoyalaIE
|
|||||||
|
|
||||||
|
|
||||||
class FusionIE(InfoExtractor):
|
class FusionIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?fusion\.(?:net|tv)/video/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?fusion\.net/video/(?P<id>\d+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://fusion.tv/video/201781/u-s-and-panamanian-forces-work-together-to-stop-a-vessel-smuggling-drugs/',
|
'url': 'http://fusion.net/video/201781/u-s-and-panamanian-forces-work-together-to-stop-a-vessel-smuggling-drugs/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'ZpcWNoMTE6x6uVIIWYpHh0qQDjxBuq5P',
|
'id': 'ZpcWNoMTE6x6uVIIWYpHh0qQDjxBuq5P',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
@ -20,7 +20,7 @@ class FusionIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
'add_ie': ['Ooyala'],
|
'add_ie': ['Ooyala'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://fusion.tv/video/201781',
|
'url': 'http://fusion.net/video/201781',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
@ -23,11 +23,6 @@ class GameInformerIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(url, display_id)
|
||||||
url, display_id, headers=self.geo_verification_headers())
|
brightcove_id = self._search_regex(r"getVideo\('[^']+video_id=(\d+)", webpage, 'brightcove id')
|
||||||
brightcove_id = self._search_regex(
|
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||||
[r'<[^>]+\bid=["\']bc_(\d+)', r"getVideo\('[^']+video_id=(\d+)"],
|
|
||||||
webpage, 'brightcove id')
|
|
||||||
return self.url_result(
|
|
||||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew',
|
|
||||||
brightcove_id)
|
|
||||||
|
@ -1,8 +1,6 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -11,52 +9,44 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class GameStarIE(InfoExtractor):
|
class GameStarIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?game(?P<site>pro|star)\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
_VALID_URL = r'https?://(?:www\.)?gamestar\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||||
_TESTS = [{
|
_TEST = {
|
||||||
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
||||||
'md5': 'ee782f1f8050448c95c5cacd63bc851c',
|
'md5': '96974ecbb7fd8d0d20fca5a00810cea7',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '76110',
|
'id': '76110',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
||||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'timestamp': 1406542380,
|
'timestamp': 1406542020,
|
||||||
'upload_date': '20140728',
|
'upload_date': '20140728',
|
||||||
'duration': 17,
|
'duration': 17
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}, {
|
|
||||||
'url': 'http://www.gamepro.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.gamestar.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
site = mobj.group('site')
|
|
||||||
video_id = mobj.group('id')
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id
|
||||||
|
|
||||||
# TODO: there are multiple ld+json objects in the webpage,
|
# TODO: there are multiple ld+json objects in the webpage,
|
||||||
# while _search_json_ld finds only the first one
|
# while _search_json_ld finds only the first one
|
||||||
json_ld = self._parse_json(self._search_regex(
|
json_ld = self._parse_json(self._search_regex(
|
||||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
||||||
webpage, 'JSON-LD', group='json_ld'), video_id)
|
webpage, 'JSON-LD', group='json_ld'), video_id)
|
||||||
info_dict = self._json_ld(json_ld, video_id)
|
info_dict = self._json_ld(json_ld, video_id)
|
||||||
info_dict['title'] = remove_end(
|
info_dict['title'] = remove_end(info_dict['title'], ' - GameStar')
|
||||||
info_dict['title'], ' - Game%s' % site.title())
|
|
||||||
|
|
||||||
view_count = int_or_none(json_ld.get('interactionCount'))
|
view_count = json_ld.get('interactionCount')
|
||||||
comment_count = int_or_none(self._html_search_regex(
|
comment_count = int_or_none(self._html_search_regex(
|
||||||
r'<span>Kommentare</span>\s*<span[^>]+class=["\']count[^>]+>\s*\(\s*([0-9]+)',
|
r'([0-9]+) Kommentare</span>', webpage, 'comment_count',
|
||||||
webpage, 'comment count', fatal=False))
|
fatal=False))
|
||||||
|
|
||||||
info_dict.update({
|
info_dict.update({
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'url': 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id,
|
'url': url,
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'view_count': view_count,
|
'view_count': view_count,
|
||||||
'comment_count': comment_count
|
'comment_count': comment_count
|
||||||
|
@ -101,10 +101,6 @@ from .vzaar import VzaarIE
|
|||||||
from .channel9 import Channel9IE
|
from .channel9 import Channel9IE
|
||||||
from .vshare import VShareIE
|
from .vshare import VShareIE
|
||||||
from .mediasite import MediasiteIE
|
from .mediasite import MediasiteIE
|
||||||
from .springboardplatform import SpringboardPlatformIE
|
|
||||||
from .yapfiles import YapFilesIE
|
|
||||||
from .vice import ViceIE
|
|
||||||
from .xfileshare import XFileShareIE
|
|
||||||
|
|
||||||
|
|
||||||
class GenericIE(InfoExtractor):
|
class GenericIE(InfoExtractor):
|
||||||
@ -1270,6 +1266,24 @@ class GenericIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
'add_ie': ['Kaltura'],
|
'add_ie': ['Kaltura'],
|
||||||
},
|
},
|
||||||
|
# EaglePlatform embed (generic URL)
|
||||||
|
{
|
||||||
|
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
|
||||||
|
# Not checking MD5 as sometimes the direct HTTP link results in 404 and HLS is used
|
||||||
|
'info_dict': {
|
||||||
|
'id': '227304',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Навальный вышел на свободу',
|
||||||
|
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
|
||||||
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
|
'duration': 87,
|
||||||
|
'view_count': int,
|
||||||
|
'age_limit': 0,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
},
|
||||||
# referrer protected EaglePlatform embed
|
# referrer protected EaglePlatform embed
|
||||||
{
|
{
|
||||||
'url': 'https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/',
|
'url': 'https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/',
|
||||||
@ -1924,49 +1938,6 @@ class GenericIE(InfoExtractor):
|
|||||||
'timestamp': 1474354800,
|
'timestamp': 1474354800,
|
||||||
'upload_date': '20160920',
|
'upload_date': '20160920',
|
||||||
}
|
}
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://www.kidzworld.com/article/30935-trolls-the-beat-goes-on-interview-skylar-astin-and-amanda-leighton',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1731611',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Official Trailer | TROLLS: THE BEAT GOES ON!',
|
|
||||||
'description': 'md5:eb5f23826a027ba95277d105f248b825',
|
|
||||||
'timestamp': 1516100691,
|
|
||||||
'upload_date': '20180116',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'add_ie': [SpringboardPlatformIE.ie_key()],
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'https://www.youtube.com/shared?ci=1nEzmT-M4fU',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'uPDB5I9wfp8',
|
|
||||||
'ext': 'webm',
|
|
||||||
'title': 'Pocoyo: 90 minutos de episódios completos Português para crianças - PARTE 3',
|
|
||||||
'description': 'md5:d9e4d9346a2dfff4c7dc4c8cec0f546d',
|
|
||||||
'upload_date': '20160219',
|
|
||||||
'uploader': 'Pocoyo - Português (BR)',
|
|
||||||
'uploader_id': 'PocoyoBrazil',
|
|
||||||
},
|
|
||||||
'add_ie': [YoutubeIE.ie_key()],
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'vMDE4NzI1Mjgt690b',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Котята',
|
|
||||||
},
|
|
||||||
'add_ie': [YapFilesIE.ie_key()],
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
# {
|
# {
|
||||||
# # TODO: find another test
|
# # TODO: find another test
|
||||||
@ -2214,11 +2185,7 @@ class GenericIE(InfoExtractor):
|
|||||||
self._sort_formats(smil['formats'])
|
self._sort_formats(smil['formats'])
|
||||||
return smil
|
return smil
|
||||||
elif doc.tag == '{http://xspf.org/ns/0/}playlist':
|
elif doc.tag == '{http://xspf.org/ns/0/}playlist':
|
||||||
return self.playlist_result(
|
return self.playlist_result(self._parse_xspf(doc, video_id), video_id)
|
||||||
self._parse_xspf(
|
|
||||||
doc, video_id, xspf_url=url,
|
|
||||||
xspf_base_url=compat_str(full_response.geturl())),
|
|
||||||
video_id)
|
|
||||||
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
|
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
|
||||||
info_dict['formats'] = self._parse_mpd_formats(
|
info_dict['formats'] = self._parse_mpd_formats(
|
||||||
doc,
|
doc,
|
||||||
@ -2297,10 +2264,7 @@ class GenericIE(InfoExtractor):
|
|||||||
# Look for Brightcove New Studio embeds
|
# Look for Brightcove New Studio embeds
|
||||||
bc_urls = BrightcoveNewIE._extract_urls(self, webpage)
|
bc_urls = BrightcoveNewIE._extract_urls(self, webpage)
|
||||||
if bc_urls:
|
if bc_urls:
|
||||||
return self.playlist_from_matches(
|
return self.playlist_from_matches(bc_urls, video_id, video_title, ie='BrightcoveNew')
|
||||||
bc_urls, video_id, video_title,
|
|
||||||
getter=lambda x: smuggle_url(x, {'referrer': url}),
|
|
||||||
ie='BrightcoveNew')
|
|
||||||
|
|
||||||
# Look for Nexx embeds
|
# Look for Nexx embeds
|
||||||
nexx_urls = NexxIE._extract_urls(webpage)
|
nexx_urls = NexxIE._extract_urls(webpage)
|
||||||
@ -2744,9 +2708,9 @@ class GenericIE(InfoExtractor):
|
|||||||
return self.url_result(viewlift_url)
|
return self.url_result(viewlift_url)
|
||||||
|
|
||||||
# Look for JWPlatform embeds
|
# Look for JWPlatform embeds
|
||||||
jwplatform_urls = JWPlatformIE._extract_urls(webpage)
|
jwplatform_url = JWPlatformIE._extract_url(webpage)
|
||||||
if jwplatform_urls:
|
if jwplatform_url:
|
||||||
return self.playlist_from_matches(jwplatform_urls, video_id, video_title, ie=JWPlatformIE.ie_key())
|
return self.url_result(jwplatform_url, 'JWPlatform')
|
||||||
|
|
||||||
# Look for Digiteka embeds
|
# Look for Digiteka embeds
|
||||||
digiteka_url = DigitekaIE._extract_url(webpage)
|
digiteka_url = DigitekaIE._extract_url(webpage)
|
||||||
@ -2942,27 +2906,6 @@ class GenericIE(InfoExtractor):
|
|||||||
for mediasite_url in mediasite_urls]
|
for mediasite_url in mediasite_urls]
|
||||||
return self.playlist_result(entries, video_id, video_title)
|
return self.playlist_result(entries, video_id, video_title)
|
||||||
|
|
||||||
springboardplatform_urls = SpringboardPlatformIE._extract_urls(webpage)
|
|
||||||
if springboardplatform_urls:
|
|
||||||
return self.playlist_from_matches(
|
|
||||||
springboardplatform_urls, video_id, video_title,
|
|
||||||
ie=SpringboardPlatformIE.ie_key())
|
|
||||||
|
|
||||||
yapfiles_urls = YapFilesIE._extract_urls(webpage)
|
|
||||||
if yapfiles_urls:
|
|
||||||
return self.playlist_from_matches(
|
|
||||||
yapfiles_urls, video_id, video_title, ie=YapFilesIE.ie_key())
|
|
||||||
|
|
||||||
vice_urls = ViceIE._extract_urls(webpage)
|
|
||||||
if vice_urls:
|
|
||||||
return self.playlist_from_matches(
|
|
||||||
vice_urls, video_id, video_title, ie=ViceIE.ie_key())
|
|
||||||
|
|
||||||
xfileshare_urls = XFileShareIE._extract_urls(webpage)
|
|
||||||
if xfileshare_urls:
|
|
||||||
return self.playlist_from_matches(
|
|
||||||
xfileshare_urls, video_id, video_title, ie=XFileShareIE.ie_key())
|
|
||||||
|
|
||||||
def merge_dicts(dict1, dict2):
|
def merge_dicts(dict1, dict2):
|
||||||
merged = {}
|
merged = {}
|
||||||
for k, v in dict1.items():
|
for k, v in dict1.items():
|
||||||
|
@ -2,14 +2,11 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .kaltura import KalturaIE
|
|
||||||
from .youtube import YoutubeIE
|
from .youtube import YoutubeIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
NO_DEFAULT,
|
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
smuggle_url,
|
|
||||||
xpath_text,
|
xpath_text,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -17,19 +14,18 @@ from ..utils import (
|
|||||||
class HeiseIE(InfoExtractor):
|
class HeiseIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?heise\.de/(?:[^/]+/)+[^/]+-(?P<id>[0-9]+)\.html'
|
_VALID_URL = r'https?://(?:www\.)?heise\.de/(?:[^/]+/)+[^/]+-(?P<id>[0-9]+)\.html'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# kaltura embed
|
|
||||||
'url': 'http://www.heise.de/video/artikel/Podcast-c-t-uplink-3-3-Owncloud-Tastaturen-Peilsender-Smartphone-2404147.html',
|
'url': 'http://www.heise.de/video/artikel/Podcast-c-t-uplink-3-3-Owncloud-Tastaturen-Peilsender-Smartphone-2404147.html',
|
||||||
|
'md5': 'ffed432483e922e88545ad9f2f15d30e',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1_kkrq94sm',
|
'id': '2404147',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': "Podcast: c't uplink 3.3 – Owncloud / Tastaturen / Peilsender Smartphone",
|
'title': "Podcast: c't uplink 3.3 – Owncloud / Tastaturen / Peilsender Smartphone",
|
||||||
'timestamp': 1512734959,
|
'format_id': 'mp4_720p',
|
||||||
'upload_date': '20171208',
|
'timestamp': 1411812600,
|
||||||
|
'upload_date': '20140927',
|
||||||
'description': 'md5:c934cbfb326c669c2bcabcbe3d3fcd20',
|
'description': 'md5:c934cbfb326c669c2bcabcbe3d3fcd20',
|
||||||
},
|
'thumbnail': r're:^https?://.*/gallery/$',
|
||||||
'params': {
|
}
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
# YouTube embed
|
# YouTube embed
|
||||||
'url': 'http://www.heise.de/newsticker/meldung/Netflix-In-20-Jahren-vom-Videoverleih-zum-TV-Revolutionaer-3814130.html',
|
'url': 'http://www.heise.de/newsticker/meldung/Netflix-In-20-Jahren-vom-Videoverleih-zum-TV-Revolutionaer-3814130.html',
|
||||||
@ -46,32 +42,6 @@ class HeiseIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'https://www.heise.de/video/artikel/nachgehakt-Wie-sichert-das-c-t-Tool-Restric-tor-Windows-10-ab-3700244.html',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1_ntrmio2s',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': "nachgehakt: Wie sichert das c't-Tool Restric'tor Windows 10 ab?",
|
|
||||||
'description': 'md5:47e8ffb6c46d85c92c310a512d6db271',
|
|
||||||
'timestamp': 1512470717,
|
|
||||||
'upload_date': '20171205',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.heise.de/ct/artikel/c-t-uplink-20-8-Staubsaugerroboter-Xiaomi-Vacuum-2-AR-Brille-Meta-2-und-Android-rooten-3959893.html',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1_59mk80sf',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': "c't uplink 20.8: Staubsaugerroboter Xiaomi Vacuum 2, AR-Brille Meta 2 und Android rooten",
|
|
||||||
'description': 'md5:f50fe044d3371ec73a8f79fcebd74afc',
|
|
||||||
'timestamp': 1517567237,
|
|
||||||
'upload_date': '20180202',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.heise.de/ct/artikel/c-t-uplink-3-3-Owncloud-Tastaturen-Peilsender-Smartphone-2403911.html',
|
'url': 'http://www.heise.de/ct/artikel/c-t-uplink-3-3-Owncloud-Tastaturen-Peilsender-Smartphone-2403911.html',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -87,45 +57,19 @@ class HeiseIE(InfoExtractor):
|
|||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
def extract_title(default=NO_DEFAULT):
|
title = self._html_search_meta('fulltitle', webpage, default=None)
|
||||||
title = self._html_search_meta(
|
|
||||||
('fulltitle', 'title'), webpage, default=None)
|
|
||||||
if not title or title == "c't":
|
if not title or title == "c't":
|
||||||
title = self._search_regex(
|
title = self._search_regex(
|
||||||
r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
|
r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
|
||||||
webpage, 'title', default=None)
|
webpage, 'title')
|
||||||
if not title:
|
|
||||||
title = self._html_search_regex(
|
|
||||||
r'<h1[^>]+\bclass=["\']article_page_title[^>]+>(.+?)<',
|
|
||||||
webpage, 'title', default=default)
|
|
||||||
return title
|
|
||||||
|
|
||||||
title = extract_title(default=None)
|
|
||||||
description = self._og_search_description(
|
|
||||||
webpage, default=None) or self._html_search_meta(
|
|
||||||
'description', webpage)
|
|
||||||
|
|
||||||
kaltura_url = KalturaIE._extract_url(webpage)
|
|
||||||
if kaltura_url:
|
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'url': smuggle_url(kaltura_url, {'source_url': url}),
|
|
||||||
'ie_key': KalturaIE.ie_key(),
|
|
||||||
'title': title,
|
|
||||||
'description': description,
|
|
||||||
}
|
|
||||||
|
|
||||||
yt_urls = YoutubeIE._extract_urls(webpage)
|
yt_urls = YoutubeIE._extract_urls(webpage)
|
||||||
if yt_urls:
|
if yt_urls:
|
||||||
return self.playlist_from_matches(
|
return self.playlist_from_matches(yt_urls, video_id, title, ie=YoutubeIE.ie_key())
|
||||||
yt_urls, video_id, title, ie=YoutubeIE.ie_key())
|
|
||||||
|
|
||||||
title = extract_title()
|
|
||||||
|
|
||||||
container_id = self._search_regex(
|
container_id = self._search_regex(
|
||||||
r'<div class="videoplayerjw"[^>]+data-container="([0-9]+)"',
|
r'<div class="videoplayerjw"[^>]+data-container="([0-9]+)"',
|
||||||
webpage, 'container ID')
|
webpage, 'container ID')
|
||||||
|
|
||||||
sequenz_id = self._search_regex(
|
sequenz_id = self._search_regex(
|
||||||
r'<div class="videoplayerjw"[^>]+data-sequenz="([0-9]+)"',
|
r'<div class="videoplayerjw"[^>]+data-sequenz="([0-9]+)"',
|
||||||
webpage, 'sequenz ID')
|
webpage, 'sequenz ID')
|
||||||
@ -151,6 +95,10 @@ class HeiseIE(InfoExtractor):
|
|||||||
})
|
})
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
description = self._og_search_description(
|
||||||
|
webpage, default=None) or self._html_search_meta(
|
||||||
|
'description', webpage)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
|
@ -1,96 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..compat import compat_str
|
|
||||||
from ..utils import (
|
|
||||||
ExtractorError,
|
|
||||||
int_or_none,
|
|
||||||
urlencode_postdata,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class HiDiveIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?hidive\.com/stream/(?P<title>[^/]+)/(?P<key>[^/?#&]+)'
|
|
||||||
# Using X-Forwarded-For results in 403 HTTP error for HLS fragments,
|
|
||||||
# so disabling geo bypass completely
|
|
||||||
_GEO_BYPASS = False
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.hidive.com/stream/the-comic-artist-and-his-assistants/s01e001',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'the-comic-artist-and-his-assistants/s01e001',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'the-comic-artist-and-his-assistants/s01e001',
|
|
||||||
'series': 'the-comic-artist-and-his-assistants',
|
|
||||||
'season_number': 1,
|
|
||||||
'episode_number': 1,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
title, key = mobj.group('title', 'key')
|
|
||||||
video_id = '%s/%s' % (title, key)
|
|
||||||
|
|
||||||
settings = self._download_json(
|
|
||||||
'https://www.hidive.com/play/settings', video_id,
|
|
||||||
data=urlencode_postdata({
|
|
||||||
'Title': title,
|
|
||||||
'Key': key,
|
|
||||||
}))
|
|
||||||
|
|
||||||
restriction = settings.get('restrictionReason')
|
|
||||||
if restriction == 'RegionRestricted':
|
|
||||||
self.raise_geo_restricted()
|
|
||||||
|
|
||||||
if restriction and restriction != 'None':
|
|
||||||
raise ExtractorError(
|
|
||||||
'%s said: %s' % (self.IE_NAME, restriction), expected=True)
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
subtitles = {}
|
|
||||||
for rendition_id, rendition in settings['renditions'].items():
|
|
||||||
bitrates = rendition.get('bitrates')
|
|
||||||
if not isinstance(bitrates, dict):
|
|
||||||
continue
|
|
||||||
m3u8_url = bitrates.get('hls')
|
|
||||||
if not isinstance(m3u8_url, compat_str):
|
|
||||||
continue
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
|
||||||
m3u8_id='%s-hls' % rendition_id, fatal=False))
|
|
||||||
cc_files = rendition.get('ccFiles')
|
|
||||||
if not isinstance(cc_files, list):
|
|
||||||
continue
|
|
||||||
for cc_file in cc_files:
|
|
||||||
if not isinstance(cc_file, list) or len(cc_file) < 3:
|
|
||||||
continue
|
|
||||||
cc_lang = cc_file[0]
|
|
||||||
cc_url = cc_file[2]
|
|
||||||
if not isinstance(cc_lang, compat_str) or not isinstance(
|
|
||||||
cc_url, compat_str):
|
|
||||||
continue
|
|
||||||
subtitles.setdefault(cc_lang, []).append({
|
|
||||||
'url': cc_url,
|
|
||||||
})
|
|
||||||
|
|
||||||
season_number = int_or_none(self._search_regex(
|
|
||||||
r's(\d+)', key, 'season number', default=None))
|
|
||||||
episode_number = int_or_none(self._search_regex(
|
|
||||||
r'e(\d+)', key, 'episode number', default=None))
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': video_id,
|
|
||||||
'subtitles': subtitles,
|
|
||||||
'formats': formats,
|
|
||||||
'series': title,
|
|
||||||
'season_number': season_number,
|
|
||||||
'episode_number': episode_number,
|
|
||||||
}
|
|
@ -1,7 +1,8 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_b64decode
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
HEADRequest,
|
HEADRequest,
|
||||||
@ -47,7 +48,7 @@ class HotNewHipHopIE(InfoExtractor):
|
|||||||
if 'mediaKey' not in mkd:
|
if 'mediaKey' not in mkd:
|
||||||
raise ExtractorError('Did not get a media key')
|
raise ExtractorError('Did not get a media key')
|
||||||
|
|
||||||
redirect_url = compat_b64decode(video_url_base64).decode('utf-8')
|
redirect_url = base64.b64decode(video_url_base64).decode('utf-8')
|
||||||
redirect_req = HEADRequest(redirect_url)
|
redirect_req = HEADRequest(redirect_url)
|
||||||
req = self._request_webpage(
|
req = self._request_webpage(
|
||||||
redirect_req, video_id,
|
redirect_req, video_id,
|
||||||
|
@ -2,8 +2,9 @@
|
|||||||
|
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
|
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_urllib_parse_unquote,
|
compat_urllib_parse_unquote,
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
)
|
)
|
||||||
@ -60,7 +61,7 @@ class InfoQIE(BokeCCBaseIE):
|
|||||||
encoded_id = self._search_regex(
|
encoded_id = self._search_regex(
|
||||||
r"jsclassref\s*=\s*'([^']*)'", webpage, 'encoded id', default=None)
|
r"jsclassref\s*=\s*'([^']*)'", webpage, 'encoded id', default=None)
|
||||||
|
|
||||||
real_id = compat_urllib_parse_unquote(compat_b64decode(encoded_id).decode('utf-8'))
|
real_id = compat_urllib_parse_unquote(base64.b64decode(encoded_id.encode('ascii')).decode('utf-8'))
|
||||||
playpath = 'mp4:' + real_id
|
playpath = 'mp4:' + real_id
|
||||||
|
|
||||||
return [{
|
return [{
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import itertools
|
import itertools
|
||||||
import json
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
@ -239,34 +238,36 @@ class InstagramUserIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
|
|
||||||
def _entries(self, uploader_id):
|
def _entries(self, uploader_id):
|
||||||
def get_count(suffix):
|
query = {
|
||||||
|
'__a': 1,
|
||||||
|
}
|
||||||
|
|
||||||
|
def get_count(kind):
|
||||||
return int_or_none(try_get(
|
return int_or_none(try_get(
|
||||||
node, lambda x: x['edge_media_' + suffix]['count']))
|
node, lambda x: x['%ss' % kind]['count']))
|
||||||
|
|
||||||
cursor = ''
|
|
||||||
for page_num in itertools.count(1):
|
for page_num in itertools.count(1):
|
||||||
media = self._download_json(
|
page = self._download_json(
|
||||||
'https://www.instagram.com/graphql/query/', uploader_id,
|
'https://instagram.com/%s/' % uploader_id, uploader_id,
|
||||||
'Downloading JSON page %d' % page_num, query={
|
note='Downloading page %d' % page_num,
|
||||||
'query_hash': '472f257a40c653c64c666ce877d59d2b',
|
fatal=False, query=query)
|
||||||
'variables': json.dumps({
|
if not page:
|
||||||
'id': uploader_id,
|
|
||||||
'first': 100,
|
|
||||||
'after': cursor,
|
|
||||||
})
|
|
||||||
})['data']['user']['edge_owner_to_timeline_media']
|
|
||||||
|
|
||||||
edges = media.get('edges')
|
|
||||||
if not edges or not isinstance(edges, list):
|
|
||||||
break
|
break
|
||||||
|
|
||||||
for edge in edges:
|
nodes = try_get(page, lambda x: x['user']['media']['nodes'], list)
|
||||||
node = edge.get('node')
|
if not nodes:
|
||||||
if not node or not isinstance(node, dict):
|
break
|
||||||
continue
|
|
||||||
|
max_id = None
|
||||||
|
|
||||||
|
for node in nodes:
|
||||||
|
node_id = node.get('id')
|
||||||
|
if node_id:
|
||||||
|
max_id = node_id
|
||||||
|
|
||||||
if node.get('__typename') != 'GraphVideo' and node.get('is_video') is not True:
|
if node.get('__typename') != 'GraphVideo' and node.get('is_video') is not True:
|
||||||
continue
|
continue
|
||||||
video_id = node.get('shortcode')
|
video_id = node.get('code')
|
||||||
if not video_id:
|
if not video_id:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
@ -275,14 +276,14 @@ class InstagramUserIE(InfoExtractor):
|
|||||||
ie=InstagramIE.ie_key(), video_id=video_id)
|
ie=InstagramIE.ie_key(), video_id=video_id)
|
||||||
|
|
||||||
description = try_get(
|
description = try_get(
|
||||||
node, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
|
node, [lambda x: x['caption'], lambda x: x['text']['id']],
|
||||||
compat_str)
|
compat_str)
|
||||||
thumbnail = node.get('thumbnail_src') or node.get('display_src')
|
thumbnail = node.get('thumbnail_src') or node.get('display_src')
|
||||||
timestamp = int_or_none(node.get('taken_at_timestamp'))
|
timestamp = int_or_none(node.get('date'))
|
||||||
|
|
||||||
comment_count = get_count('to_comment')
|
comment_count = get_count('comment')
|
||||||
like_count = get_count('preview_like')
|
like_count = get_count('like')
|
||||||
view_count = int_or_none(node.get('video_view_count'))
|
view_count = int_or_none(node.get('video_views'))
|
||||||
|
|
||||||
info.update({
|
info.update({
|
||||||
'description': description,
|
'description': description,
|
||||||
@ -295,23 +296,12 @@ class InstagramUserIE(InfoExtractor):
|
|||||||
|
|
||||||
yield info
|
yield info
|
||||||
|
|
||||||
page_info = media.get('page_info')
|
if not max_id:
|
||||||
if not page_info or not isinstance(page_info, dict):
|
|
||||||
break
|
break
|
||||||
|
|
||||||
has_next_page = page_info.get('has_next_page')
|
query['max_id'] = max_id
|
||||||
if not has_next_page:
|
|
||||||
break
|
|
||||||
|
|
||||||
cursor = page_info.get('end_cursor')
|
|
||||||
if not cursor or not isinstance(cursor, compat_str):
|
|
||||||
break
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
username = self._match_id(url)
|
uploader_id = self._match_id(url)
|
||||||
uploader_id = self._download_json(
|
|
||||||
'https://instagram.com/%s/' % username, username, query={
|
|
||||||
'__a': 1,
|
|
||||||
})['graphql']['user']['id']
|
|
||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
self._entries(uploader_id), username, username)
|
self._entries(uploader_id), uploader_id, uploader_id)
|
||||||
|
@ -23,14 +23,11 @@ class JWPlatformIE(InfoExtractor):
|
|||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _extract_url(webpage):
|
def _extract_url(webpage):
|
||||||
urls = JWPlatformIE._extract_urls(webpage)
|
mobj = re.search(
|
||||||
return urls[0] if urls else None
|
r'<(?:script|iframe)[^>]+?src=["\'](?P<url>(?:https?:)?//content.jwplatform.com/players/[a-zA-Z0-9]{8})',
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _extract_urls(webpage):
|
|
||||||
return re.findall(
|
|
||||||
r'<(?:script|iframe)[^>]+?src=["\']((?:https?:)?//content\.jwplatform\.com/players/[a-zA-Z0-9]{8})',
|
|
||||||
webpage)
|
webpage)
|
||||||
|
if mobj:
|
||||||
|
return mobj.group('url')
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
71
youtube_dl/extractor/kamcord.py
Normal file
71
youtube_dl/extractor/kamcord.py
Normal file
@ -0,0 +1,71 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
qualities,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class KamcordIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?kamcord\.com/v/(?P<id>[^/?#&]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.kamcord.com/v/hNYRduDgWb4',
|
||||||
|
'md5': 'c3180e8a9cfac2e86e1b88cb8751b54c',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'hNYRduDgWb4',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Drinking Madness',
|
||||||
|
'uploader': 'jacksfilms',
|
||||||
|
'uploader_id': '3044562',
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
video = self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
r'window\.__props\s*=\s*({.+?});?(?:\n|\s*</script)',
|
||||||
|
webpage, 'video'),
|
||||||
|
video_id)['video']
|
||||||
|
|
||||||
|
title = video['title']
|
||||||
|
|
||||||
|
formats = self._extract_m3u8_formats(
|
||||||
|
video['play']['hls'], video_id, 'mp4', entry_protocol='m3u8_native')
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
uploader = video.get('user', {}).get('username')
|
||||||
|
uploader_id = video.get('user', {}).get('id')
|
||||||
|
|
||||||
|
view_count = int_or_none(video.get('viewCount'))
|
||||||
|
like_count = int_or_none(video.get('heartCount'))
|
||||||
|
comment_count = int_or_none(video.get('messageCount'))
|
||||||
|
|
||||||
|
preference_key = qualities(('small', 'medium', 'large'))
|
||||||
|
|
||||||
|
thumbnails = [{
|
||||||
|
'url': thumbnail_url,
|
||||||
|
'id': thumbnail_id,
|
||||||
|
'preference': preference_key(thumbnail_id),
|
||||||
|
} for thumbnail_id, thumbnail_url in (video.get('thumbnail') or {}).items()
|
||||||
|
if isinstance(thumbnail_id, compat_str) and isinstance(thumbnail_url, compat_str)]
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'uploader': uploader,
|
||||||
|
'uploader_id': uploader_id,
|
||||||
|
'view_count': view_count,
|
||||||
|
'like_count': like_count,
|
||||||
|
'comment_count': comment_count,
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -49,9 +49,7 @@ class LA7IE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
player_data = self._parse_json(
|
player_data = self._parse_json(
|
||||||
self._search_regex(
|
self._search_regex(r'videoLa7\(({[^;]+})\);', webpage, 'player data'),
|
||||||
[r'(?s)videoParams\s*=\s*({.+?});', r'videoLa7\(({[^;]+})\);'],
|
|
||||||
webpage, 'player data'),
|
|
||||||
video_id, transform_source=js_to_json)
|
video_id, transform_source=js_to_json)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import datetime
|
import datetime
|
||||||
import hashlib
|
import hashlib
|
||||||
import re
|
import re
|
||||||
@ -8,7 +9,6 @@ import time
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_ord,
|
compat_ord,
|
||||||
compat_str,
|
compat_str,
|
||||||
compat_urllib_parse_urlencode,
|
compat_urllib_parse_urlencode,
|
||||||
@ -329,7 +329,7 @@ class LetvCloudIE(InfoExtractor):
|
|||||||
raise ExtractorError('Letv cloud returned an unknwon error')
|
raise ExtractorError('Letv cloud returned an unknwon error')
|
||||||
|
|
||||||
def b64decode(s):
|
def b64decode(s):
|
||||||
return compat_b64decode(s).decode('utf-8')
|
return base64.b64decode(s.encode('utf-8')).decode('utf-8')
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for media in play_json['data']['video_info']['media'].values():
|
for media in play_json['data']['video_info']['media'].values():
|
||||||
|
@ -1,53 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
|
|
||||||
|
|
||||||
class LentaIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?lenta\.ru/[^/]+/\d+/\d+/\d+/(?P<id>[^/?#&]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://lenta.ru/news/2018/03/22/savshenko_go/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '964400',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Надежду Савченко задержали',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 61,
|
|
||||||
'view_count': int,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
# EaglePlatform iframe embed
|
|
||||||
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '227304',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Навальный вышел на свободу',
|
|
||||||
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 87,
|
|
||||||
'view_count': int,
|
|
||||||
'age_limit': 0,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
|
||||||
|
|
||||||
video_id = self._search_regex(
|
|
||||||
r'vid\s*:\s*["\']?(\d+)', webpage, 'eagleplatform id',
|
|
||||||
default=None)
|
|
||||||
if video_id:
|
|
||||||
return self.url_result(
|
|
||||||
'eagleplatform:lentaru.media.eagleplatform.com:%s' % video_id,
|
|
||||||
ie='EaglePlatform', video_id=video_id)
|
|
||||||
|
|
||||||
return self.url_result(url, ie='Generic')
|
|
@ -1,28 +1,24 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import unified_strdate
|
||||||
parse_duration,
|
|
||||||
unified_strdate,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class LibsynIE(InfoExtractor):
|
class LibsynIE(InfoExtractor):
|
||||||
_VALID_URL = r'(?P<mainurl>https?://html5-player\.libsyn\.com/embed/episode/id/(?P<id>[0-9]+))'
|
_VALID_URL = r'(?P<mainurl>https?://html5-player\.libsyn\.com/embed/episode/id/(?P<id>[0-9]+))'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://html5-player.libsyn.com/embed/episode/id/6385796/',
|
'url': 'http://html5-player.libsyn.com/embed/episode/id/3377616/',
|
||||||
'md5': '2a55e75496c790cdeb058e7e6c087746',
|
'md5': '443360ee1b58007bc3dcf09b41d093bb',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '6385796',
|
'id': '3377616',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': "Champion Minded - Developing a Growth Mindset",
|
'title': "The Daily Show Podcast without Jon Stewart - Episode 12: Bassem Youssef: Egypt's Jon Stewart",
|
||||||
'description': 'In this episode, Allistair talks about the importance of developing a growth mindset, not only in sports, but in life too.',
|
'description': 'md5:601cb790edd05908957dae8aaa866465',
|
||||||
'upload_date': '20180320',
|
'upload_date': '20150220',
|
||||||
'thumbnail': 're:^https?://.*',
|
'thumbnail': 're:^https?://.*',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
@ -43,45 +39,31 @@ class LibsynIE(InfoExtractor):
|
|||||||
url = m.group('mainurl')
|
url = m.group('mainurl')
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
formats = [{
|
||||||
|
'url': media_url,
|
||||||
|
} for media_url in set(re.findall(r'var\s+mediaURL(?:Libsyn)?\s*=\s*"([^"]+)"', webpage))]
|
||||||
|
|
||||||
podcast_title = self._search_regex(
|
podcast_title = self._search_regex(
|
||||||
r'<h3>([^<]+)</h3>', webpage, 'podcast title', default=None)
|
r'<h2>([^<]+)</h2>', webpage, 'podcast title', default=None)
|
||||||
if podcast_title:
|
|
||||||
podcast_title = podcast_title.strip()
|
|
||||||
episode_title = self._search_regex(
|
episode_title = self._search_regex(
|
||||||
r'(?:<div class="episode-title">|<h4>)([^<]+)</', webpage, 'episode title')
|
r'(?:<div class="episode-title">|<h3>)([^<]+)</', webpage, 'episode title')
|
||||||
if episode_title:
|
|
||||||
episode_title = episode_title.strip()
|
|
||||||
|
|
||||||
title = '%s - %s' % (podcast_title, episode_title) if podcast_title else episode_title
|
title = '%s - %s' % (podcast_title, episode_title) if podcast_title else episode_title
|
||||||
|
|
||||||
description = self._html_search_regex(
|
description = self._html_search_regex(
|
||||||
r'<p\s+id="info_text_body">(.+?)</p>', webpage,
|
r'<div id="info_text_body">(.+?)</div>', webpage,
|
||||||
'description', default=None)
|
'description', default=None)
|
||||||
if description:
|
thumbnail = self._search_regex(
|
||||||
# Strip non-breaking and normal spaces
|
r'<img[^>]+class="info-show-icon"[^>]+src="([^"]+)"',
|
||||||
description = description.replace('\u00A0', ' ').strip()
|
webpage, 'thumbnail', fatal=False)
|
||||||
release_date = unified_strdate(self._search_regex(
|
release_date = unified_strdate(self._search_regex(
|
||||||
r'<div class="release_date">Released: ([^<]+)<', webpage, 'release date', fatal=False))
|
r'<div class="release_date">Released: ([^<]+)<', webpage, 'release date', fatal=False))
|
||||||
|
|
||||||
data_json = self._search_regex(r'var\s+playlistItem\s*=\s*(\{.*?\});\n', webpage, 'JSON data block')
|
|
||||||
data = json.loads(data_json)
|
|
||||||
|
|
||||||
formats = [{
|
|
||||||
'url': data['media_url'],
|
|
||||||
'format_id': 'main',
|
|
||||||
}, {
|
|
||||||
'url': data['media_url_libsyn'],
|
|
||||||
'format_id': 'libsyn',
|
|
||||||
}]
|
|
||||||
thumbnail = data.get('thumbnail_url')
|
|
||||||
duration = parse_duration(data.get('duration'))
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': description,
|
'description': description,
|
||||||
'thumbnail': thumbnail,
|
'thumbnail': thumbnail,
|
||||||
'upload_date': release_date,
|
'upload_date': release_date,
|
||||||
'duration': duration,
|
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
}
|
}
|
||||||
|
@ -10,7 +10,6 @@ from ..utils import (
|
|||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
try_get,
|
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
)
|
)
|
||||||
@ -221,12 +220,6 @@ class LimelightBaseIE(InfoExtractor):
|
|||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
def _extract_info_helper(self, pc, mobile, i, metadata):
|
|
||||||
return self._extract_info(
|
|
||||||
try_get(pc, lambda x: x['playlistItems'][i]['streams'], list) or [],
|
|
||||||
try_get(mobile, lambda x: x['mediaList'][i]['mobileUrls'], list) or [],
|
|
||||||
metadata)
|
|
||||||
|
|
||||||
|
|
||||||
class LimelightMediaIE(LimelightBaseIE):
|
class LimelightMediaIE(LimelightBaseIE):
|
||||||
IE_NAME = 'limelight'
|
IE_NAME = 'limelight'
|
||||||
@ -289,7 +282,10 @@ class LimelightMediaIE(LimelightBaseIE):
|
|||||||
'getMobilePlaylistByMediaId', 'properties',
|
'getMobilePlaylistByMediaId', 'properties',
|
||||||
smuggled_data.get('source_url'))
|
smuggled_data.get('source_url'))
|
||||||
|
|
||||||
return self._extract_info_helper(pc, mobile, 0, metadata)
|
return self._extract_info(
|
||||||
|
pc['playlistItems'][0].get('streams', []),
|
||||||
|
mobile['mediaList'][0].get('mobileUrls', []) if mobile else [],
|
||||||
|
metadata)
|
||||||
|
|
||||||
|
|
||||||
class LimelightChannelIE(LimelightBaseIE):
|
class LimelightChannelIE(LimelightBaseIE):
|
||||||
@ -330,7 +326,10 @@ class LimelightChannelIE(LimelightBaseIE):
|
|||||||
'media', smuggled_data.get('source_url'))
|
'media', smuggled_data.get('source_url'))
|
||||||
|
|
||||||
entries = [
|
entries = [
|
||||||
self._extract_info_helper(pc, mobile, i, medias['media_list'][i])
|
self._extract_info(
|
||||||
|
pc['playlistItems'][i].get('streams', []),
|
||||||
|
mobile['mediaList'][i].get('mobileUrls', []) if mobile else [],
|
||||||
|
medias['media_list'][i])
|
||||||
for i in range(len(medias['media_list']))]
|
for i in range(len(medias['media_list']))]
|
||||||
|
|
||||||
return self.playlist_result(entries, channel_id, pc['title'])
|
return self.playlist_result(entries, channel_id, pc['title'])
|
||||||
|
@ -1,90 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import js_to_json
|
|
||||||
|
|
||||||
|
|
||||||
class LineTVIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://tv\.line\.me/v/(?P<id>\d+)_[^/]+-(?P<segment>ep\d+-\d+)'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://tv.line.me/v/793123_goodbye-mrblack-ep1-1/list/69246',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '793123_ep1-1',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Goodbye Mr.Black | EP.1-1',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 998.509,
|
|
||||||
'view_count': int,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://tv.line.me/v/2587507_%E6%B4%BE%E9%81%A3%E5%A5%B3%E9%86%ABx-ep1-02/list/185245',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
series_id, segment = re.match(self._VALID_URL, url).groups()
|
|
||||||
video_id = '%s_%s' % (series_id, segment)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
|
|
||||||
player_params = self._parse_json(self._search_regex(
|
|
||||||
r'naver\.WebPlayer\(({[^}]+})\)', webpage, 'player parameters'),
|
|
||||||
video_id, transform_source=js_to_json)
|
|
||||||
|
|
||||||
video_info = self._download_json(
|
|
||||||
'https://global-nvapis.line.me/linetv/rmcnmv/vod_play_videoInfo.json',
|
|
||||||
video_id, query={
|
|
||||||
'videoId': player_params['videoId'],
|
|
||||||
'key': player_params['key'],
|
|
||||||
})
|
|
||||||
|
|
||||||
stream = video_info['streams'][0]
|
|
||||||
extra_query = '?__gda__=' + stream['key']['value']
|
|
||||||
formats = self._extract_m3u8_formats(
|
|
||||||
stream['source'] + extra_query, video_id, ext='mp4',
|
|
||||||
entry_protocol='m3u8_native', m3u8_id='hls')
|
|
||||||
|
|
||||||
for a_format in formats:
|
|
||||||
a_format['url'] += extra_query
|
|
||||||
|
|
||||||
duration = None
|
|
||||||
for video in video_info.get('videos', {}).get('list', []):
|
|
||||||
encoding_option = video.get('encodingOption', {})
|
|
||||||
abr = video['bitrate']['audio']
|
|
||||||
vbr = video['bitrate']['video']
|
|
||||||
tbr = abr + vbr
|
|
||||||
formats.append({
|
|
||||||
'url': video['source'],
|
|
||||||
'format_id': 'http-%d' % int(tbr),
|
|
||||||
'height': encoding_option.get('height'),
|
|
||||||
'width': encoding_option.get('width'),
|
|
||||||
'abr': abr,
|
|
||||||
'vbr': vbr,
|
|
||||||
'filesize': video.get('size'),
|
|
||||||
})
|
|
||||||
if video.get('duration') and duration is None:
|
|
||||||
duration = video['duration']
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
if not formats[0].get('width'):
|
|
||||||
formats[0]['vcodec'] = 'none'
|
|
||||||
|
|
||||||
title = self._og_search_title(webpage)
|
|
||||||
|
|
||||||
# like_count requires an additional API request https://tv.line.me/api/likeit/getCount
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'formats': formats,
|
|
||||||
'extra_param_to_segment_url': extra_query[1:],
|
|
||||||
'duration': duration,
|
|
||||||
'thumbnails': [{'url': thumbnail['source']}
|
|
||||||
for thumbnail in video_info.get('thumbnails', {}).get('list', [])],
|
|
||||||
'view_count': video_info.get('meta', {}).get('count'),
|
|
||||||
}
|
|
@ -1,17 +1,12 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import itertools
|
|
||||||
import json
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_urllib_parse_unquote
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_duration,
|
|
||||||
remove_end,
|
remove_end,
|
||||||
try_get,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -162,153 +157,3 @@ class MailRuIE(InfoExtractor):
|
|||||||
'view_count': view_count,
|
'view_count': view_count,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class MailRuMusicSearchBaseIE(InfoExtractor):
|
|
||||||
def _search(self, query, url, audio_id, limit=100, offset=0):
|
|
||||||
search = self._download_json(
|
|
||||||
'https://my.mail.ru/cgi-bin/my/ajax', audio_id,
|
|
||||||
'Downloading songs JSON page %d' % (offset // limit + 1),
|
|
||||||
headers={
|
|
||||||
'Referer': url,
|
|
||||||
'X-Requested-With': 'XMLHttpRequest',
|
|
||||||
}, query={
|
|
||||||
'xemail': '',
|
|
||||||
'ajax_call': '1',
|
|
||||||
'func_name': 'music.search',
|
|
||||||
'mna': '',
|
|
||||||
'mnb': '',
|
|
||||||
'arg_query': query,
|
|
||||||
'arg_extended': '1',
|
|
||||||
'arg_search_params': json.dumps({
|
|
||||||
'music': {
|
|
||||||
'limit': limit,
|
|
||||||
'offset': offset,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
'arg_limit': limit,
|
|
||||||
'arg_offset': offset,
|
|
||||||
})
|
|
||||||
return next(e for e in search if isinstance(e, dict))
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _extract_track(t, fatal=True):
|
|
||||||
audio_url = t['URL'] if fatal else t.get('URL')
|
|
||||||
if not audio_url:
|
|
||||||
return
|
|
||||||
|
|
||||||
audio_id = t['File'] if fatal else t.get('File')
|
|
||||||
if not audio_id:
|
|
||||||
return
|
|
||||||
|
|
||||||
thumbnail = t.get('AlbumCoverURL') or t.get('FiledAlbumCover')
|
|
||||||
uploader = t.get('OwnerName') or t.get('OwnerName_Text_HTML')
|
|
||||||
uploader_id = t.get('UploaderID')
|
|
||||||
duration = int_or_none(t.get('DurationInSeconds')) or parse_duration(
|
|
||||||
t.get('Duration') or t.get('DurationStr'))
|
|
||||||
view_count = int_or_none(t.get('PlayCount') or t.get('PlayCount_hr'))
|
|
||||||
|
|
||||||
track = t.get('Name') or t.get('Name_Text_HTML')
|
|
||||||
artist = t.get('Author') or t.get('Author_Text_HTML')
|
|
||||||
|
|
||||||
if track:
|
|
||||||
title = '%s - %s' % (artist, track) if artist else track
|
|
||||||
else:
|
|
||||||
title = audio_id
|
|
||||||
|
|
||||||
return {
|
|
||||||
'extractor_key': MailRuMusicIE.ie_key(),
|
|
||||||
'id': audio_id,
|
|
||||||
'title': title,
|
|
||||||
'thumbnail': thumbnail,
|
|
||||||
'uploader': uploader,
|
|
||||||
'uploader_id': uploader_id,
|
|
||||||
'duration': duration,
|
|
||||||
'view_count': view_count,
|
|
||||||
'vcodec': 'none',
|
|
||||||
'abr': int_or_none(t.get('BitRate')),
|
|
||||||
'track': track,
|
|
||||||
'artist': artist,
|
|
||||||
'album': t.get('Album'),
|
|
||||||
'url': audio_url,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class MailRuMusicIE(MailRuMusicSearchBaseIE):
|
|
||||||
IE_NAME = 'mailru:music'
|
|
||||||
IE_DESC = 'Музыка@Mail.Ru'
|
|
||||||
_VALID_URL = r'https?://my\.mail\.ru/music/songs/[^/?#&]+-(?P<id>[\da-f]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://my.mail.ru/music/songs/%D0%BC8%D0%BB8%D1%82%D1%85-l-a-h-luciferian-aesthetics-of-herrschaft-single-2017-4e31f7125d0dfaef505d947642366893',
|
|
||||||
'md5': '0f8c22ef8c5d665b13ac709e63025610',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '4e31f7125d0dfaef505d947642366893',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'title': 'L.A.H. (Luciferian Aesthetics of Herrschaft) single, 2017 - М8Л8ТХ',
|
|
||||||
'uploader': 'Игорь Мудрый',
|
|
||||||
'uploader_id': '1459196328',
|
|
||||||
'duration': 280,
|
|
||||||
'view_count': int,
|
|
||||||
'vcodec': 'none',
|
|
||||||
'abr': 320,
|
|
||||||
'track': 'L.A.H. (Luciferian Aesthetics of Herrschaft) single, 2017',
|
|
||||||
'artist': 'М8Л8ТХ',
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
audio_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, audio_id)
|
|
||||||
|
|
||||||
title = self._og_search_title(webpage)
|
|
||||||
music_data = self._search(title, url, audio_id)['MusicData']
|
|
||||||
t = next(t for t in music_data if t.get('File') == audio_id)
|
|
||||||
|
|
||||||
info = self._extract_track(t)
|
|
||||||
info['title'] = title
|
|
||||||
return info
|
|
||||||
|
|
||||||
|
|
||||||
class MailRuMusicSearchIE(MailRuMusicSearchBaseIE):
|
|
||||||
IE_NAME = 'mailru:music:search'
|
|
||||||
IE_DESC = 'Музыка@Mail.Ru'
|
|
||||||
_VALID_URL = r'https?://my\.mail\.ru/music/search/(?P<id>[^/?#&]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://my.mail.ru/music/search/black%20shadow',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'black shadow',
|
|
||||||
},
|
|
||||||
'playlist_mincount': 532,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
query = compat_urllib_parse_unquote(self._match_id(url))
|
|
||||||
|
|
||||||
entries = []
|
|
||||||
|
|
||||||
LIMIT = 100
|
|
||||||
offset = 0
|
|
||||||
|
|
||||||
for _ in itertools.count(1):
|
|
||||||
search = self._search(query, url, query, LIMIT, offset)
|
|
||||||
|
|
||||||
music_data = search.get('MusicData')
|
|
||||||
if not music_data or not isinstance(music_data, list):
|
|
||||||
break
|
|
||||||
|
|
||||||
for t in music_data:
|
|
||||||
track = self._extract_track(t, fatal=False)
|
|
||||||
if track:
|
|
||||||
entries.append(track)
|
|
||||||
|
|
||||||
total = try_get(
|
|
||||||
search, lambda x: x['Results']['music']['Total'], int)
|
|
||||||
|
|
||||||
if total is not None:
|
|
||||||
if offset > total:
|
|
||||||
break
|
|
||||||
|
|
||||||
offset += LIMIT
|
|
||||||
|
|
||||||
return self.playlist_result(entries, query)
|
|
||||||
|
@ -1,12 +1,13 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import compat_urllib_parse_unquote
|
||||||
compat_b64decode,
|
from ..utils import (
|
||||||
compat_urllib_parse_unquote,
|
int_or_none,
|
||||||
)
|
)
|
||||||
from ..utils import int_or_none
|
|
||||||
|
|
||||||
|
|
||||||
class MangomoloBaseIE(InfoExtractor):
|
class MangomoloBaseIE(InfoExtractor):
|
||||||
@ -50,4 +51,4 @@ class MangomoloLiveIE(MangomoloBaseIE):
|
|||||||
_IS_LIVE = True
|
_IS_LIVE = True
|
||||||
|
|
||||||
def _get_real_id(self, page_id):
|
def _get_real_id(self, page_id):
|
||||||
return compat_b64decode(compat_urllib_parse_unquote(page_id)).decode()
|
return base64.b64decode(compat_urllib_parse_unquote(page_id).encode()).decode()
|
||||||
|
@ -1,12 +1,12 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import base64
|
||||||
import functools
|
import functools
|
||||||
import itertools
|
import itertools
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
|
||||||
compat_chr,
|
compat_chr,
|
||||||
compat_ord,
|
compat_ord,
|
||||||
compat_str,
|
compat_str,
|
||||||
@ -79,7 +79,7 @@ class MixcloudIE(InfoExtractor):
|
|||||||
|
|
||||||
if encrypted_play_info is not None:
|
if encrypted_play_info is not None:
|
||||||
# Decode
|
# Decode
|
||||||
encrypted_play_info = compat_b64decode(encrypted_play_info)
|
encrypted_play_info = base64.b64decode(encrypted_play_info)
|
||||||
else:
|
else:
|
||||||
# New path
|
# New path
|
||||||
full_info_json = self._parse_json(self._html_search_regex(
|
full_info_json = self._parse_json(self._html_search_regex(
|
||||||
@ -109,7 +109,7 @@ class MixcloudIE(InfoExtractor):
|
|||||||
kpa_target = encrypted_play_info
|
kpa_target = encrypted_play_info
|
||||||
else:
|
else:
|
||||||
kps = ['https://', 'http://']
|
kps = ['https://', 'http://']
|
||||||
kpa_target = compat_b64decode(info_json['streamInfo']['url'])
|
kpa_target = base64.b64decode(info_json['streamInfo']['url'])
|
||||||
for kp in kps:
|
for kp in kps:
|
||||||
partial_key = self._decrypt_xor_cipher(kpa_target, kp)
|
partial_key = self._decrypt_xor_cipher(kpa_target, kp)
|
||||||
for quote in ["'", '"']:
|
for quote in ["'", '"']:
|
||||||
@ -165,7 +165,7 @@ class MixcloudIE(InfoExtractor):
|
|||||||
format_url = stream_info.get(url_key)
|
format_url = stream_info.get(url_key)
|
||||||
if not format_url:
|
if not format_url:
|
||||||
continue
|
continue
|
||||||
decrypted = self._decrypt_xor_cipher(key, compat_b64decode(format_url))
|
decrypted = self._decrypt_xor_cipher(key, base64.b64decode(format_url))
|
||||||
if not decrypted:
|
if not decrypted:
|
||||||
continue
|
continue
|
||||||
if url_key == 'hlsUrl':
|
if url_key == 'hlsUrl':
|
||||||
|
@ -3,18 +3,13 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from .vimple import SprutoBaseIE
|
from .vimple import SprutoBaseIE
|
||||||
|
|
||||||
|
|
||||||
class MyviIE(SprutoBaseIE):
|
class MyviIE(SprutoBaseIE):
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
(?:
|
|
||||||
https?://
|
https?://
|
||||||
(?:www\.)?
|
myvi\.(?:ru/player|tv)/
|
||||||
myvi\.
|
|
||||||
(?:
|
|
||||||
(?:ru/player|tv)/
|
|
||||||
(?:
|
(?:
|
||||||
(?:
|
(?:
|
||||||
embed/html|
|
embed/html|
|
||||||
@ -22,10 +17,6 @@ class MyviIE(SprutoBaseIE):
|
|||||||
api/Video/Get
|
api/Video/Get
|
||||||
)/|
|
)/|
|
||||||
content/preloader\.swf\?.*\bid=
|
content/preloader\.swf\?.*\bid=
|
||||||
)|
|
|
||||||
ru/watch/
|
|
||||||
)|
|
|
||||||
myvi:
|
|
||||||
)
|
)
|
||||||
(?P<id>[\da-zA-Z_-]+)
|
(?P<id>[\da-zA-Z_-]+)
|
||||||
'''
|
'''
|
||||||
@ -51,12 +42,6 @@ class MyviIE(SprutoBaseIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://myvi.ru/player/flash/ocp2qZrHI-eZnHKQBK4cZV60hslH8LALnk0uBfKsB-Q4WnY26SeGoYPi8HWHxu0O30',
|
'url': 'http://myvi.ru/player/flash/ocp2qZrHI-eZnHKQBK4cZV60hslH8LALnk0uBfKsB-Q4WnY26SeGoYPi8HWHxu0O30',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://www.myvi.ru/watch/YwbqszQynUaHPn_s82sx0Q2',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'myvi:YwbqszQynUaHPn_s82sx0Q2',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
@ -73,39 +58,3 @@ class MyviIE(SprutoBaseIE):
|
|||||||
'http://myvi.ru/player/api/Video/Get/%s?sig' % video_id, video_id)['sprutoData']
|
'http://myvi.ru/player/api/Video/Get/%s?sig' % video_id, video_id)['sprutoData']
|
||||||
|
|
||||||
return self._extract_spruto(spruto, video_id)
|
return self._extract_spruto(spruto, video_id)
|
||||||
|
|
||||||
|
|
||||||
class MyviEmbedIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?myvi\.tv/(?:[^?]+\?.*?\bv=|embed/)(?P<id>[\da-z]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.myvi.tv/embed/ccdqic3wgkqwpb36x9sxg43t4r',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'b3ea0663-3234-469d-873e-7fecf36b31d1',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Твоя (original song).mp4',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 277,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.myvi.tv/idmi6o?v=ccdqic3wgkqwpb36x9sxg43t4r#watch',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def suitable(cls, url):
|
|
||||||
return False if MyviIE.suitable(url) else super(MyviEmbedIE, cls).suitable(url)
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(
|
|
||||||
'https://www.myvi.tv/embed/%s' % video_id, video_id)
|
|
||||||
|
|
||||||
myvi_id = self._search_regex(
|
|
||||||
r'CreatePlayer\s*\(\s*["\'].*?\bv=([\da-zA-Z_]+)',
|
|
||||||
webpage, 'video id')
|
|
||||||
|
|
||||||
return self.url_result('myvi:%s' % myvi_id, ie=MyviIE.ie_key())
|
|
||||||
|
@ -68,7 +68,7 @@ class NationalGeographicVideoIE(InfoExtractor):
|
|||||||
|
|
||||||
class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
||||||
IE_NAME = 'natgeo'
|
IE_NAME = 'natgeo'
|
||||||
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:(?:wild/)?[^/]+/)?(?:videos|episodes)/(?P<id>[^/?]+)'
|
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:wild/)?[^/]+/(?:videos|episodes)/(?P<id>[^/?]+)'
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
@ -102,10 +102,6 @@ class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
|||||||
{
|
{
|
||||||
'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/episodes/the-power-of-miracles/',
|
'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/episodes/the-power-of-miracles/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://channel.nationalgeographic.com/videos/treasures-rediscovered/',
|
|
||||||
'only_matching': True,
|
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
import base64
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .theplatform import ThePlatformIE
|
from .theplatform import ThePlatformIE
|
||||||
@ -359,7 +358,6 @@ class NBCNewsIE(ThePlatformIE):
|
|||||||
|
|
||||||
|
|
||||||
class NBCOlympicsIE(InfoExtractor):
|
class NBCOlympicsIE(InfoExtractor):
|
||||||
IE_NAME = 'nbcolympics'
|
|
||||||
_VALID_URL = r'https?://www\.nbcolympics\.com/video/(?P<id>[a-z-]+)'
|
_VALID_URL = r'https?://www\.nbcolympics\.com/video/(?P<id>[a-z-]+)'
|
||||||
|
|
||||||
_TEST = {
|
_TEST = {
|
||||||
@ -397,54 +395,3 @@ class NBCOlympicsIE(InfoExtractor):
|
|||||||
'ie_key': ThePlatformIE.ie_key(),
|
'ie_key': ThePlatformIE.ie_key(),
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class NBCOlympicsStreamIE(AdobePassIE):
|
|
||||||
IE_NAME = 'nbcolympics:stream'
|
|
||||||
_VALID_URL = r'https?://stream\.nbcolympics\.com/(?P<id>[0-9a-z-]+)'
|
|
||||||
_TEST = {
|
|
||||||
'url': 'http://stream.nbcolympics.com/2018-winter-olympics-nbcsn-evening-feb-8',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '203493',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 're:Curling, Alpine, Luge [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
# m3u8 download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
_DATA_URL_TEMPLATE = 'http://stream.nbcolympics.com/data/%s_%s.json'
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
|
||||||
pid = self._search_regex(r'pid\s*=\s*(\d+);', webpage, 'pid')
|
|
||||||
resource = self._search_regex(
|
|
||||||
r"resource\s*=\s*'(.+)';", webpage,
|
|
||||||
'resource').replace("' + pid + '", pid)
|
|
||||||
event_config = self._download_json(
|
|
||||||
self._DATA_URL_TEMPLATE % ('event_config', pid),
|
|
||||||
pid)['eventConfig']
|
|
||||||
title = self._live_title(event_config['eventTitle'])
|
|
||||||
source_url = self._download_json(
|
|
||||||
self._DATA_URL_TEMPLATE % ('live_sources', pid),
|
|
||||||
pid)['videoSources'][0]['sourceUrl']
|
|
||||||
media_token = self._extract_mvpd_auth(
|
|
||||||
url, pid, event_config.get('requestorId', 'NBCOlympics'), resource)
|
|
||||||
formats = self._extract_m3u8_formats(self._download_webpage(
|
|
||||||
'http://sp.auth.adobe.com/tvs/v1/sign', pid, query={
|
|
||||||
'cdn': 'akamai',
|
|
||||||
'mediaToken': base64.b64encode(media_token.encode()),
|
|
||||||
'resource': base64.b64encode(resource.encode()),
|
|
||||||
'url': source_url,
|
|
||||||
}), pid, 'mp4')
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': pid,
|
|
||||||
'display_id': display_id,
|
|
||||||
'title': title,
|
|
||||||
'formats': formats,
|
|
||||||
'is_live': True,
|
|
||||||
}
|
|
||||||
|
@ -190,12 +190,10 @@ class NDREmbedBaseIE(InfoExtractor):
|
|||||||
ext = determine_ext(src, None)
|
ext = determine_ext(src, None)
|
||||||
if ext == 'f4m':
|
if ext == 'f4m':
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
src + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id,
|
src + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id, f4m_id='hds'))
|
||||||
f4m_id='hds', fatal=False))
|
|
||||||
elif ext == 'm3u8':
|
elif ext == 'm3u8':
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
src, video_id, 'mp4', m3u8_id='hls',
|
src, video_id, 'mp4', m3u8_id='hls', entry_protocol='m3u8_native'))
|
||||||
entry_protocol='m3u8_native', fatal=False))
|
|
||||||
else:
|
else:
|
||||||
quality = f.get('quality')
|
quality = f.get('quality')
|
||||||
ff = {
|
ff = {
|
||||||
|
@ -87,21 +87,19 @@ class NewgroundsIE(InfoExtractor):
|
|||||||
self._check_formats(formats, media_id)
|
self._check_formats(formats, media_id)
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
uploader = self._html_search_regex(
|
uploader = self._search_regex(
|
||||||
(r'(?s)<h4[^>]*>(.+?)</h4>.*?<em>\s*Author\s*</em>',
|
r'(?:Author|Writer)\s*<a[^>]+>([^<]+)', webpage, 'uploader',
|
||||||
r'(?:Author|Writer)\s*<a[^>]+>([^<]+)'), webpage, 'uploader',
|
|
||||||
fatal=False)
|
fatal=False)
|
||||||
|
|
||||||
timestamp = unified_timestamp(self._html_search_regex(
|
timestamp = unified_timestamp(self._search_regex(
|
||||||
(r'<dt>\s*Uploaded\s*</dt>\s*<dd>([^<]+</dd>\s*<dd>[^<]+)',
|
r'<dt>Uploaded</dt>\s*<dd>([^<]+)', webpage, 'timestamp',
|
||||||
r'<dt>\s*Uploaded\s*</dt>\s*<dd>([^<]+)'), webpage, 'timestamp',
|
|
||||||
default=None))
|
default=None))
|
||||||
duration = parse_duration(self._search_regex(
|
duration = parse_duration(self._search_regex(
|
||||||
r'(?s)<dd>\s*Song\s*</dd>\s*<dd>.+?</dd>\s*<dd>([^<]+)', webpage,
|
r'<dd>Song\s*</dd><dd>.+?</dd><dd>([^<]+)', webpage, 'duration',
|
||||||
'duration', default=None))
|
default=None))
|
||||||
|
|
||||||
filesize_approx = parse_filesize(self._html_search_regex(
|
filesize_approx = parse_filesize(self._html_search_regex(
|
||||||
r'(?s)<dd>\s*Song\s*</dd>\s*<dd>(.+?)</dd>', webpage, 'filesize',
|
r'<dd>Song\s*</dd><dd>(.+?)</dd>', webpage, 'filesize',
|
||||||
default=None))
|
default=None))
|
||||||
if len(formats) == 1:
|
if len(formats) == 1:
|
||||||
formats[0]['filesize_approx'] = filesize_approx
|
formats[0]['filesize_approx'] = filesize_approx
|
||||||
|
@ -21,8 +21,7 @@ class NexxIE(InfoExtractor):
|
|||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
(?:
|
(?:
|
||||||
https?://api\.nexx(?:\.cloud|cdn\.com)/v3/(?P<domain_id>\d+)/videos/byid/|
|
https?://api\.nexx(?:\.cloud|cdn\.com)/v3/(?P<domain_id>\d+)/videos/byid/|
|
||||||
nexx:(?:(?P<domain_id_s>\d+):)?|
|
nexx:(?P<domain_id_s>\d+):
|
||||||
https?://arc\.nexx\.cloud/api/video/
|
|
||||||
)
|
)
|
||||||
(?P<id>\d+)
|
(?P<id>\d+)
|
||||||
'''
|
'''
|
||||||
@ -62,33 +61,12 @@ class NexxIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
# does not work via arc
|
|
||||||
'url': 'nexx:741:1269984',
|
|
||||||
'md5': 'c714b5b238b2958dc8d5642addba6886',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1269984',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '1 TAG ohne KLO... wortwörtlich! 😑',
|
|
||||||
'alt_title': '1 TAG ohne KLO... wortwörtlich! 😑',
|
|
||||||
'description': 'md5:4604539793c49eda9443ab5c5b1d612f',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 607,
|
|
||||||
'timestamp': 1518614955,
|
|
||||||
'upload_date': '20180214',
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://api.nexxcdn.com/v3/748/videos/byid/128907',
|
'url': 'https://api.nexxcdn.com/v3/748/videos/byid/128907',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'nexx:748:128907',
|
'url': 'nexx:748:128907',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'nexx:128907',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://arc.nexx.cloud/api/video/128907.json',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -146,18 +124,6 @@ class NexxIE(InfoExtractor):
|
|||||||
domain_id = mobj.group('domain_id') or mobj.group('domain_id_s')
|
domain_id = mobj.group('domain_id') or mobj.group('domain_id_s')
|
||||||
video_id = mobj.group('id')
|
video_id = mobj.group('id')
|
||||||
|
|
||||||
video = None
|
|
||||||
|
|
||||||
response = self._download_json(
|
|
||||||
'https://arc.nexx.cloud/api/video/%s.json' % video_id,
|
|
||||||
video_id, fatal=False)
|
|
||||||
if response and isinstance(response, dict):
|
|
||||||
result = response.get('result')
|
|
||||||
if result and isinstance(result, dict):
|
|
||||||
video = result
|
|
||||||
|
|
||||||
# not all videos work via arc, e.g. nexx:741:1269984
|
|
||||||
if not video:
|
|
||||||
# Reverse engineered from JS code (see getDeviceID function)
|
# Reverse engineered from JS code (see getDeviceID function)
|
||||||
device_id = '%d:%d:%d%d' % (
|
device_id = '%d:%d:%d%d' % (
|
||||||
random.randint(1, 4), int(time.time()),
|
random.randint(1, 4), int(time.time()),
|
||||||
|
@ -198,7 +198,7 @@ class NickNightIE(NickDeIE):
|
|||||||
|
|
||||||
class NickRuIE(MTVServicesInfoExtractor):
|
class NickRuIE(MTVServicesInfoExtractor):
|
||||||
IE_NAME = 'nickelodeonru'
|
IE_NAME = 'nickelodeonru'
|
||||||
_VALID_URL = r'https?://(?:www\.)nickelodeon\.(?:ru|fr|es|pt|ro|hu|com\.tr)/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://(?:www\.)nickelodeon\.(?:ru|fr|es|pt|ro|hu)/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.nickelodeon.ru/shows/henrydanger/videos/episodes/3-sezon-15-seriya-licenziya-na-polyot/pmomfb#playlist/7airc6',
|
'url': 'http://www.nickelodeon.ru/shows/henrydanger/videos/episodes/3-sezon-15-seriya-licenziya-na-polyot/pmomfb#playlist/7airc6',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -220,9 +220,6 @@ class NickRuIE(MTVServicesInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://www.nickelodeon.hu/musorok/spongyabob-kockanadrag/videok/episodes/buborekfujas-az-elszakadt-nadrag/q57iob#playlist/k6te4y',
|
'url': 'http://www.nickelodeon.hu/musorok/spongyabob-kockanadrag/videok/episodes/buborekfujas-az-elszakadt-nadrag/q57iob#playlist/k6te4y',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'http://www.nickelodeon.com.tr/programlar/sunger-bob/videolar/kayip-yatak/mgqbjy',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -13,7 +13,7 @@ class NineGagIE(InfoExtractor):
|
|||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://9gag.com/tv/p/Kk2X5/people-are-awesome-2013-is-absolutely-awesome',
|
'url': 'http://9gag.com/tv/p/Kk2X5/people-are-awesome-2013-is-absolutely-awesome',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'kXzwOKyGlSA',
|
'id': 'Kk2X5',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'description': 'This 3-minute video will make you smile and then make you feel untalented and insignificant. Anyway, you should share this awesomeness. (Thanks, Dino!)',
|
'description': 'This 3-minute video will make you smile and then make you feel untalented and insignificant. Anyway, you should share this awesomeness. (Thanks, Dino!)',
|
||||||
'title': '\"People Are Awesome 2013\" Is Absolutely Awesome',
|
'title': '\"People Are Awesome 2013\" Is Absolutely Awesome',
|
||||||
|
@ -4,17 +4,15 @@ from __future__ import unicode_literals
|
|||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_str
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
|
||||||
int_or_none,
|
int_or_none,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
smuggle_url,
|
ExtractorError,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class NineNowIE(InfoExtractor):
|
class NineNowIE(InfoExtractor):
|
||||||
IE_NAME = '9now.com.au'
|
IE_NAME = '9now.com.au'
|
||||||
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)'
|
||||||
_GEO_COUNTRIES = ['AU']
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# clip
|
# clip
|
||||||
'url': 'https://www.9now.com.au/afl-footy-show/2016/clip-ciql02091000g0hp5oktrnytc',
|
'url': 'https://www.9now.com.au/afl-footy-show/2016/clip-ciql02091000g0hp5oktrnytc',
|
||||||
@ -77,9 +75,7 @@ class NineNowIE(InfoExtractor):
|
|||||||
|
|
||||||
return {
|
return {
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'url': smuggle_url(
|
'url': self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
|
||||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
|
|
||||||
{'geo_countries': self._GEO_COUNTRIES}),
|
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': common_data.get('description'),
|
'description': common_data.get('description'),
|
||||||
|
@ -43,8 +43,7 @@ class NJPWWorldIE(InfoExtractor):
|
|||||||
webpage, urlh = self._download_webpage_handle(
|
webpage, urlh = self._download_webpage_handle(
|
||||||
'https://njpwworld.com/auth/login', None,
|
'https://njpwworld.com/auth/login', None,
|
||||||
note='Logging in', errnote='Unable to login',
|
note='Logging in', errnote='Unable to login',
|
||||||
data=urlencode_postdata({'login_id': username, 'pw': password}),
|
data=urlencode_postdata({'login_id': username, 'pw': password}))
|
||||||
headers={'Referer': 'https://njpwworld.com/auth'})
|
|
||||||
# /auth/login will return 302 for successful logins
|
# /auth/login will return 302 for successful logins
|
||||||
if urlh.geturl() == 'https://njpwworld.com/auth/login':
|
if urlh.geturl() == 'https://njpwworld.com/auth/login':
|
||||||
self.report_warning('unable to login')
|
self.report_warning('unable to login')
|
||||||
|
@ -11,7 +11,6 @@ from ..utils import (
|
|||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
fix_xml_ampersands,
|
fix_xml_ampersands,
|
||||||
int_or_none,
|
|
||||||
orderedSet,
|
orderedSet,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
qualities,
|
qualities,
|
||||||
@ -39,7 +38,7 @@ class NPOIE(NPOBaseIE):
|
|||||||
npo\.nl/(?!(?:live|radio)/)(?:[^/]+/){2}|
|
npo\.nl/(?!(?:live|radio)/)(?:[^/]+/){2}|
|
||||||
ntr\.nl/(?:[^/]+/){2,}|
|
ntr\.nl/(?:[^/]+/){2,}|
|
||||||
omroepwnl\.nl/video/fragment/[^/]+__|
|
omroepwnl\.nl/video/fragment/[^/]+__|
|
||||||
(?:zapp|npo3)\.nl/(?:[^/]+/){2,}
|
(?:zapp|npo3)\.nl/(?:[^/]+/){2}
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
(?P<id>[^/?#]+)
|
(?P<id>[^/?#]+)
|
||||||
@ -157,9 +156,6 @@ class NPOIE(NPOBaseIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://www.npo.nl/radio-gaga/13-06-2017/BNN_101383373',
|
'url': 'http://www.npo.nl/radio-gaga/13-06-2017/BNN_101383373',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://www.zapp.nl/1803-skelterlab/instructie-video-s/740-instructievideo-s/POMS_AT_11736927',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -174,10 +170,6 @@ class NPOIE(NPOBaseIE):
|
|||||||
transform_source=strip_jsonp,
|
transform_source=strip_jsonp,
|
||||||
)
|
)
|
||||||
|
|
||||||
error = metadata.get('error')
|
|
||||||
if error:
|
|
||||||
raise ExtractorError(error, expected=True)
|
|
||||||
|
|
||||||
# For some videos actual video id (prid) is different (e.g. for
|
# For some videos actual video id (prid) is different (e.g. for
|
||||||
# http://www.omroepwnl.nl/video/fragment/vandaag-de-dag-verkiezingen__POMS_WNL_853698
|
# http://www.omroepwnl.nl/video/fragment/vandaag-de-dag-verkiezingen__POMS_WNL_853698
|
||||||
# video id is POMS_WNL_853698 but prid is POW_00996502)
|
# video id is POMS_WNL_853698 but prid is POW_00996502)
|
||||||
@ -195,15 +187,7 @@ class NPOIE(NPOBaseIE):
|
|||||||
formats = []
|
formats = []
|
||||||
urls = set()
|
urls = set()
|
||||||
|
|
||||||
def is_legal_url(format_url):
|
quality = qualities(['adaptive', 'wmv_sb', 'h264_sb', 'wmv_bb', 'h264_bb', 'wvc1_std', 'h264_std'])
|
||||||
return format_url and format_url not in urls and re.match(
|
|
||||||
r'^(?:https?:)?//', format_url)
|
|
||||||
|
|
||||||
QUALITY_LABELS = ('Laag', 'Normaal', 'Hoog')
|
|
||||||
QUALITY_FORMATS = ('adaptive', 'wmv_sb', 'h264_sb', 'wmv_bb', 'h264_bb', 'wvc1_std', 'h264_std')
|
|
||||||
|
|
||||||
quality_from_label = qualities(QUALITY_LABELS)
|
|
||||||
quality_from_format_id = qualities(QUALITY_FORMATS)
|
|
||||||
items = self._download_json(
|
items = self._download_json(
|
||||||
'http://ida.omroep.nl/app.php/%s' % video_id, video_id,
|
'http://ida.omroep.nl/app.php/%s' % video_id, video_id,
|
||||||
'Downloading formats JSON', query={
|
'Downloading formats JSON', query={
|
||||||
@ -212,34 +196,18 @@ class NPOIE(NPOBaseIE):
|
|||||||
})['items'][0]
|
})['items'][0]
|
||||||
for num, item in enumerate(items):
|
for num, item in enumerate(items):
|
||||||
item_url = item.get('url')
|
item_url = item.get('url')
|
||||||
if not is_legal_url(item_url):
|
if not item_url or item_url in urls:
|
||||||
continue
|
continue
|
||||||
urls.add(item_url)
|
urls.add(item_url)
|
||||||
format_id = self._search_regex(
|
format_id = self._search_regex(
|
||||||
r'video/ida/([^/]+)', item_url, 'format id',
|
r'video/ida/([^/]+)', item_url, 'format id',
|
||||||
default=None)
|
default=None)
|
||||||
|
|
||||||
item_label = item.get('label')
|
|
||||||
|
|
||||||
def add_format_url(format_url):
|
def add_format_url(format_url):
|
||||||
width = int_or_none(self._search_regex(
|
|
||||||
r'(\d+)[xX]\d+', format_url, 'width', default=None))
|
|
||||||
height = int_or_none(self._search_regex(
|
|
||||||
r'\d+[xX](\d+)', format_url, 'height', default=None))
|
|
||||||
if item_label in QUALITY_LABELS:
|
|
||||||
quality = quality_from_label(item_label)
|
|
||||||
f_id = item_label
|
|
||||||
elif item_label in QUALITY_FORMATS:
|
|
||||||
quality = quality_from_format_id(format_id)
|
|
||||||
f_id = format_id
|
|
||||||
else:
|
|
||||||
quality, f_id = [None] * 2
|
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': format_url,
|
'url': format_url,
|
||||||
'format_id': f_id,
|
'format_id': format_id,
|
||||||
'width': width,
|
'quality': quality(format_id),
|
||||||
'height': height,
|
|
||||||
'quality': quality,
|
|
||||||
})
|
})
|
||||||
|
|
||||||
# Example: http://www.npo.nl/de-nieuwe-mens-deel-1/21-07-2010/WO_VPRO_043706
|
# Example: http://www.npo.nl/de-nieuwe-mens-deel-1/21-07-2010/WO_VPRO_043706
|
||||||
@ -251,7 +219,7 @@ class NPOIE(NPOBaseIE):
|
|||||||
stream_info = self._download_json(
|
stream_info = self._download_json(
|
||||||
item_url + '&type=json', video_id,
|
item_url + '&type=json', video_id,
|
||||||
'Downloading %s stream JSON'
|
'Downloading %s stream JSON'
|
||||||
% item_label or item.get('format') or format_id or num)
|
% item.get('label') or item.get('format') or format_id or num)
|
||||||
except ExtractorError as ee:
|
except ExtractorError as ee:
|
||||||
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
|
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
|
||||||
error = (self._parse_json(
|
error = (self._parse_json(
|
||||||
@ -283,7 +251,7 @@ class NPOIE(NPOBaseIE):
|
|||||||
if not is_live:
|
if not is_live:
|
||||||
for num, stream in enumerate(metadata.get('streams', [])):
|
for num, stream in enumerate(metadata.get('streams', [])):
|
||||||
stream_url = stream.get('url')
|
stream_url = stream.get('url')
|
||||||
if not is_legal_url(stream_url):
|
if not stream_url or stream_url in urls:
|
||||||
continue
|
continue
|
||||||
urls.add(stream_url)
|
urls.add(stream_url)
|
||||||
# smooth streaming is not supported
|
# smooth streaming is not supported
|
||||||
|
@ -19,11 +19,11 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class OdnoklassnikiIE(InfoExtractor):
|
class OdnoklassnikiIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:(?:www|m|mobile)\.)?(?:odnoklassniki|ok)\.ru/(?:video(?:embed)?|web-api/video/moviePlayer|live)/(?P<id>[\d-]+)'
|
_VALID_URL = r'https?://(?:(?:www|m|mobile)\.)?(?:odnoklassniki|ok)\.ru/(?:video(?:embed)?|web-api/video/moviePlayer)/(?P<id>[\d-]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# metadata in JSON
|
# metadata in JSON
|
||||||
'url': 'http://ok.ru/video/20079905452',
|
'url': 'http://ok.ru/video/20079905452',
|
||||||
'md5': '0b62089b479e06681abaaca9d204f152',
|
'md5': '6ba728d85d60aa2e6dd37c9e70fdc6bc',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '20079905452',
|
'id': '20079905452',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
@ -35,6 +35,7 @@ class OdnoklassnikiIE(InfoExtractor):
|
|||||||
'like_count': int,
|
'like_count': int,
|
||||||
'age_limit': 0,
|
'age_limit': 0,
|
||||||
},
|
},
|
||||||
|
'skip': 'Video has been blocked',
|
||||||
}, {
|
}, {
|
||||||
# metadataUrl
|
# metadataUrl
|
||||||
'url': 'http://ok.ru/video/63567059965189-0?fromTime=5',
|
'url': 'http://ok.ru/video/63567059965189-0?fromTime=5',
|
||||||
@ -98,9 +99,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://mobile.ok.ru/video/20079905452',
|
'url': 'http://mobile.ok.ru/video/20079905452',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://www.ok.ru/live/484531969818',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -186,10 +184,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
|||||||
})
|
})
|
||||||
return info
|
return info
|
||||||
|
|
||||||
assert title
|
|
||||||
if provider == 'LIVE_TV_APP':
|
|
||||||
info['title'] = self._live_title(title)
|
|
||||||
|
|
||||||
quality = qualities(('4', '0', '1', '2', '3', '5'))
|
quality = qualities(('4', '0', '1', '2', '3', '5'))
|
||||||
|
|
||||||
formats = [{
|
formats = [{
|
||||||
@ -216,20 +210,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
|||||||
if fmt_type:
|
if fmt_type:
|
||||||
fmt['quality'] = quality(fmt_type)
|
fmt['quality'] = quality(fmt_type)
|
||||||
|
|
||||||
# Live formats
|
|
||||||
m3u8_url = metadata.get('hlsMasterPlaylistUrl')
|
|
||||||
if m3u8_url:
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
m3u8_url, video_id, 'mp4', entry_protocol='m3u8',
|
|
||||||
m3u8_id='hls', fatal=False))
|
|
||||||
rtmp_url = metadata.get('rtmpUrl')
|
|
||||||
if rtmp_url:
|
|
||||||
formats.append({
|
|
||||||
'url': rtmp_url,
|
|
||||||
'format_id': 'rtmp',
|
|
||||||
'ext': 'flv',
|
|
||||||
})
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
info['formats'] = formats
|
info['formats'] = formats
|
||||||
|
@ -1,13 +1,9 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
|
import base64
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import compat_str
|
||||||
compat_b64decode,
|
|
||||||
compat_str,
|
|
||||||
compat_urllib_parse_urlencode,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
@ -16,6 +12,7 @@ from ..utils import (
|
|||||||
try_get,
|
try_get,
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
)
|
)
|
||||||
|
from ..compat import compat_urllib_parse_urlencode
|
||||||
|
|
||||||
|
|
||||||
class OoyalaBaseIE(InfoExtractor):
|
class OoyalaBaseIE(InfoExtractor):
|
||||||
@ -47,7 +44,7 @@ class OoyalaBaseIE(InfoExtractor):
|
|||||||
url_data = try_get(stream, lambda x: x['url']['data'], compat_str)
|
url_data = try_get(stream, lambda x: x['url']['data'], compat_str)
|
||||||
if not url_data:
|
if not url_data:
|
||||||
continue
|
continue
|
||||||
s_url = compat_b64decode(url_data).decode('utf-8')
|
s_url = base64.b64decode(url_data.encode('ascii')).decode('utf-8')
|
||||||
if not s_url or s_url in urls:
|
if not s_url or s_url in urls:
|
||||||
continue
|
continue
|
||||||
urls.append(s_url)
|
urls.append(s_url)
|
||||||
|
@ -1,8 +1,6 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_str,
|
compat_str,
|
||||||
@ -20,14 +18,7 @@ from ..utils import (
|
|||||||
class PandoraTVIE(InfoExtractor):
|
class PandoraTVIE(InfoExtractor):
|
||||||
IE_NAME = 'pandora.tv'
|
IE_NAME = 'pandora.tv'
|
||||||
IE_DESC = '판도라TV'
|
IE_DESC = '판도라TV'
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'https?://(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?'
|
||||||
https?://
|
|
||||||
(?:
|
|
||||||
(?:www\.)?pandora\.tv/view/(?P<user_id>[^/]+)/(?P<id>\d+)| # new format
|
|
||||||
(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?| # old format
|
|
||||||
m\.pandora\.tv/?\? # mobile
|
|
||||||
)
|
|
||||||
'''
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
|
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -62,20 +53,9 @@ class PandoraTVIE(InfoExtractor):
|
|||||||
# Test metadata only
|
# Test metadata only
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'http://www.pandora.tv/view/mikakim/53294230#36797454_new',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'http://m.pandora.tv/?c=view&ch_userid=mikakim&prgid=54600346',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
user_id = mobj.group('user_id')
|
|
||||||
video_id = mobj.group('id')
|
|
||||||
|
|
||||||
if not user_id or not video_id:
|
|
||||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||||
video_id = qs.get('prgid', [None])[0]
|
video_id = qs.get('prgid', [None])[0]
|
||||||
user_id = qs.get('ch_userid', [None])[0]
|
user_id = qs.get('ch_userid', [None])[0]
|
||||||
|
@ -56,16 +56,18 @@ class PeriscopeIE(PeriscopeBaseIE):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
token = self._match_id(url)
|
token = self._match_id(url)
|
||||||
|
|
||||||
stream = self._call_api(
|
broadcast_data = self._call_api(
|
||||||
'accessVideoPublic', {'broadcast_id': token}, token)
|
'getBroadcastPublic', {'broadcast_id': token}, token)
|
||||||
|
broadcast = broadcast_data['broadcast']
|
||||||
|
status = broadcast['status']
|
||||||
|
|
||||||
broadcast = stream['broadcast']
|
user = broadcast_data.get('user', {})
|
||||||
title = broadcast['status']
|
|
||||||
|
|
||||||
uploader = broadcast.get('user_display_name') or broadcast.get('username')
|
uploader = broadcast.get('user_display_name') or user.get('display_name')
|
||||||
uploader_id = (broadcast.get('user_id') or broadcast.get('username'))
|
uploader_id = (broadcast.get('username') or user.get('username') or
|
||||||
|
broadcast.get('user_id') or user.get('id'))
|
||||||
|
|
||||||
title = '%s - %s' % (uploader, title) if uploader else title
|
title = '%s - %s' % (uploader, status) if uploader else status
|
||||||
state = broadcast.get('state').lower()
|
state = broadcast.get('state').lower()
|
||||||
if state == 'running':
|
if state == 'running':
|
||||||
title = self._live_title(title)
|
title = self._live_title(title)
|
||||||
@ -75,6 +77,9 @@ class PeriscopeIE(PeriscopeBaseIE):
|
|||||||
'url': broadcast[image],
|
'url': broadcast[image],
|
||||||
} for image in ('image_url', 'image_url_small') if broadcast.get(image)]
|
} for image in ('image_url', 'image_url_small') if broadcast.get(image)]
|
||||||
|
|
||||||
|
stream = self._call_api(
|
||||||
|
'getAccessPublic', {'broadcast_id': token}, token)
|
||||||
|
|
||||||
video_urls = set()
|
video_urls = set()
|
||||||
formats = []
|
formats = []
|
||||||
for format_id in ('replay', 'rtmp', 'hls', 'https_hls', 'lhls', 'lhlsweb'):
|
for format_id in ('replay', 'rtmp', 'hls', 'https_hls', 'lhls', 'lhlsweb'):
|
||||||
|
@ -4,9 +4,7 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_urlparse
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
xpath_text,
|
xpath_text,
|
||||||
@ -28,15 +26,17 @@ class PladformIE(InfoExtractor):
|
|||||||
(?P<id>\d+)
|
(?P<id>\d+)
|
||||||
'''
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://out.pladform.ru/player?pl=64471&videoid=3777899&vk_puid15=0&vk_puid34=0',
|
# http://muz-tv.ru/kinozal/view/7400/
|
||||||
'md5': '53362fac3a27352da20fa2803cc5cd6f',
|
'url': 'http://out.pladform.ru/player?pl=24822&videoid=100183293',
|
||||||
|
'md5': '61f37b575dd27f1bb2e1854777fe31f4',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '3777899',
|
'id': '100183293',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'СТУДИЯ СОЮЗ • Шоу Студия Союз, 24 выпуск (01.02.2018) Нурлан Сабуров и Слава Комиссаренко',
|
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
|
||||||
'description': 'md5:05140e8bf1b7e2d46e7ba140be57fd95',
|
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'duration': 3190,
|
'duration': 694,
|
||||||
|
'age_limit': 0,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://static.pladform.ru/player.swf?pl=21469&videoid=100183293&vkcid=0',
|
'url': 'http://static.pladform.ru/player.swf?pl=21469&videoid=100183293&vkcid=0',
|
||||||
@ -56,48 +56,22 @@ class PladformIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
|
||||||
pl = qs.get('pl', ['1'])[0]
|
|
||||||
|
|
||||||
video = self._download_xml(
|
video = self._download_xml(
|
||||||
'http://out.pladform.ru/getVideo', video_id, query={
|
'http://out.pladform.ru/getVideo?pl=1&videoid=%s' % video_id,
|
||||||
'pl': pl,
|
video_id)
|
||||||
'videoid': video_id,
|
|
||||||
})
|
|
||||||
|
|
||||||
def fail(text):
|
|
||||||
raise ExtractorError(
|
|
||||||
'%s returned error: %s' % (self.IE_NAME, text),
|
|
||||||
expected=True)
|
|
||||||
|
|
||||||
if video.tag == 'error':
|
if video.tag == 'error':
|
||||||
fail(video.text)
|
raise ExtractorError(
|
||||||
|
'%s returned error: %s' % (self.IE_NAME, video.text),
|
||||||
|
expected=True)
|
||||||
|
|
||||||
quality = qualities(('ld', 'sd', 'hd'))
|
quality = qualities(('ld', 'sd', 'hd'))
|
||||||
|
|
||||||
formats = []
|
formats = [{
|
||||||
for src in video.findall('./src'):
|
|
||||||
if src is None:
|
|
||||||
continue
|
|
||||||
format_url = src.text
|
|
||||||
if not format_url:
|
|
||||||
continue
|
|
||||||
if src.get('type') == 'hls' or determine_ext(format_url) == 'm3u8':
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
|
||||||
m3u8_id='hls', fatal=False))
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'url': src.text,
|
'url': src.text,
|
||||||
'format_id': src.get('quality'),
|
'format_id': src.get('quality'),
|
||||||
'quality': quality(src.get('quality')),
|
'quality': quality(src.get('quality')),
|
||||||
})
|
} for src in video.findall('./src')]
|
||||||
|
|
||||||
if not formats:
|
|
||||||
error = xpath_text(video, './cap', 'error', default=None)
|
|
||||||
if error:
|
|
||||||
fail(error)
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(
|
||||||
|
@ -11,34 +11,19 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class PokemonIE(InfoExtractor):
|
class PokemonIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/(?:[^/]+/)+(?P<display_id>[^/?#&]+))'
|
_VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/[^/]+/\d+_\d+-(?P<display_id>[^/?#]+))'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.pokemon.com/us/pokemon-episodes/20_30-the-ol-raise-and-switch/',
|
'url': 'http://www.pokemon.com/us/pokemon-episodes/19_01-from-a-to-z/?play=true',
|
||||||
'md5': '2fe8eaec69768b25ef898cda9c43062e',
|
'md5': '9fb209ae3a569aac25de0f5afc4ee08f',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'afe22e30f01c41f49d4f1d9eab5cd9a4',
|
'id': 'd0436c00c3ce4071ac6cee8130ac54a1',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'The Ol’ Raise and Switch!',
|
'title': 'From A to Z!',
|
||||||
'description': 'md5:7db77f7107f98ba88401d3adc80ff7af',
|
'description': 'Bonnie makes a new friend, Ash runs into an old friend, and a terrifying premonition begins to unfold!',
|
||||||
'timestamp': 1511824728,
|
'timestamp': 1460478136,
|
||||||
'upload_date': '20171127',
|
'upload_date': '20160412',
|
||||||
},
|
|
||||||
'add_id': ['LimelightMedia'],
|
|
||||||
}, {
|
|
||||||
# no data-video-title
|
|
||||||
'url': 'https://www.pokemon.com/us/pokemon-episodes/pokemon-movies/pokemon-the-rise-of-darkrai-2008',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '99f3bae270bf4e5097274817239ce9c8',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Pokémon: The Rise of Darkrai',
|
|
||||||
'description': 'md5:ea8fbbf942e1e497d54b19025dd57d9d',
|
|
||||||
'timestamp': 1417778347,
|
|
||||||
'upload_date': '20141205',
|
|
||||||
},
|
|
||||||
'add_id': ['LimelightMedia'],
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
},
|
||||||
|
'add_id': ['LimelightMedia']
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
|
'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -57,9 +42,7 @@ class PokemonIE(InfoExtractor):
|
|||||||
r'(<[^>]+data-video-id="%s"[^>]*>)' % (video_id if video_id else '[a-z0-9]{32}'),
|
r'(<[^>]+data-video-id="%s"[^>]*>)' % (video_id if video_id else '[a-z0-9]{32}'),
|
||||||
webpage, 'video data element'))
|
webpage, 'video data element'))
|
||||||
video_id = video_data['data-video-id']
|
video_id = video_data['data-video-id']
|
||||||
title = video_data.get('data-video-title') or self._html_search_meta(
|
title = video_data['data-video-title']
|
||||||
'pkm-title', webpage, ' title', default=None) or self._search_regex(
|
|
||||||
r'<h1[^>]+\bclass=["\']us-title[^>]+>([^<]+)', webpage, 'title')
|
|
||||||
return {
|
return {
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
@ -115,13 +115,12 @@ class PornHubIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
self._set_cookie('pornhub.com', 'age_verified', '1')
|
|
||||||
|
|
||||||
def dl_webpage(platform):
|
def dl_webpage(platform):
|
||||||
self._set_cookie('pornhub.com', 'platform', platform)
|
|
||||||
return self._download_webpage(
|
return self._download_webpage(
|
||||||
'http://www.pornhub.com/view_video.php?viewkey=%s' % video_id,
|
'http://www.pornhub.com/view_video.php?viewkey=%s' % video_id,
|
||||||
video_id)
|
video_id, headers={
|
||||||
|
'Cookie': 'age_verified=1; platform=%s' % platform,
|
||||||
|
})
|
||||||
|
|
||||||
webpage = dl_webpage('pc')
|
webpage = dl_webpage('pc')
|
||||||
|
|
||||||
@ -276,7 +275,7 @@ class PornHubPlaylistIE(PornHubPlaylistBaseIE):
|
|||||||
|
|
||||||
|
|
||||||
class PornHubUserVideosIE(PornHubPlaylistBaseIE):
|
class PornHubUserVideosIE(PornHubPlaylistBaseIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?pornhub\.com/(?:user|channel)s/(?P<id>[^/]+)/videos'
|
_VALID_URL = r'https?://(?:www\.)?pornhub\.com/users/(?P<id>[^/]+)/videos'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.pornhub.com/users/zoe_ph/videos/public',
|
'url': 'http://www.pornhub.com/users/zoe_ph/videos/public',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -286,25 +285,6 @@ class PornHubUserVideosIE(PornHubPlaylistBaseIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://www.pornhub.com/users/rushandlia/videos',
|
'url': 'http://www.pornhub.com/users/rushandlia/videos',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
# default sorting as Top Rated Videos
|
|
||||||
'url': 'https://www.pornhub.com/channels/povd/videos',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'povd',
|
|
||||||
},
|
|
||||||
'playlist_mincount': 293,
|
|
||||||
}, {
|
|
||||||
# Top Rated Videos
|
|
||||||
'url': 'https://www.pornhub.com/channels/povd/videos?o=ra',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
# Most Recent Videos
|
|
||||||
'url': 'https://www.pornhub.com/channels/povd/videos?o=da',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
# Most Viewed Videos
|
|
||||||
'url': 'https://www.pornhub.com/channels/povd/videos?o=vi',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -129,11 +129,10 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
|||||||
https?://
|
https?://
|
||||||
(?:www\.)?
|
(?:www\.)?
|
||||||
(?:
|
(?:
|
||||||
(?:beta\.)?
|
|
||||||
(?:
|
(?:
|
||||||
prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|7tv|advopedia
|
prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|7tv|advopedia
|
||||||
)\.(?:de|at|ch)|
|
)\.(?:de|at|ch)|
|
||||||
ran\.de|fem\.com|advopedia\.de|galileo\.tv/video
|
ran\.de|fem\.com|advopedia\.de
|
||||||
)
|
)
|
||||||
/(?P<id>.+)
|
/(?P<id>.+)
|
||||||
'''
|
'''
|
||||||
@ -326,11 +325,6 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
|||||||
'url': 'http://www.sat1gold.de/tv/edel-starck/video/11-staffel-1-episode-1-partner-wider-willen-ganze-folge',
|
'url': 'http://www.sat1gold.de/tv/edel-starck/video/11-staffel-1-episode-1-partner-wider-willen-ganze-folge',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
},
|
},
|
||||||
{
|
|
||||||
# geo restricted to Germany
|
|
||||||
'url': 'https://www.galileo.tv/video/diese-emojis-werden-oft-missverstanden',
|
|
||||||
'only_matching': True,
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
'url': 'http://www.sat1gold.de/tv/edel-starck/playlist/die-gesamte-1-staffel',
|
'url': 'http://www.sat1gold.de/tv/edel-starck/playlist/die-gesamte-1-staffel',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -348,10 +342,8 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
|||||||
r'"clip_id"\s*:\s+"(\d+)"',
|
r'"clip_id"\s*:\s+"(\d+)"',
|
||||||
r'clipid: "(\d+)"',
|
r'clipid: "(\d+)"',
|
||||||
r'clip[iI]d=(\d+)',
|
r'clip[iI]d=(\d+)',
|
||||||
r'clip[iI][dD]\s*=\s*["\'](\d+)',
|
r'clip[iI]d\s*=\s*["\'](\d+)',
|
||||||
r"'itemImageUrl'\s*:\s*'/dynamic/thumbnails/full/\d+/(\d+)",
|
r"'itemImageUrl'\s*:\s*'/dynamic/thumbnails/full/\d+/(\d+)",
|
||||||
r'proMamsId"\s*:\s*"(\d+)',
|
|
||||||
r'proMamsId"\s*:\s*"(\d+)',
|
|
||||||
]
|
]
|
||||||
_TITLE_REGEXES = [
|
_TITLE_REGEXES = [
|
||||||
r'<h2 class="subtitle" itemprop="name">\s*(.+?)</h2>',
|
r'<h2 class="subtitle" itemprop="name">\s*(.+?)</h2>',
|
||||||
|
@ -1,102 +0,0 @@
|
|||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from .vimeo import VimeoIE
|
|
||||||
from ..utils import (
|
|
||||||
extract_attributes,
|
|
||||||
ExtractorError,
|
|
||||||
smuggle_url,
|
|
||||||
unsmuggle_url,
|
|
||||||
urljoin,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class RayWenderlichIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://videos\.raywenderlich\.com/courses/(?P<course_id>[^/]+)/lessons/(?P<id>\d+)'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://videos.raywenderlich.com/courses/105-testing-in-ios/lessons/1',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '248377018',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Testing In iOS Episode 1: Introduction',
|
|
||||||
'duration': 133,
|
|
||||||
'uploader': 'Ray Wenderlich',
|
|
||||||
'uploader_id': 'user3304672',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'noplaylist': True,
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'add_ie': [VimeoIE.ie_key()],
|
|
||||||
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
|
||||||
}, {
|
|
||||||
'url': 'https://videos.raywenderlich.com/courses/105-testing-in-ios/lessons/1',
|
|
||||||
'info_dict': {
|
|
||||||
'title': 'Testing in iOS',
|
|
||||||
'id': '105-testing-in-ios',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'noplaylist': False,
|
|
||||||
},
|
|
||||||
'playlist_count': 29,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
url, smuggled_data = unsmuggle_url(url, {})
|
|
||||||
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
course_id, lesson_id = mobj.group('course_id', 'id')
|
|
||||||
video_id = '%s/%s' % (course_id, lesson_id)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
|
|
||||||
no_playlist = self._downloader.params.get('noplaylist')
|
|
||||||
if no_playlist or smuggled_data.get('force_video', False):
|
|
||||||
if no_playlist:
|
|
||||||
self.to_screen(
|
|
||||||
'Downloading just video %s because of --no-playlist'
|
|
||||||
% video_id)
|
|
||||||
if '>Subscribe to unlock' in webpage:
|
|
||||||
raise ExtractorError(
|
|
||||||
'This content is only available for subscribers',
|
|
||||||
expected=True)
|
|
||||||
vimeo_id = self._search_regex(
|
|
||||||
r'data-vimeo-id=["\'](\d+)', webpage, 'video id')
|
|
||||||
return self.url_result(
|
|
||||||
VimeoIE._smuggle_referrer(
|
|
||||||
'https://player.vimeo.com/video/%s' % vimeo_id, url),
|
|
||||||
ie=VimeoIE.ie_key(), video_id=vimeo_id)
|
|
||||||
|
|
||||||
self.to_screen(
|
|
||||||
'Downloading playlist %s - add --no-playlist to just download video'
|
|
||||||
% course_id)
|
|
||||||
|
|
||||||
lesson_ids = set((lesson_id, ))
|
|
||||||
for lesson in re.findall(
|
|
||||||
r'(<a[^>]+\bclass=["\']lesson-link[^>]+>)', webpage):
|
|
||||||
attrs = extract_attributes(lesson)
|
|
||||||
if not attrs:
|
|
||||||
continue
|
|
||||||
lesson_url = attrs.get('href')
|
|
||||||
if not lesson_url:
|
|
||||||
continue
|
|
||||||
lesson_id = self._search_regex(
|
|
||||||
r'/lessons/(\d+)', lesson_url, 'lesson id', default=None)
|
|
||||||
if not lesson_id:
|
|
||||||
continue
|
|
||||||
lesson_ids.add(lesson_id)
|
|
||||||
|
|
||||||
entries = []
|
|
||||||
for lesson_id in sorted(lesson_ids):
|
|
||||||
entries.append(self.url_result(
|
|
||||||
smuggle_url(urljoin(url, lesson_id), {'force_video': True}),
|
|
||||||
ie=RayWenderlichIE.ie_key()))
|
|
||||||
|
|
||||||
title = self._search_regex(
|
|
||||||
r'class=["\']course-title[^>]+>([^<]+)', webpage, 'course title',
|
|
||||||
default=None)
|
|
||||||
|
|
||||||
return self.playlist_result(entries, course_id, title)
|
|
@ -5,93 +5,135 @@ from .common import InfoExtractor
|
|||||||
from ..compat import compat_HTTPError
|
from ..compat import compat_HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
float_or_none,
|
float_or_none,
|
||||||
|
int_or_none,
|
||||||
|
try_get,
|
||||||
|
# unified_timestamp,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class RedBullTVIE(InfoExtractor):
|
class RedBullTVIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?redbull\.tv/video/(?P<id>AP-\w+)'
|
_VALID_URL = r'https?://(?:www\.)?redbull\.tv/(?:video|film|live)/(?:AP-\w+/segment/)?(?P<id>AP-\w+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# film
|
# film
|
||||||
'url': 'https://www.redbull.tv/video/AP-1Q6XCDTAN1W11',
|
'url': 'https://www.redbull.tv/video/AP-1Q756YYX51W11/abc-of-wrc',
|
||||||
'md5': 'fb0445b98aa4394e504b413d98031d1f',
|
'md5': 'fb0445b98aa4394e504b413d98031d1f',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'AP-1Q6XCDTAN1W11',
|
'id': 'AP-1Q756YYX51W11',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'ABC of... WRC - ABC of... S1E6',
|
'title': 'ABC of...WRC',
|
||||||
'description': 'md5:5c7ed8f4015c8492ecf64b6ab31e7d31',
|
'description': 'md5:5c7ed8f4015c8492ecf64b6ab31e7d31',
|
||||||
'duration': 1582.04,
|
'duration': 1582.04,
|
||||||
|
# 'timestamp': 1488405786,
|
||||||
|
# 'upload_date': '20170301',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# episode
|
# episode
|
||||||
'url': 'https://www.redbull.tv/video/AP-1PMHKJFCW1W11',
|
'url': 'https://www.redbull.tv/video/AP-1PMT5JCWH1W11/grime?playlist=shows:shows-playall:web',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'AP-1PMHKJFCW1W11',
|
'id': 'AP-1PMT5JCWH1W11',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Grime - Hashtags S2E4',
|
'title': 'Grime - Hashtags S2 E4',
|
||||||
'description': 'md5:b5f522b89b72e1e23216e5018810bb25',
|
'description': 'md5:334b741c8c1ce65be057eab6773c1cf5',
|
||||||
'duration': 904.6,
|
'duration': 904.6,
|
||||||
|
# 'timestamp': 1487290093,
|
||||||
|
# 'upload_date': '20170217',
|
||||||
|
'series': 'Hashtags',
|
||||||
|
'season_number': 2,
|
||||||
|
'episode_number': 4,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# segment
|
||||||
|
'url': 'https://www.redbull.tv/live/AP-1R5DX49XS1W11/segment/AP-1QSAQJ6V52111/semi-finals',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'AP-1QSAQJ6V52111',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Semi Finals - Vans Park Series Pro Tour',
|
||||||
|
'description': 'md5:306a2783cdafa9e65e39aa62f514fd97',
|
||||||
|
'duration': 11791.991,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.redbull.tv/film/AP-1MSKKF5T92111/in-motion',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
session = self._download_json(
|
session = self._download_json(
|
||||||
'https://api.redbull.tv/v3/session', video_id,
|
'https://api-v2.redbull.tv/session', video_id,
|
||||||
note='Downloading access token', query={
|
note='Downloading access token', query={
|
||||||
|
'build': '4.370.0',
|
||||||
'category': 'personal_computer',
|
'category': 'personal_computer',
|
||||||
|
'os_version': '1.0',
|
||||||
'os_family': 'http',
|
'os_family': 'http',
|
||||||
})
|
})
|
||||||
if session.get('code') == 'error':
|
if session.get('code') == 'error':
|
||||||
raise ExtractorError('%s said: %s' % (
|
raise ExtractorError('%s said: %s' % (
|
||||||
self.IE_NAME, session['message']))
|
self.IE_NAME, session['message']))
|
||||||
token = session['token']
|
auth = '%s %s' % (session.get('token_type', 'Bearer'), session['access_token'])
|
||||||
|
|
||||||
try:
|
try:
|
||||||
video = self._download_json(
|
info = self._download_json(
|
||||||
'https://api.redbull.tv/v3/products/' + video_id,
|
'https://api-v2.redbull.tv/content/%s' % video_id,
|
||||||
video_id, note='Downloading video information',
|
video_id, note='Downloading video information',
|
||||||
headers={'Authorization': token}
|
headers={'Authorization': auth}
|
||||||
)
|
)
|
||||||
except ExtractorError as e:
|
except ExtractorError as e:
|
||||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
||||||
error_message = self._parse_json(
|
error_message = self._parse_json(
|
||||||
e.cause.read().decode(), video_id)['error']
|
e.cause.read().decode(), video_id)['message']
|
||||||
raise ExtractorError('%s said: %s' % (
|
raise ExtractorError('%s said: %s' % (
|
||||||
self.IE_NAME, error_message), expected=True)
|
self.IE_NAME, error_message), expected=True)
|
||||||
raise
|
raise
|
||||||
|
|
||||||
title = video['title'].strip()
|
video = info['video_product']
|
||||||
|
|
||||||
|
title = info['title'].strip()
|
||||||
|
|
||||||
formats = self._extract_m3u8_formats(
|
formats = self._extract_m3u8_formats(
|
||||||
'https://dms.redbull.tv/v3/%s/%s/playlist.m3u8' % (video_id, token),
|
video['url'], video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls')
|
m3u8_id='hls')
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
for resource in video.get('resources', []):
|
for _, captions in (try_get(
|
||||||
if resource.startswith('closed_caption_'):
|
video, lambda x: x['attachments']['captions'],
|
||||||
splitted_resource = resource.split('_')
|
dict) or {}).items():
|
||||||
if splitted_resource[2]:
|
if not captions or not isinstance(captions, list):
|
||||||
subtitles.setdefault('en', []).append({
|
continue
|
||||||
'url': 'https://resources.redbull.tv/%s/%s' % (video_id, resource),
|
for caption in captions:
|
||||||
'ext': splitted_resource[2],
|
caption_url = caption.get('url')
|
||||||
|
if not caption_url:
|
||||||
|
continue
|
||||||
|
ext = caption.get('format')
|
||||||
|
if ext == 'xml':
|
||||||
|
ext = 'ttml'
|
||||||
|
subtitles.setdefault(caption.get('lang') or 'en', []).append({
|
||||||
|
'url': caption_url,
|
||||||
|
'ext': ext,
|
||||||
})
|
})
|
||||||
|
|
||||||
subheading = video.get('subheading')
|
subheading = info.get('subheading')
|
||||||
if subheading:
|
if subheading:
|
||||||
title += ' - %s' % subheading
|
title += ' - %s' % subheading
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': video.get('long_description') or video.get(
|
'description': info.get('long_description') or info.get(
|
||||||
'short_description'),
|
'short_description'),
|
||||||
'duration': float_or_none(video.get('duration'), scale=1000),
|
'duration': float_or_none(video.get('duration'), scale=1000),
|
||||||
|
# 'timestamp': unified_timestamp(info.get('published')),
|
||||||
|
'series': info.get('show_title'),
|
||||||
|
'season_number': int_or_none(info.get('season_number')),
|
||||||
|
'episode_number': int_or_none(info.get('episode_number')),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
@ -15,7 +15,7 @@ class RedditIE(InfoExtractor):
|
|||||||
_TEST = {
|
_TEST = {
|
||||||
# from https://www.reddit.com/r/videos/comments/6rrwyj/that_small_heart_attack/
|
# from https://www.reddit.com/r/videos/comments/6rrwyj/that_small_heart_attack/
|
||||||
'url': 'https://v.redd.it/zv89llsvexdz',
|
'url': 'https://v.redd.it/zv89llsvexdz',
|
||||||
'md5': '0a070c53eba7ec4534d95a5a1259e253',
|
'md5': '655d06ace653ea3b87bccfb1b27ec99d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'zv89llsvexdz',
|
'id': 'zv89llsvexdz',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
@ -16,12 +16,12 @@ class RedTubeIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:(?:www\.)?redtube\.com/|embed\.redtube\.com/\?.*?\bid=)(?P<id>[0-9]+)'
|
_VALID_URL = r'https?://(?:(?:www\.)?redtube\.com/|embed\.redtube\.com/\?.*?\bid=)(?P<id>[0-9]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.redtube.com/66418',
|
'url': 'http://www.redtube.com/66418',
|
||||||
'md5': 'fc08071233725f26b8f014dba9590005',
|
'md5': '7b8c22b5e7098a3e1c09709df1126d2d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '66418',
|
'id': '66418',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Sucked on a toilet',
|
'title': 'Sucked on a toilet',
|
||||||
'upload_date': '20110811',
|
'upload_date': '20120831',
|
||||||
'duration': 596,
|
'duration': 596,
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
@ -46,10 +46,9 @@ class RedTubeIE(InfoExtractor):
|
|||||||
raise ExtractorError('Video %s has been removed' % video_id, expected=True)
|
raise ExtractorError('Video %s has been removed' % video_id, expected=True)
|
||||||
|
|
||||||
title = self._html_search_regex(
|
title = self._html_search_regex(
|
||||||
(r'<h(\d)[^>]+class="(?:video_title_text|videoTitle)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
|
(r'<h1 class="videoTitle[^"]*">(?P<title>.+?)</h1>',
|
||||||
r'(?:videoTitle|title)\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',),
|
r'videoTitle\s*:\s*(["\'])(?P<title>)\1'),
|
||||||
webpage, 'title', group='title',
|
webpage, 'title', group='title')
|
||||||
default=None) or self._og_search_title(webpage)
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
sources = self._parse_json(
|
sources = self._parse_json(
|
||||||
@ -88,14 +87,12 @@ class RedTubeIE(InfoExtractor):
|
|||||||
|
|
||||||
thumbnail = self._og_search_thumbnail(webpage)
|
thumbnail = self._og_search_thumbnail(webpage)
|
||||||
upload_date = unified_strdate(self._search_regex(
|
upload_date = unified_strdate(self._search_regex(
|
||||||
r'<span[^>]+>ADDED ([^<]+)<',
|
r'<span[^>]+class="added-time"[^>]*>ADDED ([^<]+)<',
|
||||||
webpage, 'upload date', fatal=False))
|
webpage, 'upload date', fatal=False))
|
||||||
duration = int_or_none(self._og_search_property(
|
duration = int_or_none(self._search_regex(
|
||||||
'video:duration', webpage, default=None) or self._search_regex(
|
|
||||||
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', default=None))
|
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', default=None))
|
||||||
view_count = str_to_int(self._search_regex(
|
view_count = str_to_int(self._search_regex(
|
||||||
(r'<div[^>]*>Views</div>\s*<div[^>]*>\s*([\d,.]+)',
|
r'<span[^>]*>VIEWS</span></td>\s*<td>([\d,.]+)',
|
||||||
r'<span[^>]*>VIEWS</span>\s*</td>\s*<td>\s*([\d,.]+)'),
|
|
||||||
webpage, 'view count', fatal=False))
|
webpage, 'view count', fatal=False))
|
||||||
|
|
||||||
# No self-labeling, but they describe themselves as
|
# No self-labeling, but they describe themselves as
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user