Compare commits

..

2 Commits

Author SHA1 Message Date
97bc05116e Merge branch 'master' into totalwebcasting 2018-01-07 15:03:28 +01:00
7608a91ee7 [totalwebcasting] Add new extractor 2017-01-11 18:51:25 -05:00
265 changed files with 4169 additions and 10009 deletions

View File

@ -6,13 +6,12 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.06.14*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.12.31*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.06.14** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.12.31**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones - [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*? ### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl) - [ ] Bug report (encountered problems with youtube-dl)
@ -36,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2018.06.14 [debug] youtube-dl version 2017.12.31
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -12,7 +12,6 @@
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones - [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*? ### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl) - [ ] Bug report (encountered problems with youtube-dl)

1
.gitignore vendored
View File

@ -47,4 +47,3 @@ youtube-dl.zsh
*.iml *.iml
tmp/ tmp/
venv/

View File

@ -231,11 +231,3 @@ John Dong
Tatsuyuki Ishi Tatsuyuki Ishi
Daniel Weber Daniel Weber
Kay Bouché Kay Bouché
Yang Hongbo
Lei Wang
Petr Novák
Leonardo Taccari
Martin Weinelt
Surya Oktafendri
TingPing
Alexandre Macabies

557
ChangeLog
View File

@ -1,564 +1,9 @@
version 2018.06.14 version <unreleased>
Core
* [downloader/http] Fix retry on error when streaming to stdout (#16699)
Extractors Extractors
+ [discoverynetworks] Add support for disco-api videos (#16724)
+ [dailymotion] Add support for password protected videos (#9789)
+ [abc:iview] Add support for livestreams (#12354)
* [abc:iview] Fix extraction (#16704)
+ [crackle] Add support for sonycrackle.com (#16698)
+ [tvnet] Add support for tvnet.gov.vn (#15462)
* [nrk] Update API hosts and try all previously known ones (#16690)
* [wimp] Fix Youtube embeds extraction
version 2018.06.11
Extractors
* [npo] Extend URL regular expression and add support for npostart.nl (#16682)
+ [inc] Add support for another embed schema (#16666)
* [tv4] Fix format extraction (#16650)
+ [nexx] Add support for free cdn (#16538)
+ [pbs] Add another cove id pattern (#15373)
+ [rbmaradio] Add support for 192k format (#16631)
version 2018.06.04
Extractors
+ [camtube] Add support for camtube.co
+ [twitter:card] Extract guest token (#16609)
+ [chaturbate] Use geo verification headers
+ [bbc] Add support for bbcthree (#16612)
* [youtube] Move metadata extraction after video availability check
+ [youtube] Extract track and artist
+ [safari] Add support for new URL schema (#16614)
* [adn] Fix extraction
version 2018.06.02
Core
* [utils] Improve determine_ext
Extractors
+ [facebook] Add support for tahoe player videos (#15441, #16554)
* [cbc] Improve extraction (#16583, #16593)
* [openload] Improve ext extraction (#16595)
+ [twitter:card] Add support for another endpoint (#16586)
+ [openload] Add support for oload.win and oload.download (#16592)
* [audimedia] Fix extraction (#15309)
+ [francetv] Add support for sport.francetvinfo.fr (#15645)
* [mlb] Improve extraction (#16587)
- [nhl] Remove old extractors
* [rbmaradio] Check formats availability (#16585)
version 2018.05.30
Core
* [downloader/rtmp] Generalize download messages and report time elapsed
on finish
* [downloader/rtmp] Gracefully handle live streams interrupted by user
Extractors
* [teamcoco] Fix extraction for full episodes (#16573)
* [spiegel] Fix info extraction (#16538)
+ [apa] Add support for apa.at (#15041, #15672)
+ [bellmedia] Add support for bnnbloomberg.ca (#16560)
+ [9c9media] Extract MPD formats and subtitles
* [cammodels] Use geo verification headers
+ [ufctv] Add support for authentication (#16542)
+ [cammodels] Add support for cammodels.com (#14499)
* [utils] Fix style id extraction for namespaced id attribute in dfxp2srt
(#16551)
* [soundcloud] Detect format extension (#16549)
* [cbc] Fix playlist title extraction (#16502)
+ [tumblr] Detect and report sensitive media (#13829)
+ [tumblr] Add support for authentication (#15133)
version 2018.05.26
Core
* [utils] Improve parse_age_limit
Extractors
* [audiomack] Stringify video id (#15310)
* [izlesene] Fix extraction (#16233, #16271, #16407)
+ [indavideo] Add support for generic embeds (#11989)
* [indavideo] Fix extraction (#11221)
* [indavideo] Sign download URLs (#16174)
+ [peertube] Add support for PeerTube based sites (#16301, #16329)
* [imgur] Fix extraction (#16537)
+ [hidive] Add support for authentication (#16534)
+ [nbc] Add support for stream.nbcsports.com (#13911)
+ [viewlift] Add support for hoichoi.tv (#16536)
* [go90] Extract age limit and detect DRM protection(#10127)
* [viewlift] fix extraction for snagfilms.com (#15766)
* [globo] Improve extraction (#4189)
* Add support for authentication
* Simplify URL signing
* Extract DASH and MSS formats
* [leeco] Fix extraction (#16464)
* [teamcoco] Add fallback for format extraction (#16484)
* [teamcoco] Improve URL regular expression (#16484)
* [imdb] Improve extraction (#4085, #14557)
version 2018.05.18
Extractors
* [vimeo:likes] Relax URL regular expression and fix single page likes
extraction (#16475)
* [pluralsight] Fix clip id extraction (#16460)
+ [mychannels] Add support for mychannels.com (#15334)
- [moniker] Remove extractor (#15336)
* [pbs] Fix embed data extraction (#16474)
+ [mtv] Add support for paramountnetwork.com and bellator.com (#15418)
* [youtube] Fix hd720 format position
* [dailymotion] Remove fragment part from m3u8 URLs (#8915)
* [3sat] Improve extraction (#15350)
* Extract all formats
* Extract more format metadata
* Improve format sorting
* Use hls native downloader
* Detect and bypass geo-restriction
+ [dtube] Add support for d.tube (#15201)
* [options] Fix typo (#16450)
* [youtube] Improve format filesize extraction (#16453)
* [youtube] Make uploader extraction non fatal (#16444)
* [youtube] Fix extraction for embed restricted live streams (#16433)
* [nbc] Improve info extraction (#16440)
* [twitch:clips] Fix extraction (#16429)
* [redditr] Relax URL regular expression (#16426, #16427)
* [mixcloud] Bypass throttling for HTTP formats (#12579, #16424)
+ [nick] Add support for nickjr.de (#13230)
* [teamcoco] Fix extraction (#16374)
version 2018.05.09
Core
* [YoutubeDL] Ensure ext exists for automatic captions
* Introduce --geo-bypass-ip-block
Extractors
+ [udemy] Extract asset captions
+ [udemy] Extract stream URLs (#16372)
+ [businessinsider] Add support for businessinsider.com (#16387, #16388, #16389)
+ [cloudflarestream] Add support for cloudflarestream.com (#16375)
* [watchbox] Fix extraction (#16356)
* [discovery] Extract Affiliate/Anonymous Auth Token from cookies (#14954)
+ [itv:btcc] Add support for itv.com/btcc (#16139)
* [tunein] Use live title for live streams (#16347)
* [itv] Improve extraction (#16253)
version 2018.05.01
Core
* [downloader/fragment] Restart download if .ytdl file is corrupt (#16312)
+ [extractor/common] Extract interaction statistic
+ [utils] Add merge_dicts
+ [extractor/common] Add _download_json_handle
Extractors
* [kaltura] Improve iframe embeds detection (#16337)
+ [udemy] Extract outputs renditions (#16289, #16291, #16320, #16321, #16334,
#16335)
+ [zattoo] Add support for zattoo.com and mobiltv.quickline.com (#14668, #14676)
* [yandexmusic] Convert release_year to int
* [udemy] Override _download_webpage_handle instead of _download_webpage
* [xiami] Override _download_webpage_handle instead of _download_webpage
* [yandexmusic] Override _download_webpage_handle instead of _download_webpage
* [youtube] Correctly disable polymer on all requests (#16323, #16326)
* [generic] Prefer enclosures over links in RSS feeds (#16189)
+ [redditr] Add support for old.reddit.com URLs (#16274)
* [nrktv] Update API host (#16324)
+ [imdb] Extract all formats (#16249)
+ [vimeo] Extract JSON-LD (#16295)
* [funk:channel] Improve extraction (#16285)
version 2018.04.25
Core
* [utils] Fix match_str for boolean meta fields
+ [Makefile] Add support for pandoc 2 and disable smart extension (#16251)
* [YoutubeDL] Fix typo in media extension compatibility checker (#16215)
Extractors
+ [openload] Recognize IPv6 stream URLs (#16136, #16137, #16205, #16246,
#16250)
+ [twitch] Extract is_live according to status (#16259)
* [pornflip] Relax URL regular expression (#16258)
- [etonline] Remove extractor (#16256)
* [breakcom] Fix extraction (#16254)
+ [youtube] Add ability to authenticate with cookies
* [youtube:feed] Implement lazy playlist extraction (#10184)
+ [svt] Add support for TV channel live streams (#15279, #15809)
* [ccma] Fix video extraction (#15931)
* [rentv] Fix extraction (#15227)
+ [nick] Add support for nickjr.nl (#16230)
* [extremetube] Fix metadata extraction
+ [keezmovies] Add support for generic embeds (#16134, #16154)
* [nexx] Extract new azure URLs (#16223)
* [cbssports] Fix extraction (#16217)
* [kaltura] Improve embeds detection (#16201)
* [instagram:user] Fix extraction (#16119)
* [cbs] Skip DRM asset types (#16104)
version 2018.04.16
Extractors
* [smotri:broadcast] Fix extraction (#16180)
+ [picarto] Add support for picarto.tv (#6205, #12514, #15276, #15551)
* [vine:user] Fix extraction (#15514, #16190)
* [pornhub] Relax URL regular expression (#16165)
* [cbc:watch] Re-acquire device token when expired (#16160)
+ [fxnetworks] Add support for https theplatform URLs (#16125, #16157)
+ [instagram:user] Add request signing (#16119)
+ [twitch] Add support for mobile URLs (#16146)
version 2018.04.09
Core
* [YoutubeDL] Do not save/restore console title while simulate (#16103)
* [extractor/common] Relax JSON-LD context check (#16006)
Extractors
+ [generic] Add support for tube8 embeds
+ [generic] Add support for share-videos.se embeds (#16089, #16115)
* [odnoklassniki] Extend URL regular expression (#16081)
* [steam] Bypass mature content check (#16113)
+ [acast] Extract more metadata
* [acast] Fix extraction (#16118)
* [instagram:user] Fix extraction (#16119)
* [drtuber] Fix title extraction (#16107, #16108)
* [liveleak] Extend URL regular expression (#16117)
+ [openload] Add support for oload.xyz
* [openload] Relax stream URL regular expression
* [openload] Fix extraction (#16099)
+ [svtplay:series] Add support for season URLs
+ [svtplay:series] Add support for series (#11130, #16059)
version 2018.04.03
Extractors
+ [tvnow] Add support for shows (#15837)
* [dramafever] Fix authentication (#16067)
* [afreecatv] Use partial view only when necessary (#14450)
+ [afreecatv] Add support for authentication (#14450)
+ [nationalgeographic] Add support for new URL schema (#16001, #16054)
* [xvideos] Fix thumbnail extraction (#15978, #15979)
* [medialaan] Fix vod id (#16038)
+ [openload] Add support for oload.site (#16039)
* [naver] Fix extraction (#16029)
* [dramafever] Partially switch to API v5 (#16026)
* [abc:iview] Unescape title and series meta fields (#15994)
* [videa] Extend URL regular expression (#16003)
version 2018.03.26.1
Core
+ [downloader/external] Add elapsed time to progress hook (#10876)
* [downloader/external,fragment] Fix download finalization when writing file
to stdout (#10809, #10876, #15799)
Extractors
* [vrv] Fix extraction on python2 (#15928)
* [afreecatv] Update referrer (#15947)
+ [24video] Add support for 24video.sexy (#15973)
* [crackle] Bypass geo restriction
* [crackle] Fix extraction (#15969)
+ [lenta] Add support for lenta.ru (#15953)
+ [instagram:user] Add pagination (#15934)
* [youku] Update ccode (#15939)
* [libsyn] Adapt to new page structure
version 2018.03.20
Core
* [extractor/common] Improve thumbnail extraction for HTML5 entries
* Generalize XML manifest processing code and improve XSPF parsing
+ [extractor/common] Add _download_xml_handle
+ [extractor/common] Add support for relative URIs in _parse_xspf (#15794)
Extractors
+ [7plus] Extract series metadata (#15862, #15906)
* [9now] Bypass geo restriction (#15920)
* [cbs] Skip unavailable assets (#13490, #13506, #15776)
+ [canalc2] Add support for HTML5 videos (#15916, #15919)
+ [ceskatelevize] Add support for iframe embeds (#15918)
+ [prosiebensat1] Add support for galileo.tv (#15894)
+ [generic] Add support for xfileshare embeds (#15879)
* [bilibili] Switch to v2 playurl API
* [bilibili] Fix and improve extraction (#15048, #15430, #15622, #15863)
* [heise] Improve extraction (#15496, #15784, #15026)
* [instagram] Fix user videos extraction (#15858)
version 2018.03.14
Extractors
* [soundcloud] Update client id (#15866)
+ [tennistv] Add support for tennistv.com
+ [line] Add support for tv.line.me (#9427)
* [xnxx] Fix extraction (#15817)
* [njpwworld] Fix authentication (#15815)
version 2018.03.10
Core
* [downloader/hls] Skip uplynk ad fragments (#15748)
Extractors
* [pornhub] Don't override session cookies (#15697)
+ [raywenderlich] Add support for videos.raywenderlich.com (#15251)
* [funk] Fix extraction and rework extractors (#15792)
* [nexx] Restore reverse engineered approach
+ [heise] Add support for kaltura embeds (#14961, #15728)
+ [tvnow] Extract series metadata (#15774)
* [ruutu] Continue formats extraction on NOT-USED URLs (#15775)
* [vrtnu] Use redirect URL for building video JSON URL (#15767, #15769)
* [vimeo] Modernize login code and improve error messaging
* [archiveorg] Fix extraction (#15770, #15772)
+ [hidive] Add support for hidive.com (#15494)
* [afreecatv] Detect deleted videos
* [afreecatv] Fix extraction (#15755)
* [vice] Fix extraction and rework extractors (#11101, #13019, #13622, #13778)
+ [vidzi] Add support for vidzi.si (#15751)
* [npo] Fix typo
version 2018.03.03
Core
+ [utils] Add parse_resolution
Revert respect --prefer-insecure while updating
Extractors
+ [yapfiles] Add support for yapfiles.ru (#15726, #11085)
* [spankbang] Fix formats extraction (#15727)
* [adn] Fix extraction (#15716)
+ [toggle] Extract DASH and ISM formats (#15721)
+ [nickelodeon] Add support for nickelodeon.com.tr (#15706)
* [npo] Validate and filter format URLs (#15709)
version 2018.02.26
Extractors
* [udemy] Use custom User-Agent (#15571)
version 2018.02.25
Core
* [postprocessor/embedthumbnail] Skip embedding when there aren't any
thumbnails (#12573)
* [extractor/common] Improve jwplayer subtitles extraction (#15695)
Extractors
+ [vidlii] Add support for vidlii.com (#14472, #14512, #14779)
+ [streamango] Capture and output error messages
* [streamango] Fix extraction (#14160, #14256)
+ [telequebec] Add support for emissions (#14649, #14655)
+ [telequebec:live] Add support for live streams (#15688)
+ [mailru:music] Add support for mail.ru/music (#15618)
* [aenetworks] Switch to akamai HLS formats (#15612)
* [ytsearch] Fix flat title extraction (#11260, #15681)
version 2018.02.22
Core
+ [utils] Fixup some common URL typos in sanitize_url (#15649)
* Respect --prefer-insecure while updating (#15497)
Extractors
* [vidio] Fix HLS URL extraction (#15675)
+ [nexx] Add support for arc.nexx.cloud URLs
* [nexx] Switch to arc API (#15652)
* [redtube] Fix duration extraction (#15659)
+ [sonyliv] Respect referrer (#15648)
+ [brightcove:new] Use referrer for formats' HTTP headers
+ [cbc] Add support for olympics.cbc.ca (#15535)
+ [fusion] Add support for fusion.tv (#15628)
* [npo] Improve quality metadata extraction
* [npo] Relax URL regular expression (#14987, #14994)
+ [npo] Capture and output error message
+ [pornhub] Add support for channels (#15613)
* [youtube] Handle shared URLs with generic extractor (#14303)
version 2018.02.11
Core
+ [YoutubeDL] Add support for filesize_approx in format selector (#15550)
Extractors
+ [francetv] Add support for live streams (#13689)
+ [francetv] Add support for zouzous.fr and ludo.fr (#10454, #13087, #13103,
#15012)
* [francetv] Separate main extractor and rework others to delegate to it
* [francetv] Improve manifest URL signing (#15536)
+ [francetv] Sign m3u8 manifest URLs (#15565)
+ [veoh] Add support for embed URLs (#15561)
* [afreecatv] Fix extraction (#15556)
* [periscope] Use accessVideoPublic endpoint (#15554)
* [discovery] Fix auth request (#15542)
+ [6play] Extract subtitles (#15541)
* [newgrounds] Fix metadata extraction (#15531)
+ [nbc] Add support for stream.nbcolympics.com (#10295)
* [dvtv] Fix live streams extraction (#15442)
version 2018.02.08
Extractors
+ [myvi] Extend URL regular expression
+ [myvi:embed] Add support for myvi.tv embeds (#15521)
+ [prosiebensat1] Extend URL regular expression (#15520)
* [pokemon] Relax URL regular expression and extend title extraction (#15518)
+ [gameinformer] Use geo verification headers
* [la7] Fix extraction (#15501, #15502)
* [gameinformer] Fix brightcove id extraction (#15416)
+ [afreecatv] Pass referrer to video info request (#15507)
+ [telebruxelles] Add support for live streams
* [telebruxelles] Relax URL regular expression
* [telebruxelles] Fix extraction (#15504)
* [extractor/common] Respect secure schemes in _extract_wowza_formats
version 2018.02.04
Core
* [downloader/http] Randomize HTTP chunk size
+ [downloader/http] Add ability to pass downloader options via info dict
* [downloader/http] Fix 302 infinite loops by not reusing requests
+ Document http_chunk_size
Extractors
+ [brightcove] Pass embed page URL as referrer (#15486)
+ [youtube] Enforce using chunked HTTP downloading for DASH formats
version 2018.02.03
Core
+ Introduce --http-chunk-size for chunk-based HTTP downloading
+ Add support for IronPython
* [downloader/ism] Fix Python 3.2 support
Extractors
* [redbulltv] Fix extraction (#15481)
* [redtube] Fix metadata extraction (#15472)
* [pladform] Respect platform id and extract HLS formats (#15468)
- [rtlnl] Remove progressive formats (#15459)
* [6play] Do no modify asset URLs with a token (#15248)
* [nationalgeographic] Relax URL regular expression
* [dplay] Relax URL regular expression (#15458)
* [cbsinteractive] Fix data extraction (#15451)
+ [amcnetworks] Add support for sundancetv.com (#9260)
version 2018.01.27
Core
* [extractor/common] Improve _json_ld for articles
* Switch codebase to use compat_b64decode
+ [compat] Add compat_b64decode
Extractors
+ [seznamzpravy] Add support for seznam.cz and seznamzpravy.cz (#14102, #14616)
* [dplay] Bypass geo restriction
+ [dplay] Add support for disco-api videos (#15396)
* [youtube] Extract precise error messages (#15284)
* [teachertube] Capture and output error message
* [teachertube] Fix and relax thumbnail extraction (#15403)
+ [prosiebensat1] Add another clip id regular expression (#15378)
* [tbs] Update tokenizer url (#15395)
* [mixcloud] Use compat_b64decode (#15394)
- [thesixtyone] Remove extractor (#15341)
version 2018.01.21
Core
* [extractor/common] Improve jwplayer DASH formats extraction (#9242, #15187)
* [utils] Improve scientific notation handling in js_to_json (#14789)
Extractors
+ [southparkdk] Add support for southparkstudios.nu
+ [southpark] Add support for collections (#14803)
* [franceinter] Fix upload date extraction (#14996)
+ [rtvs] Add support for rtvs.sk (#9242, #15187)
* [restudy] Fix extraction and extend URL regular expression (#15347)
* [youtube:live] Improve live detection (#15365)
+ [springboardplatform] Add support for springboardplatform.com
* [prosiebensat1] Add another clip id regular expression (#15290)
- [ringtv] Remove extractor (#15345)
version 2018.01.18
Extractors
* [soundcloud] Update client id (#15306)
- [kamcord] Remove extractor (#15322)
+ [spiegel] Add support for nexx videos (#15285)
* [twitch] Fix authentication and error capture (#14090, #15264)
* [vk] Detect more errors due to copyright complaints (#15259)
version 2018.01.14
Extractors
* [youtube] Fix live streams extraction (#15202)
* [wdr] Bypass geo restriction
* [wdr] Rework extractors (#14598)
+ [wdr] Add support for wdrmaus.de/elefantenseite (#14598)
+ [gamestar] Add support for gamepro.de (#3384)
* [viafree] Skip rtmp formats (#15232)
+ [pandoratv] Add support for mobile URLs (#12441)
+ [pandoratv] Add support for new URL format (#15131)
+ [ximalaya] Add support for ximalaya.com (#14687)
+ [digg] Add support for digg.com (#15214)
* [limelight] Tolerate empty pc formats (#15150, #15151, #15207)
* [ndr:embed:base] Make separate formats extraction non fatal (#15203)
+ [weibo] Add extractor (#15079)
+ [ok] Add support for live streams
* [canalplus] Fix extraction (#15072)
* [bilibili] Fix extraction (#15188)
version 2018.01.07
Core
* [utils] Fix youtube-dl under PyPy3 on Windows
* [YoutubeDL] Output python implementation in debug header
Extractors
+ [jwplatform] Add support for multiple embeds (#15192)
* [mitele] Fix extraction (#15186)
+ [motherless] Add support for groups (#15124)
* [lynda] Relax URL regular expression (#15185)
* [soundcloud] Fallback to avatar picture for thumbnail (#12878)
* [youku] Fix list extraction (#15135) * [youku] Fix list extraction (#15135)
* [openload] Fix extraction (#15166) * [openload] Fix extraction (#15166)
* [lynda] Skip invalid subtitles (#15159)
* [twitch] Pass video id to url_result when extracting playlist (#15139)
* [rtve.es:alacarta] Fix extraction of some new URLs * [rtve.es:alacarta] Fix extraction of some new URLs
* [acast] Fix extraction (#15147)
version 2017.12.31 version 2017.12.31

View File

@ -14,9 +14,6 @@ PYTHON ?= /usr/bin/env python
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local # set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi) SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi)
# set markdown input format to "markdown-smart" for pandoc version 2 and to "markdown" for pandoc prior to version 2
MARKDOWN = $(shell if [ `pandoc -v | head -n1 | cut -d" " -f2 | head -c1` = "2" ]; then echo markdown-smart; else echo markdown; fi)
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
install -d $(DESTDIR)$(BINDIR) install -d $(DESTDIR)$(BINDIR)
install -m 755 youtube-dl $(DESTDIR)$(BINDIR) install -m 755 youtube-dl $(DESTDIR)$(BINDIR)
@ -85,11 +82,11 @@ supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md $(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
README.txt: README.md README.txt: README.md
pandoc -f $(MARKDOWN) -t plain README.md -o README.txt pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md youtube-dl.1: README.md
$(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md $(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md
pandoc -s -f $(MARKDOWN) -t man youtube-dl.1.temp.md -o youtube-dl.1 pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md rm -f youtube-dl.1.temp.md
youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in

View File

@ -46,7 +46,7 @@ Or with [MacPorts](https://www.macports.org/):
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html). Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION # DESCRIPTION
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like. **youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...] youtube-dl [OPTIONS] URL [URL...]
@ -93,8 +93,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
To enable SOCKS proxy, specify a proper To enable experimental SOCKS proxy, specify
scheme. For example a proper scheme. For example
socks5://127.0.0.1:1080/. Pass in an empty socks5://127.0.0.1:1080/. Pass in an empty
string (--proxy "") for direct connection string (--proxy "") for direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds --socket-timeout SECONDS Time to wait before giving up, in seconds
@ -106,18 +106,16 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--geo-verification-proxy URL Use this proxy to verify the IP address for --geo-verification-proxy URL Use this proxy to verify the IP address for
some geo-restricted sites. The default some geo-restricted sites. The default
proxy specified by --proxy (or none, if the proxy specified by --proxy (or none, if the
option is not present) is used for the options is not present) is used for the
actual downloading. actual downloading.
--geo-bypass Bypass geographic restriction via faking --geo-bypass Bypass geographic restriction via faking
X-Forwarded-For HTTP header X-Forwarded-For HTTP header (experimental)
--no-geo-bypass Do not bypass geographic restriction via --no-geo-bypass Do not bypass geographic restriction via
faking X-Forwarded-For HTTP header faking X-Forwarded-For HTTP header
(experimental)
--geo-bypass-country CODE Force bypass geographic restriction with --geo-bypass-country CODE Force bypass geographic restriction with
explicitly provided two-letter ISO 3166-2 explicitly provided two-letter ISO 3166-2
country code country code (experimental)
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with
explicitly provided IP block in CIDR
notation
## Video Selection: ## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1) --playlist-start NUMBER Playlist video to start at (default is 1)
@ -200,15 +198,10 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
size. By default, the buffer size is size. By default, the buffer size is
automatically resized from an initial value automatically resized from an initial value
of SIZE. of SIZE.
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
downloading (e.g. 10485760 or 10M) (default
is disabled). May be useful for bypassing
bandwidth throttling imposed by a webserver
(experimental)
--playlist-reverse Download playlist videos in reverse order --playlist-reverse Download playlist videos in reverse order
--playlist-random Download playlist videos in random order --playlist-random Download playlist videos in random order
--xattr-set-filesize Set file xattribute ytdl.filesize with --xattr-set-filesize Set file xattribute ytdl.filesize with
expected file size expected file size (experimental)
--hls-prefer-native Use the native HLS downloader instead of --hls-prefer-native Use the native HLS downloader instead of
ffmpeg ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS --hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
@ -225,9 +218,7 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
## Filesystem Options: ## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for -a, --batch-file FILE File containing URLs to download ('-' for
stdin), one URL per line. Lines starting stdin)
with '#', ';' or ']' are considered as
comments and ignored.
--id Use only video ID in file name --id Use only video ID in file name
-o, --output TEMPLATE Output filename template, see the "OUTPUT -o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info TEMPLATE" for all the info
@ -872,7 +863,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox). In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format. Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare). Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).

View File

@ -1,22 +1,27 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
from __future__ import unicode_literals from __future__ import unicode_literals
import hashlib
import urllib.request
import json import json
versions_info = json.load(open('update/versions.json')) versions_info = json.load(open('update/versions.json'))
version = versions_info['latest'] version = versions_info['latest']
version_dict = versions_info['versions'][version] URL = versions_info['versions'][version]['bin'][0]
data = urllib.request.urlopen(URL).read()
# Read template page # Read template page
with open('download.html.in', 'r', encoding='utf-8') as tmplf: with open('download.html.in', 'r', encoding='utf-8') as tmplf:
template = tmplf.read() template = tmplf.read()
sha256sum = hashlib.sha256(data).hexdigest()
template = template.replace('@PROGRAM_VERSION@', version) template = template.replace('@PROGRAM_VERSION@', version)
template = template.replace('@PROGRAM_URL@', version_dict['bin'][0]) template = template.replace('@PROGRAM_URL@', URL)
template = template.replace('@PROGRAM_SHA256SUM@', version_dict['bin'][1]) template = template.replace('@PROGRAM_SHA256SUM@', sha256sum)
template = template.replace('@EXE_URL@', version_dict['exe'][0]) template = template.replace('@EXE_URL@', versions_info['versions'][version]['exe'][0])
template = template.replace('@EXE_SHA256SUM@', version_dict['exe'][1]) template = template.replace('@EXE_SHA256SUM@', versions_info['versions'][version]['exe'][1])
template = template.replace('@TAR_URL@', version_dict['tar'][0]) template = template.replace('@TAR_URL@', versions_info['versions'][version]['tar'][0])
template = template.replace('@TAR_SHA256SUM@', version_dict['tar'][1]) template = template.replace('@TAR_SHA256SUM@', versions_info['versions'][version]['tar'][1])
with open('download.html', 'w', encoding='utf-8') as dlf: with open('download.html', 'w', encoding='utf-8') as dlf:
dlf.write(template) dlf.write(template)

View File

@ -13,7 +13,7 @@ year = str(datetime.datetime.now().year)
for fn in glob.glob('*.html*'): for fn in glob.glob('*.html*'):
with io.open(fn, encoding='utf-8') as f: with io.open(fn, encoding='utf-8') as f:
content = f.read() content = f.read()
newc = re.sub(r'(?P<copyright>Copyright © 2011-)(?P<year>[0-9]{4})', 'Copyright © 2011-' + year, content) newc = re.sub(r'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', 'Copyright © 2006-' + year, content)
if content != newc: if content != newc:
tmpFn = fn + '.part' tmpFn = fn + '.part'
with io.open(tmpFn, 'wt', encoding='utf-8') as outf: with io.open(tmpFn, 'wt', encoding='utf-8') as outf:

View File

@ -15,6 +15,7 @@
- **8tracks** - **8tracks**
- **91porn** - **91porn**
- **9c9media** - **9c9media**
- **9c9media:stack**
- **9gag** - **9gag**
- **9now.com.au** - **9now.com.au**
- **abc.net.au** - **abc.net.au**
@ -47,7 +48,6 @@
- **anitube.se** - **anitube.se**
- **Anvato** - **Anvato**
- **AnySex** - **AnySex**
- **APA**
- **Aparat** - **Aparat**
- **AppleConnect** - **AppleConnect**
- **AppleDaily**: 臺灣蘋果日報 - **AppleDaily**: 臺灣蘋果日報
@ -100,7 +100,6 @@
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
- **Bellator**
- **BellMedia** - **BellMedia**
- **Bet** - **Bet**
- **Bigflix** - **Bigflix**
@ -123,23 +122,19 @@
- **BRMediathek**: Bayerischer Rundfunk Mediathek - **BRMediathek**: Bayerischer Rundfunk Mediathek
- **bt:article**: Bergens Tidende Articles - **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen - **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **BusinessInsider**
- **BuzzFeed** - **BuzzFeed**
- **BYUtv** - **BYUtv**
- **Camdemy** - **Camdemy**
- **CamdemyFolder** - **CamdemyFolder**
- **CamModels**
- **CamTube**
- **CamWithHer** - **CamWithHer**
- **canalc2.tv** - **canalc2.tv**
- **Canalplus**: mycanal.fr and piwiplus.fr - **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
- **Canvas** - **Canvas**
- **CanvasEen**: canvas.be and een.be - **CanvasEen**: canvas.be and een.be
- **CarambaTV** - **CarambaTV**
- **CarambaTVPage** - **CarambaTVPage**
- **CartoonNetwork** - **CartoonNetwork**
- **cbc.ca** - **cbc.ca**
- **cbc.ca:olympics**
- **cbc.ca:player** - **cbc.ca:player**
- **cbc.ca:watch** - **cbc.ca:watch**
- **cbc.ca:watch:video** - **cbc.ca:watch:video**
@ -167,7 +162,6 @@
- **ClipRs** - **ClipRs**
- **Clipsyndicate** - **Clipsyndicate**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream**
- **cloudtime**: CloudTime - **cloudtime**: CloudTime
- **Cloudy** - **Cloudy**
- **Clubic** - **Clubic**
@ -195,7 +189,7 @@
- **CSpan**: C-SPAN - **CSpan**: C-SPAN
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
- **CTVNews** - **CTVNews**
- **Culturebox** - **culturebox.francetvinfo.fr**
- **CultureUnplugged** - **CultureUnplugged**
- **curiositystream** - **curiositystream**
- **curiositystream:collection** - **curiositystream:collection**
@ -216,7 +210,6 @@
- **defense.gouv.fr** - **defense.gouv.fr**
- **democracynow** - **democracynow**
- **DHM**: Filmarchiv - Deutsches Historisches Museum - **DHM**: Filmarchiv - Deutsches Historisches Museum
- **Digg**
- **DigitallySpeaking** - **DigitallySpeaking**
- **Digiteka** - **Digiteka**
- **Discovery** - **Discovery**
@ -237,7 +230,6 @@
- **DrTuber** - **DrTuber**
- **drtv** - **drtv**
- **drtv:live** - **drtv:live**
- **DTube**
- **Dumpert** - **Dumpert**
- **dvtv**: http://video.aktualne.cz/ - **dvtv**: http://video.aktualne.cz/
- **dw** - **dw**
@ -263,6 +255,7 @@
- **ESPN** - **ESPN**
- **ESPNArticle** - **ESPNArticle**
- **EsriVideo** - **EsriVideo**
- **ETOnline**
- **Europa** - **Europa**
- **EveryonesMixtape** - **EveryonesMixtape**
- **ExpoTV** - **ExpoTV**
@ -297,14 +290,11 @@
- **FranceTV** - **FranceTV**
- **FranceTVEmbed** - **FranceTVEmbed**
- **francetvinfo.fr** - **francetvinfo.fr**
- **FranceTVJeunesse**
- **FranceTVSite**
- **Freesound** - **Freesound**
- **freespeech.org** - **freespeech.org**
- **FreshLive** - **FreshLive**
- **Funimation** - **Funimation**
- **FunkChannel** - **Funk**
- **FunkMix**
- **FunnyOrDie** - **FunnyOrDie**
- **Fusion** - **Fusion**
- **Fux** - **Fux**
@ -342,7 +332,6 @@
- **HentaiStigma** - **HentaiStigma**
- **hetklokhuis** - **hetklokhuis**
- **hgtv.com:show** - **hgtv.com:show**
- **HiDive**
- **HistoricFilms** - **HistoricFilms**
- **history:topic**: History.com Topic - **history:topic**: History.com Topic
- **hitbox** - **hitbox**
@ -367,6 +356,7 @@
- **ImgurAlbum** - **ImgurAlbum**
- **Ina** - **Ina**
- **Inc** - **Inc**
- **Indavideo**
- **IndavideoEmbed** - **IndavideoEmbed**
- **InfoQ** - **InfoQ**
- **Instagram** - **Instagram**
@ -378,7 +368,6 @@
- **Ir90Tv** - **Ir90Tv**
- **ITTF** - **ITTF**
- **ITV** - **ITV**
- **ITVBTCC**
- **ivi**: ivi.ru - **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations - **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV - **ivideon**: Ivideon TV
@ -393,6 +382,7 @@
- **JWPlatform** - **JWPlatform**
- **Kakao** - **Kakao**
- **Kaltura** - **Kaltura**
- **Kamcord**
- **KanalPlay**: Kanal 5/9/11 Play - **KanalPlay**: Kanal 5/9/11 Play
- **Kankan** - **Kankan**
- **Karaoketv** - **Karaoketv**
@ -424,7 +414,6 @@
- **Lecture2Go** - **Lecture2Go**
- **LEGO** - **LEGO**
- **Lemonde** - **Lemonde**
- **Lenta**
- **LePlaylist** - **LePlaylist**
- **LetvCloud**: 乐视云 - **LetvCloud**: 乐视云
- **Libsyn** - **Libsyn**
@ -433,7 +422,6 @@
- **limelight** - **limelight**
- **limelight:channel** - **limelight:channel**
- **limelight:channel_list** - **limelight:channel_list**
- **LineTV**
- **LiTV** - **LiTV**
- **LiveLeak** - **LiveLeak**
- **LiveLeakEmbed** - **LiveLeakEmbed**
@ -449,8 +437,7 @@
- **m6** - **m6**
- **macgamestore**: MacGameStore trailers - **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru - **mailru**: Видео@Mail.Ru
- **mailru:music**: Музыка@Mail.Ru - **MakersChannel**
- **mailru:music:search**: Музыка@Mail.Ru
- **MakerTV** - **MakerTV**
- **mangomolo:live** - **mangomolo:live**
- **mangomolo:video** - **mangomolo:video**
@ -488,9 +475,9 @@
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
- **Mojvideo** - **Mojvideo**
- **Moniker**: allmyvideos.net and vidspot.net
- **Morningstar**: morningstar.com - **Morningstar**: morningstar.com
- **Motherless** - **Motherless**
- **MotherlessGroup**
- **Motorsport**: motorsport.com - **Motorsport**: motorsport.com
- **MovieClips** - **MovieClips**
- **MovieFap** - **MovieFap**
@ -509,13 +496,11 @@
- **mva:course**: Microsoft Virtual Academy courses - **mva:course**: Microsoft Virtual Academy courses
- **Mwave** - **Mwave**
- **MwaveMeetGreet** - **MwaveMeetGreet**
- **MyChannels**
- **MySpace** - **MySpace**
- **MySpace:album** - **MySpace:album**
- **MySpass** - **MySpass**
- **Myvi** - **Myvi**
- **MyVidster** - **MyVidster**
- **MyviEmbed**
- **n-tv.de** - **n-tv.de**
- **natgeo** - **natgeo**
- **natgeo:episodeguide** - **natgeo:episodeguide**
@ -524,10 +509,8 @@
- **NBA** - **NBA**
- **NBC** - **NBC**
- **NBCNews** - **NBCNews**
- **nbcolympics** - **NBCOlympics**
- **nbcolympics:stream**
- **NBCSports** - **NBCSports**
- **NBCSportsStream**
- **NBCSportsVPlayer** - **NBCSportsVPlayer**
- **ndr**: NDR.de - Norddeutscher Rundfunk - **ndr**: NDR.de - Norddeutscher Rundfunk
- **ndr:embed** - **ndr:embed**
@ -554,6 +537,9 @@
- **nfl.com** - **nfl.com**
- **NhkVod** - **NhkVod**
- **nhl.com** - **nhl.com**
- **nhl.com:news**: NHL news
- **nhl.com:videocenter**
- **nhl.com:videocenter:category**: NHL videocenter category
- **nick.com** - **nick.com**
- **nick.de** - **nick.de**
- **nickelodeon:br** - **nickelodeon:br**
@ -618,13 +604,11 @@
- **PacktPubCourse** - **PacktPubCourse**
- **PandaTV**: 熊猫TV - **PandaTV**: 熊猫TV
- **pandora.tv**: 판도라TV - **pandora.tv**: 판도라TV
- **ParamountNetwork**
- **parliamentlive.tv**: UK parliament videos - **parliamentlive.tv**: UK parliament videos
- **Patreon** - **Patreon**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC) - **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **pcmag** - **pcmag**
- **PearVideo** - **PearVideo**
- **PeerTube**
- **People** - **People**
- **PerformGroup** - **PerformGroup**
- **periscope**: Periscope - **periscope**: Periscope
@ -632,8 +616,6 @@
- **PhilharmonieDeParis**: Philharmonie de Paris - **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de** - **phoenix.de**
- **Photobucket** - **Photobucket**
- **Picarto**
- **PicartoVod**
- **Piksel** - **Piksel**
- **Pinkbike** - **Pinkbike**
- **Pladform** - **Pladform**
@ -672,8 +654,6 @@
- **qqmusic:playlist**: QQ音乐 - 歌单 - **qqmusic:playlist**: QQ音乐 - 歌单
- **qqmusic:singer**: QQ音乐 - 歌手 - **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜 - **qqmusic:toplist**: QQ音乐 - 排行榜
- **Quickline**
- **QuicklineLive**
- **R7** - **R7**
- **R7Article** - **R7Article**
- **radio.de** - **radio.de**
@ -686,7 +666,6 @@
- **RaiPlay** - **RaiPlay**
- **RaiPlayLive** - **RaiPlayLive**
- **RaiPlayPlaylist** - **RaiPlayPlaylist**
- **RayWenderlich**
- **RBMARadio** - **RBMARadio**
- **RDS**: RDS.ca - **RDS**: RDS.ca
- **RedBullTV** - **RedBullTV**
@ -702,6 +681,7 @@
- **revision** - **revision**
- **revision3:embed** - **revision3:embed**
- **RICE** - **RICE**
- **RingTV**
- **RMCDecouverte** - **RMCDecouverte**
- **RockstarGames** - **RockstarGames**
- **RoosterTeeth** - **RoosterTeeth**
@ -722,7 +702,6 @@
- **rtve.es:live**: RTVE.es live streams - **rtve.es:live**: RTVE.es live streams
- **rtve.es:television** - **rtve.es:television**
- **RTVNH** - **RTVNH**
- **RTVS**
- **Rudo** - **Rudo**
- **RUHD** - **RUHD**
- **RulePorn** - **RulePorn**
@ -752,8 +731,6 @@
- **ServingSys** - **ServingSys**
- **Servus** - **Servus**
- **Sexu** - **Sexu**
- **SeznamZpravy**
- **SeznamZpravyArticle**
- **Shahid** - **Shahid**
- **ShahidShow** - **ShahidShow**
- **Shared**: shared.sx - **Shared**: shared.sx
@ -791,11 +768,11 @@
- **Spiegel** - **Spiegel**
- **Spiegel:Article**: Articles on spiegel.de - **Spiegel:Article**: Articles on spiegel.de
- **Spiegeltv** - **Spiegeltv**
- **sport.francetvinfo.fr** - **Spike**
- **Sport5** - **Sport5**
- **SportBoxEmbed** - **SportBoxEmbed**
- **SportDeutschland** - **SportDeutschland**
- **SpringboardPlatform** - **Sportschau**
- **Sprout** - **Sprout**
- **sr:mediathek**: Saarländischer Rundfunk - **sr:mediathek**: Saarländischer Rundfunk
- **SRGSSR** - **SRGSSR**
@ -812,7 +789,6 @@
- **SunPorno** - **SunPorno**
- **SVT** - **SVT**
- **SVTPlay**: SVT Play and Öppet arkiv - **SVTPlay**: SVT Play and Öppet arkiv
- **SVTSeries**
- **SWRMediathek** - **SWRMediathek**
- **Syfy** - **Syfy**
- **SztvHu** - **SztvHu**
@ -836,11 +812,8 @@
- **Telegraaf** - **Telegraaf**
- **TeleMB** - **TeleMB**
- **TeleQuebec** - **TeleQuebec**
- **TeleQuebecEmission**
- **TeleQuebecLive**
- **TeleTask** - **TeleTask**
- **Telewebion** - **Telewebion**
- **TennisTV**
- **TF1** - **TF1**
- **TFO** - **TFO**
- **TheIntercept** - **TheIntercept**
@ -848,6 +821,7 @@
- **ThePlatform** - **ThePlatform**
- **ThePlatformFeed** - **ThePlatformFeed**
- **TheScene** - **TheScene**
- **TheSixtyOne**
- **TheStar** - **TheStar**
- **TheSun** - **TheSun**
- **TheWeatherChannel** - **TheWeatherChannel**
@ -893,11 +867,9 @@
- **tvigle**: Интернет-телевидение Tvigle.ru - **tvigle**: Интернет-телевидение Tvigle.ru
- **tvland.com** - **tvland.com**
- **TVN24** - **TVN24**
- **TVNet**
- **TVNoe** - **TVNoe**
- **TVNow** - **TVNow**
- **TVNowList** - **TVNowList**
- **TVNowShow**
- **tvp**: Telewizja Polska - **tvp**: Telewizja Polska
- **tvp:embed**: Telewizja Polska - **tvp:embed**: Telewizja Polska
- **tvp:series** - **tvp:series**
@ -951,6 +923,7 @@
- **vice** - **vice**
- **vice:article** - **vice:article**
- **vice:show** - **vice:show**
- **Viceland**
- **Vidbit** - **Vidbit**
- **Viddler** - **Viddler**
- **Videa** - **Videa**
@ -966,7 +939,6 @@
- **VideoPress** - **VideoPress**
- **videoweed**: VideoWeed - **videoweed**: VideoWeed
- **Vidio** - **Vidio**
- **VidLii**
- **vidme** - **vidme**
- **vidme:user** - **vidme:user**
- **vidme:user:likes** - **vidme:user:likes**
@ -1029,14 +1001,10 @@
- **WatchIndianPorn**: Watch Indian Porn - **WatchIndianPorn**: Watch Indian Porn
- **WDR** - **WDR**
- **wdr:mobile** - **wdr:mobile**
- **WDRElefant**
- **WDRPage**
- **Webcaster** - **Webcaster**
- **WebcasterFeed** - **WebcasterFeed**
- **WebOfStories** - **WebOfStories**
- **WebOfStoriesPlaylist** - **WebOfStoriesPlaylist**
- **Weibo**
- **WeiboMobile**
- **WeiqiTV**: WQTV - **WeiqiTV**: WQTV
- **wholecloud**: WholeCloud - **wholecloud**: WholeCloud
- **Wimp** - **Wimp**
@ -1056,8 +1024,6 @@
- **xiami:artist**: 虾米音乐 - 歌手 - **xiami:artist**: 虾米音乐 - 歌手
- **xiami:collection**: 虾米音乐 - 精选集 - **xiami:collection**: 虾米音乐 - 精选集
- **xiami:song**: 虾米音乐 - **xiami:song**: 虾米音乐
- **ximalaya**: 喜马拉雅FM
- **ximalaya:album**: 喜马拉雅FM 专辑
- **XMinus** - **XMinus**
- **XNXX** - **XNXX**
- **Xstream** - **Xstream**
@ -1071,7 +1037,6 @@
- **yandexmusic:album**: Яндекс.Музыка - Альбом - **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист - **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек - **yandexmusic:track**: Яндекс.Музыка - Трек
- **YapFiles**
- **YesJapan** - **YesJapan**
- **yinyuetai:video**: 音悦Tai - **yinyuetai:video**: 音悦Tai
- **Ynet** - **Ynet**
@ -1100,8 +1065,6 @@
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication) - **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **Zapiks** - **Zapiks**
- **Zaq1** - **Zaq1**
- **Zattoo**
- **ZattooLive**
- **ZDF** - **ZDF**
- **ZDFChannel** - **ZDFChannel**
- **zingmp3**: mp3.zing.vn - **zingmp3**: mp3.zing.vn

View File

@ -2,5 +2,5 @@
universal = True universal = True
[flake8] [flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
ignore = E402,E501,E731,E741 ignore = E402,E501,E731

View File

@ -694,55 +694,6 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
self.ie._sort_formats(formats) self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None) expect_value(self, formats, expected_formats, None)
def test_parse_xspf(self):
_TEST_CASES = [
(
'foo_xspf',
'https://example.org/src/foo_xspf.xspf',
[{
'id': 'foo_xspf',
'title': 'Pandemonium',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 202.416,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/src/cd1/track%201.mp3',
}],
}, {
'id': 'foo_xspf',
'title': 'Final Cartridge (Nichico Twelve Remix)',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 255.857,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3',
}],
}, {
'id': 'foo_xspf',
'title': 'Rebuilding Nightingale',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 287.915,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/src/track3.mp3',
}, {
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.com/track3.mp3',
}]
}]
),
]
for xspf_file, xspf_url, expected_entries in _TEST_CASES:
with io.open('./test/testdata/xspf/%s.xspf' % xspf_file,
mode='r', encoding='utf-8') as f:
entries = self.ie._parse_xspf(
compat_etree_fromstring(f.read().encode('utf-8')),
xspf_file, xspf_url=xspf_url, xspf_base_url=xspf_url)
expect_value(self, entries, expected_entries, None)
for i in range(len(entries)):
expect_dict(self, entries[i], expected_entries[i])
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -92,8 +92,8 @@ class TestDownload(unittest.TestCase):
def generator(test_case, tname): def generator(test_case, tname):
def test_template(self): def test_template(self):
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])() ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
other_ies = [get_info_extractor(ie_key)() for ie_key in test_case.get('add_ie', [])] other_ies = [get_info_extractor(ie_key) for ie_key in test_case.get('add_ie', [])]
is_playlist = any(k.startswith('playlist') for k in test_case) is_playlist = any(k.startswith('playlist') for k in test_case)
test_cases = test_case.get( test_cases = test_case.get(
'playlist', [] if is_playlist else [test_case]) 'playlist', [] if is_playlist else [test_case])

View File

@ -1,125 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import re
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import try_rm
from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_http_server
from youtube_dl.downloader.http import HttpFD
from youtube_dl.utils import encodeFilename
import ssl
import threading
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
def http_server_port(httpd):
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
# In Jython SSLSocket is not a subclass of socket.socket
sock = httpd.socket.sock
else:
sock = httpd.socket
return sock.getsockname()[1]
TEST_SIZE = 10 * 1024
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
def send_content_range(self, total=None):
range_header = self.headers.get('Range')
start = end = None
if range_header:
mobj = re.search(r'^bytes=(\d+)-(\d+)', range_header)
if mobj:
start = int(mobj.group(1))
end = int(mobj.group(2))
valid_range = start is not None and end is not None
if valid_range:
content_range = 'bytes %d-%d' % (start, end)
if total:
content_range += '/%d' % total
self.send_header('Content-Range', content_range)
return (end - start + 1) if valid_range else total
def serve(self, range=True, content_length=True):
self.send_response(200)
self.send_header('Content-Type', 'video/mp4')
size = TEST_SIZE
if range:
size = self.send_content_range(TEST_SIZE)
if content_length:
self.send_header('Content-Length', size)
self.end_headers()
self.wfile.write(b'#' * size)
def do_GET(self):
if self.path == '/regular':
self.serve()
elif self.path == '/no-content-length':
self.serve(content_length=False)
elif self.path == '/no-range':
self.serve(range=False)
elif self.path == '/no-range-no-content-length':
self.serve(range=False, content_length=False)
else:
assert False
class FakeLogger(object):
def debug(self, msg):
pass
def warning(self, msg):
pass
def error(self, msg):
pass
class TestHttpFD(unittest.TestCase):
def setUp(self):
self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler)
self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True
self.server_thread.start()
def download(self, params, ep):
params['logger'] = FakeLogger()
ydl = YoutubeDL(params)
downloader = HttpFD(ydl, params)
filename = 'testfile.mp4'
try_rm(encodeFilename(filename))
self.assertTrue(downloader.real_download(filename, {
'url': 'http://127.0.0.1:%d/%s' % (self.port, ep),
}))
self.assertEqual(os.path.getsize(encodeFilename(filename)), TEST_SIZE)
try_rm(encodeFilename(filename))
def download_all(self, params):
for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'):
self.download(params, ep)
def test_regular(self):
self.download_all({})
def test_chunked(self):
self.download_all({
'http_chunk_size': 1000,
})
if __name__ == '__main__':
unittest.main()

View File

@ -47,7 +47,7 @@ class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
self.end_headers() self.end_headers()
return return
new_url = 'http://127.0.0.1:%d/中文.html' % http_server_port(self.server) new_url = 'http://localhost:%d/中文.html' % http_server_port(self.server)
self.send_response(302) self.send_response(302)
self.send_header(b'Location', new_url.encode('utf-8')) self.send_header(b'Location', new_url.encode('utf-8'))
self.end_headers() self.end_headers()
@ -74,7 +74,7 @@ class FakeLogger(object):
class TestHTTP(unittest.TestCase): class TestHTTP(unittest.TestCase):
def setUp(self): def setUp(self):
self.httpd = compat_http_server.HTTPServer( self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler) ('localhost', 0), HTTPTestRequestHandler)
self.port = http_server_port(self.httpd) self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever) self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True self.server_thread.daemon = True
@ -86,15 +86,15 @@ class TestHTTP(unittest.TestCase):
return return
ydl = YoutubeDL({'logger': FakeLogger()}) ydl = YoutubeDL({'logger': FakeLogger()})
r = ydl.extract_info('http://127.0.0.1:%d/302' % self.port) r = ydl.extract_info('http://localhost:%d/302' % self.port)
self.assertEqual(r['entries'][0]['url'], 'http://127.0.0.1:%d/vid.mp4' % self.port) self.assertEqual(r['entries'][0]['url'], 'http://localhost:%d/vid.mp4' % self.port)
class TestHTTPS(unittest.TestCase): class TestHTTPS(unittest.TestCase):
def setUp(self): def setUp(self):
certfn = os.path.join(TEST_DIR, 'testcert.pem') certfn = os.path.join(TEST_DIR, 'testcert.pem')
self.httpd = compat_http_server.HTTPServer( self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler) ('localhost', 0), HTTPTestRequestHandler)
self.httpd.socket = ssl.wrap_socket( self.httpd.socket = ssl.wrap_socket(
self.httpd.socket, certfile=certfn, server_side=True) self.httpd.socket, certfile=certfn, server_side=True)
self.port = http_server_port(self.httpd) self.port = http_server_port(self.httpd)
@ -107,11 +107,11 @@ class TestHTTPS(unittest.TestCase):
ydl = YoutubeDL({'logger': FakeLogger()}) ydl = YoutubeDL({'logger': FakeLogger()})
self.assertRaises( self.assertRaises(
Exception, Exception,
ydl.extract_info, 'https://127.0.0.1:%d/video.html' % self.port) ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True}) ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
r = ydl.extract_info('https://127.0.0.1:%d/video.html' % self.port) r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
self.assertEqual(r['entries'][0]['url'], 'https://127.0.0.1:%d/vid.mp4' % self.port) self.assertEqual(r['entries'][0]['url'], 'https://localhost:%d/vid.mp4' % self.port)
def _build_proxy_handler(name): def _build_proxy_handler(name):
@ -132,23 +132,23 @@ def _build_proxy_handler(name):
class TestProxy(unittest.TestCase): class TestProxy(unittest.TestCase):
def setUp(self): def setUp(self):
self.proxy = compat_http_server.HTTPServer( self.proxy = compat_http_server.HTTPServer(
('127.0.0.1', 0), _build_proxy_handler('normal')) ('localhost', 0), _build_proxy_handler('normal'))
self.port = http_server_port(self.proxy) self.port = http_server_port(self.proxy)
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever) self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
self.proxy_thread.daemon = True self.proxy_thread.daemon = True
self.proxy_thread.start() self.proxy_thread.start()
self.geo_proxy = compat_http_server.HTTPServer( self.geo_proxy = compat_http_server.HTTPServer(
('127.0.0.1', 0), _build_proxy_handler('geo')) ('localhost', 0), _build_proxy_handler('geo'))
self.geo_port = http_server_port(self.geo_proxy) self.geo_port = http_server_port(self.geo_proxy)
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever) self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
self.geo_proxy_thread.daemon = True self.geo_proxy_thread.daemon = True
self.geo_proxy_thread.start() self.geo_proxy_thread.start()
def test_proxy(self): def test_proxy(self):
geo_proxy = '127.0.0.1:{0}'.format(self.geo_port) geo_proxy = 'localhost:{0}'.format(self.geo_port)
ydl = YoutubeDL({ ydl = YoutubeDL({
'proxy': '127.0.0.1:{0}'.format(self.port), 'proxy': 'localhost:{0}'.format(self.port),
'geo_verification_proxy': geo_proxy, 'geo_verification_proxy': geo_proxy,
}) })
url = 'http://foo.com/bar' url = 'http://foo.com/bar'
@ -162,7 +162,7 @@ class TestProxy(unittest.TestCase):
def test_proxy_with_idn(self): def test_proxy_with_idn(self):
ydl = YoutubeDL({ ydl = YoutubeDL({
'proxy': '127.0.0.1:{0}'.format(self.port), 'proxy': 'localhost:{0}'.format(self.port),
}) })
url = 'http://中文.tw/' url = 'http://中文.tw/'
response = ydl.urlopen(url).read().decode('utf-8') response = ydl.urlopen(url).read().decode('utf-8')

View File

@ -232,7 +232,7 @@ class TestNPOSubtitles(BaseTestSubtitles):
class TestMTVSubtitles(BaseTestSubtitles): class TestMTVSubtitles(BaseTestSubtitles):
url = 'http://www.cc.com/video-clips/p63lk0/adam-devine-s-house-party-chasing-white-swans' url = 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother'
IE = ComedyCentralIE IE = ComedyCentralIE
def getInfoDict(self): def getInfoDict(self):
@ -243,7 +243,7 @@ class TestMTVSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles() subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en'])) self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), '78206b8d8a0cfa9da64dc026eea48961') self.assertEqual(md5(subtitles['en']), 'b9f6ca22a6acf597ec76f61749765e65')
class TestNRKSubtitles(BaseTestSubtitles): class TestNRKSubtitles(BaseTestSubtitles):

View File

@ -42,7 +42,6 @@ from youtube_dl.utils import (
is_html, is_html,
js_to_json, js_to_json,
limit_length, limit_length,
merge_dicts,
mimetype2ext, mimetype2ext,
month_by_name, month_by_name,
multipart_encode, multipart_encode,
@ -54,12 +53,10 @@ from youtube_dl.utils import (
parse_filesize, parse_filesize,
parse_count, parse_count,
parse_iso8601, parse_iso8601,
parse_resolution,
pkcs1pad, pkcs1pad,
read_batch_urls, read_batch_urls,
sanitize_filename, sanitize_filename,
sanitize_path, sanitize_path,
sanitize_url,
expand_path, expand_path,
prepend_extension, prepend_extension,
replace_extension, replace_extension,
@ -222,12 +219,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_path('./abc'), 'abc') self.assertEqual(sanitize_path('./abc'), 'abc')
self.assertEqual(sanitize_path('./../abc'), '..\\abc') self.assertEqual(sanitize_path('./../abc'), '..\\abc')
def test_sanitize_url(self):
self.assertEqual(sanitize_url('//foo.bar'), 'http://foo.bar')
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
self.assertEqual(sanitize_url('rmtps://foo.bar'), 'rtmps://foo.bar')
self.assertEqual(sanitize_url('https://foo.bar'), 'https://foo.bar')
def test_expand_path(self): def test_expand_path(self):
def env(var): def env(var):
return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var) return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var)
@ -353,7 +344,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361) self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540) self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140) self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363)
def test_determine_ext(self): def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4') self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@ -361,7 +351,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(determine_ext('http://example.com/foo/bar.nonext/?download', None), None) self.assertEqual(determine_ext('http://example.com/foo/bar.nonext/?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar/mp4?download', None), None) self.assertEqual(determine_ext('http://example.com/foo/bar/mp4?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar.m3u8//?download'), 'm3u8') self.assertEqual(determine_ext('http://example.com/foo/bar.m3u8//?download'), 'm3u8')
self.assertEqual(determine_ext('foobar', None), None)
def test_find_xpath_attr(self): def test_find_xpath_attr(self):
testxml = '''<root> testxml = '''<root>
@ -520,8 +509,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_age_limit('PG-13'), 13) self.assertEqual(parse_age_limit('PG-13'), 13)
self.assertEqual(parse_age_limit('TV-14'), 14) self.assertEqual(parse_age_limit('TV-14'), 14)
self.assertEqual(parse_age_limit('TV-MA'), 17) self.assertEqual(parse_age_limit('TV-MA'), 17)
self.assertEqual(parse_age_limit('TV14'), 14)
self.assertEqual(parse_age_limit('TV_G'), 0)
def test_parse_duration(self): def test_parse_duration(self):
self.assertEqual(parse_duration(None), None) self.assertEqual(parse_duration(None), None)
@ -673,17 +660,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(dict_get(d, ('b', 'c', key, )), None) self.assertEqual(dict_get(d, ('b', 'c', key, )), None)
self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value) self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value)
def test_merge_dicts(self):
self.assertEqual(merge_dicts({'a': 1}, {'b': 2}), {'a': 1, 'b': 2})
self.assertEqual(merge_dicts({'a': 1}, {'a': 2}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {'a': None}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {'a': ''}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {}), {'a': 1})
self.assertEqual(merge_dicts({'a': None}, {'a': 1}), {'a': 1})
self.assertEqual(merge_dicts({'a': ''}, {'a': 1}), {'a': ''})
self.assertEqual(merge_dicts({'a': ''}, {'a': 'abc'}), {'a': 'abc'})
self.assertEqual(merge_dicts({'a': None}, {'a': ''}, {'a': 'abc'}), {'a': 'abc'})
def test_encode_compat_str(self): def test_encode_compat_str(self):
self.assertEqual(encode_compat_str(b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82', 'utf-8'), 'тест') self.assertEqual(encode_compat_str(b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82', 'utf-8'), 'тест')
self.assertEqual(encode_compat_str('тест', 'utf-8'), 'тест') self.assertEqual(encode_compat_str('тест', 'utf-8'), 'тест')
@ -838,9 +814,6 @@ class TestUtil(unittest.TestCase):
inp = '''{"duration": "00:01:07"}''' inp = '''{"duration": "00:01:07"}'''
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''') self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
inp = '''{segments: [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}'''
self.assertEqual(js_to_json(inp), '''{"segments": [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}''')
def test_js_to_json_edgecases(self): def test_js_to_json_edgecases(self):
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}") on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"}) self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
@ -912,13 +885,6 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}') on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
self.assertEqual(json.loads(on), {'42': 42}) self.assertEqual(json.loads(on), {'42': 42})
on = js_to_json('{42:4.2e1}')
self.assertEqual(json.loads(on), {'42': 42.0})
def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
def test_extract_attributes(self): def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'}) self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'}) self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
@ -999,16 +965,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_count('1.1kk '), 1100000) self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000) self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_parse_resolution(self):
self.assertEqual(parse_resolution(None), {})
self.assertEqual(parse_resolution(''), {})
self.assertEqual(parse_resolution('1920x1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920×1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920 x 1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('720p'), {'height': 720})
self.assertEqual(parse_resolution('4k'), {'height': 2160})
self.assertEqual(parse_resolution('8K'), {'height': 4320})
def test_version_tuple(self): def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,)) self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344)) self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
@ -1087,18 +1043,6 @@ ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
self.assertFalse(match_str( self.assertFalse(match_str(
'like_count > 100 & dislike_count <? 50 & description', 'like_count > 100 & dislike_count <? 50 & description',
{'like_count': 190, 'dislike_count': 10})) {'like_count': 190, 'dislike_count': 10}))
self.assertTrue(match_str('is_live', {'is_live': True}))
self.assertFalse(match_str('is_live', {'is_live': False}))
self.assertFalse(match_str('is_live', {'is_live': None}))
self.assertFalse(match_str('is_live', {}))
self.assertFalse(match_str('!is_live', {'is_live': True}))
self.assertTrue(match_str('!is_live', {'is_live': False}))
self.assertTrue(match_str('!is_live', {'is_live': None}))
self.assertTrue(match_str('!is_live', {}))
self.assertTrue(match_str('title', {'title': 'abc'}))
self.assertTrue(match_str('title', {'title': ''}))
self.assertFalse(match_str('!title', {'title': 'abc'}))
self.assertFalse(match_str('!title', {'title': ''}))
def test_parse_dfxp_time_expr(self): def test_parse_dfxp_time_expr(self):
self.assertEqual(parse_dfxp_time_expr(None), None) self.assertEqual(parse_dfxp_time_expr(None), None)

View File

@ -61,7 +61,7 @@ class TestYoutubeLists(unittest.TestCase):
dl = FakeYDL() dl = FakeYDL()
dl.params['extract_flat'] = True dl.params['extract_flat'] = True
ie = YoutubePlaylistIE(dl) ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=PL-KKIb8rvtMSrAO9YFbeM6UQrAqoFTUWv') result = ie.extract('https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re')
self.assertIsPlaylist(result) self.assertIsPlaylist(result)
for entry in result['entries']: for entry in result['entries']:
self.assertTrue(entry.get('title')) self.assertTrue(entry.get('title'))

View File

@ -1,34 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<playlist version="1" xmlns="http://xspf.org/ns/0/">
<date>2018-03-09T18:01:43Z</date>
<trackList>
<track>
<location>cd1/track%201.mp3</location>
<title>Pandemonium</title>
<creator>Foilverb</creator>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<album>Pandemonium EP</album>
<trackNum>1</trackNum>
<duration>202416</duration>
</track>
<track>
<location>../%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3</location>
<title>Final Cartridge (Nichico Twelve Remix)</title>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<creator>Foilverb</creator>
<album>Pandemonium EP</album>
<trackNum>2</trackNum>
<duration>255857</duration>
</track>
<track>
<location>track3.mp3</location>
<location>https://example.com/track3.mp3</location>
<title>Rebuilding Nightingale</title>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<creator>Foilverb</creator>
<album>Pandemonium EP</album>
<trackNum>3</trackNum>
<duration>287915</duration>
</track>
</trackList>
</playlist>

View File

@ -211,7 +211,7 @@ class YoutubeDL(object):
At the moment, this is only supported by YouTube. At the moment, this is only supported by YouTube.
proxy: URL of the proxy server to use proxy: URL of the proxy server to use
geo_verification_proxy: URL of the proxy to use for IP address verification geo_verification_proxy: URL of the proxy to use for IP address verification
on geo-restricted sites. on geo-restricted sites. (Experimental)
socket_timeout: Time to wait for unresponsive hosts, in seconds socket_timeout: Time to wait for unresponsive hosts, in seconds
bidi_workaround: Work around buggy terminals without bidirectional text bidi_workaround: Work around buggy terminals without bidirectional text
support, using fridibi support, using fridibi
@ -259,7 +259,7 @@ class YoutubeDL(object):
- "warn": only emit a warning - "warn": only emit a warning
- "detect_or_warn": check whether we can do anything - "detect_or_warn": check whether we can do anything
about it, warn otherwise (default) about it, warn otherwise (default)
source_address: Client-side IP address to bind to. source_address: (Experimental) Client-side IP address to bind to.
call_home: Boolean, true iff we are allowed to contact the call_home: Boolean, true iff we are allowed to contact the
youtube-dl servers for debugging. youtube-dl servers for debugging.
sleep_interval: Number of seconds to sleep before each download when sleep_interval: Number of seconds to sleep before each download when
@ -281,14 +281,11 @@ class YoutubeDL(object):
match_filter_func in utils.py is one example for this. match_filter_func in utils.py is one example for this.
no_color: Do not emit color codes in output. no_color: Do not emit color codes in output.
geo_bypass: Bypass geographic restriction via faking X-Forwarded-For geo_bypass: Bypass geographic restriction via faking X-Forwarded-For
HTTP header HTTP header (experimental)
geo_bypass_country: geo_bypass_country:
Two-letter ISO 3166-2 country code that will be used for Two-letter ISO 3166-2 country code that will be used for
explicit geographic restriction bypassing via faking explicit geographic restriction bypassing via faking
X-Forwarded-For HTTP header X-Forwarded-For HTTP header (experimental)
geo_bypass_ip_block:
IP range in CIDR notation that will be used similarly to
geo_bypass_country
The following options determine which downloader is picked: The following options determine which downloader is picked:
external_downloader: Executable of the external downloader to call. external_downloader: Executable of the external downloader to call.
@ -301,8 +298,7 @@ class YoutubeDL(object):
the downloader (see youtube_dl/downloader/common.py): the downloader (see youtube_dl/downloader/common.py):
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test, nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle, noresizebuffer, retries, continuedl, noprogress, consoletitle,
xattr_set_filesize, external_downloader_args, hls_use_mpegts, xattr_set_filesize, external_downloader_args, hls_use_mpegts.
http_chunk_size.
The following options are used by the post processors: The following options are used by the post processors:
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available, prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
@ -535,8 +531,6 @@ class YoutubeDL(object):
def save_console_title(self): def save_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if self.params.get('simulate', False):
return
if compat_os_name != 'nt' and 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Save the title on stack # Save the title on stack
self._write_string('\033[22;0t', self._screen_file) self._write_string('\033[22;0t', self._screen_file)
@ -544,8 +538,6 @@ class YoutubeDL(object):
def restore_console_title(self): def restore_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if self.params.get('simulate', False):
return
if compat_os_name != 'nt' and 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Restore the title from stack # Restore the title from stack
self._write_string('\033[23;0t', self._screen_file) self._write_string('\033[23;0t', self._screen_file)
@ -1040,7 +1032,7 @@ class YoutubeDL(object):
'!=': operator.ne, '!=': operator.ne,
} }
operator_rex = re.compile(r'''(?x)\s* operator_rex = re.compile(r'''(?x)\s*
(?P<key>width|height|tbr|abr|vbr|asr|filesize|filesize_approx|fps) (?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s* \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?) (?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
$ $
@ -1482,28 +1474,23 @@ class YoutubeDL(object):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
for cc_kind in ('subtitles', 'automatic_captions'): subtitles = info_dict.get('subtitles')
cc = info_dict.get(cc_kind) if subtitles:
if cc: for _, subtitle in subtitles.items():
for _, subtitle in cc.items():
for subtitle_format in subtitle: for subtitle_format in subtitle:
if subtitle_format.get('url'): if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url']) subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None: if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower() subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
automatic_captions = info_dict.get('automatic_captions')
subtitles = info_dict.get('subtitles')
if self.params.get('listsubtitles', False): if self.params.get('listsubtitles', False):
if 'automatic_captions' in info_dict: if 'automatic_captions' in info_dict:
self.list_subtitles( self.list_subtitles(info_dict['id'], info_dict.get('automatic_captions'), 'automatic captions')
info_dict['id'], automatic_captions, 'automatic captions')
self.list_subtitles(info_dict['id'], subtitles, 'subtitles') self.list_subtitles(info_dict['id'], subtitles, 'subtitles')
return return
info_dict['requested_subtitles'] = self.process_subtitles( info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, automatic_captions) info_dict['id'], subtitles,
info_dict.get('automatic_captions'))
# We now pick which formats have to be downloaded # We now pick which formats have to be downloaded
if info_dict.get('formats') is None: if info_dict.get('formats') is None:
@ -1861,7 +1848,7 @@ class YoutubeDL(object):
def compatible_formats(formats): def compatible_formats(formats):
video, audio = formats video, audio = formats
# Check extension # Check extension
video_ext, audio_ext = video.get('ext'), audio.get('ext') video_ext, audio_ext = audio.get('ext'), video.get('ext')
if video_ext and audio_ext: if video_ext and audio_ext:
COMPATIBLE_EXTS = ( COMPATIBLE_EXTS = (
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'), ('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'),

View File

@ -191,11 +191,6 @@ def _real_main(argv=None):
if numeric_buffersize is None: if numeric_buffersize is None:
parser.error('invalid buffer size specified') parser.error('invalid buffer size specified')
opts.buffersize = numeric_buffersize opts.buffersize = numeric_buffersize
if opts.http_chunk_size is not None:
numeric_chunksize = FileDownloader.parse_bytes(opts.http_chunk_size)
if not numeric_chunksize:
parser.error('invalid http chunk size specified')
opts.http_chunk_size = numeric_chunksize
if opts.playliststart <= 0: if opts.playliststart <= 0:
raise ValueError('Playlist start must be positive') raise ValueError('Playlist start must be positive')
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart: if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
@ -351,7 +346,6 @@ def _real_main(argv=None):
'keep_fragments': opts.keep_fragments, 'keep_fragments': opts.keep_fragments,
'buffersize': opts.buffersize, 'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer, 'noresizebuffer': opts.noresizebuffer,
'http_chunk_size': opts.http_chunk_size,
'continuedl': opts.continue_dl, 'continuedl': opts.continue_dl,
'noprogress': opts.noprogress, 'noprogress': opts.noprogress,
'progress_with_newline': opts.progress_with_newline, 'progress_with_newline': opts.progress_with_newline,
@ -430,7 +424,6 @@ def _real_main(argv=None):
'config_location': opts.config_location, 'config_location': opts.config_location,
'geo_bypass': opts.geo_bypass, 'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country, 'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,
# just for deprecation check # just for deprecation check
'autonumber': opts.autonumber if opts.autonumber is True else None, 'autonumber': opts.autonumber if opts.autonumber is True else None,
'usetitle': opts.usetitle if opts.usetitle is True else None, 'usetitle': opts.usetitle if opts.usetitle is True else None,

View File

@ -1,8 +1,8 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
from math import ceil from math import ceil
from .compat import compat_b64decode
from .utils import bytes_to_intlist, intlist_to_bytes from .utils import bytes_to_intlist, intlist_to_bytes
BLOCK_SIZE_BYTES = 16 BLOCK_SIZE_BYTES = 16
@ -180,7 +180,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
""" """
NONCE_LENGTH_BYTES = 8 NONCE_LENGTH_BYTES = 8
data = bytes_to_intlist(compat_b64decode(data)) data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
password = bytes_to_intlist(password.encode('utf-8')) password = bytes_to_intlist(password.encode('utf-8'))
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password)) key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))

View File

@ -1,7 +1,6 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import binascii import binascii
import collections import collections
import ctypes import ctypes
@ -2897,24 +2896,9 @@ except TypeError:
if isinstance(spec, compat_str): if isinstance(spec, compat_str):
spec = spec.encode('ascii') spec = spec.encode('ascii')
return struct.unpack(spec, *args) return struct.unpack(spec, *args)
class compat_Struct(struct.Struct):
def __init__(self, fmt):
if isinstance(fmt, compat_str):
fmt = fmt.encode('ascii')
super(compat_Struct, self).__init__(fmt)
else: else:
compat_struct_pack = struct.pack compat_struct_pack = struct.pack
compat_struct_unpack = struct.unpack compat_struct_unpack = struct.unpack
if platform.python_implementation() == 'IronPython' and sys.version_info < (2, 7, 8):
class compat_Struct(struct.Struct):
def unpack(self, string):
if not isinstance(string, buffer): # noqa: F821
string = buffer(string) # noqa: F821
return super(compat_Struct, self).unpack(string)
else:
compat_Struct = struct.Struct
try: try:
from future_builtins import zip as compat_zip from future_builtins import zip as compat_zip
@ -2924,16 +2908,6 @@ except ImportError: # not 2.6+ or is 3.x
except ImportError: except ImportError:
compat_zip = zip compat_zip = zip
if sys.version_info < (3, 3):
def compat_b64decode(s, *args, **kwargs):
if isinstance(s, compat_str):
s = s.encode('ascii')
return base64.b64decode(s, *args, **kwargs)
else:
compat_b64decode = base64.b64decode
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0): if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function # PyPy2 prior to version 5.4.0 expects byte strings as Windows function
# names, see the original PyPy issue [1] and the youtube-dl one [2]. # names, see the original PyPy issue [1] and the youtube-dl one [2].
@ -2956,8 +2930,6 @@ __all__ = [
'compat_HTMLParseError', 'compat_HTMLParseError',
'compat_HTMLParser', 'compat_HTMLParser',
'compat_HTTPError', 'compat_HTTPError',
'compat_Struct',
'compat_b64decode',
'compat_basestring', 'compat_basestring',
'compat_chr', 'compat_chr',
'compat_cookiejar', 'compat_cookiejar',

View File

@ -45,12 +45,10 @@ class FileDownloader(object):
min_filesize: Skip files smaller than this size min_filesize: Skip files smaller than this size
max_filesize: Skip files larger than this size max_filesize: Skip files larger than this size
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size. xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
(experimental)
external_downloader_args: A list of additional command-line arguments for the external_downloader_args: A list of additional command-line arguments for the
external downloader. external downloader.
hls_use_mpegts: Use the mpegts container for HLS videos. hls_use_mpegts: Use the mpegts container for HLS videos.
http_chunk_size: Size of a chunk for chunk-based HTTP downloading. May be
useful for bypassing bandwidth throttling imposed by
a webserver (experimental)
Subclasses of this one must re-define the real_download method. Subclasses of this one must re-define the real_download method.
""" """
@ -248,13 +246,12 @@ class FileDownloader(object):
if self.params.get('noprogress', False): if self.params.get('noprogress', False):
self.to_screen('[download] Download completed') self.to_screen('[download] Download completed')
else: else:
msg_template = '100%%'
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes']) s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template += ' of %(_total_bytes_str)s'
if s.get('elapsed') is not None: if s.get('elapsed') is not None:
s['_elapsed_str'] = self.format_seconds(s['elapsed']) s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template += ' in %(_elapsed_str)s' msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
else:
msg_template = '100%% of %(_total_bytes_str)s'
self._report_progress_status( self._report_progress_status(
msg_template % s, is_last_line=True) msg_template % s, is_last_line=True)

View File

@ -1,10 +1,9 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os.path import os.path
import re
import subprocess import subprocess
import sys import sys
import time import re
from .common import FileDownloader from .common import FileDownloader
from ..compat import ( from ..compat import (
@ -31,7 +30,6 @@ class ExternalFD(FileDownloader):
tmpfilename = self.temp_name(filename) tmpfilename = self.temp_name(filename)
try: try:
started = time.time()
retval = self._call_downloader(tmpfilename, info_dict) retval = self._call_downloader(tmpfilename, info_dict)
except KeyboardInterrupt: except KeyboardInterrupt:
if not info_dict.get('is_live'): if not info_dict.get('is_live'):
@ -43,20 +41,15 @@ class ExternalFD(FileDownloader):
self.to_screen('[%s] Interrupted by user' % self.get_basename()) self.to_screen('[%s] Interrupted by user' % self.get_basename())
if retval == 0: if retval == 0:
status = {
'filename': filename,
'status': 'finished',
'elapsed': time.time() - started,
}
if filename != '-':
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize)) self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
status.update({ self._hook_progress({
'downloaded_bytes': fsize, 'downloaded_bytes': fsize,
'total_bytes': fsize, 'total_bytes': fsize,
'filename': filename,
'status': 'finished',
}) })
self._hook_progress(status)
return True return True
else: else:
self.to_stderr('\n') self.to_stderr('\n')

View File

@ -1,12 +1,12 @@
from __future__ import division, unicode_literals from __future__ import division, unicode_literals
import base64
import io import io
import itertools import itertools
import time import time
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import ( from ..compat import (
compat_b64decode,
compat_etree_fromstring, compat_etree_fromstring,
compat_urlparse, compat_urlparse,
compat_urllib_error, compat_urllib_error,
@ -312,7 +312,7 @@ class F4mFD(FragmentFD):
boot_info = self._get_bootstrap_from_url(bootstrap_url) boot_info = self._get_bootstrap_from_url(bootstrap_url)
else: else:
bootstrap_url = None bootstrap_url = None
bootstrap = compat_b64decode(node.text) bootstrap = base64.b64decode(node.text.encode('ascii'))
boot_info = read_bootstrap_info(bootstrap) boot_info = read_bootstrap_info(bootstrap)
return boot_info, bootstrap_url return boot_info, bootstrap_url
@ -349,7 +349,7 @@ class F4mFD(FragmentFD):
live = boot_info['live'] live = boot_info['live']
metadata_node = media.find(_add_ns('metadata')) metadata_node = media.find(_add_ns('metadata'))
if metadata_node is not None: if metadata_node is not None:
metadata = compat_b64decode(metadata_node.text) metadata = base64.b64decode(metadata_node.text.encode('ascii'))
else: else:
metadata = None metadata = None

View File

@ -74,13 +74,8 @@ class FragmentFD(FileDownloader):
return not ctx['live'] and not ctx['tmpfilename'] == '-' return not ctx['live'] and not ctx['tmpfilename'] == '-'
def _read_ytdl_file(self, ctx): def _read_ytdl_file(self, ctx):
assert 'ytdl_corrupt' not in ctx
stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r') stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r')
try:
ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index'] ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index']
except Exception:
ctx['ytdl_corrupt'] = True
finally:
stream.close() stream.close()
def _write_ytdl_file(self, ctx): def _write_ytdl_file(self, ctx):
@ -163,17 +158,11 @@ class FragmentFD(FileDownloader):
if self.__do_ytdl_file(ctx): if self.__do_ytdl_file(ctx):
if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))): if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))):
self._read_ytdl_file(ctx) self._read_ytdl_file(ctx)
is_corrupt = ctx.get('ytdl_corrupt') is True if ctx['fragment_index'] > 0 and resume_len == 0:
is_inconsistent = ctx['fragment_index'] > 0 and resume_len == 0
if is_corrupt or is_inconsistent:
message = (
'.ytdl file is corrupt' if is_corrupt else
'Inconsistent state of incomplete fragment download')
self.report_warning( self.report_warning(
'%s. Restarting from the beginning...' % message) 'Inconsistent state of incomplete fragment download. '
'Restarting from the beginning...')
ctx['fragment_index'] = resume_len = 0 ctx['fragment_index'] = resume_len = 0
if 'ytdl_corrupt' in ctx:
del ctx['ytdl_corrupt']
self._write_ytdl_file(ctx) self._write_ytdl_file(ctx)
else: else:
self._write_ytdl_file(ctx) self._write_ytdl_file(ctx)
@ -252,16 +241,12 @@ class FragmentFD(FileDownloader):
if os.path.isfile(ytdl_filename): if os.path.isfile(ytdl_filename):
os.remove(ytdl_filename) os.remove(ytdl_filename)
elapsed = time.time() - ctx['started'] elapsed = time.time() - ctx['started']
if ctx['tmpfilename'] == '-':
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
else:
self.try_rename(ctx['tmpfilename'], ctx['filename']) self.try_rename(ctx['tmpfilename'], ctx['filename'])
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename'])) fsize = os.path.getsize(encodeFilename(ctx['filename']))
self._hook_progress({ self._hook_progress({
'downloaded_bytes': downloaded_bytes, 'downloaded_bytes': fsize,
'total_bytes': downloaded_bytes, 'total_bytes': fsize,
'filename': ctx['filename'], 'filename': ctx['filename'],
'status': 'finished', 'status': 'finished',
'elapsed': elapsed, 'elapsed': elapsed,

View File

@ -75,9 +75,8 @@ class HlsFD(FragmentFD):
fd.add_progress_hook(ph) fd.add_progress_hook(ph)
return fd.real_download(filename, info_dict) return fd.real_download(filename, info_dict)
def is_ad_fragment(s): def anvato_ad(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s or return s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
media_frags = 0 media_frags = 0
ad_frags = 0 ad_frags = 0
@ -87,7 +86,7 @@ class HlsFD(FragmentFD):
if not line: if not line:
continue continue
if line.startswith('#'): if line.startswith('#'):
if is_ad_fragment(line): if anvato_ad(line):
ad_frags += 1 ad_frags += 1
ad_frag_next = True ad_frag_next = True
continue continue
@ -196,7 +195,7 @@ class HlsFD(FragmentFD):
'start': sub_range_start, 'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]), 'end': sub_range_start + int(splitted_byte_range[0]),
} }
elif is_ad_fragment(line): elif anvato_ad(line):
ad_frag_next = True ad_frag_next = True
self._finish_frag_download(ctx) self._finish_frag_download(ctx)

View File

@ -4,18 +4,13 @@ import errno
import os import os
import socket import socket
import time import time
import random
import re import re
from .common import FileDownloader from .common import FileDownloader
from ..compat import ( from ..compat import compat_urllib_error
compat_str,
compat_urllib_error,
)
from ..utils import ( from ..utils import (
ContentTooShortError, ContentTooShortError,
encodeFilename, encodeFilename,
int_or_none,
sanitize_open, sanitize_open,
sanitized_Request, sanitized_Request,
write_xattr, write_xattr,
@ -43,26 +38,21 @@ class HttpFD(FileDownloader):
add_headers = info_dict.get('http_headers') add_headers = info_dict.get('http_headers')
if add_headers: if add_headers:
headers.update(add_headers) headers.update(add_headers)
basic_request = sanitized_Request(url, None, headers)
request = sanitized_Request(url, None, headers)
is_test = self.params.get('test', False) is_test = self.params.get('test', False)
chunk_size = self._TEST_FILE_SIZE if is_test else (
info_dict.get('downloader_options', {}).get('http_chunk_size') or if is_test:
self.params.get('http_chunk_size') or 0) request.add_header('Range', 'bytes=0-%s' % str(self._TEST_FILE_SIZE - 1))
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.resume_len = 0 ctx.resume_len = 0
ctx.data_len = None
ctx.block_size = self.params.get('buffersize', 1024)
ctx.start_time = time.time()
ctx.chunk_size = None
if self.params.get('continuedl', True): if self.params.get('continuedl', True):
# Establish possible resume length # Establish possible resume length
if os.path.isfile(encodeFilename(ctx.tmpfilename)): if os.path.isfile(encodeFilename(ctx.tmpfilename)):
ctx.resume_len = os.path.getsize( ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
encodeFilename(ctx.tmpfilename))
ctx.is_resume = ctx.resume_len > 0
count = 0 count = 0
retries = self.params.get('retries', 0) retries = self.params.get('retries', 0)
@ -74,36 +64,11 @@ class HttpFD(FileDownloader):
def __init__(self, source_error): def __init__(self, source_error):
self.source_error = source_error self.source_error = source_error
class NextFragment(Exception):
pass
def set_range(req, start, end):
range_header = 'bytes=%d-' % start
if end:
range_header += compat_str(end)
req.add_header('Range', range_header)
def establish_connection(): def establish_connection():
ctx.chunk_size = (random.randint(int(chunk_size * 0.95), chunk_size) if ctx.resume_len != 0:
if not is_test and chunk_size else chunk_size)
if ctx.resume_len > 0:
range_start = ctx.resume_len
if ctx.is_resume:
self.report_resuming_byte(ctx.resume_len) self.report_resuming_byte(ctx.resume_len)
request.add_header('Range', 'bytes=%d-' % ctx.resume_len)
ctx.open_mode = 'ab' ctx.open_mode = 'ab'
elif ctx.chunk_size > 0:
range_start = 0
else:
range_start = None
ctx.is_resume = False
range_end = range_start + ctx.chunk_size - 1 if ctx.chunk_size else None
if range_end and ctx.data_len is not None and range_end >= ctx.data_len:
range_end = ctx.data_len - 1
has_range = range_start is not None
ctx.has_range = has_range
request = sanitized_Request(url, None, headers)
if has_range:
set_range(request, range_start, range_end)
# Establish connection # Establish connection
try: try:
ctx.data = self.ydl.urlopen(request) ctx.data = self.ydl.urlopen(request)
@ -112,24 +77,12 @@ class HttpFD(FileDownloader):
# that don't support resuming and serve a whole file with no Content-Range # that don't support resuming and serve a whole file with no Content-Range
# set in response despite of requested Range (see # set in response despite of requested Range (see
# https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799) # https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799)
if has_range: if ctx.resume_len > 0:
content_range = ctx.data.headers.get('Content-Range') content_range = ctx.data.headers.get('Content-Range')
if content_range: if content_range:
content_range_m = re.search(r'bytes (\d+)-(\d+)?(?:/(\d+))?', content_range) content_range_m = re.search(r'bytes (\d+)-', content_range)
# Content-Range is present and matches requested Range, resume is possible # Content-Range is present and matches requested Range, resume is possible
if content_range_m: if content_range_m and ctx.resume_len == int(content_range_m.group(1)):
if range_start == int(content_range_m.group(1)):
content_range_end = int_or_none(content_range_m.group(2))
content_len = int_or_none(content_range_m.group(3))
accept_content_len = (
# Non-chunked download
not ctx.chunk_size or
# Chunked download and requested piece or
# its part is promised to be served
content_range_end == range_end or
content_len < range_end)
if accept_content_len:
ctx.data_len = content_len
return return
# Content-Range is either not present or invalid. Assuming remote webserver is # Content-Range is either not present or invalid. Assuming remote webserver is
# trying to send the whole file, resume is not possible, so wiping the local file # trying to send the whole file, resume is not possible, so wiping the local file
@ -137,15 +90,16 @@ class HttpFD(FileDownloader):
self.report_unable_to_resume() self.report_unable_to_resume()
ctx.resume_len = 0 ctx.resume_len = 0
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.data_len = int_or_none(ctx.data.info().get('Content-length', None))
return return
except (compat_urllib_error.HTTPError, ) as err: except (compat_urllib_error.HTTPError, ) as err:
if err.code == 416: if (err.code < 500 or err.code >= 600) and err.code != 416:
# Unexpected HTTP error
raise
elif err.code == 416:
# Unable to resume (requested range not satisfiable) # Unable to resume (requested range not satisfiable)
try: try:
# Open the connection again without the range header # Open the connection again without the range header
ctx.data = self.ydl.urlopen( ctx.data = self.ydl.urlopen(basic_request)
sanitized_Request(url, None, headers))
content_length = ctx.data.info()['Content-Length'] content_length = ctx.data.info()['Content-Length']
except (compat_urllib_error.HTTPError, ) as err: except (compat_urllib_error.HTTPError, ) as err:
if err.code < 500 or err.code >= 600: if err.code < 500 or err.code >= 600:
@ -176,9 +130,6 @@ class HttpFD(FileDownloader):
ctx.resume_len = 0 ctx.resume_len = 0
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
return return
elif err.code < 500 or err.code >= 600:
# Unexpected HTTP error
raise
raise RetryDownload(err) raise RetryDownload(err)
except socket.error as err: except socket.error as err:
if err.errno != errno.ECONNRESET: if err.errno != errno.ECONNRESET:
@ -209,7 +160,7 @@ class HttpFD(FileDownloader):
return False return False
byte_counter = 0 + ctx.resume_len byte_counter = 0 + ctx.resume_len
block_size = ctx.block_size block_size = self.params.get('buffersize', 1024)
start = time.time() start = time.time()
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly # measure time over whole while-loop, so slow_down() and best_block_size() work together properly
@ -217,11 +168,10 @@ class HttpFD(FileDownloader):
before = start # start measuring before = start # start measuring
def retry(e): def retry(e):
to_stdout = ctx.tmpfilename == '-' if ctx.tmpfilename != '-':
if not to_stdout:
ctx.stream.close() ctx.stream.close()
ctx.stream = None ctx.stream = None
ctx.resume_len = byte_counter if to_stdout else os.path.getsize(encodeFilename(ctx.tmpfilename)) ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
raise RetryDownload(e) raise RetryDownload(e)
while True: while True:
@ -283,30 +233,25 @@ class HttpFD(FileDownloader):
# Progress message # Progress message
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len) speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
if ctx.data_len is None: if data_len is None:
eta = None eta = None
else: else:
eta = self.calc_eta(start, time.time(), ctx.data_len - ctx.resume_len, byte_counter - ctx.resume_len) eta = self.calc_eta(start, time.time(), data_len - ctx.resume_len, byte_counter - ctx.resume_len)
self._hook_progress({ self._hook_progress({
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': byte_counter, 'downloaded_bytes': byte_counter,
'total_bytes': ctx.data_len, 'total_bytes': data_len,
'tmpfilename': ctx.tmpfilename, 'tmpfilename': ctx.tmpfilename,
'filename': ctx.filename, 'filename': ctx.filename,
'eta': eta, 'eta': eta,
'speed': speed, 'speed': speed,
'elapsed': now - ctx.start_time, 'elapsed': now - start,
}) })
if is_test and byte_counter == data_len: if is_test and byte_counter == data_len:
break break
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
ctx.resume_len = byte_counter
# ctx.block_size = block_size
raise NextFragment()
if ctx.stream is None: if ctx.stream is None:
self.to_stderr('\n') self.to_stderr('\n')
self.report_error('Did not get any data blocks') self.report_error('Did not get any data blocks')
@ -331,7 +276,7 @@ class HttpFD(FileDownloader):
'total_bytes': byte_counter, 'total_bytes': byte_counter,
'filename': ctx.filename, 'filename': ctx.filename,
'status': 'finished', 'status': 'finished',
'elapsed': time.time() - ctx.start_time, 'elapsed': time.time() - start,
}) })
return True return True
@ -345,8 +290,6 @@ class HttpFD(FileDownloader):
if count <= retries: if count <= retries:
self.report_retry(e.source_error, count, retries) self.report_retry(e.source_error, count, retries)
continue continue
except NextFragment:
continue
except SucceedDownload: except SucceedDownload:
return True return True

View File

@ -1,27 +1,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import time import time
import struct
import binascii import binascii
import io import io
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import ( from ..compat import compat_urllib_error
compat_Struct,
compat_urllib_error,
)
u8 = compat_Struct('>B') u8 = struct.Struct(b'>B')
u88 = compat_Struct('>Bx') u88 = struct.Struct(b'>Bx')
u16 = compat_Struct('>H') u16 = struct.Struct(b'>H')
u1616 = compat_Struct('>Hxx') u1616 = struct.Struct(b'>Hxx')
u32 = compat_Struct('>I') u32 = struct.Struct(b'>I')
u64 = compat_Struct('>Q') u64 = struct.Struct(b'>Q')
s88 = compat_Struct('>bx') s88 = struct.Struct(b'>bx')
s16 = compat_Struct('>h') s16 = struct.Struct(b'>h')
s1616 = compat_Struct('>hxx') s1616 = struct.Struct(b'>hxx')
s32 = compat_Struct('>i') s32 = struct.Struct(b'>i')
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000) unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
@ -141,7 +139,7 @@ def write_piff_header(stream, params):
sample_entry_payload += u16.pack(0x18) # depth sample_entry_payload += u16.pack(0x18) # depth
sample_entry_payload += s16.pack(-1) # pre defined sample_entry_payload += s16.pack(-1) # pre defined
codec_private_data = binascii.unhexlify(params['codec_private_data'].encode('utf-8')) codec_private_data = binascii.unhexlify(params['codec_private_data'])
if fourcc in ('H264', 'AVC1'): if fourcc in ('H264', 'AVC1'):
sps, pps = codec_private_data.split(u32.pack(1))[1:] sps, pps = codec_private_data.split(u32.pack(1))[1:]
avcc_payload = u8.pack(1) # configuration version avcc_payload = u8.pack(1) # configuration version

View File

@ -24,12 +24,10 @@ class RtmpFD(FileDownloader):
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
def run_rtmpdump(args): def run_rtmpdump(args):
start = time.time() start = time.time()
proc = subprocess.Popen(args, stderr=subprocess.PIPE)
cursor_in_new_line = True
def dl():
resume_percent = None resume_percent = None
resume_downloaded_data_len = None resume_downloaded_data_len = None
proc = subprocess.Popen(args, stderr=subprocess.PIPE)
cursor_in_new_line = True
proc_stderr_closed = False proc_stderr_closed = False
while not proc_stderr_closed: while not proc_stderr_closed:
# read line from stderr # read line from stderr
@ -90,12 +88,7 @@ class RtmpFD(FileDownloader):
self.to_screen('') self.to_screen('')
cursor_in_new_line = True cursor_in_new_line = True
self.to_screen('[rtmpdump] ' + line) self.to_screen('[rtmpdump] ' + line)
try:
dl()
finally:
proc.wait() proc.wait()
if not cursor_in_new_line: if not cursor_in_new_line:
self.to_screen('') self.to_screen('')
return proc.returncode return proc.returncode
@ -170,15 +163,7 @@ class RtmpFD(FileDownloader):
RD_INCOMPLETE = 2 RD_INCOMPLETE = 2
RD_NO_CONNECT = 3 RD_NO_CONNECT = 3
started = time.time()
try:
retval = run_rtmpdump(args) retval = run_rtmpdump(args)
except KeyboardInterrupt:
if not info_dict.get('is_live'):
raise
retval = RD_SUCCESS
self.to_screen('\n[rtmpdump] Interrupted by user')
if retval == RD_NO_CONNECT: if retval == RD_NO_CONNECT:
self.report_error('[rtmpdump] Could not connect to RTMP server.') self.report_error('[rtmpdump] Could not connect to RTMP server.')
@ -186,7 +171,7 @@ class RtmpFD(FileDownloader):
while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live: while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live:
prevsize = os.path.getsize(encodeFilename(tmpfilename)) prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] Downloaded %s bytes' % prevsize) self.to_screen('[rtmpdump] %s bytes' % prevsize)
time.sleep(5.0) # This seems to be needed time.sleep(5.0) # This seems to be needed
args = basic_args + ['--resume'] args = basic_args + ['--resume']
if retval == RD_FAILED: if retval == RD_FAILED:
@ -203,14 +188,13 @@ class RtmpFD(FileDownloader):
break break
if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE): if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE):
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] Downloaded %s bytes' % fsize) self.to_screen('[rtmpdump] %s bytes' % fsize)
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
self._hook_progress({ self._hook_progress({
'downloaded_bytes': fsize, 'downloaded_bytes': fsize,
'total_bytes': fsize, 'total_bytes': fsize,
'filename': filename, 'filename': filename,
'status': 'finished', 'status': 'finished',
'elapsed': time.time() - started,
}) })
return True return True
else: else:

View File

@ -13,7 +13,6 @@ from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get, try_get,
unescapeHTML,
update_url_query, update_url_query,
) )
@ -105,22 +104,21 @@ class ABCIE(InfoExtractor):
class ABCIViewIE(InfoExtractor): class ABCIViewIE(InfoExtractor):
IE_NAME = 'abc.net.au:iview' IE_NAME = 'abc.net.au:iview'
_VALID_URL = r'https?://iview\.abc\.net\.au/(?:[^/]+/)*video/(?P<id>[^/?#]+)' _VALID_URL = r'https?://iview\.abc\.net\.au/programs/[^/]+/(?P<id>[^/?#]+)'
_GEO_COUNTRIES = ['AU'] _GEO_COUNTRIES = ['AU']
# ABC iview programs are normally available for 14 days only. # ABC iview programs are normally available for 14 days only.
_TESTS = [{ _TESTS = [{
'url': 'https://iview.abc.net.au/show/ben-and-hollys-little-kingdom/series/0/video/ZX9371A050S00', 'url': 'http://iview.abc.net.au/programs/call-the-midwife/ZW0898A003S00',
'md5': 'cde42d728b3b7c2b32b1b94b4a548afc', 'md5': 'cde42d728b3b7c2b32b1b94b4a548afc',
'info_dict': { 'info_dict': {
'id': 'ZX9371A050S00', 'id': 'ZW0898A003S00',
'ext': 'mp4', 'ext': 'mp4',
'title': "Gaston's Birthday", 'title': 'Series 5 Ep 3',
'series': "Ben And Holly's Little Kingdom", 'description': 'md5:e0ef7d4f92055b86c4f33611f180ed79',
'description': 'md5:f9de914d02f226968f598ac76f105bcf', 'upload_date': '20171228',
'upload_date': '20180604', 'uploader_id': 'abc1',
'uploader_id': 'abc4kids', 'timestamp': 1514499187,
'timestamp': 1528140219,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -129,16 +127,17 @@ class ABCIViewIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_params = self._download_json( webpage = self._download_webpage(url, video_id)
'https://iview.abc.net.au/api/programs/' + video_id, video_id) video_params = self._parse_json(self._search_regex(
title = unescapeHTML(video_params.get('title') or video_params['seriesTitle']) r'videoParams\s*=\s*({.+?});', webpage, 'video params'), video_id)
stream = next(s for s in video_params['playlist'] if s.get('type') in ('program', 'livestream')) title = video_params.get('title') or video_params['seriesTitle']
stream = next(s for s in video_params['playlist'] if s.get('type') == 'program')
house_number = video_params.get('episodeHouseNumber') or video_id house_number = video_params.get('episodeHouseNumber')
path = '/auth/hls/sign?ts={0}&hn={1}&d=android-tablet'.format( path = '/auth/hls/sign?ts={0}&hn={1}&d=android-mobile'.format(
int(time.time()), house_number) int(time.time()), house_number)
sig = hmac.new( sig = hmac.new(
b'android.content.res.Resources', 'android.content.res.Resources'.encode('utf-8'),
path.encode('utf-8'), hashlib.sha256).hexdigest() path.encode('utf-8'), hashlib.sha256).hexdigest()
token = self._download_webpage( token = self._download_webpage(
'http://iview.abc.net.au{0}&sig={1}'.format(path, sig), video_id) 'http://iview.abc.net.au{0}&sig={1}'.format(path, sig), video_id)
@ -168,26 +167,18 @@ class ABCIViewIE(InfoExtractor):
'ext': 'vtt', 'ext': 'vtt',
}] }]
is_live = video_params.get('livestream') == '1'
if is_live:
title = self._live_title(title)
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': video_params.get('description'), 'description': self._html_search_meta(['og:description', 'twitter:description'], webpage),
'thumbnail': video_params.get('thumbnail'), 'thumbnail': self._html_search_meta(['og:image', 'twitter:image:src'], webpage),
'duration': int_or_none(video_params.get('eventDuration')), 'duration': int_or_none(video_params.get('eventDuration')),
'timestamp': parse_iso8601(video_params.get('pubDate'), ' '), 'timestamp': parse_iso8601(video_params.get('pubDate'), ' '),
'series': unescapeHTML(video_params.get('seriesTitle')), 'series': video_params.get('seriesTitle'),
'series_id': video_params.get('seriesHouseNumber') or video_id[:7], 'series_id': video_params.get('seriesHouseNumber') or video_id[:7],
'season_number': int_or_none(self._search_regex( 'episode_number': int_or_none(self._html_search_meta('episodeNumber', webpage, default=None)),
r'\bSeries\s+(\d+)\b', title, 'season number', default=None)), 'episode': self._html_search_meta('episode_title', webpage, default=None),
'episode_number': int_or_none(self._search_regex(
r'\bEp\s+(\d+)\b', title, 'episode number', default=None)),
'episode_id': house_number,
'uploader_id': video_params.get('channel'), 'uploader_id': video_params.get('channel'),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'is_live': is_live,
} }

View File

@ -66,7 +66,7 @@ class AbcNewsIE(InfoExtractor):
_TESTS = [{ _TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY', 'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
'info_dict': { 'info_dict': {
'id': '10505354', 'id': '10498713',
'ext': 'flv', 'ext': 'flv',
'display_id': 'dramatic-video-rare-death-job-america', 'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards', 'title': 'Occupational Hazards',
@ -79,7 +79,7 @@ class AbcNewsIE(InfoExtractor):
}, { }, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818', 'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': { 'info_dict': {
'id': '38897857', 'id': '39125818',
'ext': 'mp4', 'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016', 'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single', 'title': 'Justin Timberlake Drops Hints For Secret Single',

View File

@ -7,9 +7,7 @@ import functools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
float_or_none,
int_or_none, int_or_none,
try_get,
unified_timestamp, unified_timestamp,
OnDemandPagedList, OnDemandPagedList,
) )
@ -26,58 +24,40 @@ class ACastIE(InfoExtractor):
'id': '57de3baa-4bb0-487e-9418-2692c1277a34', 'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3', 'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan', 'title': '"Where Are You?": Taipei 101, Taiwan',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'timestamp': 1196172000, 'timestamp': 1196172000,
'upload_date': '20071127', 'upload_date': '20071127',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211, 'duration': 211,
'creator': 'Concierge',
'series': 'Condé Nast Traveler Podcast',
'episode': '"Where Are You?": Taipei 101, Taiwan',
} }
}, { }, {
# test with multiple blings # test with multiple blings
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna', 'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': 'a02393c74f3bdb1801c3ec2695577ce0', 'md5': 'e87d5b8516cd04c0d81b6ee1caca28d0',
'info_dict': { 'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9', 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3', 'ext': 'mp3',
'title': '2. Raggarmordet - Röster ur det förflutna', 'title': '2. Raggarmordet - Röster ur det förflutna',
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'timestamp': 1477346700, 'timestamp': 1477346700,
'upload_date': '20161024', 'upload_date': '20161024',
'duration': 2766.602563, 'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'creator': 'Anton Berg & Martin Johnson', 'duration': 2766,
'series': 'Spår',
'episode': '2. Raggarmordet - Röster ur det förflutna',
} }
}] }]
def _real_extract(self, url): def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups() channel, display_id = re.match(self._VALID_URL, url).groups()
s = self._download_json(
'https://play-api.acast.com/stitch/%s/%s' % (channel, display_id),
display_id)['result']
media_url = s['url']
cast_data = self._download_json( cast_data = self._download_json(
'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), 'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), display_id)
display_id)['result'] e = cast_data['result']['episode']
e = cast_data['episode']
title = e['name']
return { return {
'id': compat_str(e['id']), 'id': compat_str(e['id']),
'display_id': display_id, 'display_id': display_id,
'url': media_url, 'url': e['mediaUrl'],
'title': title, 'title': e['name'],
'description': e.get('description') or e.get('summary'), 'description': e.get('description'),
'thumbnail': e.get('image'), 'thumbnail': e.get('image'),
'timestamp': unified_timestamp(e.get('publishingDate')), 'timestamp': unified_timestamp(e.get('publishingDate')),
'duration': float_or_none(s.get('duration') or e.get('duration')), 'duration': int_or_none(e.get('duration')),
'filesize': int_or_none(e.get('contentLength')),
'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str),
'series': try_get(cast_data, lambda x: x['show']['name'], compat_str),
'season_number': int_or_none(e.get('seasonNumber')),
'episode': title,
'episode_number': int_or_none(e.get('episodeNumber')),
} }

View File

@ -2,25 +2,17 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64 import base64
import binascii
import json import json
import os import os
import random
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_cbc_decrypt from ..aes import aes_cbc_decrypt
from ..compat import ( from ..compat import compat_ord
compat_b64decode,
compat_ord,
)
from ..utils import ( from ..utils import (
bytes_to_intlist, bytes_to_intlist,
bytes_to_long,
ExtractorError, ExtractorError,
float_or_none, float_or_none,
intlist_to_bytes, intlist_to_bytes,
long_to_bytes,
pkcs1pad,
srt_subtitles_timecode, srt_subtitles_timecode,
strip_or_none, strip_or_none,
urljoin, urljoin,
@ -41,7 +33,6 @@ class ADNIE(InfoExtractor):
} }
} }
_BASE_URL = 'http://animedigitalnetwork.fr' _BASE_URL = 'http://animedigitalnetwork.fr'
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
def _get_subtitles(self, sub_path, video_id): def _get_subtitles(self, sub_path, video_id):
if not sub_path: if not sub_path:
@ -49,15 +40,17 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage( enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, sub_path), urljoin(self._BASE_URL, sub_path),
video_id, fatal=False) video_id, fatal=False, headers={
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0',
})
if not enc_subtitles: if not enc_subtitles:
return None return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])), bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '9032ad7083106400')), bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24])) bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
)) ))
subtitles_json = self._parse_json( subtitles_json = self._parse_json(
dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(), dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
@ -112,31 +105,15 @@ class ADNIE(InfoExtractor):
options = player_config.get('options') or {} options = player_config.get('options') or {}
metas = options.get('metas') or {} metas = options.get('metas') or {}
title = metas.get('title') or video_info['title']
links = player_config.get('links') or {} links = player_config.get('links') or {}
sub_path = player_config.get('subtitles')
error = None error = None
if not links: if not links:
links_url = player_config.get('linksurl') or options['videoUrl'] links_url = player_config['linksurl']
token = options['token'] links_data = self._download_json(urljoin(
self._K = ''.join([random.choice('0123456789abcdef') for _ in range(16)]) self._BASE_URL, links_url), video_id)
message = bytes_to_intlist(json.dumps({
'k': self._K,
'e': 60,
't': token,
}))
padded_message = intlist_to_bytes(pkcs1pad(message, 128))
n, e = self._RSA_KEY
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode()
links_data = self._download_json(
urljoin(self._BASE_URL, links_url), video_id, headers={
'Authorization': 'Bearer ' + authorization,
})
links = links_data.get('links') or {} links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {}
sub_path = (sub_path or links_data.get('subtitles')) + '&token=' + token
error = links_data.get('error') error = links_data.get('error')
title = metas.get('title') or video_info['title']
formats = [] formats = []
for format_id, qualities in links.items(): for format_id, qualities in links.items():
@ -167,7 +144,7 @@ class ADNIE(InfoExtractor):
'description': strip_or_none(metas.get('summary') or video_info.get('resume')), 'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
'thumbnail': video_info.get('image'), 'thumbnail': video_info.get('image'),
'formats': formats, 'formats': formats,
'subtitles': self.extract_subtitles(sub_path, video_id), 'subtitles': self.extract_subtitles(player_config.get('subtitles'), video_id),
'episode': metas.get('subtitle') or video_info.get('videoTitle'), 'episode': metas.get('subtitle') or video_info.get('videoTitle'),
'series': video_info.get('playlistTitle'), 'series': video_info.get('playlistTitle'),
} }

View File

@ -122,8 +122,7 @@ class AENetworksIE(AENetworksBaseIE):
query = { query = {
'mbr': 'true', 'mbr': 'true',
'assetTypes': 'high_video_ak', 'assetTypes': 'high_video_s3'
'switch': 'hls_high_ak',
} }
video_id = self._html_search_meta('aetn:VideoID', webpage) video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex( media_url = self._search_regex(

View File

@ -9,7 +9,6 @@ from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
urlencode_postdata,
xpath_text, xpath_text,
) )
@ -29,7 +28,6 @@ class AfreecaTVIE(InfoExtractor):
) )
(?P<id>\d+) (?P<id>\d+)
''' '''
_NETRC_MACHINE = 'afreecatv'
_TESTS = [{ _TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=', 'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e', 'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
@ -141,22 +139,22 @@ class AfreecaTVIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
}, { }, {
# PARTIAL_ADULT # adult video
'url': 'http://vod.afreecatv.com/PLAYER/STATION/32028439', 'url': 'http://vod.afreecatv.com/PLAYER/STATION/26542731',
'info_dict': { 'info_dict': {
'id': '20180327_27901457_202289533_1', 'id': '20171001_F1AE1711_196617479_1',
'ext': 'mp4', 'ext': 'mp4',
'title': '[생]빨개요♥ (part 1)', 'title': '[생]서아 초심 찾기 방송 (part 1)',
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$', 'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': '[SA]서아', 'uploader': 'BJ서아',
'uploader_id': 'bjdyrksu', 'uploader_id': 'bjdyrksu',
'upload_date': '20180327', 'upload_date': '20171001',
'duration': 3601, 'duration': 3600,
'age_limit': 18,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['adult content'],
}, { }, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652', 'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True, 'only_matching': True,
@ -174,107 +172,25 @@ class AfreecaTVIE(InfoExtractor):
video_key['part'] = int(m.group('part')) video_key['part'] = int(m.group('part'))
return video_key return video_key
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'szWork': 'login',
'szType': 'json',
'szUid': username,
'szPassword': password,
'isSaveId': 'false',
'szScriptVar': 'oLoginRet',
'szAction': '',
}
response = self._download_json(
'https://login.afreecatv.com/app/LoginAction.php', None,
'Logging in', data=urlencode_postdata(login_form))
_ERRORS = {
-4: 'Your account has been suspended due to a violation of our terms and policies.',
-5: 'https://member.afreecatv.com/app/user_delete_progress.php',
-6: 'https://login.afreecatv.com/membership/changeMember.php',
-8: "Hello! AfreecaTV here.\nThe username you have entered belongs to \n an account that requires a legal guardian's consent. \nIf you wish to use our services without restriction, \nplease make sure to go through the necessary verification process.",
-9: 'https://member.afreecatv.com/app/pop_login_block.php',
-11: 'https://login.afreecatv.com/afreeca/second_login.php',
-12: 'https://member.afreecatv.com/app/user_security.php',
0: 'The username does not exist or you have entered the wrong password.',
-1: 'The username does not exist or you have entered the wrong password.',
-3: 'You have entered your username/password incorrectly.',
-7: 'You cannot use your Global AfreecaTV account to access Korean AfreecaTV.',
-10: 'Sorry for the inconvenience. \nYour account has been blocked due to an unauthorized access. \nPlease contact our Help Center for assistance.',
-32008: 'You have failed to log in. Please contact our Help Center.',
}
result = int_or_none(response.get('RESULT'))
if result != 1:
error = _ERRORS.get(result, 'You have failed to log in.')
raise ExtractorError(
'Unable to login: %s said: %s' % (self.IE_NAME, error),
expected=True)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if re.search(r'alert\(["\']This video has been deleted', webpage):
raise ExtractorError(
'Video %s has been deleted' % video_id, expected=True)
station_id = self._search_regex(
r'nStationNo\s*=\s*(\d+)', webpage, 'station')
bbs_id = self._search_regex(
r'nBbsNo\s*=\s*(\d+)', webpage, 'bbs')
video_id = self._search_regex(
r'nTitleNo\s*=\s*(\d+)', webpage, 'title', default=video_id)
partial_view = False
for _ in range(2):
query = {
'nTitleNo': video_id,
'nStationNo': station_id,
'nBbsNo': bbs_id,
}
if partial_view:
query['partialView'] = 'SKIP_ADULT'
video_xml = self._download_xml( video_xml = self._download_xml(
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php', 'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
video_id, 'Downloading video info XML%s' video_id, query={
% (' (skipping adult)' if partial_view else ''), 'nTitleNo': video_id,
video_id, headers={ 'partialView': 'SKIP_ADULT',
'Referer': url, })
}, query=query)
flag = xpath_text(video_xml, './track/flag', 'flag', default=None) flag = xpath_text(video_xml, './track/flag', 'flag', default=None)
if flag and flag == 'SUCCEED': if flag and flag != 'SUCCEED':
break
if flag == 'PARTIAL_ADULT':
self._downloader.report_warning(
'In accordance with local laws and regulations, underage users are restricted from watching adult content. '
'Only content suitable for all ages will be downloaded. '
'Provide account credentials if you wish to download restricted content.')
partial_view = True
continue
elif flag == 'ADULT':
error = 'Only users older than 19 are able to watch this video. Provide account credentials to download this content.'
else:
error = flag
raise ExtractorError( raise ExtractorError(
'%s said: %s' % (self.IE_NAME, error), expected=True) '%s said: %s' % (self.IE_NAME, flag), expected=True)
else:
raise ExtractorError('Unable to download video info')
video_element = video_xml.findall(compat_xpath('./track/video'))[-1] video_element = video_xml.findall(compat_xpath('./track/video'))[1]
if video_element is None or video_element.text is None: if video_element is None or video_element.text is None:
raise ExtractorError( raise ExtractorError('Specified AfreecaTV video does not exist',
'Video %s video does not exist' % video_id, expected=True) expected=True)
video_url = video_element.text.strip() video_url = video_element.text.strip()

View File

@ -11,7 +11,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE): class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1', 'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '', 'md5': '',
@ -51,9 +51,6 @@ class AMCNetworksIE(ThePlatformIE):
}, { }, {
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3', 'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

0
youtube_dl/extractor/americastestkitchen.py Normal file → Executable file
View File

View File

@ -52,7 +52,7 @@ class AnimeOnDemandIE(InfoExtractor):
}] }]
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return

View File

@ -277,9 +277,7 @@ class AnvatoIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass({ self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
'countries': smuggled_data.get('geo_countries'),
})
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
access_key, video_id = mobj.group('access_key_or_mcp', 'id') access_key, video_id = mobj.group('access_key_or_mcp', 'id')

View File

@ -1,94 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
determine_ext,
js_to_json,
)
class APAIE(InfoExtractor):
_VALID_URL = r'https?://[^/]+\.apa\.at/embed/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
_TESTS = [{
'url': 'http://uvp.apa.at/embed/293f6d17-692a-44e3-9fd5-7b178f3a1029',
'md5': '2b12292faeb0a7d930c778c7a5b4759b',
'info_dict': {
'id': 'jjv85FdZ',
'ext': 'mp4',
'title': '"Blau ist mysteriös": Die Blue Man Group im Interview',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 254,
'timestamp': 1519211149,
'upload_date': '20180221',
},
}, {
'url': 'https://uvp-apapublisher.sf.apa.at/embed/2f94e9e6-d945-4db2-9548-f9a41ebf7b78',
'only_matching': True,
}, {
'url': 'http://uvp-rma.sf.apa.at/embed/70404cca-2f47-4855-bbb8-20b1fae58f76',
'only_matching': True,
}, {
'url': 'http://uvp-kleinezeitung.sf.apa.at/embed/f1c44979-dba2-4ebf-b021-e4cf2cac3c81',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [
mobj.group('url')
for mobj in re.finditer(
r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//[^/]+\.apa\.at/embed/[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}.*?)\1',
webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
jwplatform_id = self._search_regex(
r'media[iI]d\s*:\s*["\'](?P<id>[a-zA-Z0-9]{8})', webpage,
'jwplatform id', default=None)
if jwplatform_id:
return self.url_result(
'jwplatform:' + jwplatform_id, ie='JWPlatform',
video_id=video_id)
sources = self._parse_json(
self._search_regex(
r'sources\s*=\s*(\[.+?\])\s*;', webpage, 'sources'),
video_id, transform_source=js_to_json)
formats = []
for source in sources:
if not isinstance(source, dict):
continue
source_url = source.get('file')
if not source_url or not isinstance(source_url, compat_str):
continue
ext = determine_ext(source_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
else:
formats.append({
'url': source_url,
})
self._sort_formats(formats)
thumbnail = self._search_regex(
r'image\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
'thumbnail', fatal=False, group='url')
return {
'id': video_id,
'title': video_id,
'thumbnail': thumbnail,
'formats': formats,
}

View File

@ -41,7 +41,7 @@ class ArchiveOrgIE(InfoExtractor):
webpage = self._download_webpage( webpage = self._download_webpage(
'http://archive.org/embed/' + video_id, video_id) 'http://archive.org/embed/' + video_id, video_id)
jwplayer_playlist = self._parse_json(self._search_regex( jwplayer_playlist = self._parse_json(self._search_regex(
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)", r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\);",
webpage, 'jwplayer playlist'), video_id) webpage, 'jwplayer playlist'), video_id)
info = self._parse_jwplayer_data( info = self._parse_jwplayer_data(
{'playlist': jwplayer_playlist}, video_id, base_url=url) {'playlist': jwplayer_playlist}, video_id, base_url=url)

View File

@ -24,30 +24,57 @@ class ARDMediathekIE(InfoExtractor):
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?' _VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_TESTS = [{ _TESTS = [{
# available till 26.07.2022 'url': 'http://www.ardmediathek.de/tv/Dokumentation-und-Reportage/Ich-liebe-das-Leben-trotzdem/rbb-Fernsehen/Video?documentId=29582122&bcastId=3822114',
'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
'info_dict': { 'info_dict': {
'id': '44726822', 'id': '29582122',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?', 'title': 'Ich liebe das Leben trotzdem',
'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5', 'description': 'md5:45e4c225c72b27993314b31a84a5261c',
'duration': 1740, 'duration': 4557,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} },
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'http://www.ardmediathek.de/tv/Tatort/Tatort-Scheinwelten-H%C3%B6rfassung-Video/Das-Erste/Video?documentId=29522730&bcastId=602916',
'md5': 'f4d98b10759ac06c0072bbcd1f0b9e3e',
'info_dict': {
'id': '29522730',
'ext': 'mp4',
'title': 'Tatort: Scheinwelten - Hörfassung (Video tgl. ab 20 Uhr)',
'description': 'md5:196392e79876d0ac94c94e8cdb2875f1',
'duration': 5252,
},
'skip': 'HTTP Error 404: Not Found',
}, { }, {
# audio # audio
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086', 'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
'only_matching': True, 'md5': '219d94d8980b4f538c7fcb0865eb7f2c',
'info_dict': {
'id': '28488308',
'ext': 'mp3',
'title': 'Tod eines Fußballers',
'description': 'md5:f6e39f3461f0e1f54bfa48c8875c86ef',
'duration': 3240,
},
'skip': 'HTTP Error 404: Not Found',
}, { }, {
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht', 'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
'only_matching': True, 'only_matching': True,
}, { }, {
# audio # audio
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158', 'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
'only_matching': True, 'md5': '4e8f00631aac0395fee17368ac0e9867',
'info_dict': {
'id': '30796318',
'ext': 'mp3',
'title': 'Vor dem Fest',
'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
'duration': 3287,
},
'skip': 'Video is no longer available',
}] }]
def _extract_media_info(self, media_info_url, webpage, video_id): def _extract_media_info(self, media_info_url, webpage, video_id):
@ -225,23 +252,20 @@ class ARDMediathekIE(InfoExtractor):
class ARDIE(InfoExtractor): class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html' _VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_TESTS = [{ _TEST = {
# available till 14.02.2019 'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html', 'md5': 'd216c3a86493f9322545e045ddc3eb35',
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
'info_dict': { 'info_dict': {
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video', 'display_id': 'die-story-im-ersten-mission-unter-falscher-flagge',
'id': '102', 'id': '100',
'ext': 'mp4', 'ext': 'mp4',
'duration': 4435.0, 'duration': 2600,
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?', 'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20180214', 'upload_date': '20140804',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, { 'skip': 'HTTP Error 404: Not Found',
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html', }
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)

View File

@ -74,7 +74,7 @@ class AtresPlayerIE(InfoExtractor):
self._login() self._login()
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return

View File

@ -5,12 +5,13 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
sanitized_Request,
) )
class AudiMediaIE(InfoExtractor): class AudiMediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?:video/)?(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?P<id>[^/?#]+)'
_TESTS = [{ _TEST = {
'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467', 'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467',
'md5': '79a8b71c46d49042609795ab59779b66', 'md5': '79a8b71c46d49042609795ab59779b66',
'info_dict': { 'info_dict': {
@ -23,46 +24,41 @@ class AudiMediaIE(InfoExtractor):
'duration': 74022, 'duration': 74022,
'view_count': int, 'view_count': int,
} }
}, { }
'url': 'https://www.audi-mediacenter.com/en/audimediatv/video/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-2991', # extracted from https://audimedia.tv/assets/embed/embedded-player.js (dataSourceAuthToken)
'only_matching': True, _AUTH_TOKEN = 'e25b42847dba18c6c8816d5d8ce94c326e06823ebf0859ed164b3ba169be97f2'
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
raw_payload = self._search_regex([ raw_payload = self._search_regex([
r'class="amtv-embed"[^>]+id="([0-9a-z-]+)"', r'class="amtv-embed"[^>]+id="([^"]+)"',
r'id="([0-9a-z-]+)"[^>]+class="amtv-embed"', r'class=\\"amtv-embed\\"[^>]+id=\\"([^"]+)\\"',
r'class=\\"amtv-embed\\"[^>]+id=\\"([0-9a-z-]+)\\"',
r'id=\\"([0-9a-z-]+)\\"[^>]+class=\\"amtv-embed\\"',
r'id=(?:\\)?"(amtve-[a-z]-\d+-[a-z]{2})',
], webpage, 'raw payload') ], webpage, 'raw payload')
_, stage_mode, video_id, _ = raw_payload.split('-') _, stage_mode, video_id, lang = raw_payload.split('-')
# TODO: handle s and e stage_mode (live streams and ended live streams) # TODO: handle s and e stage_mode (live streams and ended live streams)
if stage_mode not in ('s', 'e'): if stage_mode not in ('s', 'e'):
video_data = self._download_json( request = sanitized_Request(
'https://www.audimedia.tv/api/video/v1/videos/' + video_id, 'https://audimedia.tv/api/video/v1/videos/%s?embed[]=video_versions&embed[]=thumbnail_image&where[content_language_iso]=%s' % (video_id, lang),
video_id, query={ headers={'X-Auth-Token': self._AUTH_TOKEN})
'embed[]': ['video_versions', 'thumbnail_image'], json_data = self._download_json(request, video_id)['results']
})['results']
formats = [] formats = []
stream_url_hls = video_data.get('stream_url_hls') stream_url_hls = json_data.get('stream_url_hls')
if stream_url_hls: if stream_url_hls:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
stream_url_hls, video_id, 'mp4', stream_url_hls, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)) entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
stream_url_hds = video_data.get('stream_url_hds') stream_url_hds = json_data.get('stream_url_hds')
if stream_url_hds: if stream_url_hds:
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
stream_url_hds + '?hdcore=3.4.0', stream_url_hds + '?hdcore=3.4.0',
video_id, f4m_id='hds', fatal=False)) video_id, f4m_id='hds', fatal=False))
for video_version in video_data.get('video_versions', []): for video_version in json_data.get('video_versions'):
video_version_url = video_version.get('download_url') or video_version.get('stream_url') video_version_url = video_version.get('download_url') or video_version.get('stream_url')
if not video_version_url: if not video_version_url:
continue continue
@ -83,11 +79,11 @@ class AudiMediaIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'title': video_data['title'], 'title': json_data['title'],
'description': video_data.get('subtitle'), 'description': json_data.get('subtitle'),
'thumbnail': video_data.get('thumbnail_image', {}).get('file'), 'thumbnail': json_data.get('thumbnail_image', {}).get('file'),
'timestamp': parse_iso8601(video_data.get('publication_date')), 'timestamp': parse_iso8601(json_data.get('publication_date')),
'duration': int_or_none(video_data.get('duration')), 'duration': int_or_none(json_data.get('duration')),
'view_count': int_or_none(video_data.get('view_count')), 'view_count': int_or_none(json_data.get('view_count')),
'formats': formats, 'formats': formats,
} }

View File

@ -65,7 +65,7 @@ class AudiomackIE(InfoExtractor):
return {'_type': 'url', 'url': api_response['url'], 'ie_key': 'Soundcloud'} return {'_type': 'url', 'url': api_response['url'], 'ie_key': 'Soundcloud'}
return { return {
'id': compat_str(api_response.get('id', album_url_tag)), 'id': api_response.get('id', album_url_tag),
'uploader': api_response.get('artist'), 'uploader': api_response.get('artist'),
'title': api_response.get('title'), 'title': api_response.get('title'),
'url': api_response['url'], 'url': api_response['url'],

View File

@ -44,7 +44,7 @@ class BambuserIE(InfoExtractor):
} }
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return

View File

@ -12,7 +12,6 @@ from ..utils import (
float_or_none, float_or_none,
get_element_by_class, get_element_by_class,
int_or_none, int_or_none,
js_to_json,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
try_get, try_get,
@ -773,17 +772,6 @@ class BBCIE(BBCCoUkIE):
# single video article embedded with data-media-vpid # single video article embedded with data-media-vpid
'url': 'http://www.bbc.co.uk/sport/rowing/35908187', 'url': 'http://www.bbc.co.uk/sport/rowing/35908187',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.bbc.co.uk/bbcthree/clip/73d0bbd0-abc3-4cea-b3c0-cdae21905eb1',
'info_dict': {
'id': 'p06556y7',
'ext': 'mp4',
'title': 'Transfers: Cristiano Ronaldo to Man Utd, Arsenal to spend?',
'description': 'md5:4b7dfd063d5a789a1512e99662be3ddd',
},
'params': {
'skip_download': True,
}
}] }]
@classmethod @classmethod
@ -1006,36 +994,6 @@ class BBCIE(BBCCoUkIE):
'subtitles': subtitles, 'subtitles': subtitles,
} }
bbc3_config = self._parse_json(
self._search_regex(
r'(?s)bbcthreeConfig\s*=\s*({.+?})\s*;\s*<', webpage,
'bbcthree config', default='{}'),
playlist_id, transform_source=js_to_json, fatal=False)
if bbc3_config:
bbc3_playlist = try_get(
bbc3_config, lambda x: x['payload']['content']['bbcMedia']['playlist'],
dict)
if bbc3_playlist:
playlist_title = bbc3_playlist.get('title') or playlist_title
thumbnail = bbc3_playlist.get('holdingImageURL')
entries = []
for bbc3_item in bbc3_playlist['items']:
programme_id = bbc3_item.get('versionID')
if not programme_id:
continue
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
entries.append({
'id': programme_id,
'title': playlist_title,
'thumbnail': thumbnail,
'timestamp': timestamp,
'formats': formats,
'subtitles': subtitles,
})
return self.playlist_result(
entries, playlist_id, playlist_title, playlist_description)
def extract_all(pattern): def extract_all(pattern):
return list(filter(None, map( return list(filter(None, map(
lambda s: self._parse_json(s, playlist_id, fatal=False), lambda s: self._parse_json(s, playlist_id, fatal=False),

View File

@ -12,7 +12,7 @@ class BellMediaIE(InfoExtractor):
(?: (?:
ctv| ctv|
tsn| tsn|
bnn(?:bloomberg)?| bnn|
thecomedynetwork| thecomedynetwork|
discovery| discovery|
discoveryvelocity| discoveryvelocity|
@ -27,16 +27,17 @@ class BellMediaIE(InfoExtractor):
much\.com much\.com
)/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})''' )/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
_TESTS = [{ _TESTS = [{
'url': 'https://www.bnnbloomberg.ca/video/david-cockfield-s-top-picks~1403070', 'url': 'http://www.ctv.ca/video/player?vid=706966',
'md5': '36d3ef559cfe8af8efe15922cd3ce950', 'md5': 'ff2ebbeae0aa2dcc32a830c3fd69b7b0',
'info_dict': { 'info_dict': {
'id': '1403070', 'id': '706966',
'ext': 'flv', 'ext': 'mp4',
'title': 'David Cockfield\'s Top Picks', 'title': 'Larry Day and Richard Jutras on the TIFF red carpet of \'Stonewall\'',
'description': 'md5:810f7f8c6a83ad5b48677c3f8e5bb2c3', 'description': 'etalk catches up with Larry Day and Richard Jutras on the TIFF red carpet of "Stonewall”.',
'upload_date': '20180525', 'upload_date': '20150919',
'timestamp': 1527288600, 'timestamp': 1442624700,
}, },
'expected_warnings': ['HTTP Error 404'],
}, { }, {
'url': 'http://www.thecomedynetwork.ca/video/player?vid=923582', 'url': 'http://www.thecomedynetwork.ca/video/player?vid=923582',
'only_matching': True, 'only_matching': True,
@ -69,7 +70,6 @@ class BellMediaIE(InfoExtractor):
'investigationdiscovery': 'invdisc', 'investigationdiscovery': 'invdisc',
'animalplanet': 'aniplan', 'animalplanet': 'aniplan',
'etalk': 'ctv', 'etalk': 'ctv',
'bnnbloomberg': 'bnn',
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -1,13 +1,11 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import compat_urllib_parse_unquote
compat_b64decode,
compat_urllib_parse_unquote,
)
class BigflixIE(InfoExtractor): class BigflixIE(InfoExtractor):
@ -41,8 +39,8 @@ class BigflixIE(InfoExtractor):
webpage, 'title') webpage, 'title')
def decode_url(quoted_b64_url): def decode_url(quoted_b64_url):
return compat_b64decode(compat_urllib_parse_unquote( return base64.b64decode(compat_urllib_parse_unquote(
quoted_b64_url)).decode('utf-8') quoted_b64_url).encode('ascii')).decode('utf-8')
formats = [] formats = []
for height, encoded_url in re.findall( for height, encoded_url in re.findall(

View File

@ -27,14 +27,14 @@ class BiliBiliIE(InfoExtractor):
_TESTS = [{ _TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/', 'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788', 'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
'info_dict': { 'info_dict': {
'id': '1074402', 'id': '1074402',
'ext': 'flv', 'ext': 'mp4',
'title': '【金坷垃】金泡沫', 'title': '【金坷垃】金泡沫',
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923', 'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'duration': 308.067, 'duration': 308.315,
'timestamp': 1398012678, 'timestamp': 1398012660,
'upload_date': '20140420', 'upload_date': '20140420',
'thumbnail': r're:^https?://.+\.jpg', 'thumbnail': r're:^https?://.+\.jpg',
'uploader': '菊子桑', 'uploader': '菊子桑',
@ -59,38 +59,17 @@ class BiliBiliIE(InfoExtractor):
'url': 'http://www.bilibili.com/video/av8903802/', 'url': 'http://www.bilibili.com/video/av8903802/',
'info_dict': { 'info_dict': {
'id': '8903802', 'id': '8903802',
'ext': 'mp4',
'title': '阿滴英文|英文歌分享#6 "Closer', 'title': '阿滴英文|英文歌分享#6 "Closer',
'description': '滴妹今天唱Closer給你聽! 有史以来,被推最多次也是最久的歌曲,其实歌词跟我原本想像差蛮多的,不过还是好听! 微博@阿滴英文', 'description': '滴妹今天唱Closer給你聽! 有史以来,被推最多次也是最久的歌曲,其实歌词跟我原本想像差蛮多的,不过还是好听! 微博@阿滴英文',
},
'playlist': [{
'info_dict': {
'id': '8903802_part1',
'ext': 'flv',
'title': '阿滴英文|英文歌分享#6 "Closer',
'description': 'md5:3b1b9e25b78da4ef87e9b548b88ee76a',
'uploader': '阿滴英文', 'uploader': '阿滴英文',
'uploader_id': '65880958', 'uploader_id': '65880958',
'timestamp': 1488382634, 'timestamp': 1488382620,
'upload_date': '20170301', 'upload_date': '20170301',
}, },
'params': { 'params': {
'skip_download': True, # Test metadata only 'skip_download': True, # Test metadata only
}, },
}, {
'info_dict': {
'id': '8903802_part2',
'ext': 'flv',
'title': '阿滴英文|英文歌分享#6 "Closer',
'description': 'md5:3b1b9e25b78da4ef87e9b548b88ee76a',
'uploader': '阿滴英文',
'uploader_id': '65880958',
'timestamp': 1488382634,
'upload_date': '20170301',
},
'params': {
'skip_download': True, # Test metadata only
},
}]
}] }]
_APP_KEY = '84956560bc028eb7' _APP_KEY = '84956560bc028eb7'
@ -113,12 +92,8 @@ class BiliBiliIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
if 'anime/' not in url: if 'anime/' not in url:
cid = self._search_regex( cid = compat_parse_qs(self._search_regex(
r'cid(?:["\']:|=)(\d+)', webpage, 'cid',
default=None
) or compat_parse_qs(self._search_regex(
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)', [r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
r'EmbedPlayer\([^)]+,\s*\\"([^"]+)\\"\)',
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'], r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
webpage, 'player parameters'))['cid'][0] webpage, 'player parameters'))['cid'][0]
else: else:
@ -127,7 +102,6 @@ class BiliBiliIE(InfoExtractor):
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id))) video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
headers = { headers = {
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8', 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Referer': url
} }
headers.update(self.geo_verification_headers()) headers.update(self.geo_verification_headers())
@ -139,31 +113,19 @@ class BiliBiliIE(InfoExtractor):
self._report_error(js) self._report_error(js)
cid = js['result']['cid'] cid = js['result']['cid']
headers = { payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
'Referer': url
}
headers.update(self.geo_verification_headers())
entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
for num, rendition in enumerate(RENDITIONS, start=1):
payload = 'appkey=%s&cid=%s&otype=json&%s' % (self._APP_KEY, cid, rendition)
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest() sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
video_info = self._download_json( video_info = self._download_json(
'http://interface.bilibili.com/v2/playurl?%s&sign=%s' % (payload, sign), 'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
video_id, note='Downloading video info page', video_id, note='Downloading video info page',
headers=headers, fatal=num == len(RENDITIONS)) headers=self.geo_verification_headers())
if not video_info:
continue
if 'durl' not in video_info: if 'durl' not in video_info:
if num < len(RENDITIONS):
continue
self._report_error(video_info) self._report_error(video_info)
entries = []
for idx, durl in enumerate(video_info['durl']): for idx, durl in enumerate(video_info['durl']):
formats = [{ formats = [{
'url': durl['url'], 'url': durl['url'],
@ -188,17 +150,11 @@ class BiliBiliIE(InfoExtractor):
'duration': float_or_none(durl.get('length'), 1000), 'duration': float_or_none(durl.get('length'), 1000),
'formats': formats, 'formats': formats,
}) })
break
title = self._html_search_regex( title = self._html_search_regex('<h1[^>]*>([^<]+)</h1>', webpage, 'title')
('<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
group='title')
description = self._html_search_meta('description', webpage) description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex( timestamp = unified_timestamp(self._html_search_regex(
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))
default=None) or self._html_search_meta(
'uploadDate', webpage, 'timestamp', default=None))
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage) thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
# TODO 'view_count' requires deobfuscating Javascript # TODO 'view_count' requires deobfuscating Javascript
@ -212,16 +168,13 @@ class BiliBiliIE(InfoExtractor):
} }
uploader_mobj = re.search( uploader_mobj = re.search(
r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]*>(?P<name>[^<]+)', r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
webpage) webpage)
if uploader_mobj: if uploader_mobj:
info.update({ info.update({
'uploader': uploader_mobj.group('name'), 'uploader': uploader_mobj.group('name'),
'uploader_id': uploader_mobj.group('id'), 'uploader_id': uploader_mobj.group('id'),
}) })
if not info.get('uploader'):
info['uploader'] = self._html_search_meta(
'author', webpage, 'uploader', default=None)
for entry in entries: for entry in entries:
entry.update(info) entry.update(info)

View File

@ -3,13 +3,15 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from .youtube import YoutubeIE
from ..compat import compat_str from ..compat import compat_str
from ..utils import int_or_none from ..utils import (
int_or_none,
parse_age_limit,
)
class BreakIE(InfoExtractor): class BreakIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)' _VALID_URL = r'https?://(?:www\.)?(?P<site>break|screenjunkies)\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056', 'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': { 'info_dict': {
@ -17,73 +19,125 @@ class BreakIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'When Girls Act Like D-Bags', 'title': 'When Girls Act Like D-Bags',
'age_limit': 13, 'age_limit': 13,
}
}, {
'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
'md5': '5c2b686bec3d43de42bde9ec047536b0',
'info_dict': {
'id': '2841915',
'display_id': 'best-quentin-tarantino-movie',
'ext': 'mp4',
'title': 'Best Quentin Tarantino Movie',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3671,
'age_limit': 13,
'tags': list,
}, },
}, { }, {
# youtube embed 'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
'url': 'http://www.break.com/video/someone-forgot-boat-brakes-work',
'info_dict': { 'info_dict': {
'id': 'RrrDLdeL2HQ', 'id': '2348808',
'display_id': 'honest-trailers-the-dark-knight',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Whale Watching Boat Crashing Into San Diego Dock', 'title': 'Honest Trailers - The Dark Knight',
'description': 'md5:afc1b2772f0a8468be51dd80eb021069', 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'upload_date': '20160331', 'age_limit': 10,
'uploader': 'Steve Holden', 'tags': list,
'uploader_id': 'sdholden07', },
}, {
# requires subscription but worked around
'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
'info_dict': {
'id': '3003285',
'display_id': 'knocking-dead-ep-1-the-show-so-far',
'ext': 'mp4',
'title': 'State of The Dead Recap: Knocking Dead Pilot',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3307,
'age_limit': 13,
'tags': list,
}, },
'params': {
'skip_download': True,
}
}, { }, {
'url': 'http://www.break.com/video/ugc/baby-flex-2773063', 'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
'only_matching': True, 'only_matching': True,
}] }]
_DEFAULT_BITRATES = (48, 150, 320, 496, 864, 2240, 3264)
def _real_extract(self, url): def _real_extract(self, url):
display_id, video_id = re.match(self._VALID_URL, url).groups() site, display_id, video_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
(r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
webpage, 'video id')
youtube_url = YoutubeIE._extract_url(webpage) webpage = self._download_webpage(
if youtube_url: 'http://www.%s.com/embed/%s' % (site, video_id),
return self.url_result(youtube_url, ie=YoutubeIE.ie_key()) display_id, 'Downloading video embed page')
embed_vars = self._parse_json(
content = self._parse_json(
self._search_regex( self._search_regex(
r'(?s)content["\']\s*:\s*(\[.+?\])\s*[,\n]', webpage, r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
'content'),
display_id) display_id)
youtube_id = embed_vars.get('youtubeId')
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
title = embed_vars['contentName']
formats = [] formats = []
for video in content: bitrates = []
video_url = video.get('url') for f in embed_vars.get('media', []):
if not video_url or not isinstance(video_url, compat_str): if not f.get('uri') or f.get('mediaPurpose') != 'play':
continue continue
bitrate = int_or_none(self._search_regex( bitrate = int_or_none(f.get('bitRate'))
r'(\d+)_kbps', video_url, 'tbr', default=None)) if bitrate:
bitrates.append(bitrate)
formats.append({ formats.append({
'url': video_url, 'url': f['uri'],
'format_id': 'http-%d' % bitrate if bitrate else 'http', 'format_id': 'http-%d' % bitrate if bitrate else 'http',
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': bitrate, 'tbr': bitrate,
'format': 'mp4',
}) })
if not bitrates:
# When subscriptionLevel > 0, i.e. plus subscription is required
# media list will be empty. However, hds and hls uris are still
# available. We can grab them assuming bitrates to be default.
bitrates = self._DEFAULT_BITRATES
auth_token = embed_vars.get('AuthToken')
def construct_manifest_url(base_url, ext):
pieces = [base_url]
pieces.extend([compat_str(b) for b in bitrates])
pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
return ','.join(pieces)
if bitrates and auth_token:
hds_url = embed_vars.get('hdsUri')
if hds_url:
formats.extend(self._extract_f4m_formats(
construct_manifest_url(hds_url, 'f4m'),
display_id, f4m_id='hds', fatal=False))
hls_url = embed_vars.get('hlsUri')
if hls_url:
formats.extend(self._extract_m3u8_formats(
construct_manifest_url(hls_url, 'm3u8'),
display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
title = self._search_regex(
(r'title["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
r'<h1[^>]*>(?P<value>[^<]+)'), webpage, 'title', group='value')
def get(key, name):
return int_or_none(self._search_regex(
r'%s["\']\s*:\s*["\'](\d+)' % key, webpage, name,
default=None))
age_limit = get('ratings', 'age limit')
video_id = video_id or get('pid', 'video id') or display_id
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': embed_vars.get('thumbUri'),
'age_limit': age_limit, 'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
'tags': embed_vars.get('tags', '').split(','),
'formats': formats, 'formats': formats,
} }

View File

@ -564,7 +564,7 @@ class BrightcoveNewIE(AdobePassIE):
return entries return entries
def _parse_brightcove_metadata(self, json_data, video_id, headers={}): def _parse_brightcove_metadata(self, json_data, video_id):
title = json_data['name'].strip() title = json_data['name'].strip()
formats = [] formats = []
@ -638,9 +638,6 @@ class BrightcoveNewIE(AdobePassIE):
self._sort_formats(formats) self._sort_formats(formats)
for f in formats:
f.setdefault('http_headers', {}).update(headers)
subtitles = {} subtitles = {}
for text_track in json_data.get('text_tracks', []): for text_track in json_data.get('text_tracks', []):
if text_track.get('src'): if text_track.get('src'):
@ -669,10 +666,7 @@ class BrightcoveNewIE(AdobePassIE):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass({ self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
'countries': smuggled_data.get('geo_countries'),
'ip_blocks': smuggled_data.get('geo_ip_blocks'),
})
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups() account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()
@ -696,17 +690,10 @@ class BrightcoveNewIE(AdobePassIE):
webpage, 'policy key', group='pk') webpage, 'policy key', group='pk')
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id) api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
headers = {
'Accept': 'application/json;pk=%s' % policy_key,
}
referrer = smuggled_data.get('referrer')
if referrer:
headers.update({
'Referer': referrer,
'Origin': re.search(r'https?://[^/]+', referrer).group(0),
})
try: try:
json_data = self._download_json(api_url, video_id, headers=headers) json_data = self._download_json(api_url, video_id, headers={
'Accept': 'application/json;pk=%s' % policy_key
})
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403: if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
json_data = self._parse_json(e.cause.read().decode(), video_id)[0] json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
@ -730,5 +717,4 @@ class BrightcoveNewIE(AdobePassIE):
'tveToken': tve_token, 'tveToken': tve_token,
}) })
return self._parse_brightcove_metadata( return self._parse_brightcove_metadata(json_data, video_id)
json_data, video_id, headers=headers)

View File

@ -1,42 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .jwplatform import JWPlatformIE
class BusinessInsiderIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://uk.businessinsider.com/how-much-radiation-youre-exposed-to-in-everyday-life-2016-6',
'md5': 'ca237a53a8eb20b6dc5bd60564d4ab3e',
'info_dict': {
'id': 'hZRllCfw',
'ext': 'mp4',
'title': "Here's how much radiation you're exposed to in everyday life",
'description': 'md5:9a0d6e2c279948aadaa5e84d6d9b99bd',
'upload_date': '20170709',
'timestamp': 1499606400,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.businessinsider.nl/5-scientifically-proven-things-make-you-less-attractive-2017-7/',
'only_matching': True,
}, {
'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
jwplatform_id = self._search_regex(
(r'data-media-id=["\']([a-zA-Z0-9]{8})',
r'id=["\']jwplayer_([a-zA-Z0-9]{8})',
r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})'),
webpage, 'jwplatform id')
return self.url_result(
'jwplatform:%s' % jwplatform_id, ie=JWPlatformIE.ie_key(),
video_id=video_id)

View File

@ -1,96 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
)
class CamModelsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cammodels\.com/cam/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.cammodels.com/cam/AutumnKnight/',
'only_matching': True,
}]
def _real_extract(self, url):
user_id = self._match_id(url)
webpage = self._download_webpage(
url, user_id, headers=self.geo_verification_headers())
manifest_root = self._html_search_regex(
r'manifestUrlRoot=([^&\']+)', webpage, 'manifest', default=None)
if not manifest_root:
ERRORS = (
("I'm offline, but let's stay connected", 'This user is currently offline'),
('in a private show', 'This user is in a private show'),
('is currently performing LIVE', 'This model is currently performing live'),
)
for pattern, message in ERRORS:
if pattern in webpage:
error = message
expected = True
break
else:
error = 'Unable to find manifest URL root'
expected = False
raise ExtractorError(error, expected=expected)
manifest = self._download_json(
'%s%s.json' % (manifest_root, user_id), user_id)
formats = []
for format_id, format_dict in manifest['formats'].items():
if not isinstance(format_dict, dict):
continue
encodings = format_dict.get('encodings')
if not isinstance(encodings, list):
continue
vcodec = format_dict.get('videoCodec')
acodec = format_dict.get('audioCodec')
for media in encodings:
if not isinstance(media, dict):
continue
media_url = media.get('location')
if not media_url or not isinstance(media_url, compat_str):
continue
format_id_list = [format_id]
height = int_or_none(media.get('videoHeight'))
if height is not None:
format_id_list.append('%dp' % height)
f = {
'url': media_url,
'format_id': '-'.join(format_id_list),
'width': int_or_none(media.get('videoWidth')),
'height': height,
'vbr': int_or_none(media.get('videoKbps')),
'abr': int_or_none(media.get('audioKbps')),
'fps': int_or_none(media.get('fps')),
'vcodec': vcodec,
'acodec': acodec,
}
if 'rtmp' in format_id:
f['ext'] = 'flv'
elif 'hls' in format_id:
f.update({
'ext': 'mp4',
# hls skips fragments, preferring rtmp
'preference': -1,
})
else:
continue
formats.append(f)
self._sort_formats(formats)
return {
'id': user_id,
'title': self._live_title(user_id),
'is_live': True,
'formats': formats,
}

View File

@ -1,69 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_timestamp,
)
class CamTubeIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|api)\.)?camtube\.co/recordings?/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://camtube.co/recording/minafay-030618-1136-chaturbate-female',
'info_dict': {
'id': '42ad3956-dd5b-445a-8313-803ea6079fac',
'display_id': 'minafay-030618-1136-chaturbate-female',
'ext': 'mp4',
'title': 'minafay-030618-1136-chaturbate-female',
'duration': 1274,
'timestamp': 1528018608,
'upload_date': '20180603',
},
'params': {
'skip_download': True,
},
}]
_API_BASE = 'https://api.camtube.co'
def _real_extract(self, url):
display_id = self._match_id(url)
token = self._download_json(
'%s/rpc/session/new' % self._API_BASE, display_id,
'Downloading session token')['token']
self._set_cookie('api.camtube.co', 'session', token)
video = self._download_json(
'%s/recordings/%s' % (self._API_BASE, display_id), display_id,
headers={'Referer': url})
video_id = video['uuid']
timestamp = unified_timestamp(video.get('createdAt'))
duration = int_or_none(video.get('duration'))
view_count = int_or_none(video.get('viewCount'))
like_count = int_or_none(video.get('likeCount'))
creator = video.get('stageName')
formats = [{
'url': '%s/recordings/%s/manifest.m3u8'
% (self._API_BASE, video_id),
'format_id': 'hls',
'ext': 'mp4',
'protocol': 'm3u8_native',
}]
return {
'id': video_id,
'display_id': display_id,
'title': display_id,
'timestamp': timestamp,
'duration': duration,
'view_count': view_count,
'like_count': like_count,
'creator': creator,
'formats': formats,
}

View File

@ -31,10 +31,6 @@ class Canalc2IE(InfoExtractor):
webpage = self._download_webpage( webpage = self._download_webpage(
'http://www.canalc2.tv/video/%s' % video_id, video_id) 'http://www.canalc2.tv/video/%s' % video_id, video_id)
title = self._html_search_regex(
r'(?s)class="[^"]*col_description[^"]*">.*?<h3>(.+?)</h3>',
webpage, 'title')
formats = [] formats = []
for _, video_url in re.findall(r'file\s*=\s*(["\'])(.+?)\1', webpage): for _, video_url in re.findall(r'file\s*=\s*(["\'])(.+?)\1', webpage):
if video_url.startswith('rtmp://'): if video_url.startswith('rtmp://'):
@ -53,21 +49,17 @@ class Canalc2IE(InfoExtractor):
'url': video_url, 'url': video_url,
'format_id': 'http', 'format_id': 'http',
}) })
self._sort_formats(formats)
if formats: title = self._html_search_regex(
info = { r'(?s)class="[^"]*col_description[^"]*">.*?<h3>(.*?)</h3>', webpage, 'title')
'formats': formats, duration = parse_duration(self._search_regex(
} r'id=["\']video_duree["\'][^>]*>([^<]+)',
else: webpage, 'duration', fatal=False))
info = self._parse_html5_media_entries(url, webpage, url)[0]
self._sort_formats(info['formats']) return {
info.update({
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'duration': parse_duration(self._search_regex( 'duration': duration,
r'id=["\']video_duree["\'][^>]*>([^<]+)', 'formats': formats,
webpage, 'duration', fatal=False)), }
})
return info

View File

@ -4,36 +4,59 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import ( from ..utils import (
dict_get,
# ExtractorError, # ExtractorError,
# HEADRequest, # HEADRequest,
int_or_none, int_or_none,
qualities, qualities,
remove_end,
unified_strdate, unified_strdate,
) )
class CanalplusIE(InfoExtractor): class CanalplusIE(InfoExtractor):
IE_DESC = 'mycanal.fr and piwiplus.fr' IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
_VALID_URL = r'https?://(?:www\.)?(?P<site>mycanal|piwiplus)\.fr/(?:[^/]+/)*(?P<display_id>[^?/]+)(?:\.html\?.*\bvid=|/p/)(?P<id>\d+)' _VALID_URL = r'''(?x)
https?://
(?:
(?:
(?:(?:www|m)\.)?canalplus\.fr|
(?:www\.)?piwiplus\.fr|
(?:www\.)?d8\.tv|
(?:www\.)?c8\.fr|
(?:www\.)?d17\.tv|
(?:(?:football|www)\.)?cstar\.fr|
(?:www\.)?itele\.fr
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
player\.canalplus\.fr/#/(?P<id>\d+)
)
'''
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json' _VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
_SITE_ID_MAP = { _SITE_ID_MAP = {
'mycanal': 'cplus', 'canalplus': 'cplus',
'piwiplus': 'teletoon', 'piwiplus': 'teletoon',
'd8': 'd8',
'c8': 'd8',
'd17': 'd17',
'cstar': 'd17',
'itele': 'itele',
} }
# Only works for direct mp4 URLs # Only works for direct mp4 URLs
_GEO_COUNTRIES = ['FR'] _GEO_COUNTRIES = ['FR']
_TESTS = [{ _TESTS = [{
'url': 'https://www.mycanal.fr/d17-emissions/lolywood/p/1397061', 'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1192814',
'info_dict': { 'info_dict': {
'id': '1397061', 'id': '1405510',
'display_id': 'lolywood', 'display_id': 'pid1830-c-zapping',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Euro 2016 : Je préfère te prévenir - Lolywood - Episode 34', 'title': 'Zapping - 02/07/2016',
'description': 'md5:7d97039d455cb29cdba0d652a0efaa5e', 'description': 'Le meilleur de toutes les chaînes, tous les jours',
'upload_date': '20160602', 'upload_date': '20160702',
}, },
}, { }, {
# geo restricted, bypassed # geo restricted, bypassed
@ -47,12 +70,64 @@ class CanalplusIE(InfoExtractor):
'upload_date': '20140724', 'upload_date': '20140724',
}, },
'expected_warnings': ['HTTP Error 403: Forbidden'], 'expected_warnings': ['HTTP Error 403: Forbidden'],
}, {
# geo restricted, bypassed
'url': 'http://www.c8.fr/c8-divertissement/ms-touche-pas-a-mon-poste/pid6318-videos-integrales.html?vid=1443684',
'md5': 'bb6f9f343296ab7ebd88c97b660ecf8d',
'info_dict': {
'id': '1443684',
'display_id': 'pid6318-videos-integrales',
'ext': 'mp4',
'title': 'Guess my iep ! - TPMP - 07/04/2017',
'description': 'md5:6f005933f6e06760a9236d9b3b5f17fa',
'upload_date': '20170407',
},
'expected_warnings': ['HTTP Error 403: Forbidden'],
}, {
'url': 'http://www.itele.fr/chroniques/invite-michael-darmon/rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
'info_dict': {
'id': '1420176',
'display_id': 'rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
'ext': 'mp4',
'title': 'L\'invité de Michaël Darmon du 14/10/2016 - ',
'description': 'Chaque matin du lundi au vendredi, Michaël Darmon reçoit un invité politique à 8h25.',
'upload_date': '20161014',
},
}, {
'url': 'http://football.cstar.fr/cstar-minisite-foot/pid7566-feminines-videos.html?vid=1416769',
'info_dict': {
'id': '1416769',
'display_id': 'pid7566-feminines-videos',
'ext': 'mp4',
'title': 'France - Albanie : les temps forts de la soirée - 20/09/2016',
'description': 'md5:c3f30f2aaac294c1c969b3294de6904e',
'upload_date': '20160921',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://m.canalplus.fr/?vid=1398231',
'only_matching': True,
}, {
'url': 'http://www.d17.tv/emissions/pid8303-lolywood.html?vid=1397061',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
site, display_id, video_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
site_id = self._SITE_ID_MAP[site] site_id = self._SITE_ID_MAP[compat_urllib_parse_urlparse(url).netloc.rsplit('.', 2)[-2]]
# Beware, some subclasses do not define an id group
display_id = remove_end(dict_get(mobj.groupdict(), ('display_id', 'id', 'vid')), '.html')
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
r'id=["\']canal_video_player(?P<id>\d+)',
r'data-video=["\'](?P<id>\d+)'],
webpage, 'video id', default=mobj.group('vid'), group='id')
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id) info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
video_data = self._download_json(info_url, video_id, 'Downloading video JSON') video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
@ -86,7 +161,7 @@ class CanalplusIE(InfoExtractor):
format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False)) format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False))
else: else:
formats.append({ formats.append({
# the secret extracted from ya function in http://player.canalplus.fr/common/js/canalPlayer.js # the secret extracted ya function in http://player.canalplus.fr/common/js/canalPlayer.js
'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes', 'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes',
'format_id': format_id, 'format_id': format_id,
'preference': preference(format_id), 'preference': preference(format_id),

View File

@ -246,7 +246,7 @@ class VrtNUIE(GigyaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage, urlh = self._download_webpage_handle(url, display_id) webpage = self._download_webpage(url, display_id)
title = self._html_search_regex( title = self._html_search_regex(
r'(?ms)<h1 class="content__heading">(.+?)</h1>', r'(?ms)<h1 class="content__heading">(.+?)</h1>',
@ -276,7 +276,7 @@ class VrtNUIE(GigyaBaseIE):
webpage, 'release_date', default=None)) webpage, 'release_date', default=None))
# If there's a ? or a # in the URL, remove them and everything after # If there's a ? or a # in the URL, remove them and everything after
clean_url = urlh.geturl().split('?')[0].split('#')[0].strip('/') clean_url = url.split('?')[0].split('#')[0].strip('/')
securevideo_url = clean_url + '.mssecurevideo.json' securevideo_url = clean_url + '.mssecurevideo.json'
try: try:

View File

@ -1,14 +1,10 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import compat_str
compat_str,
compat_HTTPError,
)
from ..utils import ( from ..utils import (
js_to_json, js_to_json,
smuggle_url, smuggle_url,
@ -17,11 +13,8 @@ from ..utils import (
xpath_element, xpath_element,
xpath_with_ns, xpath_with_ns,
find_xpath_attr, find_xpath_attr,
orderedSet,
parse_duration,
parse_iso8601, parse_iso8601,
parse_age_limit, parse_age_limit,
strip_or_none,
int_or_none, int_or_none,
ExtractorError, ExtractorError,
) )
@ -131,23 +124,15 @@ class CBCIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
title = self._og_search_title(webpage, default=None) or self._html_search_meta(
'twitter:title', webpage, 'title', default=None) or self._html_search_regex(
r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
entries = [ entries = [
self._extract_player_init(player_init, display_id) self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)] for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
media_ids = []
for media_id_re in (
r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"',
r'<div[^>]+\bid=["\']player-(\d+)',
r'guid["\']\s*:\s*["\'](\d+)'):
media_ids.extend(re.findall(media_id_re, webpage))
entries.extend([ entries.extend([
self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
for media_id in orderedSet(media_ids)]) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)])
return self.playlist_result( return self.playlist_result(
entries, display_id, strip_or_none(title), entries, display_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage)) self._og_search_description(webpage))
@ -219,41 +204,23 @@ class CBCWatchBaseIE(InfoExtractor):
def _call_api(self, path, video_id): def _call_api(self, path, video_id):
url = path if path.startswith('http') else self._API_BASE_URL + path url = path if path.startswith('http') else self._API_BASE_URL + path
for _ in range(2):
try:
result = self._download_xml(url, video_id, headers={ result = self._download_xml(url, video_id, headers={
'X-Clearleap-DeviceId': self._device_id, 'X-Clearleap-DeviceId': self._device_id,
'X-Clearleap-DeviceToken': self._device_token, 'X-Clearleap-DeviceToken': self._device_token,
}) })
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
# Device token has expired, re-acquiring device token
self._register_device()
continue
raise
error_message = xpath_text(result, 'userMessage') or xpath_text(result, 'systemMessage') error_message = xpath_text(result, 'userMessage') or xpath_text(result, 'systemMessage')
if error_message: if error_message:
raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message)) raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message))
return result return result
def _real_initialize(self): def _real_initialize(self):
if self._valid_device_token(): if not self._device_id or not self._device_token:
return
device = self._downloader.cache.load('cbcwatch', 'device') or {} device = self._downloader.cache.load('cbcwatch', 'device') or {}
self._device_id, self._device_token = device.get('id'), device.get('token') self._device_id, self._device_token = device.get('id'), device.get('token')
if self._valid_device_token(): if not self._device_id or not self._device_token:
return
self._register_device()
def _valid_device_token(self):
return self._device_id and self._device_token
def _register_device(self):
self._device_id = self._device_token = None
result = self._download_xml( result = self._download_xml(
self._API_BASE_URL + 'device/register', self._API_BASE_URL + 'device/register',
None, 'Acquiring device token', None, data=b'<device><type>web</type></device>')
data=b'<device><type>web</type></device>')
self._device_id = xpath_text(result, 'deviceId', fatal=True) self._device_id = xpath_text(result, 'deviceId', fatal=True)
self._device_token = xpath_text(result, 'deviceToken', fatal=True) self._device_token = xpath_text(result, 'deviceToken', fatal=True)
self._downloader.cache.store( self._downloader.cache.store(
@ -392,63 +359,3 @@ class CBCWatchIE(CBCWatchBaseIE):
video_id = self._match_id(url) video_id = self._match_id(url)
rss = self._call_api('web/browse/' + video_id, video_id) rss = self._call_api('web/browse/' + video_id, video_id)
return self._parse_rss_feed(rss) return self._parse_rss_feed(rss)
class CBCOlympicsIE(InfoExtractor):
IE_NAME = 'cbc.ca:olympics'
_VALID_URL = r'https?://olympics\.cbc\.ca/video/[^/]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://olympics.cbc.ca/video/whats-on-tv/olympic-morning-featuring-the-opening-ceremony/',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._hidden_inputs(webpage)['videoId']
video_doc = self._download_xml(
'https://olympics.cbc.ca/videodata/%s.xml' % video_id, video_id)
title = xpath_text(video_doc, 'title', fatal=True)
is_live = xpath_text(video_doc, 'kind') == 'Live'
if is_live:
title = self._live_title(title)
formats = []
for video_source in video_doc.findall('videoSources/videoSource'):
uri = xpath_text(video_source, 'uri')
if not uri:
continue
tokenize = self._download_json(
'https://olympics.cbc.ca/api/api-akamai/tokenize',
video_id, data=json.dumps({
'VideoSource': uri,
}).encode(), headers={
'Content-Type': 'application/json',
'Referer': url,
# d3.VideoPlayer._init in https://olympics.cbc.ca/components/script/base.js
'Cookie': '_dvp=TK:C0ObxjerU', # AKAMAI CDN cookie
}, fatal=False)
if not tokenize:
continue
content_url = tokenize['ContentUrl']
video_source_format = video_source.get('format')
if video_source_format == 'IIS':
formats.extend(self._extract_ism_formats(
content_url, video_id, ism_id=video_source_format, fatal=False))
else:
formats.extend(self._extract_m3u8_formats(
content_url, video_id, 'mp4',
'm3u8' if is_live else 'm3u8_native',
m3u8_id=video_source_format, fatal=False))
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': xpath_text(video_doc, 'description'),
'thumbnail': xpath_text(video_doc, 'thumbnailUrl'),
'duration': parse_duration(xpath_text(video_doc, 'duration')),
'formats': formats,
'is_live': is_live,
}

View File

@ -2,7 +2,6 @@ from __future__ import unicode_literals
from .theplatform import ThePlatformFeedIE from .theplatform import ThePlatformFeedIE
from ..utils import ( from ..utils import (
ExtractorError,
int_or_none, int_or_none,
find_xpath_attr, find_xpath_attr,
xpath_element, xpath_element,
@ -62,10 +61,9 @@ class CBSIE(CBSBaseIE):
asset_types = [] asset_types = []
subtitles = {} subtitles = {}
formats = [] formats = []
last_e = None
for item in items_data.findall('.//item'): for item in items_data.findall('.//item'):
asset_type = xpath_text(item, 'assetType') asset_type = xpath_text(item, 'assetType')
if not asset_type or asset_type in asset_types or asset_type in ('HLS_FPS', 'DASH_CENC'): if not asset_type or asset_type in asset_types:
continue continue
asset_types.append(asset_type) asset_types.append(asset_type)
query = { query = {
@ -76,17 +74,11 @@ class CBSIE(CBSBaseIE):
query['formats'] = 'MPEG4,M3U' query['formats'] = 'MPEG4,M3U'
elif asset_type in ('RTMP', 'WIFI', '3G'): elif asset_type in ('RTMP', 'WIFI', '3G'):
query['formats'] = 'MPEG4,FLV' query['formats'] = 'MPEG4,FLV'
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil( tp_formats, tp_subtitles = self._extract_theplatform_smil(
update_url_query(tp_release_url, query), content_id, update_url_query(tp_release_url, query), content_id,
'Downloading %s SMIL data' % asset_type) 'Downloading %s SMIL data' % asset_type)
except ExtractorError as e:
last_e = e
continue
formats.extend(tp_formats) formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles) subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats) self._sort_formats(formats)
info = self._extract_theplatform_metadata(tp_path, content_id) info = self._extract_theplatform_metadata(tp_path, content_id)

View File

@ -75,10 +75,10 @@ class CBSInteractiveIE(CBSIE):
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
data_json = self._html_search_regex( data_json = self._html_search_regex(
r"data(?:-(?:cnet|zdnet))?-video(?:-(?:uvp(?:js)?|player))?-options='([^']+)'", r"data-(?:cnet|zdnet)-video(?:-uvp(?:js)?)?-options='([^']+)'",
webpage, 'data json') webpage, 'data json')
data = self._parse_json(data_json, display_id) data = self._parse_json(data_json, display_id)
vdata = data.get('video') or (data.get('videos') or data.get('playlist'))[0] vdata = data.get('video') or data['videos'][0]
video_id = vdata['mpxRefId'] video_id = vdata['mpxRefId']

View File

@ -4,35 +4,28 @@ from .cbs import CBSBaseIE
class CBSSportsIE(CBSBaseIE): class CBSSportsIE(CBSBaseIE):
_VALID_URL = r'https?://(?:www\.)?cbssports\.com/[^/]+/(?:video|news)/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?cbssports\.com/video/player/[^/]+/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.cbssports.com/nba/video/donovan-mitchell-flashes-star-potential-in-game-2-victory-over-thunder/', 'url': 'http://www.cbssports.com/video/player/videos/708337219968/0/ben-simmons-the-next-lebron?-not-so-fast',
'info_dict': { 'info_dict': {
'id': '1214315075735', 'id': '708337219968',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Donovan Mitchell flashes star potential in Game 2 victory over Thunder', 'title': 'Ben Simmons the next LeBron? Not so fast',
'description': 'md5:df6f48622612c2d6bd2e295ddef58def', 'description': 'md5:854294f627921baba1f4b9a990d87197',
'timestamp': 1524111457, 'timestamp': 1466293740,
'upload_date': '20180419', 'upload_date': '20160618',
'uploader': 'CBSI-NEW', 'uploader': 'CBSI-NEW',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} }
}, {
'url': 'https://www.cbssports.com/nba/news/nba-playoffs-2018-watch-76ers-vs-heat-game-3-series-schedule-tv-channel-online-stream/',
'only_matching': True,
}] }]
def _extract_video_info(self, filter_query, video_id): def _extract_video_info(self, filter_query, video_id):
return self._extract_feed_info('dJ5BDC', 'VxxJg8Ymh8sE', filter_query, video_id) return self._extract_feed_info('dJ5BDC', 'VxxJg8Ymh8sE', filter_query, video_id)
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
[r'(?:=|%26)pcid%3D(\d+)', r'embedVideo(?:Container)?_(\d+)'],
webpage, 'video id')
return self._extract_video_info('byId=%s' % video_id, video_id) return self._extract_video_info('byId=%s' % video_id, video_id)

View File

@ -4,13 +4,11 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
clean_html,
int_or_none, int_or_none,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
parse_resolution, clean_html,
) )
@ -42,42 +40,34 @@ class CCMAIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
media_type, media_id = re.match(self._VALID_URL, url).groups() media_type, media_id = re.match(self._VALID_URL, url).groups()
media_data = {}
media = self._download_json( formats = []
'http://dinamics.ccma.cat/pvideo/media.jsp', media_id, query={ profiles = ['pc'] if media_type == 'audio' else ['mobil', 'pc']
for i, profile in enumerate(profiles):
md = self._download_json('http://dinamics.ccma.cat/pvideo/media.jsp', media_id, query={
'media': media_type, 'media': media_type,
'idint': media_id, 'idint': media_id,
}) 'profile': profile,
}, fatal=False)
formats = [] if md:
media_url = media['media']['url'] media_data = md
if isinstance(media_url, list): media_url = media_data.get('media', {}).get('url')
for format_ in media_url: if media_url:
format_url = format_.get('file')
if not format_url or not isinstance(format_url, compat_str):
continue
label = format_.get('label')
f = parse_resolution(label)
f.update({
'url': format_url,
'format_id': label,
})
formats.append(f)
else:
formats.append({ formats.append({
'format_id': profile,
'url': media_url, 'url': media_url,
'vcodec': 'none' if media_type == 'audio' else None, 'quality': i,
}) })
self._sort_formats(formats) self._sort_formats(formats)
informacio = media['informacio'] informacio = media_data['informacio']
title = informacio['titol'] title = informacio['titol']
durada = informacio.get('durada', {}) durada = informacio.get('durada', {})
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text')) duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc')) timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
subtitles = {} subtitles = {}
subtitols = media.get('subtitols', {}) subtitols = media_data.get('subtitols', {})
if subtitols: if subtitols:
sub_url = subtitols.get('url') sub_url = subtitols.get('url')
if sub_url: if sub_url:
@ -87,7 +77,7 @@ class CCMAIE(InfoExtractor):
}) })
thumbnails = [] thumbnails = []
imatges = media.get('imatges', {}) imatges = media_data.get('imatges', {})
if imatges: if imatges:
thumbnail_url = imatges.get('url') thumbnail_url = imatges.get('url')
if thumbnail_url: if thumbnail_url:

0
youtube_dl/extractor/cda.py Normal file → Executable file
View File

View File

@ -13,7 +13,6 @@ from ..utils import (
float_or_none, float_or_none,
sanitized_Request, sanitized_Request,
unescapeHTML, unescapeHTML,
update_url_query,
urlencode_postdata, urlencode_postdata,
USER_AGENTS, USER_AGENTS,
) )
@ -266,10 +265,6 @@ class CeskaTelevizePoradyIE(InfoExtractor):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
}, {
# iframe embed
'url': 'http://www.ceskatelevize.cz/porady/10614999031-neviditelni/21251212048/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -277,11 +272,8 @@ class CeskaTelevizePoradyIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
data_url = update_url_query(unescapeHTML(self._search_regex( data_url = unescapeHTML(self._search_regex(
(r'<span[^>]*\bdata-url=(["\'])(?P<url>(?:(?!\1).)+)\1', r'<span[^>]*\bdata-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?ceskatelevize\.cz/ivysilani/embed/iFramePlayer\.php.*?)\1'), webpage, 'iframe player url', group='url'))
webpage, 'iframe player url', group='url')), query={
'autoStart': 'true',
})
return self.url_result(data_url, ie=CeskaTelevizeIE.ie_key()) return self.url_result(data_url, ie=CeskaTelevizeIE.ie_key())

View File

@ -31,8 +31,7 @@ class ChaturbateIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(url, video_id)
url, video_id, headers=self.geo_verification_headers())
m3u8_urls = [] m3u8_urls = []

View File

@ -1,11 +1,11 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re import re
import base64
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from .youtube import YoutubeIE from .youtube import YoutubeIE
from ..compat import compat_b64decode
from ..utils import ( from ..utils import (
clean_html, clean_html,
ExtractorError ExtractorError
@ -58,7 +58,7 @@ class ChilloutzoneIE(InfoExtractor):
base64_video_info = self._html_search_regex( base64_video_info = self._html_search_regex(
r'var cozVidData = "(.+?)";', webpage, 'video data') r'var cozVidData = "(.+?)";', webpage, 'video data')
decoded_video_info = compat_b64decode(base64_video_info).decode('utf-8') decoded_video_info = base64.b64decode(base64_video_info.encode('utf-8')).decode('utf-8')
video_info_dict = json.loads(decoded_video_info) video_info_dict = json.loads(decoded_video_info)
# get video information from dict # get video information from dict

View File

@ -1,10 +1,10 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_b64decode
from ..utils import parse_duration from ..utils import parse_duration
@ -44,7 +44,8 @@ class ChirbitIE(InfoExtractor):
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look # Reverse engineered from https://chirb.it/js/chirbit.player.js (look
# for soundURL) # for soundURL)
audio_url = compat_b64decode(data_fd[::-1]).decode('utf-8') audio_url = base64.b64decode(
data_fd[::-1].encode('ascii')).decode('utf-8')
title = self._search_regex( title = self._search_regex(
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title') r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')

View File

@ -1,60 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class CloudflareStreamIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:watch\.)?cloudflarestream\.com/|
embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=
)
(?P<id>[\da-f]+)
'''
_TESTS = [{
'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717',
'info_dict': {
'id': '31c9291ab41fac05471db4e73aa11717',
'ext': 'mp4',
'title': '31c9291ab41fac05471db4e73aa11717',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://watch.cloudflarestream.com/9df17203414fd1db3e3ed74abbe936c1',
'only_matching': True,
}, {
'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [
mobj.group('url')
for mobj in re.finditer(
r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=[\da-f]+?.*?)\1',
webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
formats = self._extract_m3u8_formats(
'https://cloudflarestream.com/%s/manifest/video.m3u8' % video_id,
video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls',
fatal=False)
formats.extend(self._extract_mpd_formats(
'https://cloudflarestream.com/%s/manifest/video.mpd' % video_id,
video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,
'title': video_id,
'formats': formats,
}

View File

@ -174,8 +174,6 @@ class InfoExtractor(object):
width : height ratio as float. width : height ratio as float.
* no_resume The server does not support resuming the * no_resume The server does not support resuming the
(HTTP or RTMP) download. Boolean. (HTTP or RTMP) download. Boolean.
* downloader_options A dictionary of downloader options as
described in FileDownloader
url: Final video URL. url: Final video URL.
ext: Video filename extension. ext: Video filename extension.
@ -339,17 +337,15 @@ class InfoExtractor(object):
_GEO_BYPASS attribute may be set to False in order to disable _GEO_BYPASS attribute may be set to False in order to disable
geo restriction bypass mechanisms for a particular extractor. geo restriction bypass mechanisms for a particular extractor.
Though it won't disable explicit geo restriction bypass based on Though it won't disable explicit geo restriction bypass based on
country code provided with geo_bypass_country. country code provided with geo_bypass_country. (experimental)
_GEO_COUNTRIES attribute may contain a list of presumably geo unrestricted _GEO_COUNTRIES attribute may contain a list of presumably geo unrestricted
countries for this extractor. One of these countries will be used by countries for this extractor. One of these countries will be used by
geo restriction bypass mechanism right away in order to bypass geo restriction bypass mechanism right away in order to bypass
geo restriction, of course, if the mechanism is not disabled. geo restriction, of course, if the mechanism is not disabled. (experimental)
_GEO_IP_BLOCKS attribute may contain a list of presumably geo unrestricted NB: both these geo attributes are experimental and may change in future
IP blocks in CIDR notation for this extractor. One of these IP blocks or be completely removed.
will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES.
Finally, the _WORKING attribute should be set to False for broken IEs Finally, the _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests. in order to warn the users and skip the tests.
@ -360,7 +356,6 @@ class InfoExtractor(object):
_x_forwarded_for_ip = None _x_forwarded_for_ip = None
_GEO_BYPASS = True _GEO_BYPASS = True
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
def __init__(self, downloader=None): def __init__(self, downloader=None):
@ -395,15 +390,12 @@ class InfoExtractor(object):
def initialize(self): def initialize(self):
"""Initializes an instance (authentication, etc).""" """Initializes an instance (authentication, etc)."""
self._initialize_geo_bypass({ self._initialize_geo_bypass(self._GEO_COUNTRIES)
'countries': self._GEO_COUNTRIES,
'ip_blocks': self._GEO_IP_BLOCKS,
})
if not self._ready: if not self._ready:
self._real_initialize() self._real_initialize()
self._ready = True self._ready = True
def _initialize_geo_bypass(self, geo_bypass_context): def _initialize_geo_bypass(self, countries):
""" """
Initialize geo restriction bypass mechanism. Initialize geo restriction bypass mechanism.
@ -414,82 +406,28 @@ class InfoExtractor(object):
HTTP requests. HTTP requests.
This method will be used for initial geo bypass mechanism initialization This method will be used for initial geo bypass mechanism initialization
during the instance initialization with _GEO_COUNTRIES and during the instance initialization with _GEO_COUNTRIES.
_GEO_IP_BLOCKS.
You may also manually call it from extractor's code if geo bypass You may also manually call it from extractor's code if geo countries
information is not available beforehand (e.g. obtained during information is not available beforehand (e.g. obtained during
extraction) or due to some other reason. In this case you should pass extraction) or due to some another reason.
this information in geo bypass context passed as first argument. It may
contain following fields:
countries: List of geo unrestricted countries (similar
to _GEO_COUNTRIES)
ip_blocks: List of geo unrestricted IP blocks in CIDR notation
(similar to _GEO_IP_BLOCKS)
""" """
if not self._x_forwarded_for_ip: if not self._x_forwarded_for_ip:
country_code = self._downloader.params.get('geo_bypass_country', None)
# Geo bypass mechanism is explicitly disabled by user # If there is no explicit country for geo bypass specified and
if not self._downloader.params.get('geo_bypass', True): # the extractor is known to be geo restricted let's fake IP
return # as X-Forwarded-For right away.
if (not country_code and
if not geo_bypass_context: self._GEO_BYPASS and
geo_bypass_context = {} self._downloader.params.get('geo_bypass', True) and
countries):
# Backward compatibility: previously _initialize_geo_bypass country_code = random.choice(countries)
# expected a list of countries, some 3rd party code may still use if country_code:
# it this way self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if isinstance(geo_bypass_context, (list, tuple)):
geo_bypass_context = {
'countries': geo_bypass_context,
}
# The whole point of geo bypass mechanism is to fake IP
# as X-Forwarded-For HTTP header based on some IP block or
# country code.
# Path 1: bypassing based on IP block in CIDR notation
# Explicit IP block specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
ip_block = self._downloader.params.get('geo_bypass_ip_block', None)
# Otherwise use random IP block from geo bypass context but only
# if extractor is known as geo bypassable
if not ip_block:
ip_blocks = geo_bypass_context.get('ip_blocks')
if self._GEO_BYPASS and ip_blocks:
ip_block = random.choice(ip_blocks)
if ip_block:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block)
if self._downloader.params.get('verbose', False):
self._downloader.to_screen(
'[debug] Using fake IP %s as X-Forwarded-For.'
% self._x_forwarded_for_ip)
return
# Path 2: bypassing based on country code
# Explicit country code specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
country = self._downloader.params.get('geo_bypass_country', None)
# Otherwise use random country code from geo bypass context but
# only if extractor is known as geo bypassable
if not country:
countries = geo_bypass_context.get('countries')
if self._GEO_BYPASS and countries:
country = random.choice(countries)
if country:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country)
if self._downloader.params.get('verbose', False): if self._downloader.params.get('verbose', False):
self._downloader.to_screen( self._downloader.to_screen(
'[debug] Using fake IP %s (%s) as X-Forwarded-For.' '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country.upper())) % (self._x_forwarded_for_ip, country_code.upper()))
def extract(self, url): def extract(self, url):
"""Extracts URL information and returns it in list of dicts.""" """Extracts URL information and returns it in list of dicts."""
@ -704,31 +642,19 @@ class InfoExtractor(object):
content, _ = res content, _ = res
return content return content
def _download_xml_handle(
self, url_or_request, video_id, note='Downloading XML',
errnote='Unable to download XML', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={}):
"""Return a tuple (xml as an xml.etree.ElementTree.Element, URL handle)"""
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query)
if res is False:
return res
xml_string, urlh = res
return self._parse_xml(
xml_string, video_id, transform_source=transform_source,
fatal=fatal), urlh
def _download_xml(self, url_or_request, video_id, def _download_xml(self, url_or_request, video_id,
note='Downloading XML', errnote='Unable to download XML', note='Downloading XML', errnote='Unable to download XML',
transform_source=None, fatal=True, encoding=None, transform_source=None, fatal=True, encoding=None,
data=None, headers={}, query={}): data=None, headers={}, query={}):
"""Return the xml as an xml.etree.ElementTree.Element""" """Return the xml as an xml.etree.ElementTree.Element"""
res = self._download_xml_handle( xml_string = self._download_webpage(
url_or_request, video_id, note=note, errnote=errnote, url_or_request, video_id, note, errnote, fatal=fatal,
transform_source=transform_source, fatal=fatal, encoding=encoding, encoding=encoding, data=data, headers=headers, query=query)
data=data, headers=headers, query=query) if xml_string is False:
return res if res is False else res[0] return xml_string
return self._parse_xml(
xml_string, video_id, transform_source=transform_source,
fatal=fatal)
def _parse_xml(self, xml_string, video_id, transform_source=None, fatal=True): def _parse_xml(self, xml_string, video_id, transform_source=None, fatal=True):
if transform_source: if transform_source:
@ -742,30 +668,18 @@ class InfoExtractor(object):
else: else:
self.report_warning(errmsg + str(ve)) self.report_warning(errmsg + str(ve))
def _download_json_handle( def _download_json(self, url_or_request, video_id,
self, url_or_request, video_id, note='Downloading JSON metadata', note='Downloading JSON metadata',
errnote='Unable to download JSON metadata', transform_source=None, errnote='Unable to download JSON metadata',
transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={}): fatal=True, encoding=None, data=None, headers={}, query={}):
"""Return a tuple (JSON object, URL handle)""" json_string = self._download_webpage(
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal, url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query) encoding=encoding, data=data, headers=headers, query=query)
if res is False: if (not fatal) and json_string is False:
return res return None
json_string, urlh = res
return self._parse_json( return self._parse_json(
json_string, video_id, transform_source=transform_source, json_string, video_id, transform_source=transform_source, fatal=fatal)
fatal=fatal), urlh
def _download_json(
self, url_or_request, video_id, note='Downloading JSON metadata',
errnote='Unable to download JSON metadata', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={}):
res = self._download_json_handle(
url_or_request, video_id, note=note, errnote=errnote,
transform_source=transform_source, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query)
return res if res is False else res[0]
def _parse_json(self, json_string, video_id, transform_source=None, fatal=True): def _parse_json(self, json_string, video_id, transform_source=None, fatal=True):
if transform_source: if transform_source:
@ -1080,40 +994,6 @@ class InfoExtractor(object):
if isinstance(json_ld, dict): if isinstance(json_ld, dict):
json_ld = [json_ld] json_ld = [json_ld]
INTERACTION_TYPE_MAP = {
'CommentAction': 'comment',
'AgreeAction': 'like',
'DisagreeAction': 'dislike',
'LikeAction': 'like',
'DislikeAction': 'dislike',
'ListenAction': 'view',
'WatchAction': 'view',
'ViewAction': 'view',
}
def extract_interaction_statistic(e):
interaction_statistic = e.get('interactionStatistic')
if not isinstance(interaction_statistic, list):
return
for is_e in interaction_statistic:
if not isinstance(is_e, dict):
continue
if is_e.get('@type') != 'InteractionCounter':
continue
interaction_type = is_e.get('interactionType')
if not isinstance(interaction_type, compat_str):
continue
interaction_count = int_or_none(is_e.get('userInteractionCount'))
if interaction_count is None:
continue
count_kind = INTERACTION_TYPE_MAP.get(interaction_type.split('/')[-1])
if not count_kind:
continue
count_key = '%s_count' % count_kind
if info.get(count_key) is not None:
continue
info[count_key] = interaction_count
def extract_video_object(e): def extract_video_object(e):
assert e['@type'] == 'VideoObject' assert e['@type'] == 'VideoObject'
info.update({ info.update({
@ -1129,10 +1009,9 @@ class InfoExtractor(object):
'height': int_or_none(e.get('height')), 'height': int_or_none(e.get('height')),
'view_count': int_or_none(e.get('interactionCount')), 'view_count': int_or_none(e.get('interactionCount')),
}) })
extract_interaction_statistic(e)
for e in json_ld: for e in json_ld:
if isinstance(e.get('@context'), compat_str) and re.match(r'^https?://schema.org/?$', e.get('@context')): if e.get('@context') == 'http://schema.org':
item_type = e.get('@type') item_type = e.get('@type')
if expected_type is not None and expected_type != item_type: if expected_type is not None and expected_type != item_type:
return info return info
@ -1148,7 +1027,7 @@ class InfoExtractor(object):
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries') part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'): if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
info['series'] = unescapeHTML(part_of_series.get('name')) info['series'] = unescapeHTML(part_of_series.get('name'))
elif item_type in ('Article', 'NewsArticle'): elif item_type == 'Article':
info.update({ info.update({
'timestamp': parse_iso8601(e.get('datePublished')), 'timestamp': parse_iso8601(e.get('datePublished')),
'title': unescapeHTML(e.get('headline')), 'title': unescapeHTML(e.get('headline')),
@ -1813,24 +1692,22 @@ class InfoExtractor(object):
}) })
return subtitles return subtitles
def _extract_xspf_playlist(self, xspf_url, playlist_id, fatal=True): def _extract_xspf_playlist(self, playlist_url, playlist_id, fatal=True):
xspf = self._download_xml( xspf = self._download_xml(
xspf_url, playlist_id, 'Downloading xpsf playlist', playlist_url, playlist_id, 'Downloading xpsf playlist',
'Unable to download xspf manifest', fatal=fatal) 'Unable to download xspf manifest', fatal=fatal)
if xspf is False: if xspf is False:
return [] return []
return self._parse_xspf( return self._parse_xspf(xspf, playlist_id)
xspf, playlist_id, xspf_url=xspf_url,
xspf_base_url=base_url(xspf_url))
def _parse_xspf(self, xspf_doc, playlist_id, xspf_url=None, xspf_base_url=None): def _parse_xspf(self, playlist, playlist_id):
NS_MAP = { NS_MAP = {
'xspf': 'http://xspf.org/ns/0/', 'xspf': 'http://xspf.org/ns/0/',
's1': 'http://static.streamone.nl/player/ns/0', 's1': 'http://static.streamone.nl/player/ns/0',
} }
entries = [] entries = []
for track in xspf_doc.findall(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP)): for track in playlist.findall(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP)):
title = xpath_text( title = xpath_text(
track, xpath_with_ns('./xspf:title', NS_MAP), 'title', default=playlist_id) track, xpath_with_ns('./xspf:title', NS_MAP), 'title', default=playlist_id)
description = xpath_text( description = xpath_text(
@ -1840,18 +1717,12 @@ class InfoExtractor(object):
duration = float_or_none( duration = float_or_none(
xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'), 1000) xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'), 1000)
formats = [] formats = [{
for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP)): 'url': location.text,
format_url = urljoin(xspf_base_url, location.text)
if not format_url:
continue
formats.append({
'url': format_url,
'manifest_url': xspf_url,
'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)), 'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))), 'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))), 'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
}) } for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
self._sort_formats(formats) self._sort_formats(formats)
entries.append({ entries.append({
@ -1865,18 +1736,18 @@ class InfoExtractor(object):
return entries return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}): def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}):
res = self._download_xml_handle( res = self._download_webpage_handle(
mpd_url, video_id, mpd_url, video_id,
note=note or 'Downloading MPD manifest', note=note or 'Downloading MPD manifest',
errnote=errnote or 'Failed to download MPD manifest', errnote=errnote or 'Failed to download MPD manifest',
fatal=fatal) fatal=fatal)
if res is False: if res is False:
return [] return []
mpd_doc, urlh = res mpd, urlh = res
mpd_base_url = base_url(urlh.geturl()) mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats( return self._parse_mpd_formats(
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url, compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url,
formats_dict=formats_dict, mpd_url=mpd_url) formats_dict=formats_dict, mpd_url=mpd_url)
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None): def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
@ -2150,16 +2021,17 @@ class InfoExtractor(object):
return formats return formats
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True): def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
res = self._download_xml_handle( res = self._download_webpage_handle(
ism_url, video_id, ism_url, video_id,
note=note or 'Downloading ISM manifest', note=note or 'Downloading ISM manifest',
errnote=errnote or 'Failed to download ISM manifest', errnote=errnote or 'Failed to download ISM manifest',
fatal=fatal) fatal=fatal)
if res is False: if res is False:
return [] return []
ism_doc, urlh = res ism, urlh = res
return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id) return self._parse_ism_formats(
compat_etree_fromstring(ism.encode('utf-8')), urlh.geturl(), ism_id)
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None): def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
""" """
@ -2257,8 +2129,8 @@ class InfoExtractor(object):
return formats return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None): def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
def absolute_url(item_url): def absolute_url(video_url):
return urljoin(base_url, item_url) return compat_urlparse.urljoin(base_url, video_url)
def parse_content_type(content_type): def parse_content_type(content_type):
if not content_type: if not content_type:
@ -2315,7 +2187,7 @@ class InfoExtractor(object):
if src: if src:
_, formats = _media_formats(src, media_type) _, formats = _media_formats(src, media_type)
media_info['formats'].extend(formats) media_info['formats'].extend(formats)
media_info['thumbnail'] = absolute_url(media_attributes.get('poster')) media_info['thumbnail'] = media_attributes.get('poster')
if media_content: if media_content:
for source_tag in re.findall(r'<source[^>]+>', media_content): for source_tag in re.findall(r'<source[^>]+>', media_content):
source_attributes = extract_attributes(source_tag) source_attributes = extract_attributes(source_tag)
@ -2376,10 +2248,9 @@ class InfoExtractor(object):
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]): def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
query = compat_urlparse.urlparse(url).query query = compat_urlparse.urlparse(url).query
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url) url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
mobj = re.search( url_base = self._search_regex(
r'(?:(?:http|rtmp|rtsp)(?P<s>s)?:)?(?P<url>//[^?]+)', url) r'(?:(?:https?|rtmp|rtsp):)?(//[^?]+)', url, 'format url')
url_base = mobj.group('url') http_base_url = '%s:%s' % ('http', url_base)
http_base_url = '%s%s:%s' % ('http', mobj.group('s') or '', url_base)
formats = [] formats = []
def manifest_url(manifest): def manifest_url(manifest):
@ -2479,10 +2350,7 @@ class InfoExtractor(object):
for track in tracks: for track in tracks:
if not isinstance(track, dict): if not isinstance(track, dict):
continue continue
track_kind = track.get('kind') if track.get('kind') != 'captions':
if not track_kind or not isinstance(track_kind, compat_str):
continue
if track_kind.lower() not in ('captions', 'subtitles'):
continue continue
track_url = urljoin(base_url, track.get('file')) track_url = urljoin(base_url, track.get('file'))
if not track_url: if not track_url:
@ -2536,7 +2404,7 @@ class InfoExtractor(object):
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native', source_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=m3u8_id, fatal=False)) m3u8_id=m3u8_id, fatal=False))
elif source_type == 'dash' or ext == 'mpd': elif ext == 'mpd':
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_mpd_formats(
source_url, video_id, mpd_id=mpd_id, fatal=False)) source_url, video_id, mpd_id=mpd_id, fatal=False))
elif ext == 'smil': elif ext == 'smil':

View File

@ -1,167 +1,141 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals, division from __future__ import unicode_literals, division
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..utils import int_or_none
compat_str,
compat_HTTPError,
)
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
parse_age_limit,
parse_duration,
ExtractorError
)
class CrackleIE(InfoExtractor): class CrackleIE(InfoExtractor):
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?(?:sony)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)' _GEO_COUNTRIES = ['US']
_TESTS = [{ _VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
# geo restricted to CA _TEST = {
'url': 'https://www.crackle.com/andromeda/2502343', 'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',
'info_dict': { 'info_dict': {
'id': '2502343', 'id': '2498934',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Under The Night', 'title': 'Everybody Respects A Bloody Nose',
'description': 'md5:d2b8ca816579ae8a7bf28bfff8cefc8a', 'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). Theyre headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
'duration': 2583, 'thumbnail': r're:^https?://.*\.jpg',
'view_count': int, 'duration': 906,
'average_rating': 0, 'series': 'Comedians In Cars Getting Coffee',
'age_limit': 14, 'season_number': 8,
'genre': 'Action, Sci-Fi', 'episode_number': 4,
'creator': 'Allan Kroeker', 'subtitles': {
'artist': 'Keith Hamilton Cobb, Kevin Sorbo, Lisa Ryder, Lexa Doig, Robert Hewitt Wolfe', 'en-US': [
'release_year': 2000, {'ext': 'vtt'},
'series': 'Andromeda', {'ext': 'tt'},
'episode': 'Under The Night', ]
'season_number': 1, },
'episode_number': 1,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} }
}, { }
'url': 'https://www.sonycrackle.com/andromeda/2502343',
'only_matching': True, _THUMBNAIL_RES = [
}] (120, 90),
(208, 156),
(220, 124),
(220, 220),
(240, 180),
(250, 141),
(315, 236),
(320, 180),
(360, 203),
(400, 300),
(421, 316),
(460, 330),
(460, 460),
(462, 260),
(480, 270),
(587, 330),
(640, 480),
(700, 330),
(700, 394),
(854, 480),
(1024, 1024),
(1920, 1080),
]
# extracted from http://legacyweb-us.crackle.com/flash/ReferrerRedirect.ashx
_MEDIA_FILE_SLOTS = {
'c544.flv': {
'width': 544,
'height': 306,
},
'360p.mp4': {
'width': 640,
'height': 360,
},
'480p.mp4': {
'width': 852,
'height': 478,
},
'480p_1mbps.mp4': {
'width': 852,
'height': 478,
},
}
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
country_code = self._downloader.params.get('geo_bypass_country', None) config_doc = self._download_xml(
countries = [country_code] if country_code else ( 'http://legacyweb-us.crackle.com/flash/QueryReferrer.ashx?site=16',
'US', 'AU', 'CA', 'AS', 'FM', 'GU', 'MP', 'PR', 'PW', 'MH', 'VI') video_id, 'Downloading config')
last_e = None item = self._download_xml(
'http://legacyweb-us.crackle.com/app/revamp/vidwallcache.aspx?flags=-1&fm=%s' % video_id,
for country in countries: video_id, headers=self.geo_verification_headers()).find('i')
try: title = item.attrib['t']
media = self._download_json(
'https://web-api-us.crackle.com/Service.svc/details/media/%s/%s'
% (video_id, country), video_id,
'Downloading media JSON as %s' % country,
'Unable to download media JSON', query={
'disableProtocols': 'true',
'format': 'json'
})
except ExtractorError as e:
# 401 means geo restriction, trying next country
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
last_e = e
continue
raise
media_urls = media.get('MediaURLs')
if not media_urls or not isinstance(media_urls, list):
continue
title = media['Title']
formats = []
for e in media['MediaURLs']:
if e.get('UseDRM') is True:
continue
format_url = e.get('Path')
if not format_url or not isinstance(format_url, compat_str):
continue
ext = determine_ext(format_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
format_url, video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
description = media.get('Description')
duration = int_or_none(media.get(
'DurationInSeconds')) or parse_duration(media.get('Duration'))
view_count = int_or_none(media.get('CountViews'))
average_rating = float_or_none(media.get('UserRating'))
age_limit = parse_age_limit(media.get('Rating'))
genre = media.get('Genre')
release_year = int_or_none(media.get('ReleaseYear'))
creator = media.get('Directors')
artist = media.get('Cast')
if media.get('MediaTypeDisplayValue') == 'Full Episode':
series = media.get('ShowName')
episode = title
season_number = int_or_none(media.get('Season'))
episode_number = int_or_none(media.get('Episode'))
else:
series = episode = season_number = episode_number = None
subtitles = {} subtitles = {}
cc_files = media.get('ClosedCaptionFiles') formats = self._extract_m3u8_formats(
if isinstance(cc_files, list): 'http://content.uplynk.com/ext/%s/%s.m3u8' % (config_doc.attrib['strUplynkOwnerId'], video_id),
for cc_file in cc_files: video_id, 'mp4', m3u8_id='hls', fatal=None)
if not isinstance(cc_file, dict):
continue
cc_url = cc_file.get('Path')
if not cc_url or not isinstance(cc_url, compat_str):
continue
lang = cc_file.get('Locale') or 'en'
subtitles.setdefault(lang, []).append({'url': cc_url})
thumbnails = [] thumbnails = []
images = media.get('Images') path = item.attrib.get('p')
if isinstance(images, list): if path:
for image_key, image_url in images.items(): for width, height in self._THUMBNAIL_RES:
mobj = re.search(r'Img_(\d+)[xX](\d+)', image_key) res = '%dx%d' % (width, height)
if not mobj:
continue
thumbnails.append({ thumbnails.append({
'url': image_url, 'id': res,
'width': int(mobj.group(1)), 'url': 'http://images-us-am.crackle.com/%stnl_%s.jpg' % (path, res),
'height': int(mobj.group(2)), 'width': width,
'height': height,
'resolution': res,
}) })
http_base_url = 'http://ahttp.crackle.com/' + path
for mfs_path, mfs_info in self._MEDIA_FILE_SLOTS.items():
formats.append({
'url': http_base_url + mfs_path,
'format_id': 'http-' + mfs_path.split('.')[0],
'width': mfs_info['width'],
'height': mfs_info['height'],
})
for cc in item.findall('cc'):
locale = cc.attrib.get('l')
v = cc.attrib.get('v')
if locale and v:
if locale not in subtitles:
subtitles[locale] = []
for url_ext, ext in (('vtt', 'vtt'), ('xml', 'tt')):
subtitles.setdefault(locale, []).append({
'url': '%s/%s%s_%s.%s' % (config_doc.attrib['strSubtitleServer'], path, locale, v, url_ext),
'ext': ext,
})
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': description, 'description': item.attrib.get('d'),
'duration': duration, 'duration': int(item.attrib.get('r'), 16) / 1000 if item.attrib.get('r') else None,
'view_count': view_count, 'series': item.attrib.get('sn'),
'average_rating': average_rating, 'season_number': int_or_none(item.attrib.get('se')),
'age_limit': age_limit, 'episode_number': int_or_none(item.attrib.get('ep')),
'genre': genre,
'creator': creator,
'artist': artist,
'release_year': release_year,
'series': series,
'episode': episode,
'season_number': season_number,
'episode_number': episode_number,
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'subtitles': subtitles, 'subtitles': subtitles,
'formats': formats, 'formats': formats,
} }
raise last_e

View File

@ -3,13 +3,13 @@ from __future__ import unicode_literals
import re import re
import json import json
import base64
import zlib import zlib
from hashlib import sha1 from hashlib import sha1
from math import pow, sqrt, floor from math import pow, sqrt, floor
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_b64decode,
compat_etree_fromstring, compat_etree_fromstring,
compat_urllib_parse_urlencode, compat_urllib_parse_urlencode,
compat_urllib_request, compat_urllib_request,
@ -49,7 +49,7 @@ class CrunchyrollBaseIE(InfoExtractor):
}) })
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
@ -272,8 +272,8 @@ class CrunchyrollIE(CrunchyrollBaseIE):
} }
def _decrypt_subtitles(self, data, iv, id): def _decrypt_subtitles(self, data, iv, id):
data = bytes_to_intlist(compat_b64decode(data)) data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
iv = bytes_to_intlist(compat_b64decode(iv)) iv = bytes_to_intlist(base64.b64decode(iv.encode('utf-8')))
id = int(id) id = int(id)
def obfuscate_key_aux(count, modulo, start): def obfuscate_key_aux(count, modulo, start):

View File

@ -11,10 +11,10 @@ class CTVNewsIE(InfoExtractor):
_VALID_URL = r'https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)' _VALID_URL = r'https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ctvnews.ca/video?clipId=901995', 'url': 'http://www.ctvnews.ca/video?clipId=901995',
'md5': '9b8624ba66351a23e0b6e1391971f9af', 'md5': '10deb320dc0ccb8d01d34d12fc2ea672',
'info_dict': { 'info_dict': {
'id': '901995', 'id': '901995',
'ext': 'flv', 'ext': 'mp4',
'title': 'Extended: \'That person cannot be me\' Johnson says', 'title': 'Extended: \'That person cannot be me\' Johnson says',
'description': 'md5:958dd3b4f5bbbf0ed4d045c790d89285', 'description': 'md5:958dd3b4f5bbbf0ed4d045c790d89285',
'timestamp': 1467286284, 'timestamp': 1467286284,

View File

@ -35,7 +35,7 @@ class CuriosityStreamBaseIE(InfoExtractor):
return result['data'] return result['data']
def _real_initialize(self): def _real_initialize(self):
email, password = self._get_login_info() (email, password) = self._get_login_info()
if email is None: if email is None:
return return
result = self._download_json( result = self._download_json(

View File

@ -1,16 +1,12 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import hashlib
import itertools
import json
import random
import re import re
import string import json
import itertools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_struct_pack
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
error_to_compat_str, error_to_compat_str,
@ -68,6 +64,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'uploader': 'Deadline', 'uploader': 'Deadline',
'uploader_id': 'x1xm8ri', 'uploader_id': 'x1xm8ri',
'age_limit': 0, 'age_limit': 0,
'view_count': int,
}, },
}, { }, {
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames', 'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames',
@ -170,17 +167,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
player = self._parse_json(player_v5, video_id) player = self._parse_json(player_v5, video_id)
metadata = player['metadata'] metadata = player['metadata']
if metadata.get('error', {}).get('type') == 'password_protected':
password = self._downloader.params.get('videopassword')
if password:
r = int(metadata['id'][1:], 36)
us64e = lambda x: base64.urlsafe_b64encode(x).decode().strip('=')
t = ''.join(random.choice(string.ascii_letters) for i in range(10))
n = us64e(compat_struct_pack('I', r))
i = us64e(hashlib.md5(('%s%d%s' % (password, r, t)).encode()).digest())
metadata = self._download_json(
'http://www.dailymotion.com/player/metadata/video/p' + i + t + n, video_id)
self._check_error(metadata) self._check_error(metadata)
formats = [] formats = []
@ -194,12 +180,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
continue continue
ext = mimetype2ext(type_) or determine_ext(media_url) ext = mimetype2ext(type_) or determine_ext(media_url)
if ext == 'm3u8': if ext == 'm3u8':
m3u8_formats = self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
media_url, video_id, 'mp4', preference=-1, media_url, video_id, 'mp4', preference=-1,
m3u8_id='hls', fatal=False) m3u8_id='hls', fatal=False))
for f in m3u8_formats:
f['url'] = f['url'].split('#')[0]
formats.append(f)
elif ext == 'f4m': elif ext == 'f4m':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
media_url, video_id, preference=-1, f4m_id='hds', fatal=False)) media_url, video_id, preference=-1, f4m_id='hds', fatal=False))
@ -316,8 +299,8 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
def _check_error(self, info): def _check_error(self, info):
error = info.get('error') error = info.get('error')
if error: if info.get('error') is not None:
title = error.get('title') or error['message'] title = error['title']
# See https://developer.dailymotion.com/api#access-error # See https://developer.dailymotion.com/api#access-error
if error.get('code') == 'DM007': if error.get('code') == 'DM007':
self.raise_geo_restricted(msg=title) self.raise_geo_restricted(msg=title)

View File

@ -10,7 +10,6 @@ from ..aes import (
aes_cbc_decrypt, aes_cbc_decrypt,
aes_cbc_encrypt, aes_cbc_encrypt,
) )
from ..compat import compat_b64decode
from ..utils import ( from ..utils import (
bytes_to_intlist, bytes_to_intlist,
bytes_to_long, bytes_to_long,
@ -94,7 +93,7 @@ class DaisukiMottoIE(InfoExtractor):
rtn = self._parse_json( rtn = self._parse_json(
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist( intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
compat_b64decode(encrypted_rtn)), base64.b64decode(encrypted_rtn)),
aes_key, iv)).decode('utf-8').rstrip('\0'), aes_key, iv)).decode('utf-8').rstrip('\0'),
video_id) video_id)

View File

@ -1,56 +0,0 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import js_to_json
class DiggIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?digg\.com/video/(?P<id>[^/?#&]+)'
_TESTS = [{
# JWPlatform via provider
'url': 'http://digg.com/video/sci-fi-short-jonah-daniel-kaluuya-get-out',
'info_dict': {
'id': 'LcqvmS0b',
'ext': 'mp4',
'title': "'Get Out' Star Daniel Kaluuya Goes On 'Moby Dick'-Like Journey In Sci-Fi Short 'Jonah'",
'description': 'md5:541bb847648b6ee3d6514bc84b82efda',
'upload_date': '20180109',
'timestamp': 1515530551,
},
'params': {
'skip_download': True,
},
}, {
# Youtube via provider
'url': 'http://digg.com/video/dog-boat-seal-play',
'only_matching': True,
}, {
# vimeo as regular embed
'url': 'http://digg.com/video/dream-girl-short-film',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
info = self._parse_json(
self._search_regex(
r'(?s)video_info\s*=\s*({.+?});\n', webpage, 'video info',
default='{}'), display_id, transform_source=js_to_json,
fatal=False)
video_id = info.get('video_id')
if video_id:
provider = info.get('provider_name')
if provider == 'youtube':
return self.url_result(
video_id, ie='Youtube', video_id=video_id)
elif provider == 'jwplayer':
return self.url_result(
'jwplatform:%s' % video_id, ie='JWPlatform',
video_id=video_id)
return self.url_result(url, 'Generic')

View File

@ -5,19 +5,15 @@ import re
import string import string
from .discoverygo import DiscoveryGoBaseIE from .discoverygo import DiscoveryGoBaseIE
from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
try_get, update_url_query,
) )
from ..compat import compat_HTTPError from ..compat import compat_HTTPError
class DiscoveryIE(DiscoveryGoBaseIE): class DiscoveryIE(DiscoveryGoBaseIE):
_VALID_URL = r'''(?x)https?://(?:www\.)?(?P<site> _VALID_URL = r'''(?x)https?://(?:www\.)?(?:
discovery| discovery|
investigationdiscovery| investigationdiscovery|
discoverylife| discoverylife|
@ -48,7 +44,7 @@ class DiscoveryIE(DiscoveryGoBaseIE):
_GEO_BYPASS = False _GEO_BYPASS = False
def _real_extract(self, url): def _real_extract(self, url):
site, path, display_id = re.match(self._VALID_URL, url).groups() path, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
react_data = self._parse_json(self._search_regex( react_data = self._parse_json(self._search_regex(
@ -58,26 +54,15 @@ class DiscoveryIE(DiscoveryGoBaseIE):
video = next(cb for cb in content_blocks if cb.get('type') == 'video')['content']['items'][0] video = next(cb for cb in content_blocks if cb.get('type') == 'video')['content']['items'][0]
video_id = video['id'] video_id = video['id']
access_token = None
cookies = self._get_cookies(url)
# prefer Affiliate Auth Token over Anonymous Auth Token
auth_storage_cookie = cookies.get('eosAf') or cookies.get('eosAn')
if auth_storage_cookie and auth_storage_cookie.value:
auth_storage = self._parse_json(compat_urllib_parse_unquote(
compat_urllib_parse_unquote(auth_storage_cookie.value)),
video_id, fatal=False) or {}
access_token = auth_storage.get('a') or auth_storage.get('access_token')
if not access_token:
access_token = self._download_json( access_token = self._download_json(
'https://www.%s.com/anonymous' % site, display_id, query={ 'https://www.discovery.com/anonymous', display_id, query={
'authRel': 'authorization', 'authLink': update_url_query(
'client_id': try_get( 'https://login.discovery.com/v1/oauth2/authorize', {
react_data, lambda x: x['application']['apiClientId'], 'client_id': react_data['application']['apiClientId'],
compat_str) or '3020a40c2356a645b4b4', 'redirect_uri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html',
'nonce': ''.join([random.choice(string.ascii_letters) for _ in range(32)]), 'response_type': 'anonymous',
'redirectUri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html?https://www.%s.com' % site, 'state': 'nonce,' + ''.join([random.choice(string.ascii_letters) for _ in range(32)]),
})
})['access_token'] })['access_token']
try: try:
@ -87,7 +72,7 @@ class DiscoveryIE(DiscoveryGoBaseIE):
'Authorization': 'Bearer ' + access_token, 'Authorization': 'Bearer ' + access_token,
}) })
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403): if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
e_description = self._parse_json( e_description = self._parse_json(
e.cause.read().decode(), display_id)['description'] e.cause.read().decode(), display_id)['description']
if 'resource not available for country' in e_description: if 'resource not available for country' in e_description:

View File

@ -3,8 +3,8 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor
from .brightcove import BrightcoveLegacyIE from .brightcove import BrightcoveLegacyIE
from .dplay import DPlayIE
from ..compat import ( from ..compat import (
compat_parse_qs, compat_parse_qs,
compat_urlparse, compat_urlparse,
@ -12,13 +12,8 @@ from ..compat import (
from ..utils import smuggle_url from ..utils import smuggle_url
class DiscoveryNetworksDeIE(DPlayIE): class DiscoveryNetworksDeIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?(?P<site>discovery|tlc|animalplanet|dmax)\.de/ _VALID_URL = r'https?://(?:www\.)?(?:discovery|tlc|animalplanet|dmax)\.de/(?:.*#(?P<id>\d+)|(?:[^/]+/)*videos/(?P<title>[^/?#]+))'
(?:
.*\#(?P<id>\d+)|
(?:[^/]+/)*videos/(?P<display_id>[^/?#]+)|
programme/(?P<programme>[^/]+)/video/(?P<alternate_id>[^/]+)
)'''
_TESTS = [{ _TESTS = [{
'url': 'http://www.tlc.de/sendungen/breaking-amish/videos/#3235167922001', 'url': 'http://www.tlc.de/sendungen/breaking-amish/videos/#3235167922001',
@ -45,14 +40,6 @@ class DiscoveryNetworksDeIE(DPlayIE):
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
alternate_id = mobj.group('alternate_id')
if alternate_id:
self._initialize_geo_bypass({
'countries': ['DE'],
})
return self._get_disco_api_info(
url, '%s/%s' % (mobj.group('programme'), alternate_id),
'sonic-eu1-prod.disco-api.com', mobj.group('site') + 'de')
brightcove_id = mobj.group('id') brightcove_id = mobj.group('id')
if not brightcove_id: if not brightcove_id:
title = mobj.group('title') title = mobj.group('title')

View File

@ -12,28 +12,25 @@ from ..compat import (
compat_urlparse, compat_urlparse,
) )
from ..utils import ( from ..utils import (
determine_ext,
ExtractorError, ExtractorError,
float_or_none,
int_or_none, int_or_none,
remove_end, remove_end,
try_get, try_get,
unified_strdate, unified_strdate,
unified_timestamp,
update_url_query, update_url_query,
USER_AGENTS, USER_AGENTS,
) )
class DPlayIE(InfoExtractor): class DPlayIE(InfoExtractor):
_VALID_URL = r'https?://(?P<domain>www\.(?P<host>dplay\.(?P<country>dk|se|no)))/(?:video(?:er|s)/)?(?P<id>[^/]+/[^/?#]+)' _VALID_URL = r'https?://(?P<domain>www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
# non geo restricted, via secure api, unsigned download hls URL # non geo restricted, via secure api, unsigned download hls URL
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/', 'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
'info_dict': { 'info_dict': {
'id': '3172', 'id': '3172',
'display_id': 'nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet', 'display_id': 'season-1-svensken-lar-sig-njuta-av-livet',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Svensken lär sig njuta av livet', 'title': 'Svensken lär sig njuta av livet',
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8', 'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
@ -51,7 +48,7 @@ class DPlayIE(InfoExtractor):
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/', 'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
'info_dict': { 'info_dict': {
'id': '70816', 'id': '70816',
'display_id': 'mig-og-min-mor/season-6-episode-12', 'display_id': 'season-6-episode-12',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Episode 12', 'title': 'Episode 12',
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90', 'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
@ -68,122 +65,17 @@ class DPlayIE(InfoExtractor):
# geo restricted, via direct unsigned hls URL # geo restricted, via direct unsigned hls URL
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/', 'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
'only_matching': True, 'only_matching': True,
}, {
# disco-api
'url': 'https://www.dplay.no/videoer/i-kongens-klr/sesong-1-episode-7',
'info_dict': {
'id': '40206',
'display_id': 'i-kongens-klr/sesong-1-episode-7',
'ext': 'mp4',
'title': 'Episode 7',
'description': 'md5:e3e1411b2b9aebeea36a6ec5d50c60cf',
'duration': 2611.16,
'timestamp': 1516726800,
'upload_date': '20180123',
'series': 'I kongens klær',
'season_number': 1,
'episode_number': 7,
},
'params': {
'format': 'bestvideo',
'skip_download': True,
},
}, {
'url': 'https://www.dplay.dk/videoer/singleliv/season-5-episode-3',
'only_matching': True,
}, {
'url': 'https://www.dplay.se/videos/sofias-anglar/sofias-anglar-1001',
'only_matching': True,
}] }]
def _get_disco_api_info(self, url, display_id, disco_host, realm):
disco_base = 'https://' + disco_host
token = self._download_json(
'%s/token' % disco_base, display_id, 'Downloading token',
query={
'realm': realm,
})['data']['attributes']['token']
headers = {
'Referer': url,
'Authorization': 'Bearer ' + token,
}
video = self._download_json(
'%s/content/videos/%s' % (disco_base, display_id), display_id,
headers=headers, query={
'include': 'show'
})
video_id = video['data']['id']
info = video['data']['attributes']
title = info['name']
formats = []
for format_id, format_dict in self._download_json(
'%s/playback/videoPlaybackInfo/%s' % (disco_base, video_id),
display_id, headers=headers)['data']['attributes']['streaming'].items():
if not isinstance(format_dict, dict):
continue
format_url = format_dict.get('url')
if not format_url:
continue
ext = determine_ext(format_url)
if format_id == 'dash' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
format_url, display_id, mpd_id='dash', fatal=False))
elif format_id == 'hls' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, display_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls',
fatal=False))
else:
formats.append({
'url': format_url,
'format_id': format_id,
})
self._sort_formats(formats)
series = None
try:
included = video.get('included')
if isinstance(included, list):
show = next(e for e in included if e.get('type') == 'show')
series = try_get(
show, lambda x: x['attributes']['name'], compat_str)
except StopIteration:
pass
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': info.get('description'),
'duration': float_or_none(
info.get('videoDuration'), scale=1000),
'timestamp': unified_timestamp(info.get('publishStart')),
'series': series,
'season_number': int_or_none(info.get('seasonNumber')),
'episode_number': int_or_none(info.get('episodeNumber')),
'age_limit': int_or_none(info.get('minimum_age')),
'formats': formats,
}
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id') display_id = mobj.group('id')
domain = mobj.group('domain') domain = mobj.group('domain')
self._initialize_geo_bypass({
'countries': [mobj.group('country').upper()],
})
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._search_regex( video_id = self._search_regex(
r'data-video-id=["\'](\d+)', webpage, 'video id', default=None) r'data-video-id=["\'](\d+)', webpage, 'video id')
if not video_id:
host = mobj.group('host')
return self._get_disco_api_info(
url, display_id, 'disco-api.' + host, host.replace('.', ''))
info = self._download_json( info = self._download_json(
'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id), 'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),

View File

@ -2,26 +2,26 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import itertools import itertools
import json
from .common import InfoExtractor from .amp import AMPIE
from ..compat import ( from ..compat import (
compat_HTTPError, compat_HTTPError,
compat_str,
compat_urlparse, compat_urlparse,
) )
from ..utils import ( from ..utils import (
clean_html,
ExtractorError, ExtractorError,
clean_html,
int_or_none, int_or_none,
parse_age_limit, remove_end,
parse_duration, sanitized_Request,
unified_timestamp, urlencode_postdata
) )
class DramaFeverBaseIE(InfoExtractor): class DramaFeverBaseIE(AMPIE):
_LOGIN_URL = 'https://www.dramafever.com/accounts/login/'
_NETRC_MACHINE = 'dramafever' _NETRC_MACHINE = 'dramafever'
_GEO_COUNTRIES = ['US', 'CA']
_CONSUMER_SECRET = 'DA59dtVXYLxajktV' _CONSUMER_SECRET = 'DA59dtVXYLxajktV'
@ -38,11 +38,11 @@ class DramaFeverBaseIE(InfoExtractor):
'consumer secret', default=self._CONSUMER_SECRET) 'consumer secret', default=self._CONSUMER_SECRET)
def _real_initialize(self): def _real_initialize(self):
self._consumer_secret = self._get_consumer_secret()
self._login() self._login()
self._consumer_secret = self._get_consumer_secret()
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
@ -51,28 +51,18 @@ class DramaFeverBaseIE(InfoExtractor):
'password': password, 'password': password,
} }
try: request = sanitized_Request(
response = self._download_json( self._LOGIN_URL, urlencode_postdata(login_form))
'https://www.dramafever.com/api/users/login', None, 'Logging in', response = self._download_webpage(
data=json.dumps(login_form).encode('utf-8'), headers={ request, None, 'Logging in')
'x-consumer-key': self._consumer_secret,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (403, 404):
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
else:
raise
# Successful login if all(logout_pattern not in response
if response.get('result') or response.get('guid') or response.get('user_guid'): for logout_pattern in ['href="/accounts/logout/"', '>Log out<']):
return error = self._html_search_regex(
r'(?s)<h\d[^>]+\bclass="hidden-xs prompt"[^>]*>(.+?)</h\d',
errors = response.get('errors') response, 'error message', default=None)
if errors and isinstance(errors, list): if error:
error = errors[0] raise ExtractorError('Unable to login: %s' % error, expected=True)
message = error.get('message') or error['reason']
raise ExtractorError('Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in') raise ExtractorError('Unable to log in')
@ -80,20 +70,18 @@ class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever' IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)' _VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.dramafever.com/drama/4274/1/Heirs/', 'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': { 'info_dict': {
'id': '4274.1', 'id': '4512.1',
'ext': 'wvm', 'ext': 'flv',
'title': 'Heirs - Episode 1', 'title': 'Cooking with Shin',
'description': 'md5:362a24ba18209f6276e032a651c50bc2', 'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3783,
'timestamp': 1381354993,
'upload_date': '20131009',
'series': 'Heirs',
'season_number': 1,
'episode': 'Episode 1', 'episode': 'Episode 1',
'episode_number': 1, 'episode_number': 1,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1404336058,
'upload_date': '20140702',
'duration': 344,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -122,95 +110,50 @@ class DramaFeverIE(DramaFeverBaseIE):
'only_matching': True, 'only_matching': True,
}] }]
def _call_api(self, path, video_id, note, fatal=False):
return self._download_json(
'https://www.dramafever.com/api/5/' + path,
video_id, note=note, headers={
'x-consumer-key': self._consumer_secret,
}, fatal=fatal)
def _get_subtitles(self, video_id):
subtitles = {}
subs = self._call_api(
'video/%s/subtitles/webvtt/' % video_id, video_id,
'Downloading subtitles JSON', fatal=False)
if not subs or not isinstance(subs, list):
return subtitles
for sub in subs:
if not isinstance(sub, dict):
continue
sub_url = sub.get('url')
if not sub_url or not isinstance(sub_url, compat_str):
continue
subtitles.setdefault(
sub.get('code') or sub.get('language') or 'en', []).append({
'url': sub_url
})
return subtitles
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url).replace('/', '.') video_id = self._match_id(url).replace('/', '.')
try:
info = self._extract_feed_info(
'http://www.dramafever.com/amp/episode/feed.json?guid=%s' % video_id)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError):
self.raise_geo_restricted(
msg='Currently unavailable in your country',
countries=self._GEO_COUNTRIES)
raise
# title is postfixed with video id for some reason, removing
if info.get('title'):
info['title'] = remove_end(info['title'], video_id).strip()
series_id, episode_number = video_id.split('.') series_id, episode_number = video_id.split('.')
episode_info = self._download_json(
video = self._call_api( # We only need a single episode info, so restricting page size to one episode
'series/%s/episodes/%s/' % (series_id, episode_number), video_id, # and dealing with page number as with episode number
'Downloading video JSON') r'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_number=%s&page_size=1'
% (self._consumer_secret, series_id, episode_number),
formats = [] video_id, 'Downloading episode info JSON', fatal=False)
download_assets = video.get('download_assets') if episode_info:
if download_assets and isinstance(download_assets, dict): value = episode_info.get('value')
for format_id, format_dict in download_assets.items(): if isinstance(value, list):
if not isinstance(format_dict, dict): for v in value:
continue if v.get('type') == 'Episode':
format_url = format_dict.get('url') subfile = v.get('subfile') or v.get('new_subfile')
if not format_url or not isinstance(format_url, compat_str): if subfile and subfile != 'http://www.dramafever.com/st/':
continue info.setdefault('subtitles', {}).setdefault('English', []).append({
formats.append({ 'ext': 'srt',
'url': format_url, 'url': subfile,
'format_id': format_id,
'filesize': int_or_none(video.get('filesize')),
}) })
episode_number = int_or_none(v.get('number'))
episode_fallback = 'Episode'
if episode_number:
episode_fallback += ' %d' % episode_number
info['episode'] = v.get('title') or episode_fallback
info['episode_number'] = episode_number
break
stream = self._call_api( return info
'video/%s/stream/' % video_id, video_id, 'Downloading stream JSON',
fatal=False)
if stream:
stream_url = stream.get('stream_url')
if stream_url:
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
title = video.get('title') or 'Episode %s' % episode_number
description = video.get('description')
thumbnail = video.get('thumbnail')
timestamp = unified_timestamp(video.get('release_date'))
duration = parse_duration(video.get('duration'))
age_limit = parse_age_limit(video.get('tv_rating'))
series = video.get('series_title')
season_number = int_or_none(video.get('season'))
if series:
title = '%s - %s' % (series, title)
subtitles = self.extract_subtitles(video_id)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'age_limit': age_limit,
'series': series,
'season_number': season_number,
'episode_number': int_or_none(episode_number),
'formats': formats,
'subtitles': subtitles,
}
class DramaFeverSeriesIE(DramaFeverBaseIE): class DramaFeverSeriesIE(DramaFeverBaseIE):

View File

@ -8,6 +8,7 @@ from ..utils import (
unified_strdate, unified_strdate,
xpath_text, xpath_text,
determine_ext, determine_ext,
qualities,
float_or_none, float_or_none,
ExtractorError, ExtractorError,
) )
@ -15,8 +16,7 @@ from ..utils import (
class DreiSatIE(InfoExtractor): class DreiSatIE(InfoExtractor):
IE_NAME = '3sat' IE_NAME = '3sat'
_GEO_COUNTRIES = ['DE'] _VALID_URL = r'(?:https?://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_VALID_URL = r'https?://(?:www\.)?3sat\.de/mediathek/(?:(?:index|mediathek)\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)'
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918', 'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918',
@ -43,8 +43,7 @@ class DreiSatIE(InfoExtractor):
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None): def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
param_groups = {} param_groups = {}
for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)): for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
group_id = param_group.get(self._xpath_ns( group_id = param_group.attrib.get(self._xpath_ns('id', 'http://www.w3.org/XML/1998/namespace'))
'id', 'http://www.w3.org/XML/1998/namespace'))
params = {} params = {}
for param in param_group: for param in param_group:
params[param.get('name')] = param.get('value') params[param.get('name')] = param.get('value')
@ -55,7 +54,7 @@ class DreiSatIE(InfoExtractor):
src = video.get('src') src = video.get('src')
if not src: if not src:
continue continue
bitrate = int_or_none(self._search_regex(r'_(\d+)k', src, 'bitrate', None)) or float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000) bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
group_id = video.get('paramGroup') group_id = video.get('paramGroup')
param_group = param_groups[group_id] param_group = param_groups[group_id]
for proto in param_group['protocols'].split(','): for proto in param_group['protocols'].split(','):
@ -76,36 +75,66 @@ class DreiSatIE(InfoExtractor):
note='Downloading video info', note='Downloading video info',
errnote='Failed to download video info') errnote='Failed to download video info')
status_code = xpath_text(doc, './status/statuscode') status_code = doc.find('./status/statuscode')
if status_code and status_code != 'ok': if status_code is not None and status_code.text != 'ok':
if status_code == 'notVisibleAnymore': code = status_code.text
if code == 'notVisibleAnymore':
message = 'Video %s is not available' % video_id message = 'Video %s is not available' % video_id
else: else:
message = '%s returned error: %s' % (self.IE_NAME, status_code) message = '%s returned error: %s' % (self.IE_NAME, code)
raise ExtractorError(message, expected=True) raise ExtractorError(message, expected=True)
title = xpath_text(doc, './/information/title', 'title', True) title = doc.find('.//information/title').text
description = xpath_text(doc, './/information/detail', 'description')
duration = int_or_none(xpath_text(doc, './/details/lengthSec', 'duration'))
uploader = xpath_text(doc, './/details/originChannelTitle', 'uploader')
uploader_id = xpath_text(doc, './/details/originChannelId', 'uploader id')
upload_date = unified_strdate(xpath_text(doc, './/details/airtime', 'upload date'))
urls = [] def xml_to_thumbnails(fnode):
thumbnails = []
for node in fnode:
thumbnail_url = node.text
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
if 'key' in node.attrib:
m = re.match('^([0-9]+)x([0-9]+)$', node.attrib['key'])
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
return thumbnails
thumbnails = xml_to_thumbnails(doc.findall('.//teaserimages/teaserimage'))
format_nodes = doc.findall('.//formitaeten/formitaet')
quality = qualities(['veryhigh', 'high', 'med', 'low'])
def get_quality(elem):
return quality(xpath_text(elem, 'quality'))
format_nodes.sort(key=get_quality)
format_ids = []
formats = [] formats = []
for fnode in doc.findall('.//formitaeten/formitaet'): for fnode in format_nodes:
video_url = xpath_text(fnode, 'url') video_url = fnode.find('url').text
if not video_url or video_url in urls:
continue
urls.append(video_url)
is_available = 'http://www.metafilegenerator' not in video_url is_available = 'http://www.metafilegenerator' not in video_url
geoloced = 'static_geoloced_online' in video_url if not is_available:
if not is_available or geoloced:
continue continue
format_id = fnode.attrib['basetype'] format_id = fnode.attrib['basetype']
quality = xpath_text(fnode, './quality', 'quality')
format_m = re.match(r'''(?x) format_m = re.match(r'''(?x)
(?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_ (?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
(?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+) (?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
''', format_id) ''', format_id)
ext = determine_ext(video_url, None) or format_m.group('container') ext = determine_ext(video_url, None) or format_m.group('container')
if ext not in ('smil', 'f4m', 'm3u8'):
format_id = format_id + '-' + quality
if format_id in format_ids:
continue
if ext == 'meta': if ext == 'meta':
continue continue
@ -118,23 +147,24 @@ class DreiSatIE(InfoExtractor):
if video_url.startswith('https://'): if video_url.startswith('https://'):
continue continue
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', 'm3u8_native', video_url, video_id, 'mp4', m3u8_id=format_id, fatal=False))
m3u8_id=format_id, fatal=False))
elif ext == 'f4m': elif ext == 'f4m':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
video_url, video_id, f4m_id=format_id, fatal=False)) video_url, video_id, f4m_id=format_id, fatal=False))
else: else:
quality = xpath_text(fnode, './quality') proto = format_m.group('proto').lower()
if quality:
format_id += '-' + quality
abr = int_or_none(xpath_text(fnode, './audioBitrate'), 1000) abr = int_or_none(xpath_text(fnode, './audioBitrate', 'abr'), 1000)
vbr = int_or_none(xpath_text(fnode, './videoBitrate'), 1000) vbr = int_or_none(xpath_text(fnode, './videoBitrate', 'vbr'), 1000)
tbr = int_or_none(self._search_regex( width = int_or_none(xpath_text(fnode, './width', 'width'))
r'_(\d+)k', video_url, 'bitrate', None)) height = int_or_none(xpath_text(fnode, './height', 'height'))
if tbr and vbr and not abr:
abr = tbr - vbr filesize = int_or_none(xpath_text(fnode, './filesize', 'filesize'))
format_note = ''
if not format_note:
format_note = None
formats.append({ formats.append({
'format_id': format_id, 'format_id': format_id,
@ -144,50 +174,31 @@ class DreiSatIE(InfoExtractor):
'vcodec': format_m.group('vcodec'), 'vcodec': format_m.group('vcodec'),
'abr': abr, 'abr': abr,
'vbr': vbr, 'vbr': vbr,
'tbr': tbr, 'width': width,
'width': int_or_none(xpath_text(fnode, './width')), 'height': height,
'height': int_or_none(xpath_text(fnode, './height')), 'filesize': filesize,
'filesize': int_or_none(xpath_text(fnode, './filesize')), 'format_note': format_note,
'protocol': format_m.group('proto').lower(), 'protocol': proto,
'_available': is_available,
}) })
format_ids.append(format_id)
geolocation = xpath_text(doc, './/details/geolocation')
if not formats and geolocation and geolocation != 'none':
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
self._sort_formats(formats) self._sort_formats(formats)
thumbnails = []
for node in doc.findall('.//teaserimages/teaserimage'):
thumbnail_url = node.text
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
thumbnail_key = node.get('key')
if thumbnail_key:
m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
upload_date = unified_strdate(xpath_text(doc, './/details/airtime'))
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': xpath_text(doc, './/information/detail'), 'description': description,
'duration': int_or_none(xpath_text(doc, './/details/lengthSec')), 'duration': duration,
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'uploader': xpath_text(doc, './/details/originChannelTitle'), 'uploader': uploader,
'uploader_id': xpath_text(doc, './/details/originChannelId'), 'uploader_id': uploader_id,
'upload_date': upload_date, 'upload_date': upload_date,
'formats': formats, 'formats': formats,
} }
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?id=%s' % video_id video_id = mobj.group('id')
details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?ak=web&id=%s' % video_id
return self.extract_from_xml_url(video_id, details_url) return self.extract_from_xml_url(video_id, details_url)

View File

@ -66,9 +66,7 @@ class DrTuberIE(InfoExtractor):
self._sort_formats(formats) self._sort_formats(formats)
title = self._html_search_regex( title = self._html_search_regex(
(r'<h1[^>]+class=["\']title[^>]+>([^<]+)', (r'class="title_watch"[^>]*><(?:p|h\d+)[^>]*>([^<]+)<',
r'<title>([^<]+)\s*@\s+DrTuber',
r'class="title_watch"[^>]*><(?:p|h\d+)[^>]*>([^<]+)<',
r'<p[^>]+class="title_substrate">([^<]+)</p>', r'<p[^>]+class="title_substrate">([^<]+)</p>',
r'<title>([^<]+) - \d+'), r'<title>([^<]+) - \d+'),
webpage, 'title') webpage, 'title')

View File

@ -1,83 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from socket import timeout
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
)
class DTubeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?d\.tube/(?:#!/)?v/(?P<uploader_id>[0-9a-z.-]+)/(?P<id>[0-9a-z]{8})'
_TEST = {
'url': 'https://d.tube/#!/v/benswann/zqd630em',
'md5': 'a03eaa186618ffa7a3145945543a251e',
'info_dict': {
'id': 'zqd630em',
'ext': 'mp4',
'title': 'Reality Check: FDA\'s Disinformation Campaign on Kratom',
'description': 'md5:700d164e066b87f9eac057949e4227c2',
'uploader_id': 'benswann',
'upload_date': '20180222',
'timestamp': 1519328958,
},
'params': {
'format': '480p',
},
}
def _real_extract(self, url):
uploader_id, video_id = re.match(self._VALID_URL, url).groups()
result = self._download_json('https://api.steemit.com/', video_id, data=json.dumps({
'jsonrpc': '2.0',
'method': 'get_content',
'params': [uploader_id, video_id],
}).encode())['result']
metadata = json.loads(result['json_metadata'])
video = metadata['video']
content = video['content']
info = video.get('info', {})
title = info.get('title') or result['title']
def canonical_url(h):
if not h:
return None
return 'https://ipfs.io/ipfs/' + h
formats = []
for q in ('240', '480', '720', '1080', ''):
video_url = canonical_url(content.get('video%shash' % q))
if not video_url:
continue
format_id = (q + 'p') if q else 'Source'
try:
self.to_screen('%s: Checking %s video format URL' % (video_id, format_id))
self._downloader._opener.open(video_url, timeout=5).close()
except timeout as e:
self.to_screen(
'%s: %s URL is invalid, skipping' % (video_id, format_id))
continue
formats.append({
'format_id': format_id,
'url': video_url,
'height': int_or_none(q),
'ext': 'mp4',
})
return {
'id': video_id,
'title': title,
'description': content.get('description'),
'thumbnail': canonical_url(info.get('snaphash')),
'tags': content.get('tags') or metadata.get('tags'),
'duration': info.get('duration'),
'formats': formats,
'timestamp': parse_iso8601(result.get('created')),
'uploader_id': uploader_id,
}

View File

@ -1,10 +1,10 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_b64decode
from ..utils import ( from ..utils import (
qualities, qualities,
sanitized_Request, sanitized_Request,
@ -42,7 +42,7 @@ class DumpertIE(InfoExtractor):
r'data-files="([^"]+)"', webpage, 'data files') r'data-files="([^"]+)"', webpage, 'data files')
files = self._parse_json( files = self._parse_json(
compat_b64decode(files_base64).decode('utf-8'), base64.b64decode(files_base64.encode('utf-8')).decode('utf-8'),
video_id) video_id)
quality = qualities(['flv', 'mobile', 'tablet', '720p']) quality = qualities(['flv', 'mobile', 'tablet', '720p'])

View File

@ -32,7 +32,7 @@ class DVTVIE(InfoExtractor):
}, { }, {
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/', 'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
'info_dict': { 'info_dict': {
'title': r're:^DVTV 16\. 12\. 2014: útok Talibanu, boj o kliniku, uprchlíci', 'title': 'DVTV 16. 12. 2014: útok Talibanu, boj o kliniku, uprchlíci',
'id': '973eb3bc854e11e498be002590604f2e', 'id': '973eb3bc854e11e498be002590604f2e',
}, },
'playlist': [{ 'playlist': [{
@ -93,11 +93,8 @@ class DVTVIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
def _parse_video_metadata(self, js, video_id, live_js=None): def _parse_video_metadata(self, js, video_id):
data = self._parse_json(js, video_id, transform_source=js_to_json) data = self._parse_json(js, video_id, transform_source=js_to_json)
if live_js:
data.update(self._parse_json(
live_js, video_id, transform_source=js_to_json))
title = unescapeHTML(data['title']) title = unescapeHTML(data['title'])
@ -145,18 +142,13 @@ class DVTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
# live content
live_item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\.asset\.liveStarter\s*=\s*(\{.+?\});',
webpage, 'video', default=None)
# single video # single video
item = self._search_regex( item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});', r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
webpage, 'video', default=None) webpage, 'video', default=None, fatal=False)
if item: if item:
return self._parse_video_metadata(item, video_id, live_item) return self._parse_video_metadata(item, video_id)
# playlist # playlist
items = re.findall( items = re.findall(

View File

@ -1,13 +1,13 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_b64decode,
compat_str,
compat_urlparse, compat_urlparse,
compat_str,
) )
from ..utils import ( from ..utils import (
extract_attributes, extract_attributes,
@ -36,9 +36,9 @@ class EinthusanIE(InfoExtractor):
# reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js # reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js
def _decrypt(self, encrypted_data, video_id): def _decrypt(self, encrypted_data, video_id):
return self._parse_json(compat_b64decode(( return self._parse_json(base64.b64decode((
encrypted_data[:10] + encrypted_data[-1] + encrypted_data[12:-1] encrypted_data[:10] + encrypted_data[-1] + encrypted_data[12:-1]
)).decode('utf-8'), video_id) ).encode('ascii')).decode('utf-8'), video_id)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)

View File

@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class ETOnlineIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?etonline\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.etonline.com/tv/211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale/',
'info_dict': {
'id': '211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale',
'title': 'md5:a21ec7d3872ed98335cbd2a046f34ee6',
'description': 'md5:8b94484063f463cca709617c79618ccd',
},
'playlist_count': 2,
}, {
'url': 'http://www.etonline.com/media/video/here_are_the_stars_who_love_bringing_their_moms_as_dates_to_the_oscars-211359/',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911076001/default_default/index.html?videoId=ref:%s'
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % video_id, 'BrightcoveNew', video_id)
for video_id in re.findall(
r'site\.brightcove\s*\([^,]+,\s*["\'](title_\d+)', webpage)]
return self.playlist_result(
entries, playlist_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))

View File

@ -44,7 +44,6 @@ from .anysex import AnySexIE
from .aol import AolIE from .aol import AolIE
from .allocine import AllocineIE from .allocine import AllocineIE
from .aliexpress import AliExpressLiveIE from .aliexpress import AliExpressLiveIE
from .apa import APAIE
from .aparat import AparatIE from .aparat import AparatIE
from .appleconnect import AppleConnectIE from .appleconnect import AppleConnectIE
from .appletrailers import ( from .appletrailers import (
@ -138,7 +137,6 @@ from .brightcove import (
BrightcoveLegacyIE, BrightcoveLegacyIE,
BrightcoveNewIE, BrightcoveNewIE,
) )
from .businessinsider import BusinessInsiderIE
from .buzzfeed import BuzzFeedIE from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE from .byutv import BYUtvIE
from .c56 import C56IE from .c56 import C56IE
@ -146,8 +144,6 @@ from .camdemy import (
CamdemyIE, CamdemyIE,
CamdemyFolderIE CamdemyFolderIE
) )
from .cammodels import CamModelsIE
from .camtube import CamTubeIE
from .camwithher import CamWithHerIE from .camwithher import CamWithHerIE
from .canalplus import CanalplusIE from .canalplus import CanalplusIE
from .canalc2 import Canalc2IE from .canalc2 import Canalc2IE
@ -166,7 +162,6 @@ from .cbc import (
CBCPlayerIE, CBCPlayerIE,
CBCWatchVideoIE, CBCWatchVideoIE,
CBCWatchIE, CBCWatchIE,
CBCOlympicsIE,
) )
from .cbs import CBSIE from .cbs import CBSIE
from .cbslocal import CBSLocalIE from .cbslocal import CBSLocalIE
@ -199,7 +194,6 @@ from .clippit import ClippitIE
from .cliprs import ClipRsIE from .cliprs import ClipRsIE
from .clipsyndicate import ClipsyndicateIE from .clipsyndicate import ClipsyndicateIE
from .closertotruth import CloserToTruthIE from .closertotruth import CloserToTruthIE
from .cloudflarestream import CloudflareStreamIE
from .cloudy import CloudyIE from .cloudy import CloudyIE
from .clubic import ClubicIE from .clubic import ClubicIE
from .clyp import ClypIE from .clyp import ClypIE
@ -265,7 +259,6 @@ from .deezer import DeezerPlaylistIE
from .democracynow import DemocracynowIE from .democracynow import DemocracynowIE
from .dfb import DFBIE from .dfb import DFBIE
from .dhm import DHMIE from .dhm import DHMIE
from .digg import DiggIE
from .dotsub import DotsubIE from .dotsub import DotsubIE
from .douyutv import ( from .douyutv import (
DouyuShowIE, DouyuShowIE,
@ -286,7 +279,6 @@ from .drtv import (
DRTVIE, DRTVIE,
DRTVLiveIE, DRTVLiveIE,
) )
from .dtube import DTubeIE
from .dvtv import DVTVIE from .dvtv import DVTVIE
from .dumpert import DumpertIE from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE from .defense import DefenseGouvFrIE
@ -332,6 +324,7 @@ from .espn import (
FiveThirtyEightIE, FiveThirtyEightIE,
) )
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .etonline import ETOnlineIE
from .europa import EuropaIE from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE from .everyonesmixtape import EveryonesMixtapeIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
@ -379,11 +372,8 @@ from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE from .franceinter import FranceInterIE
from .francetv import ( from .francetv import (
FranceTVIE, FranceTVIE,
FranceTVSiteIE,
FranceTVEmbedIE, FranceTVEmbedIE,
FranceTVInfoIE, FranceTVInfoIE,
FranceTVInfoSportIE,
FranceTVJeunesseIE,
GenerationWhatIE, GenerationWhatIE,
CultureboxIE, CultureboxIE,
) )
@ -391,10 +381,7 @@ from .freesound import FreesoundIE
from .freespeech import FreespeechIE from .freespeech import FreespeechIE
from .freshlive import FreshLiveIE from .freshlive import FreshLiveIE
from .funimation import FunimationIE from .funimation import FunimationIE
from .funk import ( from .funk import FunkIE
FunkMixIE,
FunkChannelIE,
)
from .funnyordie import FunnyOrDieIE from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE from .fusion import FusionIE
from .fxnetworks import FXNetworksIE from .fxnetworks import FXNetworksIE
@ -438,7 +425,6 @@ from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE from .helsinki import HelsinkiIE
from .hentaistigma import HentaiStigmaIE from .hentaistigma import HentaiStigmaIE
from .hgtv import HGTVComShowIE from .hgtv import HGTVComShowIE
from .hidive import HiDiveIE
from .historicfilms import HistoricFilmsIE from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE from .hitbox import HitboxIE, HitboxLiveIE
from .hitrecord import HitRecordIE from .hitrecord import HitRecordIE
@ -473,7 +459,10 @@ from .imgur import (
) )
from .ina import InaIE from .ina import InaIE
from .inc import IncIE from .inc import IncIE
from .indavideo import IndavideoEmbedIE from .indavideo import (
IndavideoIE,
IndavideoEmbedIE,
)
from .infoq import InfoQIE from .infoq import InfoQIE
from .instagram import InstagramIE, InstagramUserIE from .instagram import InstagramIE, InstagramUserIE
from .internazionale import InternazionaleIE from .internazionale import InternazionaleIE
@ -481,10 +470,7 @@ from .internetvideoarchive import InternetVideoArchiveIE
from .iprima import IPrimaIE from .iprima import IPrimaIE
from .iqiyi import IqiyiIE from .iqiyi import IqiyiIE
from .ir90tv import Ir90TvIE from .ir90tv import Ir90TvIE
from .itv import ( from .itv import ITVIE
ITVIE,
ITVBTCCIE,
)
from .ivi import ( from .ivi import (
IviIE, IviIE,
IviCompilationIE IviCompilationIE
@ -503,6 +489,7 @@ from .jwplatform import JWPlatformIE
from .jpopsukitv import JpopsukiIE from .jpopsukitv import JpopsukiIE
from .kakao import KakaoIE from .kakao import KakaoIE
from .kaltura import KalturaIE from .kaltura import KalturaIE
from .kamcord import KamcordIE
from .kanalplay import KanalPlayIE from .kanalplay import KanalPlayIE
from .kankan import KankanIE from .kankan import KankanIE
from .karaoketv import KaraoketvIE from .karaoketv import KaraoketvIE
@ -538,14 +525,13 @@ from .lcp import (
) )
from .learnr import LearnrIE from .learnr import LearnrIE
from .lecture2go import Lecture2GoIE from .lecture2go import Lecture2GoIE
from .lego import LEGOIE
from .lemonde import LemondeIE
from .leeco import ( from .leeco import (
LeIE, LeIE,
LePlaylistIE, LePlaylistIE,
LetvCloudIE, LetvCloudIE,
) )
from .lego import LEGOIE
from .lemonde import LemondeIE
from .lenta import LentaIE
from .libraryofcongress import LibraryOfCongressIE from .libraryofcongress import LibraryOfCongressIE
from .libsyn import LibsynIE from .libsyn import LibsynIE
from .lifenews import ( from .lifenews import (
@ -557,7 +543,6 @@ from .limelight import (
LimelightChannelIE, LimelightChannelIE,
LimelightChannelListIE, LimelightChannelListIE,
) )
from .line import LineTVIE
from .litv import LiTVIE from .litv import LiTVIE
from .liveleak import ( from .liveleak import (
LiveLeakIE, LiveLeakIE,
@ -578,11 +563,8 @@ from .lynda import (
) )
from .m6 import M6IE from .m6 import M6IE
from .macgamestore import MacGameStoreIE from .macgamestore import MacGameStoreIE
from .mailru import ( from .mailru import MailRuIE
MailRuIE, from .makerschannel import MakersChannelIE
MailRuMusicIE,
MailRuMusicSearchIE,
)
from .makertv import MakerTVIE from .makertv import MakerTVIE
from .mangomolo import ( from .mangomolo import (
MangomoloVideoIE, MangomoloVideoIE,
@ -625,6 +607,7 @@ from .mnet import MnetIE
from .moevideo import MoeVideoIE from .moevideo import MoeVideoIE
from .mofosex import MofosexIE from .mofosex import MofosexIE
from .mojvideo import MojvideoIE from .mojvideo import MojvideoIE
from .moniker import MonikerIE
from .morningstar import MorningstarIE from .morningstar import MorningstarIE
from .motherless import ( from .motherless import (
MotherlessIE, MotherlessIE,
@ -645,13 +628,9 @@ from .mtv import (
from .muenchentv import MuenchenTVIE from .muenchentv import MuenchenTVIE
from .musicplayon import MusicPlayOnIE from .musicplayon import MusicPlayOnIE
from .mwave import MwaveIE, MwaveMeetGreetIE from .mwave import MwaveIE, MwaveMeetGreetIE
from .mychannels import MyChannelsIE
from .myspace import MySpaceIE, MySpaceAlbumIE from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE from .myspass import MySpassIE
from .myvi import ( from .myvi import MyviIE
MyviIE,
MyviEmbedIE,
)
from .myvidster import MyVidsterIE from .myvidster import MyVidsterIE
from .nationalgeographic import ( from .nationalgeographic import (
NationalGeographicVideoIE, NationalGeographicVideoIE,
@ -665,9 +644,7 @@ from .nbc import (
NBCIE, NBCIE,
NBCNewsIE, NBCNewsIE,
NBCOlympicsIE, NBCOlympicsIE,
NBCOlympicsStreamIE,
NBCSportsIE, NBCSportsIE,
NBCSportsStreamIE,
NBCSportsVPlayerIE, NBCSportsVPlayerIE,
) )
from .ndr import ( from .ndr import (
@ -707,7 +684,12 @@ from .nexx import (
from .nfb import NFBIE from .nfb import NFBIE
from .nfl import NFLIE from .nfl import NFLIE
from .nhk import NhkVodIE from .nhk import NhkVodIE
from .nhl import NHLIE from .nhl import (
NHLVideocenterIE,
NHLNewsIE,
NHLVideocenterCategoryIE,
NHLIE,
)
from .nick import ( from .nick import (
NickIE, NickIE,
NickBrIE, NickBrIE,
@ -716,7 +698,10 @@ from .nick import (
NickRuIE, NickRuIE,
) )
from .niconico import NiconicoIE, NiconicoPlaylistIE from .niconico import NiconicoIE, NiconicoPlaylistIE
from .ninecninemedia import NineCNineMediaIE from .ninecninemedia import (
NineCNineMediaStackIE,
NineCNineMediaIE,
)
from .ninegag import NineGagIE from .ninegag import NineGagIE
from .ninenow import NineNowIE from .ninenow import NineNowIE
from .nintendo import NintendoIE from .nintendo import NintendoIE
@ -804,7 +789,6 @@ from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE from .patreon import PatreonIE
from .pbs import PBSIE from .pbs import PBSIE
from .pearvideo import PearVideoIE from .pearvideo import PearVideoIE
from .peertube import PeerTubeIE
from .people import PeopleIE from .people import PeopleIE
from .performgroup import PerformGroupIE from .performgroup import PerformGroupIE
from .periscope import ( from .periscope import (
@ -814,10 +798,6 @@ from .periscope import (
from .philharmoniedeparis import PhilharmonieDeParisIE from .philharmoniedeparis import PhilharmonieDeParisIE
from .phoenix import PhoenixIE from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE from .photobucket import PhotobucketIE
from .picarto import (
PicartoIE,
PicartoVodIE,
)
from .piksel import PikselIE from .piksel import PikselIE
from .pinkbike import PinkbikeIE from .pinkbike import PinkbikeIE
from .pladform import PladformIE from .pladform import PladformIE
@ -880,7 +860,6 @@ from .rai import (
RaiPlayPlaylistIE, RaiPlayPlaylistIE,
RaiIE, RaiIE,
) )
from .raywenderlich import RayWenderlichIE
from .rbmaradio import RBMARadioIE from .rbmaradio import RBMARadioIE
from .rds import RDSIE from .rds import RDSIE
from .redbulltv import RedBullTVIE from .redbulltv import RedBullTVIE
@ -902,6 +881,7 @@ from .revision3 import (
Revision3IE, Revision3IE,
) )
from .rice import RICEIE from .rice import RICEIE
from .ringtv import RingTVIE
from .rmcdecouverte import RMCDecouverteIE from .rmcdecouverte import RMCDecouverteIE
from .ro220 import Ro220IE from .ro220 import Ro220IE
from .rockstargames import RockstarGamesIE from .rockstargames import RockstarGamesIE
@ -921,7 +901,6 @@ from .rtp import RTPIE
from .rts import RTSIE from .rts import RTSIE
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE, RTVELiveIE, RTVETelevisionIE from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE, RTVELiveIE, RTVETelevisionIE
from .rtvnh import RTVNHIE from .rtvnh import RTVNHIE
from .rtvs import RTVSIE
from .rudo import RudoIE from .rudo import RudoIE
from .ruhd import RUHDIE from .ruhd import RUHDIE
from .ruleporn import RulePornIE from .ruleporn import RulePornIE
@ -954,10 +933,6 @@ from .servingsys import ServingSysIE
from .servus import ServusIE from .servus import ServusIE
from .sevenplus import SevenPlusIE from .sevenplus import SevenPlusIE
from .sexu import SexuIE from .sexu import SexuIE
from .seznamzpravy import (
SeznamZpravyIE,
SeznamZpravyArticleIE,
)
from .shahid import ( from .shahid import (
ShahidIE, ShahidIE,
ShahidShowIE, ShahidShowIE,
@ -1010,15 +985,12 @@ from .spankbang import SpankBangIE
from .spankwire import SpankwireIE from .spankwire import SpankwireIE
from .spiegel import SpiegelIE, SpiegelArticleIE from .spiegel import SpiegelIE, SpiegelArticleIE
from .spiegeltv import SpiegeltvIE from .spiegeltv import SpiegeltvIE
from .spike import ( from .spike import SpikeIE
BellatorIE,
ParamountNetworkIE,
)
from .stitcher import StitcherIE from .stitcher import StitcherIE
from .sport5 import Sport5IE from .sport5 import Sport5IE
from .sportbox import SportBoxEmbedIE from .sportbox import SportBoxEmbedIE
from .sportdeutschland import SportDeutschlandIE from .sportdeutschland import SportDeutschlandIE
from .springboardplatform import SpringboardPlatformIE from .sportschau import SportschauIE
from .sprout import SproutIE from .sprout import SproutIE
from .srgssr import ( from .srgssr import (
SRGSSRIE, SRGSSRIE,
@ -1037,7 +1009,6 @@ from .sunporno import SunPornoIE
from .svt import ( from .svt import (
SVTIE, SVTIE,
SVTPlayIE, SVTPlayIE,
SVTSeriesIE,
) )
from .swrmediathek import SWRMediathekIE from .swrmediathek import SWRMediathekIE
from .syfy import SyfyIE from .syfy import SyfyIE
@ -1063,14 +1034,9 @@ from .telebruxelles import TeleBruxellesIE
from .telecinco import TelecincoIE from .telecinco import TelecincoIE
from .telegraaf import TelegraafIE from .telegraaf import TelegraafIE
from .telemb import TeleMBIE from .telemb import TeleMBIE
from .telequebec import ( from .telequebec import TeleQuebecIE
TeleQuebecIE,
TeleQuebecEmissionIE,
TeleQuebecLiveIE,
)
from .teletask import TeleTaskIE from .teletask import TeleTaskIE
from .telewebion import TelewebionIE from .telewebion import TelewebionIE
from .tennistv import TennisTVIE
from .testurl import TestURLIE from .testurl import TestURLIE
from .tf1 import TF1IE from .tf1 import TF1IE
from .tfo import TFOIE from .tfo import TFOIE
@ -1080,6 +1046,7 @@ from .theplatform import (
ThePlatformFeedIE, ThePlatformFeedIE,
) )
from .thescene import TheSceneIE from .thescene import TheSceneIE
from .thesixtyone import TheSixtyOneIE
from .thestar import TheStarIE from .thestar import TheStarIE
from .thesun import TheSunIE from .thesun import TheSunIE
from .theweatherchannel import TheWeatherChannelIE from .theweatherchannel import TheWeatherChannelIE
@ -1101,6 +1068,7 @@ from .tnaflix import (
from .toggle import ToggleIE from .toggle import ToggleIE
from .tonline import TOnlineIE from .tonline import TOnlineIE
from .toongoggles import ToonGogglesIE from .toongoggles import ToonGogglesIE
from .totalwebcasting import TotalWebCastingIE
from .toutv import TouTvIE from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE from .traileraddict import TrailerAddictIE
@ -1139,12 +1107,10 @@ from .tvc import (
from .tvigle import TvigleIE from .tvigle import TvigleIE
from .tvland import TVLandIE from .tvland import TVLandIE
from .tvn24 import TVN24IE from .tvn24 import TVN24IE
from .tvnet import TVNetIE
from .tvnoe import TVNoeIE from .tvnoe import TVNoeIE
from .tvnow import ( from .tvnow import (
TVNowIE, TVNowIE,
TVNowListIE, TVNowListIE,
TVNowShowIE,
) )
from .tvp import ( from .tvp import (
TVPEmbedIE, TVPEmbedIE,
@ -1227,6 +1193,7 @@ from .vice import (
ViceArticleIE, ViceArticleIE,
ViceShowIE, ViceShowIE,
) )
from .viceland import VicelandIE
from .vidbit import VidbitIE from .vidbit import VidbitIE
from .viddler import ViddlerIE from .viddler import ViddlerIE
from .videa import VideaIE from .videa import VideaIE
@ -1241,7 +1208,6 @@ from .videomore import (
from .videopremium import VideoPremiumIE from .videopremium import VideoPremiumIE
from .videopress import VideoPressIE from .videopress import VideoPressIE
from .vidio import VidioIE from .vidio import VidioIE
from .vidlii import VidLiiIE
from .vidme import ( from .vidme import (
VidmeIE, VidmeIE,
VidmeUserIE, VidmeUserIE,
@ -1323,8 +1289,6 @@ from .watchbox import WatchBoxIE
from .watchindianporn import WatchIndianPornIE from .watchindianporn import WatchIndianPornIE
from .wdr import ( from .wdr import (
WDRIE, WDRIE,
WDRPageIE,
WDRElefantIE,
WDRMobileIE, WDRMobileIE,
) )
from .webcaster import ( from .webcaster import (
@ -1335,10 +1299,6 @@ from .webofstories import (
WebOfStoriesIE, WebOfStoriesIE,
WebOfStoriesPlaylistIE, WebOfStoriesPlaylistIE,
) )
from .weibo import (
WeiboIE,
WeiboMobileIE
)
from .weiqitv import WeiqiTVIE from .weiqitv import WeiqiTVIE
from .wimp import WimpIE from .wimp import WimpIE
from .wistia import WistiaIE from .wistia import WistiaIE
@ -1364,10 +1324,6 @@ from .xiami import (
XiamiArtistIE, XiamiArtistIE,
XiamiCollectionIE XiamiCollectionIE
) )
from .ximalaya import (
XimalayaIE,
XimalayaAlbumIE
)
from .xminus import XMinusIE from .xminus import XMinusIE
from .xnxx import XNXXIE from .xnxx import XNXXIE
from .xstream import XstreamIE from .xstream import XstreamIE
@ -1385,7 +1341,6 @@ from .yandexmusic import (
YandexMusicPlaylistIE, YandexMusicPlaylistIE,
) )
from .yandexdisk import YandexDiskIE from .yandexdisk import YandexDiskIE
from .yapfiles import YapFilesIE
from .yesjapan import YesJapanIE from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE from .yinyuetai import YinYueTaiIE
from .ynet import YnetIE from .ynet import YnetIE
@ -1422,11 +1377,5 @@ from .youtube import (
) )
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zaq1 import Zaq1IE from .zaq1 import Zaq1IE
from .zattoo import (
QuicklineIE,
QuicklineLiveIE,
ZattooIE,
ZattooLiveIE,
)
from .zdf import ZDFIE, ZDFChannelIE from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import ZingMp3IE from .zingmp3 import ZingMp3IE

View File

@ -8,12 +8,12 @@ class ExtremeTubeIE(KeezMoviesIE):
_VALID_URL = r'https?://(?:www\.)?extremetube\.com/(?:[^/]+/)?video/(?P<id>[^/#?&]+)' _VALID_URL = r'https?://(?:www\.)?extremetube\.com/(?:[^/]+/)?video/(?P<id>[^/#?&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.extremetube.com/video/music-video-14-british-euro-brit-european-cumshots-swallow-652431', 'url': 'http://www.extremetube.com/video/music-video-14-british-euro-brit-european-cumshots-swallow-652431',
'md5': '92feaafa4b58e82f261e5419f39c60cb', 'md5': '1fb9228f5e3332ec8c057d6ac36f33e0',
'info_dict': { 'info_dict': {
'id': 'music-video-14-british-euro-brit-european-cumshots-swallow-652431', 'id': 'music-video-14-british-euro-brit-european-cumshots-swallow-652431',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Music Video 14 british euro brit european cumshots swallow', 'title': 'Music Video 14 british euro brit european cumshots swallow',
'uploader': 'anonim', 'uploader': 'unknown',
'view_count': int, 'view_count': int,
'age_limit': 18, 'age_limit': 18,
} }
@ -36,10 +36,10 @@ class ExtremeTubeIE(KeezMoviesIE):
r'<h1[^>]+title="([^"]+)"[^>]*>', webpage, 'title') r'<h1[^>]+title="([^"]+)"[^>]*>', webpage, 'title')
uploader = self._html_search_regex( uploader = self._html_search_regex(
r'Uploaded by:\s*</[^>]+>\s*<a[^>]+>(.+?)</a>', r'Uploaded by:\s*</strong>\s*(.+?)\s*</div>',
webpage, 'uploader', fatal=False) webpage, 'uploader', fatal=False)
view_count = str_to_int(self._search_regex( view_count = str_to_int(self._search_regex(
r'Views:\s*</[^>]+>\s*<[^>]+>([\d,\.]+)</', r'Views:\s*</strong>\s*<span>([\d,\.]+)</span>',
webpage, 'view count', fatal=False)) webpage, 'view count', fatal=False))
info.update({ info.update({

View File

@ -56,7 +56,6 @@ class FacebookIE(InfoExtractor):
_CHROME_USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36' _CHROME_USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36'
_VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s' _VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s'
_VIDEO_PAGE_TAHOE_TEMPLATE = 'https://www.facebook.com/video/tahoe/async/%s/?chain=true&isvideo=true'
_TESTS = [{ _TESTS = [{
'url': 'https://www.facebook.com/video.php?v=637842556329505&fref=nf', 'url': 'https://www.facebook.com/video.php?v=637842556329505&fref=nf',
@ -209,17 +208,6 @@ class FacebookIE(InfoExtractor):
# no title # no title
'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/', 'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.facebook.com/WatchESLOne/videos/359649331226507/',
'info_dict': {
'id': '359649331226507',
'ext': 'mp4',
'title': '#ESLOne VoD - Birmingham Finals Day#1 Fnatic vs. @Evil Geniuses',
'uploader': 'ESL One Dota 2',
},
'params': {
'skip_download': True,
},
}] }]
@staticmethod @staticmethod
@ -238,7 +226,7 @@ class FacebookIE(InfoExtractor):
return urls return urls
def _login(self): def _login(self):
useremail, password = self._get_login_info() (useremail, password) = self._get_login_info()
if useremail is None: if useremail is None:
return return
@ -324,18 +312,16 @@ class FacebookIE(InfoExtractor):
if server_js_data: if server_js_data:
video_data = extract_video_data(server_js_data.get('instances', [])) video_data = extract_video_data(server_js_data.get('instances', []))
def extract_from_jsmods_instances(js_data):
if js_data:
return extract_video_data(try_get(
js_data, lambda x: x['jsmods']['instances'], list) or [])
if not video_data: if not video_data:
server_js_data = self._parse_json( server_js_data = self._parse_json(
self._search_regex( self._search_regex(
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:stream_pagelet|pagelet_group_mall|permalink_video_pagelet)', r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:stream_pagelet|pagelet_group_mall|permalink_video_pagelet)',
webpage, 'js data', default='{}'), webpage, 'js data', default='{}'),
video_id, transform_source=js_to_json, fatal=False) video_id, transform_source=js_to_json, fatal=False)
video_data = extract_from_jsmods_instances(server_js_data) if server_js_data:
video_data = extract_video_data(try_get(
server_js_data, lambda x: x['jsmods']['instances'],
list) or [])
if not video_data: if not video_data:
if not fatal_if_no_video: if not fatal_if_no_video:
@ -347,32 +333,7 @@ class FacebookIE(InfoExtractor):
expected=True) expected=True)
elif '>You must log in to continue' in webpage: elif '>You must log in to continue' in webpage:
self.raise_login_required() self.raise_login_required()
else:
# Video info not in first request, do a secondary request using
# tahoe player specific URL
tahoe_data = self._download_webpage(
self._VIDEO_PAGE_TAHOE_TEMPLATE % video_id, video_id,
data=urlencode_postdata({
'__user': 0,
'__a': 1,
'__pc': self._search_regex(
r'pkg_cohort["\']\s*:\s*["\'](.+?)["\']', webpage,
'pkg cohort', default='PHASED:DEFAULT'),
'__rev': self._search_regex(
r'client_revision["\']\s*:\s*(\d+),', webpage,
'client revision', default='3944515'),
}),
headers={
'Content-Type': 'application/x-www-form-urlencoded',
})
tahoe_js_data = self._parse_json(
self._search_regex(
r'for\s+\(\s*;\s*;\s*\)\s*;(.+)', tahoe_data,
'tahoe js data', default='{}'),
video_id, fatal=False)
video_data = extract_from_jsmods_instances(tahoe_js_data)
if not video_data:
raise ExtractorError('Cannot parse data') raise ExtractorError('Cannot parse data')
formats = [] formats = []
@ -419,8 +380,7 @@ class FacebookIE(InfoExtractor):
video_title = 'Facebook video #%s' % video_id video_title = 'Facebook video #%s' % video_id
uploader = clean_html(get_element_by_id( uploader = clean_html(get_element_by_id(
'fbPhotoPageAuthorName', webpage)) or self._search_regex( 'fbPhotoPageAuthorName', webpage)) or self._search_regex(
r'ownerName\s*:\s*"([^"]+)"', webpage, 'uploader', r'ownerName\s*:\s*"([^"]+)"', webpage, 'uploader', fatal=False)
fatal=False) or self._og_search_title(webpage, fatal=False)
timestamp = int_or_none(self._search_regex( timestamp = int_or_none(self._search_regex(
r'<abbr[^>]+data-utime=["\'](\d+)', webpage, r'<abbr[^>]+data-utime=["\'](\d+)', webpage,
'timestamp', default=None)) 'timestamp', default=None))

View File

@ -46,7 +46,7 @@ class FC2IE(InfoExtractor):
}] }]
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None or password is None: if username is None or password is None:
return False return False

View File

@ -33,7 +33,7 @@ class FranceInterIE(InfoExtractor):
description = self._og_search_description(webpage) description = self._og_search_description(webpage)
upload_date_str = self._search_regex( upload_date_str = self._search_regex(
r'class=["\']\s*cover-emission-period\s*["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<', r'class=["\']cover-emission-period["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
webpage, 'upload date', fatal=False) webpage, 'upload date', fatal=False)
if upload_date_str: if upload_date_str:
upload_date_list = upload_date_str.split() upload_date_list = upload_date_str.split()

View File

@ -5,89 +5,19 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import compat_urlparse
compat_str,
compat_urlparse,
)
from ..utils import ( from ..utils import (
clean_html, clean_html,
determine_ext,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
parse_duration, parse_duration,
try_get, determine_ext,
) )
from .dailymotion import DailymotionIE from .dailymotion import DailymotionIE
class FranceTVBaseInfoExtractor(InfoExtractor): class FranceTVBaseInfoExtractor(InfoExtractor):
def _make_url_result(self, video_or_full_id, catalog=None):
full_id = 'francetv:%s' % video_or_full_id
if '@' not in video_or_full_id and catalog:
full_id += '@%s' % catalog
return self.url_result(
full_id, ie=FranceTVIE.ie_key(),
video_id=video_or_full_id.split('@')[0])
class FranceTVIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:
https?://
sivideo\.webservices\.francetelevisions\.fr/tools/getInfosOeuvre/v2/\?
.*?\bidDiffusion=[^&]+|
(?:
https?://videos\.francetv\.fr/video/|
francetv:
)
(?P<id>[^@]+)(?:@(?P<catalog>.+))?
)
'''
_TESTS = [{
# without catalog
'url': 'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/?idDiffusion=162311093&callback=_jsonp_loader_callback_request_0',
'md5': 'c2248a8de38c4e65ea8fae7b5df2d84f',
'info_dict': {
'id': '162311093',
'ext': 'mp4',
'title': '13h15, le dimanche... - Les mystères de Jésus',
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
'timestamp': 1502623500,
'upload_date': '20170813',
},
}, {
# with catalog
'url': 'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/?idDiffusion=NI_1004933&catalogue=Zouzous&callback=_jsonp_loader_callback_request_4',
'only_matching': True,
}, {
'url': 'http://videos.francetv.fr/video/NI_657393@Regions',
'only_matching': True,
}, {
'url': 'francetv:162311093',
'only_matching': True,
}, {
'url': 'francetv:NI_1004933@Zouzous',
'only_matching': True,
}, {
'url': 'francetv:NI_983319@Info-web',
'only_matching': True,
}, {
'url': 'francetv:NI_983319',
'only_matching': True,
}, {
'url': 'francetv:NI_657393@Regions',
'only_matching': True,
}, {
# france-3 live
'url': 'francetv:SIM_France3',
'only_matching': True,
}]
def _extract_video(self, video_id, catalogue=None): def _extract_video(self, video_id, catalogue=None):
# Videos are identified by idDiffusion so catalogue part is optional.
# However when provided, some extra formats may be returned so we pass
# it if available.
info = self._download_json( info = self._download_json(
'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/', 'https://sivideo.webservices.francetelevisions.fr/tools/getInfosOeuvre/v2/',
video_id, 'Downloading video JSON', query={ video_id, 'Downloading video JSON', query={
@ -97,8 +27,7 @@ class FranceTVIE(InfoExtractor):
if info.get('status') == 'NOK': if info.get('status') == 'NOK':
raise ExtractorError( raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, info['message']), '%s returned error: %s' % (self.IE_NAME, info['message']), expected=True)
expected=True)
allowed_countries = info['videos'][0].get('geoblocage') allowed_countries = info['videos'][0].get('geoblocage')
if allowed_countries: if allowed_countries:
georestricted = True georestricted = True
@ -113,21 +42,6 @@ class FranceTVIE(InfoExtractor):
else: else:
georestricted = False georestricted = False
def sign(manifest_url, manifest_id):
for host in ('hdfauthftv-a.akamaihd.net', 'hdfauth.francetv.fr'):
signed_url = self._download_webpage(
'https://%s/esi/TA' % host, video_id,
'Downloading signed %s manifest URL' % manifest_id,
fatal=False, query={
'url': manifest_url,
})
if (signed_url and isinstance(signed_url, compat_str) and
re.search(r'^(?:https?:)?//', signed_url)):
return signed_url
return manifest_url
is_live = None
formats = [] formats = []
for video in info['videos']: for video in info['videos']:
if video['statut'] != 'ONLINE': if video['statut'] != 'ONLINE':
@ -135,10 +49,6 @@ class FranceTVIE(InfoExtractor):
video_url = video['url'] video_url = video['url']
if not video_url: if not video_url:
continue continue
if is_live is None:
is_live = (try_get(
video, lambda x: x['plages_ouverture'][0]['direct'],
bool) is True) or '/live.francetv.fr/' in video_url
format_id = video['format'] format_id = video['format']
ext = determine_ext(video_url) ext = determine_ext(video_url)
if ext == 'f4m': if ext == 'f4m':
@ -146,14 +56,17 @@ class FranceTVIE(InfoExtractor):
# See https://github.com/rg3/youtube-dl/issues/3963 # See https://github.com/rg3/youtube-dl/issues/3963
# m3u8 urls work fine # m3u8 urls work fine
continue continue
f4m_url = self._download_webpage(
'http://hdfauth.francetv.fr/esi/TA?url=%s' % video_url,
video_id, 'Downloading f4m manifest token', fatal=False)
if f4m_url:
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
sign(video_url, format_id) + '&hdcore=3.7.0&plugin=aasp-3.7.0.39.44', f4m_url + '&hdcore=3.7.0&plugin=aasp-3.7.0.39.44',
video_id, f4m_id=format_id, fatal=False)) video_id, f4m_id=format_id, fatal=False))
elif ext == 'm3u8': elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
sign(video_url, format_id), video_id, 'mp4', video_url, video_id, 'mp4', entry_protocol='m3u8_native',
entry_protocol='m3u8_native', m3u8_id=format_id, m3u8_id=format_id, fatal=False))
fatal=False))
elif video_url.startswith('rtmp'): elif video_url.startswith('rtmp'):
formats.append({ formats.append({
'url': video_url, 'url': video_url,
@ -184,48 +97,33 @@ class FranceTVIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'title': self._live_title(title) if is_live else title, 'title': title,
'description': clean_html(info['synopsis']), 'description': clean_html(info['synopsis']),
'thumbnail': compat_urlparse.urljoin('http://pluzz.francetv.fr', info['image']), 'thumbnail': compat_urlparse.urljoin('http://pluzz.francetv.fr', info['image']),
'duration': int_or_none(info.get('real_duration')) or parse_duration(info['duree']), 'duration': int_or_none(info.get('real_duration')) or parse_duration(info['duree']),
'timestamp': int_or_none(info['diffusion']['timestamp']), 'timestamp': int_or_none(info['diffusion']['timestamp']),
'is_live': is_live,
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
} }
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
catalog = mobj.group('catalog')
if not video_id: class FranceTVIE(FranceTVBaseInfoExtractor):
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
video_id = qs.get('idDiffusion', [None])[0]
catalog = qs.get('catalogue', [None])[0]
if not video_id:
raise ExtractorError('Invalid URL', expected=True)
return self._extract_video(video_id, catalog)
class FranceTVSiteIE(FranceTVBaseInfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html' _VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html', 'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html',
'info_dict': { 'info_dict': {
'id': '162311093', 'id': '157550144',
'ext': 'mp4', 'ext': 'mp4',
'title': '13h15, le dimanche... - Les mystères de Jésus', 'title': '13h15, le dimanche... - Les mystères de Jésus',
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42', 'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
'timestamp': 1502623500, 'timestamp': 1494156300,
'upload_date': '20170813', 'upload_date': '20170507',
}, },
'params': { 'params': {
# m3u8 downloads
'skip_download': True, 'skip_download': True,
}, },
'add_ie': [FranceTVIE.ie_key()],
}, { }, {
# france3 # france3
'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html', 'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html',
@ -258,10 +156,6 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
}, { }, {
'url': 'https://www.france.tv/142749-rouge-sang.html', 'url': 'https://www.france.tv/142749-rouge-sang.html',
'only_matching': True, 'only_matching': True,
}, {
# france-3 live
'url': 'https://www.france.tv/france-3/direct.html',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -278,14 +172,13 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
video_id, catalogue = self._html_search_regex( video_id, catalogue = self._html_search_regex(
r'(?:href=|player\.setVideo\(\s*)"http://videos?\.francetv\.fr/video/([^@]+@[^"]+)"', r'(?:href=|player\.setVideo\(\s*)"http://videos?\.francetv\.fr/video/([^@]+@[^"]+)"',
webpage, 'video ID').split('@') webpage, 'video ID').split('@')
return self._extract_video(video_id, catalogue)
return self._make_url_result(video_id, catalogue)
class FranceTVEmbedIE(FranceTVBaseInfoExtractor): class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
_VALID_URL = r'https?://embed\.francetv\.fr/*\?.*?\bue=(?P<id>[^&]+)' _VALID_URL = r'https?://embed\.francetv\.fr/*\?.*?\bue=(?P<id>[^&]+)'
_TESTS = [{ _TEST = {
'url': 'http://embed.francetv.fr/?ue=7fd581a2ccf59d2fc5719c5c13cf6961', 'url': 'http://embed.francetv.fr/?ue=7fd581a2ccf59d2fc5719c5c13cf6961',
'info_dict': { 'info_dict': {
'id': 'NI_983319', 'id': 'NI_983319',
@ -295,11 +188,7 @@ class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
'timestamp': 1493981780, 'timestamp': 1493981780,
'duration': 16, 'duration': 16,
}, },
'params': { }
'skip_download': True,
},
'add_ie': [FranceTVIE.ie_key()],
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@ -308,12 +197,12 @@ class FranceTVEmbedIE(FranceTVBaseInfoExtractor):
'http://api-embed.webservices.francetelevisions.fr/key/%s' % video_id, 'http://api-embed.webservices.francetelevisions.fr/key/%s' % video_id,
video_id) video_id)
return self._make_url_result(video['video_id'], video.get('catalog')) return self._extract_video(video['video_id'], video.get('catalog'))
class FranceTVInfoIE(FranceTVBaseInfoExtractor): class FranceTVInfoIE(FranceTVBaseInfoExtractor):
IE_NAME = 'francetvinfo.fr' IE_NAME = 'francetvinfo.fr'
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&.]+)' _VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<title>[^/?#&.]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html', 'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
@ -328,18 +217,51 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
}, },
}, },
'params': { 'params': {
# m3u8 downloads
'skip_download': True, 'skip_download': True,
}, },
'add_ie': [FranceTVIE.ie_key()],
}, { }, {
'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html', 'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html',
'only_matching': True, 'info_dict': {
'id': 'EV_20019',
'ext': 'mp4',
'title': 'Débat des candidats à la Commission européenne',
'description': 'Débat des candidats à la Commission européenne',
},
'params': {
'skip_download': 'HLS (reqires ffmpeg)'
},
'skip': 'Ce direct est terminé et sera disponible en rattrapage dans quelques minutes.',
}, { }, {
'url': 'http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html', 'url': 'http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html',
'only_matching': True, 'md5': 'f485bda6e185e7d15dbc69b72bae993e',
'info_dict': {
'id': 'NI_173343',
'ext': 'mp4',
'title': 'Les entreprises familiales : le secret de la réussite',
'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1433273139,
'upload_date': '20150602',
},
'params': {
# m3u8 downloads
'skip_download': True,
},
}, { }, {
'url': 'http://france3-regions.francetvinfo.fr/bretagne/cotes-d-armor/thalassa-echappee-breizh-ce-venredi-dans-les-cotes-d-armor-954961.html', 'url': 'http://france3-regions.francetvinfo.fr/bretagne/cotes-d-armor/thalassa-echappee-breizh-ce-venredi-dans-les-cotes-d-armor-954961.html',
'only_matching': True, 'md5': 'f485bda6e185e7d15dbc69b72bae993e',
'info_dict': {
'id': 'NI_657393',
'ext': 'mp4',
'title': 'Olivier Monthus, réalisateur de "Bretagne, le choix de lArmor"',
'description': 'md5:a3264114c9d29aeca11ced113c37b16c',
'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1458300695,
'upload_date': '20160318',
},
'params': {
'skip_download': True,
},
}, { }, {
# Dailymotion embed # Dailymotion embed
'url': 'http://www.francetvinfo.fr/politique/notre-dame-des-landes/video-sur-france-inter-cecile-duflot-denonce-le-regard-meprisant-de-patrick-cohen_1520091.html', 'url': 'http://www.francetvinfo.fr/politique/notre-dame-des-landes/video-sur-france-inter-cecile-duflot-denonce-le-regard-meprisant-de-patrick-cohen_1520091.html',
@ -361,9 +283,9 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
}] }]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
page_title = mobj.group('title')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, page_title)
dailymotion_urls = DailymotionIE._extract_urls(webpage) dailymotion_urls = DailymotionIE._extract_urls(webpage)
if dailymotion_urls: if dailymotion_urls:
@ -375,38 +297,12 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
(r'id-video=([^@]+@[^"]+)', (r'id-video=([^@]+@[^"]+)',
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"'), r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"'),
webpage, 'video id').split('@') webpage, 'video id').split('@')
return self._extract_video(video_id, catalogue)
return self._make_url_result(video_id, catalogue)
class FranceTVInfoSportIE(FranceTVBaseInfoExtractor):
IE_NAME = 'sport.francetvinfo.fr'
_VALID_URL = r'https?://sport\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://sport.francetvinfo.fr/les-jeux-olympiques/retour-sur-les-meilleurs-moments-de-pyeongchang-2018',
'info_dict': {
'id': '6e49080e-3f45-11e8-b459-000d3a2439ea',
'ext': 'mp4',
'title': 'Retour sur les meilleurs moments de Pyeongchang 2018',
'timestamp': 1523639962,
'upload_date': '20180413',
},
'params': {
'skip_download': True,
},
'add_ie': [FranceTVIE.ie_key()],
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(r'data-video="([^"]+)"', webpage, 'video_id')
return self._make_url_result(video_id, 'Sport-web')
class GenerationWhatIE(InfoExtractor): class GenerationWhatIE(InfoExtractor):
IE_NAME = 'france2.fr:generation-what' IE_NAME = 'france2.fr:generation-what'
_VALID_URL = r'https?://generation-what\.francetv\.fr/[^/]+/video/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://generation-what\.francetv\.fr/[^/]+/video/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://generation-what.francetv.fr/portrait/video/present-arms', 'url': 'http://generation-what.francetv.fr/portrait/video/present-arms',
@ -418,10 +314,6 @@ class GenerationWhatIE(InfoExtractor):
'uploader_id': 'UCHH9p1eetWCgt4kXBYCb3_w', 'uploader_id': 'UCHH9p1eetWCgt4kXBYCb3_w',
'upload_date': '20160411', 'upload_date': '20160411',
}, },
'params': {
'skip_download': True,
},
'add_ie': ['Youtube'],
}, { }, {
'url': 'http://generation-what.francetv.fr/europe/video/present-arms', 'url': 'http://generation-what.francetv.fr/europe/video/present-arms',
'only_matching': True, 'only_matching': True,
@ -429,87 +321,42 @@ class GenerationWhatIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
youtube_id = self._search_regex( youtube_id = self._search_regex(
r"window\.videoURL\s*=\s*'([0-9A-Za-z_-]{11})';", r"window\.videoURL\s*=\s*'([0-9A-Za-z_-]{11})';",
webpage, 'youtube id') webpage, 'youtube id')
return self.url_result(youtube_id, 'Youtube', youtube_id)
return self.url_result(youtube_id, ie='Youtube', video_id=youtube_id)
class CultureboxIE(FranceTVBaseInfoExtractor): class CultureboxIE(FranceTVBaseInfoExtractor):
_VALID_URL = r'https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&]+)' IE_NAME = 'culturebox.francetvinfo.fr'
_VALID_URL = r'https?://(?:m\.)?culturebox\.francetvinfo\.fr/(?P<name>.*?)(\?|$)'
_TESTS = [{ _TEST = {
'url': 'https://culturebox.francetvinfo.fr/opera-classique/musique-classique/c-est-baroque/concerts/cantates-bwv-4-106-et-131-de-bach-par-raphael-pichon-57-268689', 'url': 'http://culturebox.francetvinfo.fr/live/musique/musique-classique/le-livre-vermeil-de-montserrat-a-la-cathedrale-delne-214511',
'md5': '9b88dc156781c4dbebd4c3e066e0b1d6',
'info_dict': { 'info_dict': {
'id': 'EV_134885', 'id': 'EV_50111',
'ext': 'mp4', 'ext': 'flv',
'title': 'Cantates BWV 4, 106 et 131 de Bach par Raphaël Pichon 5/7', 'title': "Le Livre Vermeil de Montserrat à la Cathédrale d'Elne",
'description': 'md5:19c44af004b88219f4daa50fa9a351d4', 'description': 'md5:f8a4ad202e8fe533e2c493cc12e739d9',
'upload_date': '20180206', 'upload_date': '20150320',
'timestamp': 1517945220, 'timestamp': 1426892400,
'duration': 5981, 'duration': 2760.9,
}, },
'params': { }
'skip_download': True,
},
'add_ie': [FranceTVIE.ie_key()],
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
name = mobj.group('name')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, name)
if ">Ce live n'est plus disponible en replay<" in webpage: if ">Ce live n'est plus disponible en replay<" in webpage:
raise ExtractorError( raise ExtractorError('Video %s is not available' % name, expected=True)
'Video %s is not available' % display_id, expected=True)
video_id, catalogue = self._search_regex( video_id, catalogue = self._search_regex(
r'["\'>]https?://videos\.francetv\.fr/video/([^@]+@.+?)["\'<]', r'["\'>]https?://videos\.francetv\.fr/video/([^@]+@.+?)["\'<]',
webpage, 'video id').split('@') webpage, 'video id').split('@')
return self._make_url_result(video_id, catalogue) return self._extract_video(video_id, catalogue)
class FranceTVJeunesseIE(FranceTVBaseInfoExtractor):
_VALID_URL = r'(?P<url>https?://(?:www\.)?(?:zouzous|ludo)\.fr/heros/(?P<id>[^/?#&]+))'
_TESTS = [{
'url': 'https://www.zouzous.fr/heros/simon',
'info_dict': {
'id': 'simon',
},
'playlist_count': 9,
}, {
'url': 'https://www.ludo.fr/heros/ninjago',
'info_dict': {
'id': 'ninjago',
},
'playlist_count': 10,
}, {
'url': 'https://www.zouzous.fr/heros/simon?abc',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
playlist = self._download_json(
'%s/%s' % (mobj.group('url'), 'playlist'), playlist_id)
if not playlist.get('count'):
raise ExtractorError(
'%s is not available' % playlist_id, expected=True)
entries = []
for item in playlist['items']:
identity = item.get('identity')
if identity and isinstance(identity, compat_str):
entries.append(self._make_url_result(identity))
return self.playlist_result(entries, playlist_id)

View File

@ -51,7 +51,7 @@ class FunimationIE(InfoExtractor):
}] }]
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
try: try:

View File

@ -1,131 +1,43 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from .nexx import NexxIE from .nexx import NexxIE
from ..utils import ( from ..utils import extract_attributes
int_or_none,
try_get,
)
class FunkBaseIE(InfoExtractor): class FunkIE(InfoExtractor):
def _make_url_result(self, video): _VALID_URL = r'https?://(?:www\.)?funk\.net/(?:mix|channel)/(?:[^/]+/)*(?P<id>[^?/#]+)'
return {
'_type': 'url_transparent',
'url': 'nexx:741:%s' % video['sourceId'],
'ie_key': NexxIE.ie_key(),
'id': video['sourceId'],
'title': video.get('title'),
'description': video.get('description'),
'duration': int_or_none(video.get('duration')),
'season_number': int_or_none(video.get('seasonNr')),
'episode_number': int_or_none(video.get('episodeNr')),
}
class FunkMixIE(FunkBaseIE):
_VALID_URL = r'https?://(?:www\.)?funk\.net/mix/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/die-realste-kifferdoku-aller-zeiten', 'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/0/59d517e741dca10001252574/',
'md5': '8edf617c2f2b7c9847dfda313f199009', 'md5': '4d40974481fa3475f8bccfd20c5361f8',
'info_dict': { 'info_dict': {
'id': '123748', 'id': '716599',
'ext': 'mp4', 'ext': 'mp4',
'title': '"Die realste Kifferdoku aller Zeiten"', 'title': 'Neue Rechte Welle',
'description': 'md5:c97160f5bafa8d47ec8e2e461012aa9d', 'description': 'md5:a30a53f740ffb6bfd535314c2cc5fb69',
'timestamp': 1490274721, 'timestamp': 1501337639,
'upload_date': '20170323', 'upload_date': '20170729',
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mix_id = mobj.group('id')
alias = mobj.group('alias')
lists = self._download_json(
'https://www.funk.net/api/v3.1/curation/curatedLists/',
mix_id, headers={
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbC12Mi4wIiwic2NvcGUiOiJzdGF0aWMtY29udGVudC1hcGksY3VyYXRpb24tc2VydmljZSxzZWFyY2gtYXBpIn0.SGCC1IXHLtZYoo8PvRKlU2gXH1su8YSu47sB3S4iXBI',
'Referer': url,
}, query={
'size': 100,
})['result']['lists']
metas = next(
l for l in lists
if mix_id in (l.get('entityId'), l.get('alias')))['videoMetas']
video = next(
meta['videoDataDelegate']
for meta in metas if meta.get('alias') == alias)
return self._make_url_result(video)
class FunkChannelIE(FunkBaseIE):
_VALID_URL = r'https?://(?:www\.)?funk\.net/channel/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.funk.net/channel/ba/die-lustigsten-instrumente-aus-dem-internet-teil-2',
'info_dict': {
'id': '1155821',
'ext': 'mp4',
'title': 'Die LUSTIGSTEN INSTRUMENTE aus dem Internet - Teil 2',
'description': 'md5:a691d0413ef4835588c5b03ded670c1f',
'timestamp': 1514507395,
'upload_date': '20171229',
}, },
'params': { 'params': {
'format': 'bestvideo',
'skip_download': True, 'skip_download': True,
}, },
}, { }, {
# only available via byIdList API 'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/0/59d52049999264000182e79d/',
'url': 'https://www.funk.net/channel/informr/martin-sonneborn-erklaert-die-eu',
'info_dict': {
'id': '205067',
'ext': 'mp4',
'title': 'Martin Sonneborn erklärt die EU',
'description': 'md5:050f74626e4ed87edf4626d2024210c0',
'timestamp': 1494424042,
'upload_date': '20170510',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/mein-erster-job-lovemilla-folge-1/lovemilla/',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id = self._match_id(url)
channel_id = mobj.group('id')
alias = mobj.group('alias')
headers = { webpage = self._download_webpage(url, video_id)
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbCIsInNjb3BlIjoic3RhdGljLWNvbnRlbnQtYXBpLGN1cmF0aW9uLWFwaSxzZWFyY2gtYXBpIn0.q4Y2xZG8PFHai24-4Pjx2gym9RmJejtmK6lMXP5wAgc',
'Referer': url,
}
video = None domain_id = NexxIE._extract_domain_id(webpage) or '741'
nexx_id = extract_attributes(self._search_regex(
r'(<div[^>]id=["\']mediaplayer-funk[^>]+>)',
webpage, 'media player'))['data-id']
by_id_list = self._download_json( return self.url_result(
'https://www.funk.net/api/v3.0/content/videos/byIdList', channel_id, 'nexx:%s:%s' % (domain_id, nexx_id), ie=NexxIE.ie_key(),
headers=headers, query={ video_id=nexx_id)
'ids': alias,
}, fatal=False)
if by_id_list:
video = try_get(by_id_list, lambda x: x['result'][0], dict)
if not video:
results = self._download_json(
'https://www.funk.net/api/v3.0/content/videos/filter', channel_id,
headers=headers, query={
'channelId': channel_id,
'size': 100,
})['result']
video = next(r for r in results if r.get('alias') == alias)
return self._make_url_result(video)

View File

@ -5,9 +5,9 @@ from .ooyala import OoyalaIE
class FusionIE(InfoExtractor): class FusionIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?fusion\.(?:net|tv)/video/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?fusion\.net/video/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://fusion.tv/video/201781/u-s-and-panamanian-forces-work-together-to-stop-a-vessel-smuggling-drugs/', 'url': 'http://fusion.net/video/201781/u-s-and-panamanian-forces-work-together-to-stop-a-vessel-smuggling-drugs/',
'info_dict': { 'info_dict': {
'id': 'ZpcWNoMTE6x6uVIIWYpHh0qQDjxBuq5P', 'id': 'ZpcWNoMTE6x6uVIIWYpHh0qQDjxBuq5P',
'ext': 'mp4', 'ext': 'mp4',
@ -20,7 +20,7 @@ class FusionIE(InfoExtractor):
}, },
'add_ie': ['Ooyala'], 'add_ie': ['Ooyala'],
}, { }, {
'url': 'http://fusion.tv/video/201781', 'url': 'http://fusion.net/video/201781',
'only_matching': True, 'only_matching': True,
}] }]

View File

@ -41,7 +41,7 @@ class FXNetworksIE(AdobePassIE):
if 'The content you are trying to access is not available in your region.' in webpage: if 'The content you are trying to access is not available in your region.' in webpage:
self.raise_geo_restricted() self.raise_geo_restricted()
video_data = extract_attributes(self._search_regex( video_data = extract_attributes(self._search_regex(
r'(<a.+?rel="https?://link\.theplatform\.com/s/.+?</a>)', webpage, 'video data')) r'(<a.+?rel="http://link\.theplatform\.com/s/.+?</a>)', webpage, 'video data'))
player_type = self._search_regex(r'playerType\s*=\s*[\'"]([^\'"]+)', webpage, 'player type', default=None) player_type = self._search_regex(r'playerType\s*=\s*[\'"]([^\'"]+)', webpage, 'player type', default=None)
release_url = video_data['rel'] release_url = video_data['rel']
title = video_data['data-title'] title = video_data['data-title']

Some files were not shown because too many files have changed in this diff Show More