Compare commits
2 Commits
2018.02.08
...
totalwebca
Author | SHA1 | Date | |
---|---|---|---|
97bc05116e | |||
7608a91ee7 |
7
.github/ISSUE_TEMPLATE.md
vendored
7
.github/ISSUE_TEMPLATE.md
vendored
@ -6,13 +6,12 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.02.08*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.02.08**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.12.31*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.12.31**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
||||
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
|
||||
|
||||
### What is the purpose of your *issue*?
|
||||
- [ ] Bug report (encountered problems with youtube-dl)
|
||||
@ -36,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2018.02.08
|
||||
[debug] youtube-dl version 2017.12.31
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
1
.github/ISSUE_TEMPLATE_tmpl.md
vendored
1
.github/ISSUE_TEMPLATE_tmpl.md
vendored
@ -12,7 +12,6 @@
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
|
||||
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
|
||||
|
||||
### What is the purpose of your *issue*?
|
||||
- [ ] Bug report (encountered problems with youtube-dl)
|
||||
|
2
AUTHORS
2
AUTHORS
@ -231,5 +231,3 @@ John Dong
|
||||
Tatsuyuki Ishi
|
||||
Daniel Weber
|
||||
Kay Bouché
|
||||
Yang Hongbo
|
||||
Lei Wang
|
||||
|
132
ChangeLog
132
ChangeLog
@ -1,139 +1,9 @@
|
||||
version 2018.02.08
|
||||
version <unreleased>
|
||||
|
||||
Extractors
|
||||
+ [myvi] Extend URL regular expression
|
||||
+ [myvi:embed] Add support for myvi.tv embeds (#15521)
|
||||
+ [prosiebensat1] Extend URL regular expression (#15520)
|
||||
* [pokemon] Relax URL regular expression and extend title extraction (#15518)
|
||||
+ [gameinformer] Use geo verification headers
|
||||
* [la7] Fix extraction (#15501, #15502)
|
||||
* [gameinformer] Fix brightcove id extraction (#15416)
|
||||
+ [afreecatv] Pass referrer to video info request (#15507)
|
||||
+ [telebruxelles] Add support for live streams
|
||||
* [telebruxelles] Relax URL regular expression
|
||||
* [telebruxelles] Fix extraction (#15504)
|
||||
* [extractor/common] Respect secure schemes in _extract_wowza_formats
|
||||
|
||||
|
||||
version 2018.02.04
|
||||
|
||||
Core
|
||||
* [downloader/http] Randomize HTTP chunk size
|
||||
+ [downloader/http] Add ability to pass downloader options via info dict
|
||||
* [downloader/http] Fix 302 infinite loops by not reusing requests
|
||||
+ Document http_chunk_size
|
||||
|
||||
Extractors
|
||||
+ [brightcove] Pass embed page URL as referrer (#15486)
|
||||
+ [youtube] Enforce using chunked HTTP downloading for DASH formats
|
||||
|
||||
|
||||
version 2018.02.03
|
||||
|
||||
Core
|
||||
+ Introduce --http-chunk-size for chunk-based HTTP downloading
|
||||
+ Add support for IronPython
|
||||
* [downloader/ism] Fix Python 3.2 support
|
||||
|
||||
Extractors
|
||||
* [redbulltv] Fix extraction (#15481)
|
||||
* [redtube] Fix metadata extraction (#15472)
|
||||
* [pladform] Respect platform id and extract HLS formats (#15468)
|
||||
- [rtlnl] Remove progressive formats (#15459)
|
||||
* [6play] Do no modify asset URLs with a token (#15248)
|
||||
* [nationalgeographic] Relax URL regular expression
|
||||
* [dplay] Relax URL regular expression (#15458)
|
||||
* [cbsinteractive] Fix data extraction (#15451)
|
||||
+ [amcnetworks] Add support for sundancetv.com (#9260)
|
||||
|
||||
|
||||
version 2018.01.27
|
||||
|
||||
Core
|
||||
* [extractor/common] Improve _json_ld for articles
|
||||
* Switch codebase to use compat_b64decode
|
||||
+ [compat] Add compat_b64decode
|
||||
|
||||
Extractors
|
||||
+ [seznamzpravy] Add support for seznam.cz and seznamzpravy.cz (#14102, #14616)
|
||||
* [dplay] Bypass geo restriction
|
||||
+ [dplay] Add support for disco-api videos (#15396)
|
||||
* [youtube] Extract precise error messages (#15284)
|
||||
* [teachertube] Capture and output error message
|
||||
* [teachertube] Fix and relax thumbnail extraction (#15403)
|
||||
+ [prosiebensat1] Add another clip id regular expression (#15378)
|
||||
* [tbs] Update tokenizer url (#15395)
|
||||
* [mixcloud] Use compat_b64decode (#15394)
|
||||
- [thesixtyone] Remove extractor (#15341)
|
||||
|
||||
|
||||
version 2018.01.21
|
||||
|
||||
Core
|
||||
* [extractor/common] Improve jwplayer DASH formats extraction (#9242, #15187)
|
||||
* [utils] Improve scientific notation handling in js_to_json (#14789)
|
||||
|
||||
Extractors
|
||||
+ [southparkdk] Add support for southparkstudios.nu
|
||||
+ [southpark] Add support for collections (#14803)
|
||||
* [franceinter] Fix upload date extraction (#14996)
|
||||
+ [rtvs] Add support for rtvs.sk (#9242, #15187)
|
||||
* [restudy] Fix extraction and extend URL regular expression (#15347)
|
||||
* [youtube:live] Improve live detection (#15365)
|
||||
+ [springboardplatform] Add support for springboardplatform.com
|
||||
* [prosiebensat1] Add another clip id regular expression (#15290)
|
||||
- [ringtv] Remove extractor (#15345)
|
||||
|
||||
|
||||
version 2018.01.18
|
||||
|
||||
Extractors
|
||||
* [soundcloud] Update client id (#15306)
|
||||
- [kamcord] Remove extractor (#15322)
|
||||
+ [spiegel] Add support for nexx videos (#15285)
|
||||
* [twitch] Fix authentication and error capture (#14090, #15264)
|
||||
* [vk] Detect more errors due to copyright complaints (#15259)
|
||||
|
||||
|
||||
version 2018.01.14
|
||||
|
||||
Extractors
|
||||
* [youtube] Fix live streams extraction (#15202)
|
||||
* [wdr] Bypass geo restriction
|
||||
* [wdr] Rework extractors (#14598)
|
||||
+ [wdr] Add support for wdrmaus.de/elefantenseite (#14598)
|
||||
+ [gamestar] Add support for gamepro.de (#3384)
|
||||
* [viafree] Skip rtmp formats (#15232)
|
||||
+ [pandoratv] Add support for mobile URLs (#12441)
|
||||
+ [pandoratv] Add support for new URL format (#15131)
|
||||
+ [ximalaya] Add support for ximalaya.com (#14687)
|
||||
+ [digg] Add support for digg.com (#15214)
|
||||
* [limelight] Tolerate empty pc formats (#15150, #15151, #15207)
|
||||
* [ndr:embed:base] Make separate formats extraction non fatal (#15203)
|
||||
+ [weibo] Add extractor (#15079)
|
||||
+ [ok] Add support for live streams
|
||||
* [canalplus] Fix extraction (#15072)
|
||||
* [bilibili] Fix extraction (#15188)
|
||||
|
||||
|
||||
version 2018.01.07
|
||||
|
||||
Core
|
||||
* [utils] Fix youtube-dl under PyPy3 on Windows
|
||||
* [YoutubeDL] Output python implementation in debug header
|
||||
|
||||
Extractors
|
||||
+ [jwplatform] Add support for multiple embeds (#15192)
|
||||
* [mitele] Fix extraction (#15186)
|
||||
+ [motherless] Add support for groups (#15124)
|
||||
* [lynda] Relax URL regular expression (#15185)
|
||||
* [soundcloud] Fallback to avatar picture for thumbnail (#12878)
|
||||
* [youku] Fix list extraction (#15135)
|
||||
* [openload] Fix extraction (#15166)
|
||||
* [lynda] Skip invalid subtitles (#15159)
|
||||
* [twitch] Pass video id to url_result when extracting playlist (#15139)
|
||||
* [rtve.es:alacarta] Fix extraction of some new URLs
|
||||
* [acast] Fix extraction (#15147)
|
||||
|
||||
|
||||
version 2017.12.31
|
||||
|
@ -46,7 +46,7 @@ Or with [MacPorts](https://www.macports.org/):
|
||||
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
||||
|
||||
# DESCRIPTION
|
||||
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
||||
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
||||
|
||||
youtube-dl [OPTIONS] URL [URL...]
|
||||
|
||||
@ -198,11 +198,6 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
|
||||
size. By default, the buffer size is
|
||||
automatically resized from an initial value
|
||||
of SIZE.
|
||||
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
|
||||
downloading (e.g. 10485760 or 10M) (default
|
||||
is disabled). May be useful for bypassing
|
||||
bandwidth throttling imposed by a webserver
|
||||
(experimental)
|
||||
--playlist-reverse Download playlist videos in reverse order
|
||||
--playlist-random Download playlist videos in random order
|
||||
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
||||
@ -868,7 +863,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
||||
|
||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
||||
|
||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||
|
||||
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
||||
|
||||
|
@ -128,7 +128,7 @@
|
||||
- **CamdemyFolder**
|
||||
- **CamWithHer**
|
||||
- **canalc2.tv**
|
||||
- **Canalplus**: mycanal.fr and piwiplus.fr
|
||||
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
|
||||
- **Canvas**
|
||||
- **CanvasEen**: canvas.be and een.be
|
||||
- **CarambaTV**
|
||||
@ -210,7 +210,6 @@
|
||||
- **defense.gouv.fr**
|
||||
- **democracynow**
|
||||
- **DHM**: Filmarchiv - Deutsches Historisches Museum
|
||||
- **Digg**
|
||||
- **DigitallySpeaking**
|
||||
- **Digiteka**
|
||||
- **Discovery**
|
||||
@ -383,6 +382,7 @@
|
||||
- **JWPlatform**
|
||||
- **Kakao**
|
||||
- **Kaltura**
|
||||
- **Kamcord**
|
||||
- **KanalPlay**: Kanal 5/9/11 Play
|
||||
- **Kankan**
|
||||
- **Karaoketv**
|
||||
@ -478,7 +478,6 @@
|
||||
- **Moniker**: allmyvideos.net and vidspot.net
|
||||
- **Morningstar**: morningstar.com
|
||||
- **Motherless**
|
||||
- **MotherlessGroup**
|
||||
- **Motorsport**: motorsport.com
|
||||
- **MovieClips**
|
||||
- **MovieFap**
|
||||
@ -502,7 +501,6 @@
|
||||
- **MySpass**
|
||||
- **Myvi**
|
||||
- **MyVidster**
|
||||
- **MyviEmbed**
|
||||
- **n-tv.de**
|
||||
- **natgeo**
|
||||
- **natgeo:episodeguide**
|
||||
@ -683,6 +681,7 @@
|
||||
- **revision**
|
||||
- **revision3:embed**
|
||||
- **RICE**
|
||||
- **RingTV**
|
||||
- **RMCDecouverte**
|
||||
- **RockstarGames**
|
||||
- **RoosterTeeth**
|
||||
@ -703,7 +702,6 @@
|
||||
- **rtve.es:live**: RTVE.es live streams
|
||||
- **rtve.es:television**
|
||||
- **RTVNH**
|
||||
- **RTVS**
|
||||
- **Rudo**
|
||||
- **RUHD**
|
||||
- **RulePorn**
|
||||
@ -733,8 +731,6 @@
|
||||
- **ServingSys**
|
||||
- **Servus**
|
||||
- **Sexu**
|
||||
- **SeznamZpravy**
|
||||
- **SeznamZpravyArticle**
|
||||
- **Shahid**
|
||||
- **ShahidShow**
|
||||
- **Shared**: shared.sx
|
||||
@ -776,7 +772,7 @@
|
||||
- **Sport5**
|
||||
- **SportBoxEmbed**
|
||||
- **SportDeutschland**
|
||||
- **SpringboardPlatform**
|
||||
- **Sportschau**
|
||||
- **Sprout**
|
||||
- **sr:mediathek**: Saarländischer Rundfunk
|
||||
- **SRGSSR**
|
||||
@ -825,6 +821,7 @@
|
||||
- **ThePlatform**
|
||||
- **ThePlatformFeed**
|
||||
- **TheScene**
|
||||
- **TheSixtyOne**
|
||||
- **TheStar**
|
||||
- **TheSun**
|
||||
- **TheWeatherChannel**
|
||||
@ -1004,14 +1001,10 @@
|
||||
- **WatchIndianPorn**: Watch Indian Porn
|
||||
- **WDR**
|
||||
- **wdr:mobile**
|
||||
- **WDRElefant**
|
||||
- **WDRPage**
|
||||
- **Webcaster**
|
||||
- **WebcasterFeed**
|
||||
- **WebOfStories**
|
||||
- **WebOfStoriesPlaylist**
|
||||
- **Weibo**
|
||||
- **WeiboMobile**
|
||||
- **WeiqiTV**: WQTV
|
||||
- **wholecloud**: WholeCloud
|
||||
- **Wimp**
|
||||
@ -1031,8 +1024,6 @@
|
||||
- **xiami:artist**: 虾米音乐 - 歌手
|
||||
- **xiami:collection**: 虾米音乐 - 精选集
|
||||
- **xiami:song**: 虾米音乐
|
||||
- **ximalaya**: 喜马拉雅FM
|
||||
- **ximalaya:album**: 喜马拉雅FM 专辑
|
||||
- **XMinus**
|
||||
- **XNXX**
|
||||
- **Xstream**
|
||||
|
@ -3,4 +3,4 @@ universal = True
|
||||
|
||||
[flake8]
|
||||
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
|
||||
ignore = E402,E501,E731,E741
|
||||
ignore = E402,E501,E731
|
||||
|
@ -92,8 +92,8 @@ class TestDownload(unittest.TestCase):
|
||||
def generator(test_case, tname):
|
||||
|
||||
def test_template(self):
|
||||
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])()
|
||||
other_ies = [get_info_extractor(ie_key)() for ie_key in test_case.get('add_ie', [])]
|
||||
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
|
||||
other_ies = [get_info_extractor(ie_key) for ie_key in test_case.get('add_ie', [])]
|
||||
is_playlist = any(k.startswith('playlist') for k in test_case)
|
||||
test_cases = test_case.get(
|
||||
'playlist', [] if is_playlist else [test_case])
|
||||
|
@ -1,125 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
# Allow direct execution
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import unittest
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
from test.helper import try_rm
|
||||
from youtube_dl import YoutubeDL
|
||||
from youtube_dl.compat import compat_http_server
|
||||
from youtube_dl.downloader.http import HttpFD
|
||||
from youtube_dl.utils import encodeFilename
|
||||
import ssl
|
||||
import threading
|
||||
|
||||
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
|
||||
|
||||
def http_server_port(httpd):
|
||||
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
|
||||
# In Jython SSLSocket is not a subclass of socket.socket
|
||||
sock = httpd.socket.sock
|
||||
else:
|
||||
sock = httpd.socket
|
||||
return sock.getsockname()[1]
|
||||
|
||||
|
||||
TEST_SIZE = 10 * 1024
|
||||
|
||||
|
||||
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
||||
def log_message(self, format, *args):
|
||||
pass
|
||||
|
||||
def send_content_range(self, total=None):
|
||||
range_header = self.headers.get('Range')
|
||||
start = end = None
|
||||
if range_header:
|
||||
mobj = re.search(r'^bytes=(\d+)-(\d+)', range_header)
|
||||
if mobj:
|
||||
start = int(mobj.group(1))
|
||||
end = int(mobj.group(2))
|
||||
valid_range = start is not None and end is not None
|
||||
if valid_range:
|
||||
content_range = 'bytes %d-%d' % (start, end)
|
||||
if total:
|
||||
content_range += '/%d' % total
|
||||
self.send_header('Content-Range', content_range)
|
||||
return (end - start + 1) if valid_range else total
|
||||
|
||||
def serve(self, range=True, content_length=True):
|
||||
self.send_response(200)
|
||||
self.send_header('Content-Type', 'video/mp4')
|
||||
size = TEST_SIZE
|
||||
if range:
|
||||
size = self.send_content_range(TEST_SIZE)
|
||||
if content_length:
|
||||
self.send_header('Content-Length', size)
|
||||
self.end_headers()
|
||||
self.wfile.write(b'#' * size)
|
||||
|
||||
def do_GET(self):
|
||||
if self.path == '/regular':
|
||||
self.serve()
|
||||
elif self.path == '/no-content-length':
|
||||
self.serve(content_length=False)
|
||||
elif self.path == '/no-range':
|
||||
self.serve(range=False)
|
||||
elif self.path == '/no-range-no-content-length':
|
||||
self.serve(range=False, content_length=False)
|
||||
else:
|
||||
assert False
|
||||
|
||||
|
||||
class FakeLogger(object):
|
||||
def debug(self, msg):
|
||||
pass
|
||||
|
||||
def warning(self, msg):
|
||||
pass
|
||||
|
||||
def error(self, msg):
|
||||
pass
|
||||
|
||||
|
||||
class TestHttpFD(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.httpd = compat_http_server.HTTPServer(
|
||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
||||
self.port = http_server_port(self.httpd)
|
||||
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
||||
self.server_thread.daemon = True
|
||||
self.server_thread.start()
|
||||
|
||||
def download(self, params, ep):
|
||||
params['logger'] = FakeLogger()
|
||||
ydl = YoutubeDL(params)
|
||||
downloader = HttpFD(ydl, params)
|
||||
filename = 'testfile.mp4'
|
||||
try_rm(encodeFilename(filename))
|
||||
self.assertTrue(downloader.real_download(filename, {
|
||||
'url': 'http://127.0.0.1:%d/%s' % (self.port, ep),
|
||||
}))
|
||||
self.assertEqual(os.path.getsize(encodeFilename(filename)), TEST_SIZE)
|
||||
try_rm(encodeFilename(filename))
|
||||
|
||||
def download_all(self, params):
|
||||
for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'):
|
||||
self.download(params, ep)
|
||||
|
||||
def test_regular(self):
|
||||
self.download_all({})
|
||||
|
||||
def test_chunked(self):
|
||||
self.download_all({
|
||||
'http_chunk_size': 1000,
|
||||
})
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
@ -47,7 +47,7 @@ class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
||||
self.end_headers()
|
||||
return
|
||||
|
||||
new_url = 'http://127.0.0.1:%d/中文.html' % http_server_port(self.server)
|
||||
new_url = 'http://localhost:%d/中文.html' % http_server_port(self.server)
|
||||
self.send_response(302)
|
||||
self.send_header(b'Location', new_url.encode('utf-8'))
|
||||
self.end_headers()
|
||||
@ -74,7 +74,7 @@ class FakeLogger(object):
|
||||
class TestHTTP(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.httpd = compat_http_server.HTTPServer(
|
||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
||||
('localhost', 0), HTTPTestRequestHandler)
|
||||
self.port = http_server_port(self.httpd)
|
||||
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
||||
self.server_thread.daemon = True
|
||||
@ -86,15 +86,15 @@ class TestHTTP(unittest.TestCase):
|
||||
return
|
||||
|
||||
ydl = YoutubeDL({'logger': FakeLogger()})
|
||||
r = ydl.extract_info('http://127.0.0.1:%d/302' % self.port)
|
||||
self.assertEqual(r['entries'][0]['url'], 'http://127.0.0.1:%d/vid.mp4' % self.port)
|
||||
r = ydl.extract_info('http://localhost:%d/302' % self.port)
|
||||
self.assertEqual(r['entries'][0]['url'], 'http://localhost:%d/vid.mp4' % self.port)
|
||||
|
||||
|
||||
class TestHTTPS(unittest.TestCase):
|
||||
def setUp(self):
|
||||
certfn = os.path.join(TEST_DIR, 'testcert.pem')
|
||||
self.httpd = compat_http_server.HTTPServer(
|
||||
('127.0.0.1', 0), HTTPTestRequestHandler)
|
||||
('localhost', 0), HTTPTestRequestHandler)
|
||||
self.httpd.socket = ssl.wrap_socket(
|
||||
self.httpd.socket, certfile=certfn, server_side=True)
|
||||
self.port = http_server_port(self.httpd)
|
||||
@ -107,11 +107,11 @@ class TestHTTPS(unittest.TestCase):
|
||||
ydl = YoutubeDL({'logger': FakeLogger()})
|
||||
self.assertRaises(
|
||||
Exception,
|
||||
ydl.extract_info, 'https://127.0.0.1:%d/video.html' % self.port)
|
||||
ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
|
||||
|
||||
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
|
||||
r = ydl.extract_info('https://127.0.0.1:%d/video.html' % self.port)
|
||||
self.assertEqual(r['entries'][0]['url'], 'https://127.0.0.1:%d/vid.mp4' % self.port)
|
||||
r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
|
||||
self.assertEqual(r['entries'][0]['url'], 'https://localhost:%d/vid.mp4' % self.port)
|
||||
|
||||
|
||||
def _build_proxy_handler(name):
|
||||
@ -132,23 +132,23 @@ def _build_proxy_handler(name):
|
||||
class TestProxy(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.proxy = compat_http_server.HTTPServer(
|
||||
('127.0.0.1', 0), _build_proxy_handler('normal'))
|
||||
('localhost', 0), _build_proxy_handler('normal'))
|
||||
self.port = http_server_port(self.proxy)
|
||||
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
|
||||
self.proxy_thread.daemon = True
|
||||
self.proxy_thread.start()
|
||||
|
||||
self.geo_proxy = compat_http_server.HTTPServer(
|
||||
('127.0.0.1', 0), _build_proxy_handler('geo'))
|
||||
('localhost', 0), _build_proxy_handler('geo'))
|
||||
self.geo_port = http_server_port(self.geo_proxy)
|
||||
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
|
||||
self.geo_proxy_thread.daemon = True
|
||||
self.geo_proxy_thread.start()
|
||||
|
||||
def test_proxy(self):
|
||||
geo_proxy = '127.0.0.1:{0}'.format(self.geo_port)
|
||||
geo_proxy = 'localhost:{0}'.format(self.geo_port)
|
||||
ydl = YoutubeDL({
|
||||
'proxy': '127.0.0.1:{0}'.format(self.port),
|
||||
'proxy': 'localhost:{0}'.format(self.port),
|
||||
'geo_verification_proxy': geo_proxy,
|
||||
})
|
||||
url = 'http://foo.com/bar'
|
||||
@ -162,7 +162,7 @@ class TestProxy(unittest.TestCase):
|
||||
|
||||
def test_proxy_with_idn(self):
|
||||
ydl = YoutubeDL({
|
||||
'proxy': '127.0.0.1:{0}'.format(self.port),
|
||||
'proxy': 'localhost:{0}'.format(self.port),
|
||||
})
|
||||
url = 'http://中文.tw/'
|
||||
response = ydl.urlopen(url).read().decode('utf-8')
|
||||
|
@ -814,9 +814,6 @@ class TestUtil(unittest.TestCase):
|
||||
inp = '''{"duration": "00:01:07"}'''
|
||||
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
|
||||
|
||||
inp = '''{segments: [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}'''
|
||||
self.assertEqual(js_to_json(inp), '''{"segments": [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}''')
|
||||
|
||||
def test_js_to_json_edgecases(self):
|
||||
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
|
||||
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
|
||||
@ -888,13 +885,6 @@ class TestUtil(unittest.TestCase):
|
||||
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
|
||||
self.assertEqual(json.loads(on), {'42': 42})
|
||||
|
||||
on = js_to_json('{42:4.2e1}')
|
||||
self.assertEqual(json.loads(on), {'42': 42.0})
|
||||
|
||||
def test_js_to_json_malformed(self):
|
||||
self.assertEqual(js_to_json('42a1'), '42"a1"')
|
||||
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
|
||||
|
||||
def test_extract_attributes(self):
|
||||
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
|
||||
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
|
||||
|
@ -298,8 +298,7 @@ class YoutubeDL(object):
|
||||
the downloader (see youtube_dl/downloader/common.py):
|
||||
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
|
||||
noresizebuffer, retries, continuedl, noprogress, consoletitle,
|
||||
xattr_set_filesize, external_downloader_args, hls_use_mpegts,
|
||||
http_chunk_size.
|
||||
xattr_set_filesize, external_downloader_args, hls_use_mpegts.
|
||||
|
||||
The following options are used by the post processors:
|
||||
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
|
||||
|
@ -191,11 +191,6 @@ def _real_main(argv=None):
|
||||
if numeric_buffersize is None:
|
||||
parser.error('invalid buffer size specified')
|
||||
opts.buffersize = numeric_buffersize
|
||||
if opts.http_chunk_size is not None:
|
||||
numeric_chunksize = FileDownloader.parse_bytes(opts.http_chunk_size)
|
||||
if not numeric_chunksize:
|
||||
parser.error('invalid http chunk size specified')
|
||||
opts.http_chunk_size = numeric_chunksize
|
||||
if opts.playliststart <= 0:
|
||||
raise ValueError('Playlist start must be positive')
|
||||
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
|
||||
@ -351,7 +346,6 @@ def _real_main(argv=None):
|
||||
'keep_fragments': opts.keep_fragments,
|
||||
'buffersize': opts.buffersize,
|
||||
'noresizebuffer': opts.noresizebuffer,
|
||||
'http_chunk_size': opts.http_chunk_size,
|
||||
'continuedl': opts.continue_dl,
|
||||
'noprogress': opts.noprogress,
|
||||
'progress_with_newline': opts.progress_with_newline,
|
||||
|
@ -1,8 +1,8 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
from math import ceil
|
||||
|
||||
from .compat import compat_b64decode
|
||||
from .utils import bytes_to_intlist, intlist_to_bytes
|
||||
|
||||
BLOCK_SIZE_BYTES = 16
|
||||
@ -180,7 +180,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
|
||||
"""
|
||||
NONCE_LENGTH_BYTES = 8
|
||||
|
||||
data = bytes_to_intlist(compat_b64decode(data))
|
||||
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
|
||||
password = bytes_to_intlist(password.encode('utf-8'))
|
||||
|
||||
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))
|
||||
|
@ -1,7 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import binascii
|
||||
import collections
|
||||
import ctypes
|
||||
@ -2897,24 +2896,9 @@ except TypeError:
|
||||
if isinstance(spec, compat_str):
|
||||
spec = spec.encode('ascii')
|
||||
return struct.unpack(spec, *args)
|
||||
|
||||
class compat_Struct(struct.Struct):
|
||||
def __init__(self, fmt):
|
||||
if isinstance(fmt, compat_str):
|
||||
fmt = fmt.encode('ascii')
|
||||
super(compat_Struct, self).__init__(fmt)
|
||||
else:
|
||||
compat_struct_pack = struct.pack
|
||||
compat_struct_unpack = struct.unpack
|
||||
if platform.python_implementation() == 'IronPython' and sys.version_info < (2, 7, 8):
|
||||
class compat_Struct(struct.Struct):
|
||||
def unpack(self, string):
|
||||
if not isinstance(string, buffer): # noqa: F821
|
||||
string = buffer(string) # noqa: F821
|
||||
return super(compat_Struct, self).unpack(string)
|
||||
else:
|
||||
compat_Struct = struct.Struct
|
||||
|
||||
|
||||
try:
|
||||
from future_builtins import zip as compat_zip
|
||||
@ -2924,16 +2908,6 @@ except ImportError: # not 2.6+ or is 3.x
|
||||
except ImportError:
|
||||
compat_zip = zip
|
||||
|
||||
|
||||
if sys.version_info < (3, 3):
|
||||
def compat_b64decode(s, *args, **kwargs):
|
||||
if isinstance(s, compat_str):
|
||||
s = s.encode('ascii')
|
||||
return base64.b64decode(s, *args, **kwargs)
|
||||
else:
|
||||
compat_b64decode = base64.b64decode
|
||||
|
||||
|
||||
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
|
||||
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function
|
||||
# names, see the original PyPy issue [1] and the youtube-dl one [2].
|
||||
@ -2956,8 +2930,6 @@ __all__ = [
|
||||
'compat_HTMLParseError',
|
||||
'compat_HTMLParser',
|
||||
'compat_HTTPError',
|
||||
'compat_Struct',
|
||||
'compat_b64decode',
|
||||
'compat_basestring',
|
||||
'compat_chr',
|
||||
'compat_cookiejar',
|
||||
|
@ -49,9 +49,6 @@ class FileDownloader(object):
|
||||
external_downloader_args: A list of additional command-line arguments for the
|
||||
external downloader.
|
||||
hls_use_mpegts: Use the mpegts container for HLS videos.
|
||||
http_chunk_size: Size of a chunk for chunk-based HTTP downloading.May be
|
||||
useful for bypassing bandwidth throttling imposed by
|
||||
a webserver (experimental)
|
||||
|
||||
Subclasses of this one must re-define the real_download method.
|
||||
"""
|
||||
|
@ -1,12 +1,12 @@
|
||||
from __future__ import division, unicode_literals
|
||||
|
||||
import base64
|
||||
import io
|
||||
import itertools
|
||||
import time
|
||||
|
||||
from .fragment import FragmentFD
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_etree_fromstring,
|
||||
compat_urlparse,
|
||||
compat_urllib_error,
|
||||
@ -312,7 +312,7 @@ class F4mFD(FragmentFD):
|
||||
boot_info = self._get_bootstrap_from_url(bootstrap_url)
|
||||
else:
|
||||
bootstrap_url = None
|
||||
bootstrap = compat_b64decode(node.text)
|
||||
bootstrap = base64.b64decode(node.text.encode('ascii'))
|
||||
boot_info = read_bootstrap_info(bootstrap)
|
||||
return boot_info, bootstrap_url
|
||||
|
||||
@ -349,7 +349,7 @@ class F4mFD(FragmentFD):
|
||||
live = boot_info['live']
|
||||
metadata_node = media.find(_add_ns('metadata'))
|
||||
if metadata_node is not None:
|
||||
metadata = compat_b64decode(metadata_node.text)
|
||||
metadata = base64.b64decode(metadata_node.text.encode('ascii'))
|
||||
else:
|
||||
metadata = None
|
||||
|
||||
|
@ -4,18 +4,13 @@ import errno
|
||||
import os
|
||||
import socket
|
||||
import time
|
||||
import random
|
||||
import re
|
||||
|
||||
from .common import FileDownloader
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urllib_error,
|
||||
)
|
||||
from ..compat import compat_urllib_error
|
||||
from ..utils import (
|
||||
ContentTooShortError,
|
||||
encodeFilename,
|
||||
int_or_none,
|
||||
sanitize_open,
|
||||
sanitized_Request,
|
||||
write_xattr,
|
||||
@ -43,26 +38,21 @@ class HttpFD(FileDownloader):
|
||||
add_headers = info_dict.get('http_headers')
|
||||
if add_headers:
|
||||
headers.update(add_headers)
|
||||
basic_request = sanitized_Request(url, None, headers)
|
||||
request = sanitized_Request(url, None, headers)
|
||||
|
||||
is_test = self.params.get('test', False)
|
||||
chunk_size = self._TEST_FILE_SIZE if is_test else (
|
||||
info_dict.get('downloader_options', {}).get('http_chunk_size') or
|
||||
self.params.get('http_chunk_size') or 0)
|
||||
|
||||
if is_test:
|
||||
request.add_header('Range', 'bytes=0-%s' % str(self._TEST_FILE_SIZE - 1))
|
||||
|
||||
ctx.open_mode = 'wb'
|
||||
ctx.resume_len = 0
|
||||
ctx.data_len = None
|
||||
ctx.block_size = self.params.get('buffersize', 1024)
|
||||
ctx.start_time = time.time()
|
||||
ctx.chunk_size = None
|
||||
|
||||
if self.params.get('continuedl', True):
|
||||
# Establish possible resume length
|
||||
if os.path.isfile(encodeFilename(ctx.tmpfilename)):
|
||||
ctx.resume_len = os.path.getsize(
|
||||
encodeFilename(ctx.tmpfilename))
|
||||
|
||||
ctx.is_resume = ctx.resume_len > 0
|
||||
ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
|
||||
|
||||
count = 0
|
||||
retries = self.params.get('retries', 0)
|
||||
@ -74,36 +64,11 @@ class HttpFD(FileDownloader):
|
||||
def __init__(self, source_error):
|
||||
self.source_error = source_error
|
||||
|
||||
class NextFragment(Exception):
|
||||
pass
|
||||
|
||||
def set_range(req, start, end):
|
||||
range_header = 'bytes=%d-' % start
|
||||
if end:
|
||||
range_header += compat_str(end)
|
||||
req.add_header('Range', range_header)
|
||||
|
||||
def establish_connection():
|
||||
ctx.chunk_size = (random.randint(int(chunk_size * 0.95), chunk_size)
|
||||
if not is_test and chunk_size else chunk_size)
|
||||
if ctx.resume_len > 0:
|
||||
range_start = ctx.resume_len
|
||||
if ctx.is_resume:
|
||||
if ctx.resume_len != 0:
|
||||
self.report_resuming_byte(ctx.resume_len)
|
||||
request.add_header('Range', 'bytes=%d-' % ctx.resume_len)
|
||||
ctx.open_mode = 'ab'
|
||||
elif ctx.chunk_size > 0:
|
||||
range_start = 0
|
||||
else:
|
||||
range_start = None
|
||||
ctx.is_resume = False
|
||||
range_end = range_start + ctx.chunk_size - 1 if ctx.chunk_size else None
|
||||
if range_end and ctx.data_len is not None and range_end >= ctx.data_len:
|
||||
range_end = ctx.data_len - 1
|
||||
has_range = range_start is not None
|
||||
ctx.has_range = has_range
|
||||
request = sanitized_Request(url, None, headers)
|
||||
if has_range:
|
||||
set_range(request, range_start, range_end)
|
||||
# Establish connection
|
||||
try:
|
||||
ctx.data = self.ydl.urlopen(request)
|
||||
@ -112,24 +77,12 @@ class HttpFD(FileDownloader):
|
||||
# that don't support resuming and serve a whole file with no Content-Range
|
||||
# set in response despite of requested Range (see
|
||||
# https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799)
|
||||
if has_range:
|
||||
if ctx.resume_len > 0:
|
||||
content_range = ctx.data.headers.get('Content-Range')
|
||||
if content_range:
|
||||
content_range_m = re.search(r'bytes (\d+)-(\d+)?(?:/(\d+))?', content_range)
|
||||
content_range_m = re.search(r'bytes (\d+)-', content_range)
|
||||
# Content-Range is present and matches requested Range, resume is possible
|
||||
if content_range_m:
|
||||
if range_start == int(content_range_m.group(1)):
|
||||
content_range_end = int_or_none(content_range_m.group(2))
|
||||
content_len = int_or_none(content_range_m.group(3))
|
||||
accept_content_len = (
|
||||
# Non-chunked download
|
||||
not ctx.chunk_size or
|
||||
# Chunked download and requested piece or
|
||||
# its part is promised to be served
|
||||
content_range_end == range_end or
|
||||
content_len < range_end)
|
||||
if accept_content_len:
|
||||
ctx.data_len = content_len
|
||||
if content_range_m and ctx.resume_len == int(content_range_m.group(1)):
|
||||
return
|
||||
# Content-Range is either not present or invalid. Assuming remote webserver is
|
||||
# trying to send the whole file, resume is not possible, so wiping the local file
|
||||
@ -137,15 +90,16 @@ class HttpFD(FileDownloader):
|
||||
self.report_unable_to_resume()
|
||||
ctx.resume_len = 0
|
||||
ctx.open_mode = 'wb'
|
||||
ctx.data_len = int_or_none(ctx.data.info().get('Content-length', None))
|
||||
return
|
||||
except (compat_urllib_error.HTTPError, ) as err:
|
||||
if err.code == 416:
|
||||
if (err.code < 500 or err.code >= 600) and err.code != 416:
|
||||
# Unexpected HTTP error
|
||||
raise
|
||||
elif err.code == 416:
|
||||
# Unable to resume (requested range not satisfiable)
|
||||
try:
|
||||
# Open the connection again without the range header
|
||||
ctx.data = self.ydl.urlopen(
|
||||
sanitized_Request(url, None, headers))
|
||||
ctx.data = self.ydl.urlopen(basic_request)
|
||||
content_length = ctx.data.info()['Content-Length']
|
||||
except (compat_urllib_error.HTTPError, ) as err:
|
||||
if err.code < 500 or err.code >= 600:
|
||||
@ -176,9 +130,6 @@ class HttpFD(FileDownloader):
|
||||
ctx.resume_len = 0
|
||||
ctx.open_mode = 'wb'
|
||||
return
|
||||
elif err.code < 500 or err.code >= 600:
|
||||
# Unexpected HTTP error
|
||||
raise
|
||||
raise RetryDownload(err)
|
||||
except socket.error as err:
|
||||
if err.errno != errno.ECONNRESET:
|
||||
@ -209,7 +160,7 @@ class HttpFD(FileDownloader):
|
||||
return False
|
||||
|
||||
byte_counter = 0 + ctx.resume_len
|
||||
block_size = ctx.block_size
|
||||
block_size = self.params.get('buffersize', 1024)
|
||||
start = time.time()
|
||||
|
||||
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly
|
||||
@ -282,30 +233,25 @@ class HttpFD(FileDownloader):
|
||||
|
||||
# Progress message
|
||||
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
|
||||
if ctx.data_len is None:
|
||||
if data_len is None:
|
||||
eta = None
|
||||
else:
|
||||
eta = self.calc_eta(start, time.time(), ctx.data_len - ctx.resume_len, byte_counter - ctx.resume_len)
|
||||
eta = self.calc_eta(start, time.time(), data_len - ctx.resume_len, byte_counter - ctx.resume_len)
|
||||
|
||||
self._hook_progress({
|
||||
'status': 'downloading',
|
||||
'downloaded_bytes': byte_counter,
|
||||
'total_bytes': ctx.data_len,
|
||||
'total_bytes': data_len,
|
||||
'tmpfilename': ctx.tmpfilename,
|
||||
'filename': ctx.filename,
|
||||
'eta': eta,
|
||||
'speed': speed,
|
||||
'elapsed': now - ctx.start_time,
|
||||
'elapsed': now - start,
|
||||
})
|
||||
|
||||
if is_test and byte_counter == data_len:
|
||||
break
|
||||
|
||||
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
|
||||
ctx.resume_len = byte_counter
|
||||
# ctx.block_size = block_size
|
||||
raise NextFragment()
|
||||
|
||||
if ctx.stream is None:
|
||||
self.to_stderr('\n')
|
||||
self.report_error('Did not get any data blocks')
|
||||
@ -330,7 +276,7 @@ class HttpFD(FileDownloader):
|
||||
'total_bytes': byte_counter,
|
||||
'filename': ctx.filename,
|
||||
'status': 'finished',
|
||||
'elapsed': time.time() - ctx.start_time,
|
||||
'elapsed': time.time() - start,
|
||||
})
|
||||
|
||||
return True
|
||||
@ -344,8 +290,6 @@ class HttpFD(FileDownloader):
|
||||
if count <= retries:
|
||||
self.report_retry(e.source_error, count, retries)
|
||||
continue
|
||||
except NextFragment:
|
||||
continue
|
||||
except SucceedDownload:
|
||||
return True
|
||||
|
||||
|
@ -1,27 +1,25 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import time
|
||||
import struct
|
||||
import binascii
|
||||
import io
|
||||
|
||||
from .fragment import FragmentFD
|
||||
from ..compat import (
|
||||
compat_Struct,
|
||||
compat_urllib_error,
|
||||
)
|
||||
from ..compat import compat_urllib_error
|
||||
|
||||
|
||||
u8 = compat_Struct('>B')
|
||||
u88 = compat_Struct('>Bx')
|
||||
u16 = compat_Struct('>H')
|
||||
u1616 = compat_Struct('>Hxx')
|
||||
u32 = compat_Struct('>I')
|
||||
u64 = compat_Struct('>Q')
|
||||
u8 = struct.Struct(b'>B')
|
||||
u88 = struct.Struct(b'>Bx')
|
||||
u16 = struct.Struct(b'>H')
|
||||
u1616 = struct.Struct(b'>Hxx')
|
||||
u32 = struct.Struct(b'>I')
|
||||
u64 = struct.Struct(b'>Q')
|
||||
|
||||
s88 = compat_Struct('>bx')
|
||||
s16 = compat_Struct('>h')
|
||||
s1616 = compat_Struct('>hxx')
|
||||
s32 = compat_Struct('>i')
|
||||
s88 = struct.Struct(b'>bx')
|
||||
s16 = struct.Struct(b'>h')
|
||||
s1616 = struct.Struct(b'>hxx')
|
||||
s32 = struct.Struct(b'>i')
|
||||
|
||||
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
|
||||
|
||||
@ -141,7 +139,7 @@ def write_piff_header(stream, params):
|
||||
sample_entry_payload += u16.pack(0x18) # depth
|
||||
sample_entry_payload += s16.pack(-1) # pre defined
|
||||
|
||||
codec_private_data = binascii.unhexlify(params['codec_private_data'].encode('utf-8'))
|
||||
codec_private_data = binascii.unhexlify(params['codec_private_data'])
|
||||
if fourcc in ('H264', 'AVC1'):
|
||||
sps, pps = codec_private_data.split(u32.pack(1))[1:]
|
||||
avcc_payload = u8.pack(1) # configuration version
|
||||
|
@ -1,15 +1,13 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import json
|
||||
import os
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..aes import aes_cbc_decrypt
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_ord,
|
||||
)
|
||||
from ..compat import compat_ord
|
||||
from ..utils import (
|
||||
bytes_to_intlist,
|
||||
ExtractorError,
|
||||
@ -50,9 +48,9 @@ class ADNIE(InfoExtractor):
|
||||
|
||||
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
|
||||
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
|
||||
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
|
||||
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
|
||||
bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
|
||||
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
|
||||
bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
|
||||
))
|
||||
subtitles_json = self._parse_json(
|
||||
dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
|
||||
|
@ -177,9 +177,7 @@ class AfreecaTVIE(InfoExtractor):
|
||||
|
||||
video_xml = self._download_xml(
|
||||
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
|
||||
video_id, headers={
|
||||
'Referer': 'http://vod.afreecatv.com/embed.php',
|
||||
}, query={
|
||||
video_id, query={
|
||||
'nTitleNo': video_id,
|
||||
'partialView': 'SKIP_ADULT',
|
||||
})
|
||||
|
@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class AMCNetworksIE(ThePlatformIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
|
||||
'md5': '',
|
||||
@ -51,9 +51,6 @@ class AMCNetworksIE(ThePlatformIE):
|
||||
}, {
|
||||
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -1,13 +1,11 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_urllib_parse_unquote,
|
||||
)
|
||||
from ..compat import compat_urllib_parse_unquote
|
||||
|
||||
|
||||
class BigflixIE(InfoExtractor):
|
||||
@ -41,8 +39,8 @@ class BigflixIE(InfoExtractor):
|
||||
webpage, 'title')
|
||||
|
||||
def decode_url(quoted_b64_url):
|
||||
return compat_b64decode(compat_urllib_parse_unquote(
|
||||
quoted_b64_url)).decode('utf-8')
|
||||
return base64.b64decode(compat_urllib_parse_unquote(
|
||||
quoted_b64_url).encode('ascii')).decode('utf-8')
|
||||
|
||||
formats = []
|
||||
for height, encoded_url in re.findall(
|
||||
|
@ -102,7 +102,6 @@ class BiliBiliIE(InfoExtractor):
|
||||
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
|
||||
headers = {
|
||||
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
|
||||
'Referer': url
|
||||
}
|
||||
headers.update(self.geo_verification_headers())
|
||||
|
||||
@ -117,15 +116,10 @@ class BiliBiliIE(InfoExtractor):
|
||||
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
|
||||
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
|
||||
|
||||
headers = {
|
||||
'Referer': url
|
||||
}
|
||||
headers.update(self.geo_verification_headers())
|
||||
|
||||
video_info = self._download_json(
|
||||
'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
|
||||
video_id, note='Downloading video info page',
|
||||
headers=headers)
|
||||
headers=self.geo_verification_headers())
|
||||
|
||||
if 'durl' not in video_info:
|
||||
self._report_error(video_info)
|
||||
|
@ -690,17 +690,10 @@ class BrightcoveNewIE(AdobePassIE):
|
||||
webpage, 'policy key', group='pk')
|
||||
|
||||
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
|
||||
headers = {
|
||||
'Accept': 'application/json;pk=%s' % policy_key,
|
||||
}
|
||||
referrer = smuggled_data.get('referrer')
|
||||
if referrer:
|
||||
headers.update({
|
||||
'Referer': referrer,
|
||||
'Origin': re.search(r'https?://[^/]+', referrer).group(0),
|
||||
})
|
||||
try:
|
||||
json_data = self._download_json(api_url, video_id, headers=headers)
|
||||
json_data = self._download_json(api_url, video_id, headers={
|
||||
'Accept': 'application/json;pk=%s' % policy_key
|
||||
})
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
||||
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
|
||||
|
@ -4,36 +4,59 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urllib_parse_urlparse
|
||||
from ..utils import (
|
||||
dict_get,
|
||||
# ExtractorError,
|
||||
# HEADRequest,
|
||||
int_or_none,
|
||||
qualities,
|
||||
remove_end,
|
||||
unified_strdate,
|
||||
)
|
||||
|
||||
|
||||
class CanalplusIE(InfoExtractor):
|
||||
IE_DESC = 'mycanal.fr and piwiplus.fr'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<site>mycanal|piwiplus)\.fr/(?:[^/]+/)*(?P<display_id>[^?/]+)(?:\.html\?.*\bvid=|/p/)(?P<id>\d+)'
|
||||
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
(?:
|
||||
(?:(?:www|m)\.)?canalplus\.fr|
|
||||
(?:www\.)?piwiplus\.fr|
|
||||
(?:www\.)?d8\.tv|
|
||||
(?:www\.)?c8\.fr|
|
||||
(?:www\.)?d17\.tv|
|
||||
(?:(?:football|www)\.)?cstar\.fr|
|
||||
(?:www\.)?itele\.fr
|
||||
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
|
||||
player\.canalplus\.fr/#/(?P<id>\d+)
|
||||
)
|
||||
|
||||
'''
|
||||
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
|
||||
_SITE_ID_MAP = {
|
||||
'mycanal': 'cplus',
|
||||
'canalplus': 'cplus',
|
||||
'piwiplus': 'teletoon',
|
||||
'd8': 'd8',
|
||||
'c8': 'd8',
|
||||
'd17': 'd17',
|
||||
'cstar': 'd17',
|
||||
'itele': 'itele',
|
||||
}
|
||||
|
||||
# Only works for direct mp4 URLs
|
||||
_GEO_COUNTRIES = ['FR']
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://www.mycanal.fr/d17-emissions/lolywood/p/1397061',
|
||||
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1192814',
|
||||
'info_dict': {
|
||||
'id': '1397061',
|
||||
'display_id': 'lolywood',
|
||||
'id': '1405510',
|
||||
'display_id': 'pid1830-c-zapping',
|
||||
'ext': 'mp4',
|
||||
'title': 'Euro 2016 : Je préfère te prévenir - Lolywood - Episode 34',
|
||||
'description': 'md5:7d97039d455cb29cdba0d652a0efaa5e',
|
||||
'upload_date': '20160602',
|
||||
'title': 'Zapping - 02/07/2016',
|
||||
'description': 'Le meilleur de toutes les chaînes, tous les jours',
|
||||
'upload_date': '20160702',
|
||||
},
|
||||
}, {
|
||||
# geo restricted, bypassed
|
||||
@ -47,12 +70,64 @@ class CanalplusIE(InfoExtractor):
|
||||
'upload_date': '20140724',
|
||||
},
|
||||
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
||||
}, {
|
||||
# geo restricted, bypassed
|
||||
'url': 'http://www.c8.fr/c8-divertissement/ms-touche-pas-a-mon-poste/pid6318-videos-integrales.html?vid=1443684',
|
||||
'md5': 'bb6f9f343296ab7ebd88c97b660ecf8d',
|
||||
'info_dict': {
|
||||
'id': '1443684',
|
||||
'display_id': 'pid6318-videos-integrales',
|
||||
'ext': 'mp4',
|
||||
'title': 'Guess my iep ! - TPMP - 07/04/2017',
|
||||
'description': 'md5:6f005933f6e06760a9236d9b3b5f17fa',
|
||||
'upload_date': '20170407',
|
||||
},
|
||||
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
||||
}, {
|
||||
'url': 'http://www.itele.fr/chroniques/invite-michael-darmon/rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
|
||||
'info_dict': {
|
||||
'id': '1420176',
|
||||
'display_id': 'rachida-dati-nicolas-sarkozy-est-le-plus-en-phase-avec-les-inquietudes-des-francais-171510',
|
||||
'ext': 'mp4',
|
||||
'title': 'L\'invité de Michaël Darmon du 14/10/2016 - ',
|
||||
'description': 'Chaque matin du lundi au vendredi, Michaël Darmon reçoit un invité politique à 8h25.',
|
||||
'upload_date': '20161014',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://football.cstar.fr/cstar-minisite-foot/pid7566-feminines-videos.html?vid=1416769',
|
||||
'info_dict': {
|
||||
'id': '1416769',
|
||||
'display_id': 'pid7566-feminines-videos',
|
||||
'ext': 'mp4',
|
||||
'title': 'France - Albanie : les temps forts de la soirée - 20/09/2016',
|
||||
'description': 'md5:c3f30f2aaac294c1c969b3294de6904e',
|
||||
'upload_date': '20160921',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://m.canalplus.fr/?vid=1398231',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.d17.tv/emissions/pid8303-lolywood.html?vid=1397061',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
site, display_id, video_id = re.match(self._VALID_URL, url).groups()
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
|
||||
site_id = self._SITE_ID_MAP[site]
|
||||
site_id = self._SITE_ID_MAP[compat_urllib_parse_urlparse(url).netloc.rsplit('.', 2)[-2]]
|
||||
|
||||
# Beware, some subclasses do not define an id group
|
||||
display_id = remove_end(dict_get(mobj.groupdict(), ('display_id', 'id', 'vid')), '.html')
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
|
||||
r'id=["\']canal_video_player(?P<id>\d+)',
|
||||
r'data-video=["\'](?P<id>\d+)'],
|
||||
webpage, 'video id', default=mobj.group('vid'), group='id')
|
||||
|
||||
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
|
||||
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
|
||||
@ -86,7 +161,7 @@ class CanalplusIE(InfoExtractor):
|
||||
format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
# the secret extracted from ya function in http://player.canalplus.fr/common/js/canalPlayer.js
|
||||
# the secret extracted ya function in http://player.canalplus.fr/common/js/canalPlayer.js
|
||||
'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes',
|
||||
'format_id': format_id,
|
||||
'preference': preference(format_id),
|
||||
|
@ -75,10 +75,10 @@ class CBSInteractiveIE(CBSIE):
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
data_json = self._html_search_regex(
|
||||
r"data(?:-(?:cnet|zdnet))?-video(?:-(?:uvp(?:js)?|player))?-options='([^']+)'",
|
||||
r"data-(?:cnet|zdnet)-video(?:-uvp(?:js)?)?-options='([^']+)'",
|
||||
webpage, 'data json')
|
||||
data = self._parse_json(data_json, display_id)
|
||||
vdata = data.get('video') or (data.get('videos') or data.get('playlist'))[0]
|
||||
vdata = data.get('video') or data['videos'][0]
|
||||
|
||||
video_id = vdata['mpxRefId']
|
||||
|
||||
|
@ -1,11 +1,11 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import base64
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .youtube import YoutubeIE
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
ExtractorError
|
||||
@ -58,7 +58,7 @@ class ChilloutzoneIE(InfoExtractor):
|
||||
|
||||
base64_video_info = self._html_search_regex(
|
||||
r'var cozVidData = "(.+?)";', webpage, 'video data')
|
||||
decoded_video_info = compat_b64decode(base64_video_info).decode('utf-8')
|
||||
decoded_video_info = base64.b64decode(base64_video_info.encode('utf-8')).decode('utf-8')
|
||||
video_info_dict = json.loads(decoded_video_info)
|
||||
|
||||
# get video information from dict
|
||||
|
@ -1,10 +1,10 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import parse_duration
|
||||
|
||||
|
||||
@ -44,7 +44,8 @@ class ChirbitIE(InfoExtractor):
|
||||
|
||||
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look
|
||||
# for soundURL)
|
||||
audio_url = compat_b64decode(data_fd[::-1]).decode('utf-8')
|
||||
audio_url = base64.b64decode(
|
||||
data_fd[::-1].encode('ascii')).decode('utf-8')
|
||||
|
||||
title = self._search_regex(
|
||||
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')
|
||||
|
@ -174,8 +174,6 @@ class InfoExtractor(object):
|
||||
width : height ratio as float.
|
||||
* no_resume The server does not support resuming the
|
||||
(HTTP or RTMP) download. Boolean.
|
||||
* downloader_options A dictionary of downloader options as
|
||||
described in FileDownloader
|
||||
|
||||
url: Final video URL.
|
||||
ext: Video filename extension.
|
||||
@ -1029,7 +1027,7 @@ class InfoExtractor(object):
|
||||
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
|
||||
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
|
||||
info['series'] = unescapeHTML(part_of_series.get('name'))
|
||||
elif item_type in ('Article', 'NewsArticle'):
|
||||
elif item_type == 'Article':
|
||||
info.update({
|
||||
'timestamp': parse_iso8601(e.get('datePublished')),
|
||||
'title': unescapeHTML(e.get('headline')),
|
||||
@ -2250,10 +2248,9 @@ class InfoExtractor(object):
|
||||
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
|
||||
query = compat_urlparse.urlparse(url).query
|
||||
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
|
||||
mobj = re.search(
|
||||
r'(?:(?:http|rtmp|rtsp)(?P<s>s)?:)?(?P<url>//[^?]+)', url)
|
||||
url_base = mobj.group('url')
|
||||
http_base_url = '%s%s:%s' % ('http', mobj.group('s') or '', url_base)
|
||||
url_base = self._search_regex(
|
||||
r'(?:(?:https?|rtmp|rtsp):)?(//[^?]+)', url, 'format url')
|
||||
http_base_url = '%s:%s' % ('http', url_base)
|
||||
formats = []
|
||||
|
||||
def manifest_url(manifest):
|
||||
@ -2407,7 +2404,7 @@ class InfoExtractor(object):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id=m3u8_id, fatal=False))
|
||||
elif source_type == 'dash' or ext == 'mpd':
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
source_url, video_id, mpd_id=mpd_id, fatal=False))
|
||||
elif ext == 'smil':
|
||||
|
@ -3,13 +3,13 @@ from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import json
|
||||
import base64
|
||||
import zlib
|
||||
|
||||
from hashlib import sha1
|
||||
from math import pow, sqrt, floor
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_etree_fromstring,
|
||||
compat_urllib_parse_urlencode,
|
||||
compat_urllib_request,
|
||||
@ -272,8 +272,8 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
}
|
||||
|
||||
def _decrypt_subtitles(self, data, iv, id):
|
||||
data = bytes_to_intlist(compat_b64decode(data))
|
||||
iv = bytes_to_intlist(compat_b64decode(iv))
|
||||
data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
|
||||
iv = bytes_to_intlist(base64.b64decode(iv.encode('utf-8')))
|
||||
id = int(id)
|
||||
|
||||
def obfuscate_key_aux(count, modulo, start):
|
||||
|
@ -10,7 +10,6 @@ from ..aes import (
|
||||
aes_cbc_decrypt,
|
||||
aes_cbc_encrypt,
|
||||
)
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
bytes_to_intlist,
|
||||
bytes_to_long,
|
||||
@ -94,7 +93,7 @@ class DaisukiMottoIE(InfoExtractor):
|
||||
|
||||
rtn = self._parse_json(
|
||||
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
|
||||
compat_b64decode(encrypted_rtn)),
|
||||
base64.b64decode(encrypted_rtn)),
|
||||
aes_key, iv)).decode('utf-8').rstrip('\0'),
|
||||
video_id)
|
||||
|
||||
|
@ -1,56 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import js_to_json
|
||||
|
||||
|
||||
class DiggIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?digg\.com/video/(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
# JWPlatform via provider
|
||||
'url': 'http://digg.com/video/sci-fi-short-jonah-daniel-kaluuya-get-out',
|
||||
'info_dict': {
|
||||
'id': 'LcqvmS0b',
|
||||
'ext': 'mp4',
|
||||
'title': "'Get Out' Star Daniel Kaluuya Goes On 'Moby Dick'-Like Journey In Sci-Fi Short 'Jonah'",
|
||||
'description': 'md5:541bb847648b6ee3d6514bc84b82efda',
|
||||
'upload_date': '20180109',
|
||||
'timestamp': 1515530551,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# Youtube via provider
|
||||
'url': 'http://digg.com/video/dog-boat-seal-play',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# vimeo as regular embed
|
||||
'url': 'http://digg.com/video/dream-girl-short-film',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
info = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)video_info\s*=\s*({.+?});\n', webpage, 'video info',
|
||||
default='{}'), display_id, transform_source=js_to_json,
|
||||
fatal=False)
|
||||
|
||||
video_id = info.get('video_id')
|
||||
|
||||
if video_id:
|
||||
provider = info.get('provider_name')
|
||||
if provider == 'youtube':
|
||||
return self.url_result(
|
||||
video_id, ie='Youtube', video_id=video_id)
|
||||
elif provider == 'jwplayer':
|
||||
return self.url_result(
|
||||
'jwplatform:%s' % video_id, ie='JWPlatform',
|
||||
video_id=video_id)
|
||||
|
||||
return self.url_result(url, 'Generic')
|
@ -12,28 +12,25 @@ from ..compat import (
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
remove_end,
|
||||
try_get,
|
||||
unified_strdate,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
USER_AGENTS,
|
||||
)
|
||||
|
||||
|
||||
class DPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?P<domain>www\.(?P<host>dplay\.(?P<country>dk|se|no)))/(?:video(?:er|s)/)?(?P<id>[^/]+/[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?P<domain>www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
|
||||
|
||||
_TESTS = [{
|
||||
# non geo restricted, via secure api, unsigned download hls URL
|
||||
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
|
||||
'info_dict': {
|
||||
'id': '3172',
|
||||
'display_id': 'nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet',
|
||||
'display_id': 'season-1-svensken-lar-sig-njuta-av-livet',
|
||||
'ext': 'mp4',
|
||||
'title': 'Svensken lär sig njuta av livet',
|
||||
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
|
||||
@ -51,7 +48,7 @@ class DPlayIE(InfoExtractor):
|
||||
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
|
||||
'info_dict': {
|
||||
'id': '70816',
|
||||
'display_id': 'mig-og-min-mor/season-6-episode-12',
|
||||
'display_id': 'season-6-episode-12',
|
||||
'ext': 'mp4',
|
||||
'title': 'Episode 12',
|
||||
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
|
||||
@ -68,33 +65,6 @@ class DPlayIE(InfoExtractor):
|
||||
# geo restricted, via direct unsigned hls URL
|
||||
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# disco-api
|
||||
'url': 'https://www.dplay.no/videoer/i-kongens-klr/sesong-1-episode-7',
|
||||
'info_dict': {
|
||||
'id': '40206',
|
||||
'display_id': 'i-kongens-klr/sesong-1-episode-7',
|
||||
'ext': 'mp4',
|
||||
'title': 'Episode 7',
|
||||
'description': 'md5:e3e1411b2b9aebeea36a6ec5d50c60cf',
|
||||
'duration': 2611.16,
|
||||
'timestamp': 1516726800,
|
||||
'upload_date': '20180123',
|
||||
'series': 'I kongens klær',
|
||||
'season_number': 1,
|
||||
'episode_number': 7,
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
|
||||
'url': 'https://www.dplay.dk/videoer/singleliv/season-5-episode-3',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.dplay.se/videos/sofias-anglar/sofias-anglar-1001',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -102,81 +72,10 @@ class DPlayIE(InfoExtractor):
|
||||
display_id = mobj.group('id')
|
||||
domain = mobj.group('domain')
|
||||
|
||||
self._initialize_geo_bypass([mobj.group('country').upper()])
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'data-video-id=["\'](\d+)', webpage, 'video id', default=None)
|
||||
|
||||
if not video_id:
|
||||
host = mobj.group('host')
|
||||
disco_base = 'https://disco-api.%s' % host
|
||||
self._download_json(
|
||||
'%s/token' % disco_base, display_id, 'Downloading token',
|
||||
query={
|
||||
'realm': host.replace('.', ''),
|
||||
})
|
||||
video = self._download_json(
|
||||
'%s/content/videos/%s' % (disco_base, display_id), display_id,
|
||||
headers={
|
||||
'Referer': url,
|
||||
'x-disco-client': 'WEB:UNKNOWN:dplay-client:0.0.1',
|
||||
}, query={
|
||||
'include': 'show'
|
||||
})
|
||||
video_id = video['data']['id']
|
||||
info = video['data']['attributes']
|
||||
title = info['name']
|
||||
formats = []
|
||||
for format_id, format_dict in self._download_json(
|
||||
'%s/playback/videoPlaybackInfo/%s' % (disco_base, video_id),
|
||||
display_id)['data']['attributes']['streaming'].items():
|
||||
if not isinstance(format_dict, dict):
|
||||
continue
|
||||
format_url = format_dict.get('url')
|
||||
if not format_url:
|
||||
continue
|
||||
ext = determine_ext(format_url)
|
||||
if format_id == 'dash' or ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
format_url, display_id, mpd_id='dash', fatal=False))
|
||||
elif format_id == 'hls' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, display_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id='hls',
|
||||
fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': format_id,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
series = None
|
||||
try:
|
||||
included = video.get('included')
|
||||
if isinstance(included, list):
|
||||
show = next(e for e in included if e.get('type') == 'show')
|
||||
series = try_get(
|
||||
show, lambda x: x['attributes']['name'], compat_str)
|
||||
except StopIteration:
|
||||
pass
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': info.get('description'),
|
||||
'duration': float_or_none(
|
||||
info.get('videoDuration'), scale=1000),
|
||||
'timestamp': unified_timestamp(info.get('publishStart')),
|
||||
'series': series,
|
||||
'season_number': int_or_none(info.get('seasonNumber')),
|
||||
'episode_number': int_or_none(info.get('episodeNumber')),
|
||||
'age_limit': int_or_none(info.get('minimum_age')),
|
||||
'formats': formats,
|
||||
}
|
||||
r'data-video-id=["\'](\d+)', webpage, 'video id')
|
||||
|
||||
info = self._download_json(
|
||||
'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),
|
||||
|
@ -1,10 +1,10 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
qualities,
|
||||
sanitized_Request,
|
||||
@ -42,7 +42,7 @@ class DumpertIE(InfoExtractor):
|
||||
r'data-files="([^"]+)"', webpage, 'data files')
|
||||
|
||||
files = self._parse_json(
|
||||
compat_b64decode(files_base64).decode('utf-8'),
|
||||
base64.b64decode(files_base64.encode('utf-8')).decode('utf-8'),
|
||||
video_id)
|
||||
|
||||
quality = qualities(['flv', 'mobile', 'tablet', '720p'])
|
||||
|
@ -1,13 +1,13 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_str,
|
||||
compat_urlparse,
|
||||
compat_str,
|
||||
)
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
@ -36,9 +36,9 @@ class EinthusanIE(InfoExtractor):
|
||||
|
||||
# reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js
|
||||
def _decrypt(self, encrypted_data, video_id):
|
||||
return self._parse_json(compat_b64decode((
|
||||
return self._parse_json(base64.b64decode((
|
||||
encrypted_data[:10] + encrypted_data[-1] + encrypted_data[12:-1]
|
||||
)).decode('utf-8'), video_id)
|
||||
).encode('ascii')).decode('utf-8'), video_id)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
@ -259,7 +259,6 @@ from .deezer import DeezerPlaylistIE
|
||||
from .democracynow import DemocracynowIE
|
||||
from .dfb import DFBIE
|
||||
from .dhm import DHMIE
|
||||
from .digg import DiggIE
|
||||
from .dotsub import DotsubIE
|
||||
from .douyutv import (
|
||||
DouyuShowIE,
|
||||
@ -490,6 +489,7 @@ from .jwplatform import JWPlatformIE
|
||||
from .jpopsukitv import JpopsukiIE
|
||||
from .kakao import KakaoIE
|
||||
from .kaltura import KalturaIE
|
||||
from .kamcord import KamcordIE
|
||||
from .kanalplay import KanalPlayIE
|
||||
from .kankan import KankanIE
|
||||
from .karaoketv import KaraoketvIE
|
||||
@ -630,10 +630,7 @@ from .musicplayon import MusicPlayOnIE
|
||||
from .mwave import MwaveIE, MwaveMeetGreetIE
|
||||
from .myspace import MySpaceIE, MySpaceAlbumIE
|
||||
from .myspass import MySpassIE
|
||||
from .myvi import (
|
||||
MyviIE,
|
||||
MyviEmbedIE,
|
||||
)
|
||||
from .myvi import MyviIE
|
||||
from .myvidster import MyVidsterIE
|
||||
from .nationalgeographic import (
|
||||
NationalGeographicVideoIE,
|
||||
@ -884,6 +881,7 @@ from .revision3 import (
|
||||
Revision3IE,
|
||||
)
|
||||
from .rice import RICEIE
|
||||
from .ringtv import RingTVIE
|
||||
from .rmcdecouverte import RMCDecouverteIE
|
||||
from .ro220 import Ro220IE
|
||||
from .rockstargames import RockstarGamesIE
|
||||
@ -903,7 +901,6 @@ from .rtp import RTPIE
|
||||
from .rts import RTSIE
|
||||
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE, RTVELiveIE, RTVETelevisionIE
|
||||
from .rtvnh import RTVNHIE
|
||||
from .rtvs import RTVSIE
|
||||
from .rudo import RudoIE
|
||||
from .ruhd import RUHDIE
|
||||
from .ruleporn import RulePornIE
|
||||
@ -936,10 +933,6 @@ from .servingsys import ServingSysIE
|
||||
from .servus import ServusIE
|
||||
from .sevenplus import SevenPlusIE
|
||||
from .sexu import SexuIE
|
||||
from .seznamzpravy import (
|
||||
SeznamZpravyIE,
|
||||
SeznamZpravyArticleIE,
|
||||
)
|
||||
from .shahid import (
|
||||
ShahidIE,
|
||||
ShahidShowIE,
|
||||
@ -997,7 +990,7 @@ from .stitcher import StitcherIE
|
||||
from .sport5 import Sport5IE
|
||||
from .sportbox import SportBoxEmbedIE
|
||||
from .sportdeutschland import SportDeutschlandIE
|
||||
from .springboardplatform import SpringboardPlatformIE
|
||||
from .sportschau import SportschauIE
|
||||
from .sprout import SproutIE
|
||||
from .srgssr import (
|
||||
SRGSSRIE,
|
||||
@ -1053,6 +1046,7 @@ from .theplatform import (
|
||||
ThePlatformFeedIE,
|
||||
)
|
||||
from .thescene import TheSceneIE
|
||||
from .thesixtyone import TheSixtyOneIE
|
||||
from .thestar import TheStarIE
|
||||
from .thesun import TheSunIE
|
||||
from .theweatherchannel import TheWeatherChannelIE
|
||||
@ -1074,6 +1068,7 @@ from .tnaflix import (
|
||||
from .toggle import ToggleIE
|
||||
from .tonline import TOnlineIE
|
||||
from .toongoggles import ToonGogglesIE
|
||||
from .totalwebcasting import TotalWebCastingIE
|
||||
from .toutv import TouTvIE
|
||||
from .toypics import ToypicsUserIE, ToypicsIE
|
||||
from .traileraddict import TrailerAddictIE
|
||||
@ -1294,8 +1289,6 @@ from .watchbox import WatchBoxIE
|
||||
from .watchindianporn import WatchIndianPornIE
|
||||
from .wdr import (
|
||||
WDRIE,
|
||||
WDRPageIE,
|
||||
WDRElefantIE,
|
||||
WDRMobileIE,
|
||||
)
|
||||
from .webcaster import (
|
||||
@ -1306,10 +1299,6 @@ from .webofstories import (
|
||||
WebOfStoriesIE,
|
||||
WebOfStoriesPlaylistIE,
|
||||
)
|
||||
from .weibo import (
|
||||
WeiboIE,
|
||||
WeiboMobileIE
|
||||
)
|
||||
from .weiqitv import WeiqiTVIE
|
||||
from .wimp import WimpIE
|
||||
from .wistia import WistiaIE
|
||||
@ -1335,10 +1324,6 @@ from .xiami import (
|
||||
XiamiArtistIE,
|
||||
XiamiCollectionIE
|
||||
)
|
||||
from .ximalaya import (
|
||||
XimalayaIE,
|
||||
XimalayaAlbumIE
|
||||
)
|
||||
from .xminus import XMinusIE
|
||||
from .xnxx import XNXXIE
|
||||
from .xstream import XstreamIE
|
||||
|
@ -33,7 +33,7 @@ class FranceInterIE(InfoExtractor):
|
||||
description = self._og_search_description(webpage)
|
||||
|
||||
upload_date_str = self._search_regex(
|
||||
r'class=["\']\s*cover-emission-period\s*["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
|
||||
r'class=["\']cover-emission-period["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
|
||||
webpage, 'upload date', fatal=False)
|
||||
if upload_date_str:
|
||||
upload_date_list = upload_date_str.split()
|
||||
|
@ -23,11 +23,6 @@ class GameInformerIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
url, display_id, headers=self.geo_verification_headers())
|
||||
brightcove_id = self._search_regex(
|
||||
[r'<[^>]+\bid=["\']bc_(\d+)', r"getVideo\('[^']+video_id=(\d+)"],
|
||||
webpage, 'brightcove id')
|
||||
return self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew',
|
||||
brightcove_id)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
brightcove_id = self._search_regex(r"getVideo\('[^']+video_id=(\d+)", webpage, 'brightcove id')
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||
|
@ -1,8 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
@ -11,52 +9,44 @@ from ..utils import (
|
||||
|
||||
|
||||
class GameStarIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?game(?P<site>pro|star)\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||
_TESTS = [{
|
||||
_VALID_URL = r'https?://(?:www\.)?gamestar\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
||||
'md5': 'ee782f1f8050448c95c5cacd63bc851c',
|
||||
'md5': '96974ecbb7fd8d0d20fca5a00810cea7',
|
||||
'info_dict': {
|
||||
'id': '76110',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1406542380,
|
||||
'timestamp': 1406542020,
|
||||
'upload_date': '20140728',
|
||||
'duration': 17,
|
||||
'duration': 17
|
||||
}
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.gamepro.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.gamestar.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
site = mobj.group('site')
|
||||
video_id = mobj.group('id')
|
||||
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id
|
||||
|
||||
# TODO: there are multiple ld+json objects in the webpage,
|
||||
# while _search_json_ld finds only the first one
|
||||
json_ld = self._parse_json(self._search_regex(
|
||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
||||
webpage, 'JSON-LD', group='json_ld'), video_id)
|
||||
info_dict = self._json_ld(json_ld, video_id)
|
||||
info_dict['title'] = remove_end(
|
||||
info_dict['title'], ' - Game%s' % site.title())
|
||||
info_dict['title'] = remove_end(info_dict['title'], ' - GameStar')
|
||||
|
||||
view_count = int_or_none(json_ld.get('interactionCount'))
|
||||
view_count = json_ld.get('interactionCount')
|
||||
comment_count = int_or_none(self._html_search_regex(
|
||||
r'<span>Kommentare</span>\s*<span[^>]+class=["\']count[^>]+>\s*\(\s*([0-9]+)',
|
||||
webpage, 'comment count', fatal=False))
|
||||
r'([0-9]+) Kommentare</span>', webpage, 'comment_count',
|
||||
fatal=False))
|
||||
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'url': 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id,
|
||||
'url': url,
|
||||
'ext': 'mp4',
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count
|
||||
|
@ -101,7 +101,6 @@ from .vzaar import VzaarIE
|
||||
from .channel9 import Channel9IE
|
||||
from .vshare import VShareIE
|
||||
from .mediasite import MediasiteIE
|
||||
from .springboardplatform import SpringboardPlatformIE
|
||||
|
||||
|
||||
class GenericIE(InfoExtractor):
|
||||
@ -1939,21 +1938,6 @@ class GenericIE(InfoExtractor):
|
||||
'timestamp': 1474354800,
|
||||
'upload_date': '20160920',
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://www.kidzworld.com/article/30935-trolls-the-beat-goes-on-interview-skylar-astin-and-amanda-leighton',
|
||||
'info_dict': {
|
||||
'id': '1731611',
|
||||
'ext': 'mp4',
|
||||
'title': 'Official Trailer | TROLLS: THE BEAT GOES ON!',
|
||||
'description': 'md5:eb5f23826a027ba95277d105f248b825',
|
||||
'timestamp': 1516100691,
|
||||
'upload_date': '20180116',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [SpringboardPlatformIE.ie_key()],
|
||||
}
|
||||
# {
|
||||
# # TODO: find another test
|
||||
@ -2280,10 +2264,7 @@ class GenericIE(InfoExtractor):
|
||||
# Look for Brightcove New Studio embeds
|
||||
bc_urls = BrightcoveNewIE._extract_urls(self, webpage)
|
||||
if bc_urls:
|
||||
return self.playlist_from_matches(
|
||||
bc_urls, video_id, video_title,
|
||||
getter=lambda x: smuggle_url(x, {'referrer': url}),
|
||||
ie='BrightcoveNew')
|
||||
return self.playlist_from_matches(bc_urls, video_id, video_title, ie='BrightcoveNew')
|
||||
|
||||
# Look for Nexx embeds
|
||||
nexx_urls = NexxIE._extract_urls(webpage)
|
||||
@ -2727,9 +2708,9 @@ class GenericIE(InfoExtractor):
|
||||
return self.url_result(viewlift_url)
|
||||
|
||||
# Look for JWPlatform embeds
|
||||
jwplatform_urls = JWPlatformIE._extract_urls(webpage)
|
||||
if jwplatform_urls:
|
||||
return self.playlist_from_matches(jwplatform_urls, video_id, video_title, ie=JWPlatformIE.ie_key())
|
||||
jwplatform_url = JWPlatformIE._extract_url(webpage)
|
||||
if jwplatform_url:
|
||||
return self.url_result(jwplatform_url, 'JWPlatform')
|
||||
|
||||
# Look for Digiteka embeds
|
||||
digiteka_url = DigitekaIE._extract_url(webpage)
|
||||
@ -2925,12 +2906,6 @@ class GenericIE(InfoExtractor):
|
||||
for mediasite_url in mediasite_urls]
|
||||
return self.playlist_result(entries, video_id, video_title)
|
||||
|
||||
springboardplatform_urls = SpringboardPlatformIE._extract_urls(webpage)
|
||||
if springboardplatform_urls:
|
||||
return self.playlist_from_matches(
|
||||
springboardplatform_urls, video_id, video_title,
|
||||
ie=SpringboardPlatformIE.ie_key())
|
||||
|
||||
def merge_dicts(dict1, dict2):
|
||||
merged = {}
|
||||
for k, v in dict1.items():
|
||||
|
@ -1,7 +1,8 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
HEADRequest,
|
||||
@ -47,7 +48,7 @@ class HotNewHipHopIE(InfoExtractor):
|
||||
if 'mediaKey' not in mkd:
|
||||
raise ExtractorError('Did not get a media key')
|
||||
|
||||
redirect_url = compat_b64decode(video_url_base64).decode('utf-8')
|
||||
redirect_url = base64.b64decode(video_url_base64).decode('utf-8')
|
||||
redirect_req = HEADRequest(redirect_url)
|
||||
req = self._request_webpage(
|
||||
redirect_req, video_id,
|
||||
|
@ -2,8 +2,9 @@
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urlparse,
|
||||
)
|
||||
@ -60,7 +61,7 @@ class InfoQIE(BokeCCBaseIE):
|
||||
encoded_id = self._search_regex(
|
||||
r"jsclassref\s*=\s*'([^']*)'", webpage, 'encoded id', default=None)
|
||||
|
||||
real_id = compat_urllib_parse_unquote(compat_b64decode(encoded_id).decode('utf-8'))
|
||||
real_id = compat_urllib_parse_unquote(base64.b64decode(encoded_id.encode('ascii')).decode('utf-8'))
|
||||
playpath = 'mp4:' + real_id
|
||||
|
||||
return [{
|
||||
|
@ -23,14 +23,11 @@ class JWPlatformIE(InfoExtractor):
|
||||
|
||||
@staticmethod
|
||||
def _extract_url(webpage):
|
||||
urls = JWPlatformIE._extract_urls(webpage)
|
||||
return urls[0] if urls else None
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return re.findall(
|
||||
r'<(?:script|iframe)[^>]+?src=["\']((?:https?:)?//content\.jwplatform\.com/players/[a-zA-Z0-9]{8})',
|
||||
mobj = re.search(
|
||||
r'<(?:script|iframe)[^>]+?src=["\'](?P<url>(?:https?:)?//content.jwplatform.com/players/[a-zA-Z0-9]{8})',
|
||||
webpage)
|
||||
if mobj:
|
||||
return mobj.group('url')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
71
youtube_dl/extractor/kamcord.py
Normal file
71
youtube_dl/extractor/kamcord.py
Normal file
@ -0,0 +1,71 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
qualities,
|
||||
)
|
||||
|
||||
|
||||
class KamcordIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?kamcord\.com/v/(?P<id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.kamcord.com/v/hNYRduDgWb4',
|
||||
'md5': 'c3180e8a9cfac2e86e1b88cb8751b54c',
|
||||
'info_dict': {
|
||||
'id': 'hNYRduDgWb4',
|
||||
'ext': 'mp4',
|
||||
'title': 'Drinking Madness',
|
||||
'uploader': 'jacksfilms',
|
||||
'uploader_id': '3044562',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video = self._parse_json(
|
||||
self._search_regex(
|
||||
r'window\.__props\s*=\s*({.+?});?(?:\n|\s*</script)',
|
||||
webpage, 'video'),
|
||||
video_id)['video']
|
||||
|
||||
title = video['title']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
video['play']['hls'], video_id, 'mp4', entry_protocol='m3u8_native')
|
||||
self._sort_formats(formats)
|
||||
|
||||
uploader = video.get('user', {}).get('username')
|
||||
uploader_id = video.get('user', {}).get('id')
|
||||
|
||||
view_count = int_or_none(video.get('viewCount'))
|
||||
like_count = int_or_none(video.get('heartCount'))
|
||||
comment_count = int_or_none(video.get('messageCount'))
|
||||
|
||||
preference_key = qualities(('small', 'medium', 'large'))
|
||||
|
||||
thumbnails = [{
|
||||
'url': thumbnail_url,
|
||||
'id': thumbnail_id,
|
||||
'preference': preference_key(thumbnail_id),
|
||||
} for thumbnail_id, thumbnail_url in (video.get('thumbnail') or {}).items()
|
||||
if isinstance(thumbnail_id, compat_str) and isinstance(thumbnail_url, compat_str)]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'view_count': view_count,
|
||||
'like_count': like_count,
|
||||
'comment_count': comment_count,
|
||||
'thumbnails': thumbnails,
|
||||
'formats': formats,
|
||||
}
|
@ -49,9 +49,7 @@ class LA7IE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
player_data = self._parse_json(
|
||||
self._search_regex(
|
||||
[r'(?s)videoParams\s*=\s*({.+?});', r'videoLa7\(({[^;]+})\);'],
|
||||
webpage, 'player data'),
|
||||
self._search_regex(r'videoLa7\(({[^;]+})\);', webpage, 'player data'),
|
||||
video_id, transform_source=js_to_json)
|
||||
|
||||
return {
|
||||
|
@ -1,6 +1,7 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import datetime
|
||||
import hashlib
|
||||
import re
|
||||
@ -8,7 +9,6 @@ import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_ord,
|
||||
compat_str,
|
||||
compat_urllib_parse_urlencode,
|
||||
@ -329,7 +329,7 @@ class LetvCloudIE(InfoExtractor):
|
||||
raise ExtractorError('Letv cloud returned an unknwon error')
|
||||
|
||||
def b64decode(s):
|
||||
return compat_b64decode(s).decode('utf-8')
|
||||
return base64.b64decode(s.encode('utf-8')).decode('utf-8')
|
||||
|
||||
formats = []
|
||||
for media in play_json['data']['video_info']['media'].values():
|
||||
|
@ -10,7 +10,6 @@ from ..utils import (
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
smuggle_url,
|
||||
try_get,
|
||||
unsmuggle_url,
|
||||
ExtractorError,
|
||||
)
|
||||
@ -221,12 +220,6 @@ class LimelightBaseIE(InfoExtractor):
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
def _extract_info_helper(self, pc, mobile, i, metadata):
|
||||
return self._extract_info(
|
||||
try_get(pc, lambda x: x['playlistItems'][i]['streams'], list) or [],
|
||||
try_get(mobile, lambda x: x['mediaList'][i]['mobileUrls'], list) or [],
|
||||
metadata)
|
||||
|
||||
|
||||
class LimelightMediaIE(LimelightBaseIE):
|
||||
IE_NAME = 'limelight'
|
||||
@ -289,7 +282,10 @@ class LimelightMediaIE(LimelightBaseIE):
|
||||
'getMobilePlaylistByMediaId', 'properties',
|
||||
smuggled_data.get('source_url'))
|
||||
|
||||
return self._extract_info_helper(pc, mobile, 0, metadata)
|
||||
return self._extract_info(
|
||||
pc['playlistItems'][0].get('streams', []),
|
||||
mobile['mediaList'][0].get('mobileUrls', []) if mobile else [],
|
||||
metadata)
|
||||
|
||||
|
||||
class LimelightChannelIE(LimelightBaseIE):
|
||||
@ -330,7 +326,10 @@ class LimelightChannelIE(LimelightBaseIE):
|
||||
'media', smuggled_data.get('source_url'))
|
||||
|
||||
entries = [
|
||||
self._extract_info_helper(pc, mobile, i, medias['media_list'][i])
|
||||
self._extract_info(
|
||||
pc['playlistItems'][i].get('streams', []),
|
||||
mobile['mediaList'][i].get('mobileUrls', []) if mobile else [],
|
||||
medias['media_list'][i])
|
||||
for i in range(len(medias['media_list']))]
|
||||
|
||||
return self.playlist_result(entries, channel_id, pc['title'])
|
||||
|
@ -1,12 +1,13 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_urllib_parse_unquote,
|
||||
from ..compat import compat_urllib_parse_unquote
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
)
|
||||
from ..utils import int_or_none
|
||||
|
||||
|
||||
class MangomoloBaseIE(InfoExtractor):
|
||||
@ -50,4 +51,4 @@ class MangomoloLiveIE(MangomoloBaseIE):
|
||||
_IS_LIVE = True
|
||||
|
||||
def _get_real_id(self, page_id):
|
||||
return compat_b64decode(compat_urllib_parse_unquote(page_id)).decode()
|
||||
return base64.b64decode(compat_urllib_parse_unquote(page_id).encode()).decode()
|
||||
|
@ -1,12 +1,12 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import functools
|
||||
import itertools
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_chr,
|
||||
compat_ord,
|
||||
compat_str,
|
||||
@ -79,7 +79,7 @@ class MixcloudIE(InfoExtractor):
|
||||
|
||||
if encrypted_play_info is not None:
|
||||
# Decode
|
||||
encrypted_play_info = compat_b64decode(encrypted_play_info)
|
||||
encrypted_play_info = base64.b64decode(encrypted_play_info)
|
||||
else:
|
||||
# New path
|
||||
full_info_json = self._parse_json(self._html_search_regex(
|
||||
@ -109,7 +109,7 @@ class MixcloudIE(InfoExtractor):
|
||||
kpa_target = encrypted_play_info
|
||||
else:
|
||||
kps = ['https://', 'http://']
|
||||
kpa_target = compat_b64decode(info_json['streamInfo']['url'])
|
||||
kpa_target = base64.b64decode(info_json['streamInfo']['url'])
|
||||
for kp in kps:
|
||||
partial_key = self._decrypt_xor_cipher(kpa_target, kp)
|
||||
for quote in ["'", '"']:
|
||||
@ -165,7 +165,7 @@ class MixcloudIE(InfoExtractor):
|
||||
format_url = stream_info.get(url_key)
|
||||
if not format_url:
|
||||
continue
|
||||
decrypted = self._decrypt_xor_cipher(key, compat_b64decode(format_url))
|
||||
decrypted = self._decrypt_xor_cipher(key, base64.b64decode(format_url))
|
||||
if not decrypted:
|
||||
continue
|
||||
if url_key == 'hlsUrl':
|
||||
|
@ -3,18 +3,13 @@ from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .vimple import SprutoBaseIE
|
||||
|
||||
|
||||
class MyviIE(SprutoBaseIE):
|
||||
_VALID_URL = r'''(?x)
|
||||
(?:
|
||||
https?://
|
||||
(?:www\.)?
|
||||
myvi\.
|
||||
(?:
|
||||
(?:ru/player|tv)/
|
||||
myvi\.(?:ru/player|tv)/
|
||||
(?:
|
||||
(?:
|
||||
embed/html|
|
||||
@ -22,10 +17,6 @@ class MyviIE(SprutoBaseIE):
|
||||
api/Video/Get
|
||||
)/|
|
||||
content/preloader\.swf\?.*\bid=
|
||||
)|
|
||||
ru/watch/
|
||||
)|
|
||||
myvi:
|
||||
)
|
||||
(?P<id>[\da-zA-Z_-]+)
|
||||
'''
|
||||
@ -51,12 +42,6 @@ class MyviIE(SprutoBaseIE):
|
||||
}, {
|
||||
'url': 'http://myvi.ru/player/flash/ocp2qZrHI-eZnHKQBK4cZV60hslH8LALnk0uBfKsB-Q4WnY26SeGoYPi8HWHxu0O30',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.myvi.ru/watch/YwbqszQynUaHPn_s82sx0Q2',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'myvi:YwbqszQynUaHPn_s82sx0Q2',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
@ -73,39 +58,3 @@ class MyviIE(SprutoBaseIE):
|
||||
'http://myvi.ru/player/api/Video/Get/%s?sig' % video_id, video_id)['sprutoData']
|
||||
|
||||
return self._extract_spruto(spruto, video_id)
|
||||
|
||||
|
||||
class MyviEmbedIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?myvi\.tv/(?:[^?]+\?.*?\bv=|embed/)(?P<id>[\da-z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.myvi.tv/embed/ccdqic3wgkqwpb36x9sxg43t4r',
|
||||
'info_dict': {
|
||||
'id': 'b3ea0663-3234-469d-873e-7fecf36b31d1',
|
||||
'ext': 'mp4',
|
||||
'title': 'Твоя (original song).mp4',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 277,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.myvi.tv/idmi6o?v=ccdqic3wgkqwpb36x9sxg43t4r#watch',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if MyviIE.suitable(url) else super(MyviEmbedIE, cls).suitable(url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
'https://www.myvi.tv/embed/%s' % video_id, video_id)
|
||||
|
||||
myvi_id = self._search_regex(
|
||||
r'CreatePlayer\s*\(\s*["\'].*?\bv=([\da-zA-Z_]+)',
|
||||
webpage, 'video id')
|
||||
|
||||
return self.url_result('myvi:%s' % myvi_id, ie=MyviIE.ie_key())
|
||||
|
@ -68,7 +68,7 @@ class NationalGeographicVideoIE(InfoExtractor):
|
||||
|
||||
class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
||||
IE_NAME = 'natgeo'
|
||||
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:(?:wild/)?[^/]+/)?(?:videos|episodes)/(?P<id>[^/?]+)'
|
||||
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:wild/)?[^/]+/(?:videos|episodes)/(?P<id>[^/?]+)'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@ -102,10 +102,6 @@ class NationalGeographicIE(ThePlatformIE, AdobePassIE):
|
||||
{
|
||||
'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/episodes/the-power-of-miracles/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://channel.nationalgeographic.com/videos/treasures-rediscovered/',
|
||||
'only_matching': True,
|
||||
}
|
||||
]
|
||||
|
||||
|
@ -190,12 +190,10 @@ class NDREmbedBaseIE(InfoExtractor):
|
||||
ext = determine_ext(src, None)
|
||||
if ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
src + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id,
|
||||
f4m_id='hds', fatal=False))
|
||||
src + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id, f4m_id='hds'))
|
||||
elif ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
src, video_id, 'mp4', m3u8_id='hls',
|
||||
entry_protocol='m3u8_native', fatal=False))
|
||||
src, video_id, 'mp4', m3u8_id='hls', entry_protocol='m3u8_native'))
|
||||
else:
|
||||
quality = f.get('quality')
|
||||
ff = {
|
||||
|
@ -19,11 +19,11 @@ from ..utils import (
|
||||
|
||||
|
||||
class OdnoklassnikiIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:(?:www|m|mobile)\.)?(?:odnoklassniki|ok)\.ru/(?:video(?:embed)?|web-api/video/moviePlayer|live)/(?P<id>[\d-]+)'
|
||||
_VALID_URL = r'https?://(?:(?:www|m|mobile)\.)?(?:odnoklassniki|ok)\.ru/(?:video(?:embed)?|web-api/video/moviePlayer)/(?P<id>[\d-]+)'
|
||||
_TESTS = [{
|
||||
# metadata in JSON
|
||||
'url': 'http://ok.ru/video/20079905452',
|
||||
'md5': '0b62089b479e06681abaaca9d204f152',
|
||||
'md5': '6ba728d85d60aa2e6dd37c9e70fdc6bc',
|
||||
'info_dict': {
|
||||
'id': '20079905452',
|
||||
'ext': 'mp4',
|
||||
@ -35,6 +35,7 @@ class OdnoklassnikiIE(InfoExtractor):
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
},
|
||||
'skip': 'Video has been blocked',
|
||||
}, {
|
||||
# metadataUrl
|
||||
'url': 'http://ok.ru/video/63567059965189-0?fromTime=5',
|
||||
@ -98,9 +99,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://mobile.ok.ru/video/20079905452',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ok.ru/live/484531969818',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -186,10 +184,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
||||
})
|
||||
return info
|
||||
|
||||
assert title
|
||||
if provider == 'LIVE_TV_APP':
|
||||
info['title'] = self._live_title(title)
|
||||
|
||||
quality = qualities(('4', '0', '1', '2', '3', '5'))
|
||||
|
||||
formats = [{
|
||||
@ -216,20 +210,6 @@ class OdnoklassnikiIE(InfoExtractor):
|
||||
if fmt_type:
|
||||
fmt['quality'] = quality(fmt_type)
|
||||
|
||||
# Live formats
|
||||
m3u8_url = metadata.get('hlsMasterPlaylistUrl')
|
||||
if m3u8_url:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', entry_protocol='m3u8',
|
||||
m3u8_id='hls', fatal=False))
|
||||
rtmp_url = metadata.get('rtmpUrl')
|
||||
if rtmp_url:
|
||||
formats.append({
|
||||
'url': rtmp_url,
|
||||
'format_id': 'rtmp',
|
||||
'ext': 'flv',
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
info['formats'] = formats
|
||||
|
@ -1,13 +1,9 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_str,
|
||||
compat_urllib_parse_urlencode,
|
||||
)
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
@ -16,6 +12,7 @@ from ..utils import (
|
||||
try_get,
|
||||
unsmuggle_url,
|
||||
)
|
||||
from ..compat import compat_urllib_parse_urlencode
|
||||
|
||||
|
||||
class OoyalaBaseIE(InfoExtractor):
|
||||
@ -47,7 +44,7 @@ class OoyalaBaseIE(InfoExtractor):
|
||||
url_data = try_get(stream, lambda x: x['url']['data'], compat_str)
|
||||
if not url_data:
|
||||
continue
|
||||
s_url = compat_b64decode(url_data).decode('utf-8')
|
||||
s_url = base64.b64decode(url_data.encode('ascii')).decode('utf-8')
|
||||
if not s_url or s_url in urls:
|
||||
continue
|
||||
urls.append(s_url)
|
||||
|
@ -1,8 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
@ -20,14 +18,7 @@ from ..utils import (
|
||||
class PandoraTVIE(InfoExtractor):
|
||||
IE_NAME = 'pandora.tv'
|
||||
IE_DESC = '판도라TV'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
(?:www\.)?pandora\.tv/view/(?P<user_id>[^/]+)/(?P<id>\d+)| # new format
|
||||
(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?| # old format
|
||||
m\.pandora\.tv/?\? # mobile
|
||||
)
|
||||
'''
|
||||
_VALID_URL = r'https?://(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?'
|
||||
_TESTS = [{
|
||||
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
|
||||
'info_dict': {
|
||||
@ -62,20 +53,9 @@ class PandoraTVIE(InfoExtractor):
|
||||
# Test metadata only
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.pandora.tv/view/mikakim/53294230#36797454_new',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://m.pandora.tv/?c=view&ch_userid=mikakim&prgid=54600346',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
user_id = mobj.group('user_id')
|
||||
video_id = mobj.group('id')
|
||||
|
||||
if not user_id or not video_id:
|
||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
video_id = qs.get('prgid', [None])[0]
|
||||
user_id = qs.get('ch_userid', [None])[0]
|
||||
|
@ -4,9 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
xpath_text,
|
||||
@ -28,15 +26,17 @@ class PladformIE(InfoExtractor):
|
||||
(?P<id>\d+)
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'https://out.pladform.ru/player?pl=64471&videoid=3777899&vk_puid15=0&vk_puid34=0',
|
||||
'md5': '53362fac3a27352da20fa2803cc5cd6f',
|
||||
# http://muz-tv.ru/kinozal/view/7400/
|
||||
'url': 'http://out.pladform.ru/player?pl=24822&videoid=100183293',
|
||||
'md5': '61f37b575dd27f1bb2e1854777fe31f4',
|
||||
'info_dict': {
|
||||
'id': '3777899',
|
||||
'id': '100183293',
|
||||
'ext': 'mp4',
|
||||
'title': 'СТУДИЯ СОЮЗ • Шоу Студия Союз, 24 выпуск (01.02.2018) Нурлан Сабуров и Слава Комиссаренко',
|
||||
'description': 'md5:05140e8bf1b7e2d46e7ba140be57fd95',
|
||||
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
|
||||
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 3190,
|
||||
'duration': 694,
|
||||
'age_limit': 0,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://static.pladform.ru/player.swf?pl=21469&videoid=100183293&vkcid=0',
|
||||
@ -56,48 +56,22 @@ class PladformIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
pl = qs.get('pl', ['1'])[0]
|
||||
|
||||
video = self._download_xml(
|
||||
'http://out.pladform.ru/getVideo', video_id, query={
|
||||
'pl': pl,
|
||||
'videoid': video_id,
|
||||
})
|
||||
|
||||
def fail(text):
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, text),
|
||||
expected=True)
|
||||
'http://out.pladform.ru/getVideo?pl=1&videoid=%s' % video_id,
|
||||
video_id)
|
||||
|
||||
if video.tag == 'error':
|
||||
fail(video.text)
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, video.text),
|
||||
expected=True)
|
||||
|
||||
quality = qualities(('ld', 'sd', 'hd'))
|
||||
|
||||
formats = []
|
||||
for src in video.findall('./src'):
|
||||
if src is None:
|
||||
continue
|
||||
format_url = src.text
|
||||
if not format_url:
|
||||
continue
|
||||
if src.get('type') == 'hls' or determine_ext(format_url) == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
formats = [{
|
||||
'url': src.text,
|
||||
'format_id': src.get('quality'),
|
||||
'quality': quality(src.get('quality')),
|
||||
})
|
||||
|
||||
if not formats:
|
||||
error = xpath_text(video, './cap', 'error', default=None)
|
||||
if error:
|
||||
fail(error)
|
||||
|
||||
} for src in video.findall('./src')]
|
||||
self._sort_formats(formats)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
|
@ -11,34 +11,19 @@ from ..utils import (
|
||||
|
||||
|
||||
class PokemonIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/(?:[^/]+/)+(?P<display_id>[^/?#&]+))'
|
||||
_VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/[^/]+/\d+_\d+-(?P<display_id>[^/?#]+))'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.pokemon.com/us/pokemon-episodes/20_30-the-ol-raise-and-switch/',
|
||||
'md5': '2fe8eaec69768b25ef898cda9c43062e',
|
||||
'url': 'http://www.pokemon.com/us/pokemon-episodes/19_01-from-a-to-z/?play=true',
|
||||
'md5': '9fb209ae3a569aac25de0f5afc4ee08f',
|
||||
'info_dict': {
|
||||
'id': 'afe22e30f01c41f49d4f1d9eab5cd9a4',
|
||||
'id': 'd0436c00c3ce4071ac6cee8130ac54a1',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Ol’ Raise and Switch!',
|
||||
'description': 'md5:7db77f7107f98ba88401d3adc80ff7af',
|
||||
'timestamp': 1511824728,
|
||||
'upload_date': '20171127',
|
||||
},
|
||||
'add_id': ['LimelightMedia'],
|
||||
}, {
|
||||
# no data-video-title
|
||||
'url': 'https://www.pokemon.com/us/pokemon-episodes/pokemon-movies/pokemon-the-rise-of-darkrai-2008',
|
||||
'info_dict': {
|
||||
'id': '99f3bae270bf4e5097274817239ce9c8',
|
||||
'ext': 'mp4',
|
||||
'title': 'Pokémon: The Rise of Darkrai',
|
||||
'description': 'md5:ea8fbbf942e1e497d54b19025dd57d9d',
|
||||
'timestamp': 1417778347,
|
||||
'upload_date': '20141205',
|
||||
},
|
||||
'add_id': ['LimelightMedia'],
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
'title': 'From A to Z!',
|
||||
'description': 'Bonnie makes a new friend, Ash runs into an old friend, and a terrifying premonition begins to unfold!',
|
||||
'timestamp': 1460478136,
|
||||
'upload_date': '20160412',
|
||||
},
|
||||
'add_id': ['LimelightMedia']
|
||||
}, {
|
||||
'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
|
||||
'only_matching': True,
|
||||
@ -57,9 +42,7 @@ class PokemonIE(InfoExtractor):
|
||||
r'(<[^>]+data-video-id="%s"[^>]*>)' % (video_id if video_id else '[a-z0-9]{32}'),
|
||||
webpage, 'video data element'))
|
||||
video_id = video_data['data-video-id']
|
||||
title = video_data.get('data-video-title') or self._html_search_meta(
|
||||
'pkm-title', webpage, ' title', default=None) or self._search_regex(
|
||||
r'<h1[^>]+\bclass=["\']us-title[^>]+>([^<]+)', webpage, 'title')
|
||||
title = video_data['data-video-title']
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
|
@ -129,7 +129,6 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
(?:beta\.)?
|
||||
(?:
|
||||
prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|7tv|advopedia
|
||||
)\.(?:de|at|ch)|
|
||||
@ -345,8 +344,6 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
r'clip[iI]d=(\d+)',
|
||||
r'clip[iI]d\s*=\s*["\'](\d+)',
|
||||
r"'itemImageUrl'\s*:\s*'/dynamic/thumbnails/full/\d+/(\d+)",
|
||||
r'proMamsId"\s*:\s*"(\d+)',
|
||||
r'proMamsId"\s*:\s*"(\d+)',
|
||||
]
|
||||
_TITLE_REGEXES = [
|
||||
r'<h2 class="subtitle" itemprop="name">\s*(.+?)</h2>',
|
||||
|
@ -5,93 +5,135 @@ from .common import InfoExtractor
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
try_get,
|
||||
# unified_timestamp,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
class RedBullTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?redbull\.tv/video/(?P<id>AP-\w+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?redbull\.tv/(?:video|film|live)/(?:AP-\w+/segment/)?(?P<id>AP-\w+)'
|
||||
_TESTS = [{
|
||||
# film
|
||||
'url': 'https://www.redbull.tv/video/AP-1Q6XCDTAN1W11',
|
||||
'url': 'https://www.redbull.tv/video/AP-1Q756YYX51W11/abc-of-wrc',
|
||||
'md5': 'fb0445b98aa4394e504b413d98031d1f',
|
||||
'info_dict': {
|
||||
'id': 'AP-1Q6XCDTAN1W11',
|
||||
'id': 'AP-1Q756YYX51W11',
|
||||
'ext': 'mp4',
|
||||
'title': 'ABC of... WRC - ABC of... S1E6',
|
||||
'title': 'ABC of...WRC',
|
||||
'description': 'md5:5c7ed8f4015c8492ecf64b6ab31e7d31',
|
||||
'duration': 1582.04,
|
||||
# 'timestamp': 1488405786,
|
||||
# 'upload_date': '20170301',
|
||||
},
|
||||
}, {
|
||||
# episode
|
||||
'url': 'https://www.redbull.tv/video/AP-1PMHKJFCW1W11',
|
||||
'url': 'https://www.redbull.tv/video/AP-1PMT5JCWH1W11/grime?playlist=shows:shows-playall:web',
|
||||
'info_dict': {
|
||||
'id': 'AP-1PMHKJFCW1W11',
|
||||
'id': 'AP-1PMT5JCWH1W11',
|
||||
'ext': 'mp4',
|
||||
'title': 'Grime - Hashtags S2E4',
|
||||
'description': 'md5:b5f522b89b72e1e23216e5018810bb25',
|
||||
'title': 'Grime - Hashtags S2 E4',
|
||||
'description': 'md5:334b741c8c1ce65be057eab6773c1cf5',
|
||||
'duration': 904.6,
|
||||
# 'timestamp': 1487290093,
|
||||
# 'upload_date': '20170217',
|
||||
'series': 'Hashtags',
|
||||
'season_number': 2,
|
||||
'episode_number': 4,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# segment
|
||||
'url': 'https://www.redbull.tv/live/AP-1R5DX49XS1W11/segment/AP-1QSAQJ6V52111/semi-finals',
|
||||
'info_dict': {
|
||||
'id': 'AP-1QSAQJ6V52111',
|
||||
'ext': 'mp4',
|
||||
'title': 'Semi Finals - Vans Park Series Pro Tour',
|
||||
'description': 'md5:306a2783cdafa9e65e39aa62f514fd97',
|
||||
'duration': 11791.991,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.redbull.tv/film/AP-1MSKKF5T92111/in-motion',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
session = self._download_json(
|
||||
'https://api.redbull.tv/v3/session', video_id,
|
||||
'https://api-v2.redbull.tv/session', video_id,
|
||||
note='Downloading access token', query={
|
||||
'build': '4.370.0',
|
||||
'category': 'personal_computer',
|
||||
'os_version': '1.0',
|
||||
'os_family': 'http',
|
||||
})
|
||||
if session.get('code') == 'error':
|
||||
raise ExtractorError('%s said: %s' % (
|
||||
self.IE_NAME, session['message']))
|
||||
token = session['token']
|
||||
auth = '%s %s' % (session.get('token_type', 'Bearer'), session['access_token'])
|
||||
|
||||
try:
|
||||
video = self._download_json(
|
||||
'https://api.redbull.tv/v3/products/' + video_id,
|
||||
info = self._download_json(
|
||||
'https://api-v2.redbull.tv/content/%s' % video_id,
|
||||
video_id, note='Downloading video information',
|
||||
headers={'Authorization': token}
|
||||
headers={'Authorization': auth}
|
||||
)
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
||||
error_message = self._parse_json(
|
||||
e.cause.read().decode(), video_id)['error']
|
||||
e.cause.read().decode(), video_id)['message']
|
||||
raise ExtractorError('%s said: %s' % (
|
||||
self.IE_NAME, error_message), expected=True)
|
||||
raise
|
||||
|
||||
title = video['title'].strip()
|
||||
video = info['video_product']
|
||||
|
||||
title = info['title'].strip()
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
'https://dms.redbull.tv/v3/%s/%s/playlist.m3u8' % (video_id, token),
|
||||
video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls')
|
||||
video['url'], video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
for resource in video.get('resources', []):
|
||||
if resource.startswith('closed_caption_'):
|
||||
splitted_resource = resource.split('_')
|
||||
if splitted_resource[2]:
|
||||
subtitles.setdefault('en', []).append({
|
||||
'url': 'https://resources.redbull.tv/%s/%s' % (video_id, resource),
|
||||
'ext': splitted_resource[2],
|
||||
for _, captions in (try_get(
|
||||
video, lambda x: x['attachments']['captions'],
|
||||
dict) or {}).items():
|
||||
if not captions or not isinstance(captions, list):
|
||||
continue
|
||||
for caption in captions:
|
||||
caption_url = caption.get('url')
|
||||
if not caption_url:
|
||||
continue
|
||||
ext = caption.get('format')
|
||||
if ext == 'xml':
|
||||
ext = 'ttml'
|
||||
subtitles.setdefault(caption.get('lang') or 'en', []).append({
|
||||
'url': caption_url,
|
||||
'ext': ext,
|
||||
})
|
||||
|
||||
subheading = video.get('subheading')
|
||||
subheading = info.get('subheading')
|
||||
if subheading:
|
||||
title += ' - %s' % subheading
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': video.get('long_description') or video.get(
|
||||
'description': info.get('long_description') or info.get(
|
||||
'short_description'),
|
||||
'duration': float_or_none(video.get('duration'), scale=1000),
|
||||
# 'timestamp': unified_timestamp(info.get('published')),
|
||||
'series': info.get('show_title'),
|
||||
'season_number': int_or_none(info.get('season_number')),
|
||||
'episode_number': int_or_none(info.get('episode_number')),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
@ -46,10 +46,9 @@ class RedTubeIE(InfoExtractor):
|
||||
raise ExtractorError('Video %s has been removed' % video_id, expected=True)
|
||||
|
||||
title = self._html_search_regex(
|
||||
(r'<h(\d)[^>]+class="(?:video_title_text|videoTitle)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
|
||||
r'(?:videoTitle|title)\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',),
|
||||
webpage, 'title', group='title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
(r'<h1 class="videoTitle[^"]*">(?P<title>.+?)</h1>',
|
||||
r'videoTitle\s*:\s*(["\'])(?P<title>)\1'),
|
||||
webpage, 'title', group='title')
|
||||
|
||||
formats = []
|
||||
sources = self._parse_json(
|
||||
@ -88,13 +87,12 @@ class RedTubeIE(InfoExtractor):
|
||||
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
upload_date = unified_strdate(self._search_regex(
|
||||
r'<span[^>]+>ADDED ([^<]+)<',
|
||||
r'<span[^>]+class="added-time"[^>]*>ADDED ([^<]+)<',
|
||||
webpage, 'upload date', fatal=False))
|
||||
duration = int_or_none(self._search_regex(
|
||||
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', default=None))
|
||||
view_count = str_to_int(self._search_regex(
|
||||
(r'<div[^>]*>Views</div>\s*<div[^>]*>\s*([\d,.]+)',
|
||||
r'<span[^>]*>VIEWS</span>\s*</td>\s*<td>\s*([\d,.]+)'),
|
||||
r'<span[^>]*>VIEWS</span></td>\s*<td>([\d,.]+)',
|
||||
webpage, 'view count', fatal=False))
|
||||
|
||||
# No self-labeling, but they describe themselves as
|
||||
|
@ -5,8 +5,8 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class RestudyIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:(?:www|portal)\.)?restudy\.dk/video/[^/]+/id/(?P<id>[0-9]+)'
|
||||
_TESTS = [{
|
||||
_VALID_URL = r'https?://(?:www\.)?restudy\.dk/video/play/id/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.restudy.dk/video/play/id/1637',
|
||||
'info_dict': {
|
||||
'id': '1637',
|
||||
@ -18,10 +18,7 @@ class RestudyIE(InfoExtractor):
|
||||
# rtmp download
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://portal.restudy.dk/video/leiden-frosteffekt/id/1637',
|
||||
'only_matching': True,
|
||||
}]
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
@ -32,7 +29,7 @@ class RestudyIE(InfoExtractor):
|
||||
description = self._og_search_description(webpage).strip()
|
||||
|
||||
formats = self._extract_smil_formats(
|
||||
'https://cdn.portal.restudy.dk/dynamic/themes/front/awsmedia/SmilDirectory/video_%s.xml' % video_id,
|
||||
'https://www.restudy.dk/awsmedia/SmilDirectory/video_%s.xml' % video_id,
|
||||
video_id)
|
||||
self._sort_formats(formats)
|
||||
|
||||
|
44
youtube_dl/extractor/ringtv.py
Normal file
44
youtube_dl/extractor/ringtv.py
Normal file
@ -0,0 +1,44 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class RingTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ringtv\.craveonline\.com/(?P<type>news|videos/video)/(?P<id>[^/?#]+)'
|
||||
_TEST = {
|
||||
'url': 'http://ringtv.craveonline.com/news/310833-luis-collazo-says-victor-ortiz-better-not-quit-on-jan-30',
|
||||
'md5': 'd25945f5df41cdca2d2587165ac28720',
|
||||
'info_dict': {
|
||||
'id': '857645',
|
||||
'ext': 'mp4',
|
||||
'title': 'Video: Luis Collazo says Victor Ortiz "better not quit on Jan. 30" - Ring TV',
|
||||
'description': 'Luis Collazo is excited about his Jan. 30 showdown with fellow former welterweight titleholder Victor Ortiz at Barclays Center in his hometown of Brooklyn. The SuperBowl week fight headlines a Golden Boy Live! card on Fox Sports 1.',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id').split('-')[0]
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
if mobj.group('type') == 'news':
|
||||
video_id = self._search_regex(
|
||||
r'''(?x)<iframe[^>]+src="http://cms\.springboardplatform\.com/
|
||||
embed_iframe/[0-9]+/video/([0-9]+)/''',
|
||||
webpage, 'real video ID')
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._html_search_regex(
|
||||
r'addthis:description="([^"]+)"',
|
||||
webpage, 'description', fatal=False)
|
||||
final_url = 'http://ringtv.craveonline.springboardplatform.com/storage/ringtv.craveonline.com/conversion/%s.mp4' % video_id
|
||||
thumbnail_url = 'http://ringtv.craveonline.springboardplatform.com/storage/ringtv.craveonline.com/snapshots/%s.jpg' % video_id
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': final_url,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail_url,
|
||||
'description': description,
|
||||
}
|
@ -1,12 +1,12 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..aes import aes_cbc_decrypt
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_ord,
|
||||
compat_str,
|
||||
)
|
||||
@ -142,11 +142,11 @@ class RTL2YouIE(RTL2YouBaseIE):
|
||||
stream_data = self._download_json(
|
||||
self._BACKWERK_BASE_URL + 'stream/video/' + video_id, video_id)
|
||||
|
||||
data, iv = compat_b64decode(stream_data['streamUrl']).decode().split(':')
|
||||
data, iv = base64.b64decode(stream_data['streamUrl']).decode().split(':')
|
||||
stream_url = intlist_to_bytes(aes_cbc_decrypt(
|
||||
bytes_to_intlist(compat_b64decode(data)),
|
||||
bytes_to_intlist(base64.b64decode(data)),
|
||||
bytes_to_intlist(self._AES_KEY),
|
||||
bytes_to_intlist(compat_b64decode(iv))
|
||||
bytes_to_intlist(base64.b64decode(iv))
|
||||
))
|
||||
if b'rtl2_you_video_not_found' in stream_url:
|
||||
raise ExtractorError('video not found', expected=True)
|
||||
|
@ -93,11 +93,58 @@ class RtlNlIE(InfoExtractor):
|
||||
|
||||
meta = info.get('meta', {})
|
||||
|
||||
# m3u8 streams are encrypted and may not be handled properly by older ffmpeg/avconv.
|
||||
# To workaround this previously adaptive -> flash trick was used to obtain
|
||||
# unencrypted m3u8 streams (see https://github.com/rg3/youtube-dl/issues/4118)
|
||||
# and bypass georestrictions as well.
|
||||
# Currently, unencrypted m3u8 playlists are (intentionally?) invalid and therefore
|
||||
# unusable albeit can be fixed by simple string replacement (see
|
||||
# https://github.com/rg3/youtube-dl/pull/6337)
|
||||
# Since recent ffmpeg and avconv handle encrypted streams just fine encrypted
|
||||
# streams are used now.
|
||||
videopath = material['videopath']
|
||||
m3u8_url = meta.get('videohost', 'http://manifest.us.rtl.nl') + videopath
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
m3u8_url, uuid, 'mp4', m3u8_id='hls', fatal=False)
|
||||
|
||||
video_urlpart = videopath.split('/adaptive/')[1][:-5]
|
||||
PG_URL_TEMPLATE = 'http://pg.us.rtl.nl/rtlxl/network/%s/progressive/%s.mp4'
|
||||
|
||||
PG_FORMATS = (
|
||||
('a2t', 512, 288),
|
||||
('a3t', 704, 400),
|
||||
('nettv', 1280, 720),
|
||||
)
|
||||
|
||||
def pg_format(format_id, width, height):
|
||||
return {
|
||||
'url': PG_URL_TEMPLATE % (format_id, video_urlpart),
|
||||
'format_id': 'pg-%s' % format_id,
|
||||
'protocol': 'http',
|
||||
'width': width,
|
||||
'height': height,
|
||||
}
|
||||
|
||||
if not formats:
|
||||
formats = [pg_format(*pg_tuple) for pg_tuple in PG_FORMATS]
|
||||
else:
|
||||
pg_formats = []
|
||||
for format_id, width, height in PG_FORMATS:
|
||||
try:
|
||||
# Find hls format with the same width and height corresponding
|
||||
# to progressive format and copy metadata from it.
|
||||
f = next(f for f in formats if f.get('height') == height)
|
||||
# hls formats may have invalid width
|
||||
f['width'] = width
|
||||
f_copy = f.copy()
|
||||
f_copy.update(pg_format(format_id, width, height))
|
||||
pg_formats.append(f_copy)
|
||||
except StopIteration:
|
||||
# Missing hls format does mean that no progressive format with
|
||||
# such width and height exists either.
|
||||
pass
|
||||
formats.extend(pg_formats)
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnails = []
|
||||
|
@ -7,7 +7,6 @@ import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_struct_unpack,
|
||||
)
|
||||
from ..utils import (
|
||||
@ -22,7 +21,7 @@ from ..utils import (
|
||||
|
||||
|
||||
def _decrypt_url(png):
|
||||
encrypted_data = compat_b64decode(png)
|
||||
encrypted_data = base64.b64decode(png.encode('utf-8'))
|
||||
text_index = encrypted_data.find(b'tEXt')
|
||||
text_chunk = encrypted_data[text_index - 4:]
|
||||
length = compat_struct_unpack('!I', text_chunk[:4])[0]
|
||||
|
@ -1,47 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class RTVSIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?rtvs\.sk/(?:radio|televizia)/archiv/\d+/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
# radio archive
|
||||
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
|
||||
'md5': '134d5d6debdeddf8a5d761cbc9edacb8',
|
||||
'info_dict': {
|
||||
'id': '414872',
|
||||
'ext': 'mp3',
|
||||
'title': 'Ostrov pokladov 1 časť.mp3'
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
# tv archive
|
||||
'url': 'http://www.rtvs.sk/televizia/archiv/8249/63118',
|
||||
'md5': '85e2c55cf988403b70cac24f5c086dc6',
|
||||
'info_dict': {
|
||||
'id': '63118',
|
||||
'ext': 'mp4',
|
||||
'title': 'Amaro Džives - Náš deň',
|
||||
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.'
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
playlist_url = self._search_regex(
|
||||
r'playlist["\']?\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
|
||||
'playlist url', group='url')
|
||||
|
||||
data = self._download_json(
|
||||
playlist_url, video_id, 'Downloading playlist')[0]
|
||||
return self._parse_jwplayer_data(data, video_id=video_id)
|
@ -1,169 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_str,
|
||||
compat_urllib_parse_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
urljoin,
|
||||
int_or_none,
|
||||
parse_codecs,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
def _raw_id(src_url):
|
||||
return compat_urllib_parse_urlparse(src_url).path.split('/')[-1]
|
||||
|
||||
|
||||
class SeznamZpravyIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?seznamzpravy\.cz/iframe/player\?.*\bsrc='
|
||||
_TESTS = [{
|
||||
'url': 'https://www.seznamzpravy.cz/iframe/player?duration=241&serviceSlug=zpravy&src=https%3A%2F%2Fv39-a.sdn.szn.cz%2Fv_39%2Fvmd%2F5999c902ea707c67d8e267a9%3Ffl%3Dmdk%2C432f65a0%7C&itemType=video&autoPlay=false&title=Sv%C4%9Bt%20bez%20obalu%3A%20%C4%8Ce%C5%A1t%C3%AD%20voj%C3%A1ci%20na%20mis%C3%ADch%20(kr%C3%A1tk%C3%A1%20verze)&series=Sv%C4%9Bt%20bez%20obalu&serviceName=Seznam%20Zpr%C3%A1vy&poster=%2F%2Fd39-a.sdn.szn.cz%2Fd_39%2Fc_img_F_I%2FR5puJ.jpeg%3Ffl%3Dcro%2C0%2C0%2C1920%2C1080%7Cres%2C1200%2C%2C1%7Cjpg%2C80%2C%2C1&width=1920&height=1080&cutFrom=0&cutTo=0&splVersion=VOD&contentId=170889&contextId=35990&showAdvert=true&collocation=&autoplayPossible=true&embed=&isVideoTooShortForPreroll=false&isVideoTooLongForPostroll=true&videoCommentOpKey=&videoCommentId=&version=4.0.76&dotService=zpravy&gemiusPrismIdentifier=bVc1ZIb_Qax4W2v5xOPGpMeCP31kFfrTzj0SqPTLh_b.Z7&zoneIdPreroll=seznam.pack.videospot&skipOffsetPreroll=5§ionPrefixPreroll=%2Fzpravy',
|
||||
'info_dict': {
|
||||
'id': '170889',
|
||||
'ext': 'mp4',
|
||||
'title': 'Svět bez obalu: Čeští vojáci na misích (krátká verze)',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'duration': 241,
|
||||
'series': 'Svět bez obalu',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# with Location key
|
||||
'url': 'https://www.seznamzpravy.cz/iframe/player?duration=null&serviceSlug=zpravy&src=https%3A%2F%2Flive-a.sdn.szn.cz%2Fv_39%2F59e468fe454f8472a96af9fa%3Ffl%3Dmdk%2C5c1e2840%7C&itemType=livevod&autoPlay=false&title=P%C5%99edseda%20KDU-%C4%8CSL%20Pavel%20B%C4%9Blobr%C3%A1dek%20ve%20volebn%C3%AD%20V%C3%BDzv%C4%9B%20Seznamu&series=V%C3%BDzva&serviceName=Seznam%20Zpr%C3%A1vy&poster=%2F%2Fd39-a.sdn.szn.cz%2Fd_39%2Fc_img_G_J%2FjTBCs.jpeg%3Ffl%3Dcro%2C0%2C0%2C1280%2C720%7Cres%2C1200%2C%2C1%7Cjpg%2C80%2C%2C1&width=16&height=9&cutFrom=0&cutTo=0&splVersion=VOD&contentId=185688&contextId=38489&showAdvert=true&collocation=&hideFullScreen=false&hideSubtitles=false&embed=&isVideoTooShortForPreroll=false&isVideoTooShortForPreroll2=false&isVideoTooLongForPostroll=false&fakePostrollZoneID=seznam.clanky.zpravy.preroll&fakePrerollZoneID=seznam.clanky.zpravy.preroll&videoCommentId=&trim=default_16x9&noPrerollVideoLength=30&noPreroll2VideoLength=undefined&noMidrollVideoLength=0&noPostrollVideoLength=999999&autoplayPossible=true&version=5.0.41&dotService=zpravy&gemiusPrismIdentifier=zD3g7byfW5ekpXmxTVLaq5Srjw5i4hsYo0HY1aBwIe..27&zoneIdPreroll=seznam.pack.videospot&skipOffsetPreroll=5§ionPrefixPreroll=%2Fzpravy%2Fvyzva&zoneIdPostroll=seznam.pack.videospot&skipOffsetPostroll=5§ionPrefixPostroll=%2Fzpravy%2Fvyzva®ression=false',
|
||||
'info_dict': {
|
||||
'id': '185688',
|
||||
'ext': 'mp4',
|
||||
'title': 'Předseda KDU-ČSL Pavel Bělobrádek ve volební Výzvě Seznamu',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'series': 'Výzva',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return [
|
||||
mobj.group('url') for mobj in re.finditer(
|
||||
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?seznamzpravy\.cz/iframe/player\?.*?)\1',
|
||||
webpage)]
|
||||
|
||||
def _extract_sdn_formats(self, sdn_url, video_id):
|
||||
sdn_data = self._download_json(sdn_url, video_id)
|
||||
|
||||
if sdn_data.get('Location'):
|
||||
sdn_url = sdn_data['Location']
|
||||
sdn_data = self._download_json(sdn_url, video_id)
|
||||
|
||||
formats = []
|
||||
mp4_formats = try_get(sdn_data, lambda x: x['data']['mp4'], dict) or {}
|
||||
for format_id, format_data in mp4_formats.items():
|
||||
relative_url = format_data.get('url')
|
||||
if not relative_url:
|
||||
continue
|
||||
|
||||
try:
|
||||
width, height = format_data.get('resolution')
|
||||
except (TypeError, ValueError):
|
||||
width, height = None, None
|
||||
|
||||
f = {
|
||||
'url': urljoin(sdn_url, relative_url),
|
||||
'format_id': 'http-%s' % format_id,
|
||||
'tbr': int_or_none(format_data.get('bandwidth'), scale=1000),
|
||||
'width': int_or_none(width),
|
||||
'height': int_or_none(height),
|
||||
}
|
||||
f.update(parse_codecs(format_data.get('codec')))
|
||||
formats.append(f)
|
||||
|
||||
pls = sdn_data.get('pls', {})
|
||||
|
||||
def get_url(format_id):
|
||||
return try_get(pls, lambda x: x[format_id]['url'], compat_str)
|
||||
|
||||
dash_rel_url = get_url('dash')
|
||||
if dash_rel_url:
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
urljoin(sdn_url, dash_rel_url), video_id, mpd_id='dash',
|
||||
fatal=False))
|
||||
|
||||
hls_rel_url = get_url('hls')
|
||||
if hls_rel_url:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
urljoin(sdn_url, hls_rel_url), video_id, ext='mp4',
|
||||
m3u8_id='hls', fatal=False))
|
||||
|
||||
self._sort_formats(formats)
|
||||
return formats
|
||||
|
||||
def _real_extract(self, url):
|
||||
params = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
|
||||
|
||||
src = params['src'][0]
|
||||
title = params['title'][0]
|
||||
video_id = params.get('contentId', [_raw_id(src)])[0]
|
||||
formats = self._extract_sdn_formats(src + 'spl2,2,VOD', video_id)
|
||||
|
||||
duration = int_or_none(params.get('duration', [None])[0])
|
||||
series = params.get('series', [None])[0]
|
||||
thumbnail = params.get('poster', [None])[0]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'series': series,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class SeznamZpravyArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:seznam\.cz/zpravy|seznamzpravy\.cz)/clanek/(?:[^/?#&]+)-(?P<id>\d+)'
|
||||
_API_URL = 'https://apizpravy.seznam.cz/'
|
||||
|
||||
_TESTS = [{
|
||||
# two videos on one page, with SDN URL
|
||||
'url': 'https://www.seznamzpravy.cz/clanek/jejich-svet-na-nas-utoci-je-lepsi-branit-se-na-jejich-pisecku-rika-reziser-a-major-v-zaloze-marhoul-35990',
|
||||
'info_dict': {
|
||||
'id': '35990',
|
||||
'title': 'md5:6011c877a36905f28f271fcd8dcdb0f2',
|
||||
'description': 'md5:933f7b06fa337a814ba199d3596d27ba',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
}, {
|
||||
# video with live stream URL
|
||||
'url': 'https://www.seznam.cz/zpravy/clanek/znovu-do-vlady-s-ano-pavel-belobradek-ve-volebnim-specialu-seznamu-38489',
|
||||
'info_dict': {
|
||||
'id': '38489',
|
||||
'title': 'md5:8fa1afdc36fd378cf0eba2b74c5aca60',
|
||||
'description': 'md5:428e7926a1a81986ec7eb23078004fb4',
|
||||
},
|
||||
'playlist_count': 1,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
article_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, article_id)
|
||||
|
||||
info = self._search_json_ld(webpage, article_id, default={})
|
||||
|
||||
title = info.get('title') or self._og_search_title(webpage, fatal=False)
|
||||
description = info.get('description') or self._og_search_description(webpage)
|
||||
|
||||
return self.playlist_result([
|
||||
self.url_result(url, ie=SeznamZpravyIE.ie_key())
|
||||
for url in SeznamZpravyIE._extract_urls(webpage)],
|
||||
article_id, title, description)
|
@ -1,7 +1,8 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
@ -21,8 +22,8 @@ class SharedBaseIE(InfoExtractor):
|
||||
|
||||
video_url = self._extract_video_url(webpage, video_id, url)
|
||||
|
||||
title = compat_b64decode(self._html_search_meta(
|
||||
'full:title', webpage, 'title')).decode('utf-8')
|
||||
title = base64.b64decode(self._html_search_meta(
|
||||
'full:title', webpage, 'title').encode('utf-8')).decode('utf-8')
|
||||
filesize = int_or_none(self._html_search_meta(
|
||||
'full:size', webpage, 'file size', fatal=False))
|
||||
|
||||
@ -91,4 +92,5 @@ class VivoIE(SharedBaseIE):
|
||||
r'InitializeStream\s*\(\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
webpage, 'stream', group='url'),
|
||||
video_id,
|
||||
transform_source=lambda x: compat_b64decode(x).decode('utf-8'))[0]
|
||||
transform_source=lambda x: base64.b64decode(
|
||||
x.encode('ascii')).decode('utf-8'))[0]
|
||||
|
@ -4,11 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_str,
|
||||
compat_urllib_parse_urlparse,
|
||||
)
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
@ -61,7 +57,7 @@ class SixPlayIE(InfoExtractor):
|
||||
container = asset.get('video_container')
|
||||
ext = determine_ext(asset_url)
|
||||
if container == 'm3u8' or ext == 'm3u8':
|
||||
if protocol == 'usp' and not compat_parse_qs(compat_urllib_parse_urlparse(asset_url).query).get('token', [None])[0]:
|
||||
if protocol == 'usp':
|
||||
asset_url = re.sub(r'/([^/]+)\.ism/[^/]*\.m3u8', r'/\1.ism/\1.m3u8', asset_url)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
asset_url, video_id, 'mp4', 'm3u8_native',
|
||||
|
@ -157,7 +157,8 @@ class SoundcloudIE(InfoExtractor):
|
||||
},
|
||||
]
|
||||
|
||||
_CLIENT_ID = 'DQskPX1pntALRzMp4HSxya3Mc0AO66Ro'
|
||||
_CLIENT_ID = 'c6CU49JDMapyrQo06UxU9xouB9ZVzqCn'
|
||||
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
|
@ -6,7 +6,7 @@ from .mtv import MTVServicesInfoExtractor
|
||||
|
||||
class SouthParkIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'southpark.cc.com'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.cc\.com/(?:clips|(?:full-)?episodes|collections)/(?P<id>.+?)(\?|#|$))'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.cc\.com/(?:clips|(?:full-)?episodes)/(?P<id>.+?)(\?|#|$))'
|
||||
|
||||
_FEED_URL = 'http://www.southparkstudios.com/feeds/video-player/mrss'
|
||||
|
||||
@ -20,9 +20,6 @@ class SouthParkIE(MTVServicesInfoExtractor):
|
||||
'timestamp': 1112760000,
|
||||
'upload_date': '20050406',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://southpark.cc.com/collections/7758/fan-favorites/1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
@ -44,7 +41,7 @@ class SouthParkEsIE(SouthParkIE):
|
||||
|
||||
class SouthParkDeIE(SouthParkIE):
|
||||
IE_NAME = 'southpark.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.de/(?:clips|alle-episoden|collections)/(?P<id>.+?)(\?|#|$))'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.de/(?:clips|alle-episoden)/(?P<id>.+?)(\?|#|$))'
|
||||
_FEED_URL = 'http://www.southpark.de/feeds/video-player/mrss/'
|
||||
|
||||
_TESTS = [{
|
||||
@ -73,15 +70,12 @@ class SouthParkDeIE(SouthParkIE):
|
||||
'description': 'Kyle will mit seinem kleinen Bruder Ike Videospiele spielen. Als der nicht mehr mit ihm spielen will, hat Kyle Angst, dass er die Kids von heute nicht mehr versteht.',
|
||||
},
|
||||
'playlist_count': 3,
|
||||
}, {
|
||||
'url': 'http://www.southpark.de/collections/2476/superhero-showdown/1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class SouthParkNlIE(SouthParkIE):
|
||||
IE_NAME = 'southpark.nl'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.nl/(?:clips|(?:full-)?episodes|collections)/(?P<id>.+?)(\?|#|$))'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.nl/(?:clips|(?:full-)?episodes)/(?P<id>.+?)(\?|#|$))'
|
||||
_FEED_URL = 'http://www.southpark.nl/feeds/video-player/mrss/'
|
||||
|
||||
_TESTS = [{
|
||||
@ -96,7 +90,7 @@ class SouthParkNlIE(SouthParkIE):
|
||||
|
||||
class SouthParkDkIE(SouthParkIE):
|
||||
IE_NAME = 'southparkstudios.dk'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southparkstudios\.(?:dk|nu)/(?:clips|full-episodes|collections)/(?P<id>.+?)(\?|#|$))'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<url>southparkstudios\.dk/(?:clips|full-episodes)/(?P<id>.+?)(\?|#|$))'
|
||||
_FEED_URL = 'http://www.southparkstudios.dk/feeds/video-player/mrss/'
|
||||
|
||||
_TESTS = [{
|
||||
@ -106,10 +100,4 @@ class SouthParkDkIE(SouthParkIE):
|
||||
'description': 'Butters is convinced he\'s living in a virtual reality.',
|
||||
},
|
||||
'playlist_mincount': 3,
|
||||
}, {
|
||||
'url': 'http://www.southparkstudios.dk/collections/2476/superhero-showdown/1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.southparkstudios.nu/collections/2476/superhero-showdown/1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
@ -4,10 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .nexx import (
|
||||
NexxIE,
|
||||
NexxEmbedIE,
|
||||
)
|
||||
from .nexx import NexxEmbedIE
|
||||
from .spiegeltv import SpiegeltvIE
|
||||
from ..compat import compat_urlparse
|
||||
from ..utils import (
|
||||
@ -54,10 +51,6 @@ class SpiegelIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.spiegel.de/video/astronaut-alexander-gerst-von-der-iss-station-beantwortet-fragen-video-1519126-iframe.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# nexx video
|
||||
'url': 'http://www.spiegel.de/video/spiegel-tv-magazin-ueber-guellekrise-in-schleswig-holstein-video-99012776.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -68,14 +61,6 @@ class SpiegelIE(InfoExtractor):
|
||||
if SpiegeltvIE.suitable(handle.geturl()):
|
||||
return self.url_result(handle.geturl(), 'Spiegeltv')
|
||||
|
||||
nexx_id = self._search_regex(
|
||||
r'nexxOmniaId\s*:\s*(\d+)', webpage, 'nexx id', default=None)
|
||||
if nexx_id:
|
||||
domain_id = NexxIE._extract_domain_id(webpage) or '748'
|
||||
return self.url_result(
|
||||
'nexx:%s:%s' % (domain_id, nexx_id), ie=NexxIE.ie_key(),
|
||||
video_id=nexx_id)
|
||||
|
||||
video_data = extract_attributes(self._search_regex(r'(<div[^>]+id="spVideoElements"[^>]+>)', webpage, 'video element', default=''))
|
||||
|
||||
title = video_data.get('data-video-title') or get_element_by_attribute('class', 'module-title', webpage)
|
||||
|
38
youtube_dl/extractor/sportschau.py
Normal file
38
youtube_dl/extractor/sportschau.py
Normal file
@ -0,0 +1,38 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .wdr import WDRBaseIE
|
||||
from ..utils import get_element_by_attribute
|
||||
|
||||
|
||||
class SportschauIE(WDRBaseIE):
|
||||
IE_NAME = 'Sportschau'
|
||||
_VALID_URL = r'https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video-?(?P<id>[^/#?]+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.sportschau.de/uefaeuro2016/videos/video-dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100.html',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1140188',
|
||||
'display_id': 'dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100',
|
||||
'ext': 'mp4',
|
||||
'title': 'DFB-Team geht gut gelaunt ins Spiel gegen Polen',
|
||||
'description': 'Vor dem zweiten Gruppenspiel gegen Polen herrscht gute Stimmung im deutschen Team. Insbesondere Bastian Schweinsteiger strotzt vor Optimismus nach seinem Tor gegen die Ukraine.',
|
||||
'upload_date': '20160615',
|
||||
},
|
||||
'skip': 'Geo-restricted to Germany',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
title = get_element_by_attribute('class', 'headline', webpage)
|
||||
description = self._html_search_meta('description', webpage, 'description')
|
||||
|
||||
info = self._extract_wdr_video(webpage, video_id)
|
||||
|
||||
info.update({
|
||||
'title': title,
|
||||
'description': description,
|
||||
})
|
||||
|
||||
return info
|
@ -1,125 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
xpath_attr,
|
||||
xpath_text,
|
||||
xpath_element,
|
||||
unescapeHTML,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
class SpringboardPlatformIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
cms\.springboardplatform\.com/
|
||||
(?:
|
||||
(?:previews|embed_iframe)/(?P<index>\d+)/video/(?P<id>\d+)|
|
||||
xml_feeds_advanced/index/(?P<index_2>\d+)/rss3/(?P<id_2>\d+)
|
||||
)
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'http://cms.springboardplatform.com/previews/159/video/981017/0/0/1',
|
||||
'md5': '5c3cb7b5c55740d482561099e920f192',
|
||||
'info_dict': {
|
||||
'id': '981017',
|
||||
'ext': 'mp4',
|
||||
'title': 'Redman "BUD like YOU" "Usher Good Kisser" REMIX',
|
||||
'description': 'Redman "BUD like YOU" "Usher Good Kisser" REMIX',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1409132328,
|
||||
'upload_date': '20140827',
|
||||
'duration': 193,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://cms.springboardplatform.com/embed_iframe/159/video/981017/rab007/rapbasement.com/1/1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://cms.springboardplatform.com/embed_iframe/20/video/1731611/ki055/kidzworld.com/10',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://cms.springboardplatform.com/xml_feeds_advanced/index/159/rss3/981017/0/0/1/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return [
|
||||
mobj.group('url')
|
||||
for mobj in re.finditer(
|
||||
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//cms\.springboardplatform\.com/embed_iframe/\d+/video/\d+.*?)\1',
|
||||
webpage)]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id') or mobj.group('id_2')
|
||||
index = mobj.group('index') or mobj.group('index_2')
|
||||
|
||||
video = self._download_xml(
|
||||
'http://cms.springboardplatform.com/xml_feeds_advanced/index/%s/rss3/%s'
|
||||
% (index, video_id), video_id)
|
||||
|
||||
item = xpath_element(video, './/item', 'item', fatal=True)
|
||||
|
||||
content = xpath_element(
|
||||
item, './{http://search.yahoo.com/mrss/}content', 'content',
|
||||
fatal=True)
|
||||
title = unescapeHTML(xpath_text(item, './title', 'title', fatal=True))
|
||||
|
||||
video_url = content.attrib['url']
|
||||
|
||||
if 'error_video.mp4' in video_url:
|
||||
raise ExtractorError(
|
||||
'Video %s no longer exists' % video_id, expected=True)
|
||||
|
||||
duration = int_or_none(content.get('duration'))
|
||||
tbr = int_or_none(content.get('bitrate'))
|
||||
filesize = int_or_none(content.get('fileSize'))
|
||||
width = int_or_none(content.get('width'))
|
||||
height = int_or_none(content.get('height'))
|
||||
|
||||
description = unescapeHTML(xpath_text(
|
||||
item, './description', 'description'))
|
||||
thumbnail = xpath_attr(
|
||||
item, './{http://search.yahoo.com/mrss/}thumbnail', 'url',
|
||||
'thumbnail')
|
||||
|
||||
timestamp = unified_timestamp(xpath_text(
|
||||
item, './{http://cms.springboardplatform.com/namespaces.html}created',
|
||||
'timestamp'))
|
||||
|
||||
formats = [{
|
||||
'url': video_url,
|
||||
'format_id': 'http',
|
||||
'tbr': tbr,
|
||||
'filesize': filesize,
|
||||
'width': width,
|
||||
'height': height,
|
||||
}]
|
||||
|
||||
m3u8_format = formats[0].copy()
|
||||
m3u8_format.update({
|
||||
'url': re.sub(r'(https?://)cdn\.', r'\1hls.', video_url) + '.m3u8',
|
||||
'ext': 'mp4',
|
||||
'format_id': 'hls',
|
||||
'protocol': 'm3u8_native',
|
||||
})
|
||||
formats.append(m3u8_format)
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'timestamp': timestamp,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
@ -58,7 +58,7 @@ class TBSIE(TurnerBaseIE):
|
||||
continue
|
||||
if stream_data.get('playlistProtection') == 'spe':
|
||||
m3u8_url = self._add_akamai_spe_token(
|
||||
'http://token.vgtf.net/token/token_spe',
|
||||
'http://www.%s.com/service/token_spe' % site,
|
||||
m3u8_url, media_id, {
|
||||
'url': url,
|
||||
'site_name': site[:3].upper(),
|
||||
|
@ -5,9 +5,8 @@ import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
qualities,
|
||||
determine_ext,
|
||||
)
|
||||
|
||||
|
||||
@ -18,7 +17,6 @@ class TeacherTubeIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?teachertube\.com/(viewVideo\.php\?video_id=|music\.php\?music_id=|video/(?:[\da-z-]+-)?|audio/)(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
# flowplayer
|
||||
'url': 'http://www.teachertube.com/viewVideo.php?video_id=339997',
|
||||
'md5': 'f9434ef992fd65936d72999951ee254c',
|
||||
'info_dict': {
|
||||
@ -26,10 +24,19 @@ class TeacherTubeIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Measures of dispersion from a frequency table',
|
||||
'description': 'Measures of dispersion from a frequency table',
|
||||
'thumbnail': r're:https?://.*\.(?:jpg|png)',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.teachertube.com/viewVideo.php?video_id=340064',
|
||||
'md5': '0d625ec6bc9bf50f70170942ad580676',
|
||||
'info_dict': {
|
||||
'id': '340064',
|
||||
'ext': 'mp4',
|
||||
'title': 'How to Make Paper Dolls _ Paper Art Projects',
|
||||
'description': 'Learn how to make paper dolls in this simple',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
# jwplayer
|
||||
'url': 'http://www.teachertube.com/music.php?music_id=8805',
|
||||
'md5': '01e8352006c65757caf7b961f6050e21',
|
||||
'info_dict': {
|
||||
@ -39,21 +46,20 @@ class TeacherTubeIE(InfoExtractor):
|
||||
'description': 'RADIJSKA EMISIJA ZRAKOPLOVNE TEHNI?KE ?KOLE P',
|
||||
},
|
||||
}, {
|
||||
# unavailable video
|
||||
'url': 'http://www.teachertube.com/video/intro-video-schleicher-297790',
|
||||
'only_matching': True,
|
||||
'md5': '9c79fbb2dd7154823996fc28d4a26998',
|
||||
'info_dict': {
|
||||
'id': '297790',
|
||||
'ext': 'mp4',
|
||||
'title': 'Intro Video - Schleicher',
|
||||
'description': 'Intro Video - Why to flip, how flipping will',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
error = self._search_regex(
|
||||
r'<div\b[^>]+\bclass=["\']msgBox error[^>]+>([^<]+)', webpage,
|
||||
'error', default=None)
|
||||
if error:
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True)
|
||||
|
||||
title = self._html_search_meta('title', webpage, 'title', fatal=True)
|
||||
TITLE_SUFFIX = ' - TeacherTube'
|
||||
if title.endswith(TITLE_SUFFIX):
|
||||
@ -78,16 +84,12 @@ class TeacherTubeIE(InfoExtractor):
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnail = self._og_search_thumbnail(
|
||||
webpage, default=None) or self._html_search_meta(
|
||||
'thumbnail', webpage)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'thumbnail': self._html_search_regex(r'\'image\'\s*:\s*["\']([^"\']+)["\']', webpage, 'thumbnail'),
|
||||
'formats': formats,
|
||||
'description': description,
|
||||
}
|
||||
|
||||
|
||||
|
@ -1,20 +1,18 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import binascii
|
||||
import re
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_ord,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
qualities,
|
||||
determine_ext,
|
||||
)
|
||||
from ..compat import compat_ord
|
||||
|
||||
|
||||
class TeamcocoIE(InfoExtractor):
|
||||
@ -99,7 +97,7 @@ class TeamcocoIE(InfoExtractor):
|
||||
for i in range(len(cur_fragments)):
|
||||
cur_sequence = (''.join(cur_fragments[i:] + cur_fragments[:i])).encode('ascii')
|
||||
try:
|
||||
raw_data = compat_b64decode(cur_sequence)
|
||||
raw_data = base64.b64decode(cur_sequence)
|
||||
if compat_ord(raw_data[0]) == compat_ord('{'):
|
||||
return json.loads(raw_data.decode('utf-8'))
|
||||
except (TypeError, binascii.Error, UnicodeDecodeError, ValueError):
|
||||
|
@ -7,7 +7,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class TeleBruxellesIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(?:[^/]+/)*(?P<id>[^/#?]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt|emission)/?(?P<id>[^/#?]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://bx1.be/news/que-risque-lauteur-dune-fausse-alerte-a-la-bombe/',
|
||||
'md5': 'a2a67a5b1c3e8c9d33109b902f474fd9',
|
||||
@ -31,16 +31,6 @@ class TeleBruxellesIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://bx1.be/emission/bxenf1-gastronomie/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://bx1.be/berchem-sainte-agathe/personnel-carrefour-de-berchem-sainte-agathe-inquiet/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://bx1.be/dernier-jt/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# live stream
|
||||
'url': 'https://bx1.be/lives/direct-tv/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -48,29 +38,22 @@ class TeleBruxellesIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
article_id = self._html_search_regex(
|
||||
r'<article[^>]+\bid=["\']post-(\d+)', webpage, 'article ID', default=None)
|
||||
r"<article id=\"post-(\d+)\"", webpage, 'article ID', default=None)
|
||||
title = self._html_search_regex(
|
||||
r'<h1[^>]*>(.+?)</h1>', webpage, 'title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
r'<h1 class=\"entry-title\">(.*?)</h1>', webpage, 'title')
|
||||
description = self._og_search_description(webpage, default=None)
|
||||
|
||||
rtmp_url = self._html_search_regex(
|
||||
r'file["\']?\s*:\s*"(r(?:tm|mt)ps?://[^/]+/(?:vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*"\.mp4|stream/live))"',
|
||||
r'file\s*:\s*"(rtmp://[^/]+/vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*".mp4)"',
|
||||
webpage, 'RTMP url')
|
||||
# Yes, they have a typo in scheme name for live stream URLs (e.g.
|
||||
# https://bx1.be/lives/direct-tv/)
|
||||
rtmp_url = re.sub(r'^rmtp', 'rtmp', rtmp_url)
|
||||
rtmp_url = re.sub(r'"\s*\+\s*"', '', rtmp_url)
|
||||
formats = self._extract_wowza_formats(rtmp_url, article_id or display_id)
|
||||
self._sort_formats(formats)
|
||||
|
||||
is_live = 'stream/live' in rtmp_url
|
||||
|
||||
return {
|
||||
'id': article_id or display_id,
|
||||
'display_id': display_id,
|
||||
'title': self._live_title(title) if is_live else title,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'formats': formats,
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
106
youtube_dl/extractor/thesixtyone.py
Normal file
106
youtube_dl/extractor/thesixtyone.py
Normal file
@ -0,0 +1,106 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import unified_strdate
|
||||
|
||||
|
||||
class TheSixtyOneIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:www\.)?thesixtyone\.com/
|
||||
(?:.*?/)*
|
||||
(?:
|
||||
s|
|
||||
song/comments/list|
|
||||
song
|
||||
)/(?:[^/]+/)?(?P<id>[A-Za-z0-9]+)/?$'''
|
||||
_SONG_URL_TEMPLATE = 'http://thesixtyone.com/s/{0:}'
|
||||
_SONG_FILE_URL_TEMPLATE = 'http://{audio_server:}/thesixtyone_production/audio/{0:}_stream'
|
||||
_THUMBNAIL_URL_TEMPLATE = '{photo_base_url:}_desktop'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/s/SrE3zD7s1jt/',
|
||||
'md5': '821cc43b0530d3222e3e2b70bb4622ea',
|
||||
'info_dict': {
|
||||
'id': 'SrE3zD7s1jt',
|
||||
'ext': 'mp3',
|
||||
'title': 'CASIO - Unicorn War Mixtape',
|
||||
'thumbnail': 're:^https?://.*_desktop$',
|
||||
'upload_date': '20071217',
|
||||
'duration': 3208,
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/song/comments/list/SrE3zD7s1jt',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/s/ULoiyjuJWli#/s/SrE3zD7s1jt/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/#/s/SrE3zD7s1jt/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/song/SrE3zD7s1jt/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.thesixtyone.com/maryatmidnight/song/StrawberriesandCream/yvWtLp0c4GQ/',
|
||||
'only_matching': True,
|
||||
},
|
||||
]
|
||||
|
||||
_DECODE_MAP = {
|
||||
'x': 'a',
|
||||
'm': 'b',
|
||||
'w': 'c',
|
||||
'q': 'd',
|
||||
'n': 'e',
|
||||
'p': 'f',
|
||||
'a': '0',
|
||||
'h': '1',
|
||||
'e': '2',
|
||||
'u': '3',
|
||||
's': '4',
|
||||
'i': '5',
|
||||
'o': '6',
|
||||
'y': '7',
|
||||
'r': '8',
|
||||
'c': '9'
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
song_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
self._SONG_URL_TEMPLATE.format(song_id), song_id)
|
||||
|
||||
song_data = self._parse_json(self._search_regex(
|
||||
r'"%s":\s(\{.*?\})' % song_id, webpage, 'song_data'), song_id)
|
||||
|
||||
if self._search_regex(r'(t61\.s3_audio_load\s*=\s*1\.0;)', webpage, 's3_audio_load marker', default=None):
|
||||
song_data['audio_server'] = 's3.amazonaws.com'
|
||||
else:
|
||||
song_data['audio_server'] = song_data['audio_server'] + '.thesixtyone.com'
|
||||
|
||||
keys = [self._DECODE_MAP.get(s, s) for s in song_data['key']]
|
||||
url = self._SONG_FILE_URL_TEMPLATE.format(
|
||||
"".join(reversed(keys)), **song_data)
|
||||
|
||||
formats = [{
|
||||
'format_id': 'sd',
|
||||
'url': url,
|
||||
'ext': 'mp3',
|
||||
}]
|
||||
|
||||
return {
|
||||
'id': song_id,
|
||||
'title': '{artist:} - {name:}'.format(**song_data),
|
||||
'formats': formats,
|
||||
'comment_count': song_data.get('comments_count'),
|
||||
'duration': song_data.get('play_time'),
|
||||
'like_count': song_data.get('score'),
|
||||
'thumbnail': self._THUMBNAIL_URL_TEMPLATE.format(**song_data),
|
||||
'upload_date': unified_strdate(song_data.get('publish_date')),
|
||||
}
|
50
youtube_dl/extractor/totalwebcasting.py
Normal file
50
youtube_dl/extractor/totalwebcasting.py
Normal file
@ -0,0 +1,50 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class TotalWebCastingIE(InfoExtractor):
|
||||
IE_NAME = 'totalwebcasting.com'
|
||||
_VALID_URL = r'https?://www\.totalwebcasting\.com/view/\?func=VOFF.*'
|
||||
_TEST = {
|
||||
'url': 'https://www.totalwebcasting.com/view/?func=VOFF&id=columbia&date=2017-01-04&seq=1',
|
||||
'info_dict': {
|
||||
'id': '270e1c415d443924485f547403180906731570466a42740764673853041316737548',
|
||||
'title': 'Real World Cryptography Conference 2017',
|
||||
'description': 'md5:47a31e91ed537a2bb0d3a091659dc80c',
|
||||
},
|
||||
'playlist_count': 6,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
params = url.split('?', 1)[1]
|
||||
webpage = self._download_webpage(url, params)
|
||||
aprm = self._search_regex(r"startVideo\('(\w+)'", webpage, 'aprm')
|
||||
VLEV = self._download_json("https://www.totalwebcasting.com/view/?func=VLEV&aprm=%s&style=G" % aprm, aprm)
|
||||
parts = []
|
||||
for s in VLEV["aiTimes"].values():
|
||||
n = int(s[:-5])
|
||||
if n == 99:
|
||||
continue
|
||||
if n not in parts:
|
||||
parts.append(n)
|
||||
parts.sort()
|
||||
title = VLEV["title"]
|
||||
entries = []
|
||||
for p in parts:
|
||||
VLEV = self._download_json("https://www.totalwebcasting.com/view/?func=VLEV&aprm=%s&style=G&refP=1&nf=%d&time=1&cs=1&ns=1" % (aprm, p), aprm)
|
||||
for s in VLEV["playerObj"]["clip"]["sources"]:
|
||||
if s["type"] != "video/mp4":
|
||||
continue
|
||||
entries.append({
|
||||
"id": "%s_part%d" % (aprm, p),
|
||||
"url": "https:" + s["src"],
|
||||
"title": title,
|
||||
})
|
||||
return {
|
||||
'_type': 'multi_video',
|
||||
'id': aprm,
|
||||
'entries': entries,
|
||||
'title': title,
|
||||
'description': VLEV.get("desc"),
|
||||
}
|
@ -1,10 +1,9 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_parse_qs,
|
||||
)
|
||||
from ..compat import compat_parse_qs
|
||||
|
||||
|
||||
class TutvIE(InfoExtractor):
|
||||
@ -27,7 +26,7 @@ class TutvIE(InfoExtractor):
|
||||
|
||||
data_content = self._download_webpage(
|
||||
'http://tu.tv/flvurl.php?codVideo=%s' % internal_id, video_id, 'Downloading video info')
|
||||
video_url = compat_b64decode(compat_parse_qs(data_content)['kpt'][0]).decode('utf-8')
|
||||
video_url = base64.b64decode(compat_parse_qs(data_content)['kpt'][0].encode('utf-8')).decode('utf-8')
|
||||
|
||||
return {
|
||||
'id': internal_id,
|
||||
|
@ -273,8 +273,6 @@ class TVPlayIE(InfoExtractor):
|
||||
'ext': ext,
|
||||
}
|
||||
if video_url.startswith('rtmp'):
|
||||
if smuggled_data.get('skip_rtmp'):
|
||||
continue
|
||||
m = re.search(
|
||||
r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', video_url)
|
||||
if not m:
|
||||
@ -436,10 +434,6 @@ class ViafreeIE(InfoExtractor):
|
||||
return self.url_result(
|
||||
smuggle_url(
|
||||
'mtg:%s' % video_id,
|
||||
{
|
||||
'geo_countries': [
|
||||
compat_urlparse.urlparse(url).netloc.rsplit('.', 1)[-1]],
|
||||
# rtmp host mtgfs.fplive.net for viafree is unresolvable
|
||||
'skip_rtmp': True,
|
||||
}),
|
||||
{'geo_countries': [
|
||||
compat_urlparse.urlparse(url).netloc.rsplit('.', 1)[-1]]}),
|
||||
ie=TVPlayIE.ie_key(), video_id=video_id)
|
||||
|
@ -85,15 +85,10 @@ class TwitchBaseIE(InfoExtractor):
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
|
||||
response = self._parse_json(
|
||||
e.cause.read().decode('utf-8'), None)
|
||||
fail(response.get('message') or response['errors'][0])
|
||||
fail(response['message'])
|
||||
raise
|
||||
|
||||
if 'Authenticated successfully' in response.get('message', ''):
|
||||
return None, None
|
||||
|
||||
redirect_url = urljoin(
|
||||
post_url,
|
||||
response.get('redirect') or response['redirect_path'])
|
||||
redirect_url = urljoin(post_url, response['redirect'])
|
||||
return self._download_webpage_handle(
|
||||
redirect_url, None, 'Downloading login redirect page',
|
||||
headers=headers)
|
||||
@ -111,10 +106,6 @@ class TwitchBaseIE(InfoExtractor):
|
||||
'password': password,
|
||||
})
|
||||
|
||||
# Successful login
|
||||
if not redirect_page:
|
||||
return
|
||||
|
||||
if re.search(r'(?i)<form[^>]+id="two-factor-submit"', redirect_page) is not None:
|
||||
# TODO: Add mechanism to request an SMS or phone call
|
||||
tfa_token = self._get_tfa_info('two-factor authentication token')
|
||||
|
@ -318,14 +318,9 @@ class VKIE(VKBaseIE):
|
||||
'You are trying to log in from an unusual location. You should confirm ownership at vk.com to log in with this IP.',
|
||||
expected=True)
|
||||
|
||||
ERROR_COPYRIGHT = 'Video %s has been removed from public access due to rightholder complaint.'
|
||||
|
||||
ERRORS = {
|
||||
r'>Видеозапись .*? была изъята из публичного доступа в связи с обращением правообладателя.<':
|
||||
ERROR_COPYRIGHT,
|
||||
|
||||
r'>The video .*? was removed from public access by request of the copyright holder.<':
|
||||
ERROR_COPYRIGHT,
|
||||
'Video %s has been removed from public access due to rightholder complaint.',
|
||||
|
||||
r'<!>Please log in or <':
|
||||
'Video %s is only available for registered users, '
|
||||
|
@ -4,50 +4,49 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
js_to_json,
|
||||
strip_jsonp,
|
||||
try_get,
|
||||
unified_strdate,
|
||||
update_url_query,
|
||||
urlhandle_detect_ext,
|
||||
)
|
||||
|
||||
|
||||
class WDRIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://deviceids-medp\.wdr\.de/ondemand/\d+/(?P<id>\d+)\.js'
|
||||
_GEO_COUNTRIES = ['DE']
|
||||
_TEST = {
|
||||
'url': 'http://deviceids-medp.wdr.de/ondemand/155/1557833.js',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1557833',
|
||||
'ext': 'mp4',
|
||||
'title': 'Biathlon-Staffel verpasst Podest bei Olympia-Generalprobe',
|
||||
'upload_date': '20180112',
|
||||
},
|
||||
}
|
||||
class WDRBaseIE(InfoExtractor):
|
||||
def _extract_wdr_video(self, webpage, display_id):
|
||||
# for wdr.de the data-extension is in a tag with the class "mediaLink"
|
||||
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
|
||||
# for wdrmaus, in a tag with the class "videoButton" (previously a link
|
||||
# to the page in a multiline "videoLink"-tag)
|
||||
json_metadata = self._html_search_regex(
|
||||
r'''(?sx)class=
|
||||
(?:
|
||||
(["\'])(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b.*?\1[^>]+|
|
||||
(["\'])videoLink\b.*?\2[\s]*>\n[^\n]*
|
||||
)data-extension=(["\'])(?P<data>(?:(?!\3).)+)\3
|
||||
''',
|
||||
webpage, 'media link', default=None, group='data')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
if not json_metadata:
|
||||
return
|
||||
|
||||
media_link_obj = self._parse_json(json_metadata, display_id,
|
||||
transform_source=js_to_json)
|
||||
jsonp_url = media_link_obj['mediaObj']['url']
|
||||
|
||||
metadata = self._download_json(
|
||||
url, video_id, transform_source=strip_jsonp)
|
||||
jsonp_url, display_id, transform_source=strip_jsonp)
|
||||
|
||||
is_live = metadata.get('mediaType') == 'live'
|
||||
|
||||
tracker_data = metadata['trackerData']
|
||||
media_resource = metadata['mediaResource']
|
||||
metadata_tracker_data = metadata['trackerData']
|
||||
metadata_media_resource = metadata['mediaResource']
|
||||
|
||||
formats = []
|
||||
|
||||
# check if the metadata contains a direct URL to a file
|
||||
for kind, media_resource in media_resource.items():
|
||||
for kind, media_resource in metadata_media_resource.items():
|
||||
if kind not in ('dflt', 'alt'):
|
||||
continue
|
||||
|
||||
@ -58,13 +57,13 @@ class WDRIE(InfoExtractor):
|
||||
ext = determine_ext(medium_url)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
medium_url, video_id, 'mp4', 'm3u8_native',
|
||||
medium_url, display_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls'))
|
||||
elif ext == 'f4m':
|
||||
manifest_url = update_url_query(
|
||||
medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
manifest_url, video_id, f4m_id='hds', fatal=False))
|
||||
manifest_url, display_id, f4m_id='hds', fatal=False))
|
||||
elif ext == 'smil':
|
||||
formats.extend(self._extract_smil_formats(
|
||||
medium_url, 'stream', fatal=False))
|
||||
@ -74,7 +73,7 @@ class WDRIE(InfoExtractor):
|
||||
}
|
||||
if ext == 'unknown_video':
|
||||
urlh = self._request_webpage(
|
||||
medium_url, video_id, note='Determining extension')
|
||||
medium_url, display_id, note='Determining extension')
|
||||
ext = urlhandle_detect_ext(urlh)
|
||||
a_format['ext'] = ext
|
||||
formats.append(a_format)
|
||||
@ -82,30 +81,30 @@ class WDRIE(InfoExtractor):
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
caption_url = media_resource.get('captionURL')
|
||||
caption_url = metadata_media_resource.get('captionURL')
|
||||
if caption_url:
|
||||
subtitles['de'] = [{
|
||||
'url': caption_url,
|
||||
'ext': 'ttml',
|
||||
}]
|
||||
|
||||
title = tracker_data['trackerClipTitle']
|
||||
title = metadata_tracker_data['trackerClipTitle']
|
||||
|
||||
return {
|
||||
'id': tracker_data.get('trackerClipId', video_id),
|
||||
'title': self._live_title(title) if is_live else title,
|
||||
'alt_title': tracker_data.get('trackerClipSubcategory'),
|
||||
'id': metadata_tracker_data.get('trackerClipId', display_id),
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'upload_date': unified_strdate(tracker_data.get('trackerClipAirTime')),
|
||||
'is_live': is_live,
|
||||
'upload_date': unified_strdate(metadata_tracker_data.get('trackerClipAirTime')),
|
||||
}
|
||||
|
||||
|
||||
class WDRPageIE(InfoExtractor):
|
||||
class WDRIE(WDRBaseIE):
|
||||
_CURRENT_MAUS_URL = r'https?://(?:www\.)wdrmaus.de/(?:[^/]+/){1,2}[^/?#]+\.php5'
|
||||
_PAGE_REGEX = r'/(?:mediathek/)?(?:[^/]+/)*(?P<display_id>[^/]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\d?\.)?(?:wdr\d?|sportschau)\.de' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
|
||||
_PAGE_REGEX = r'/(?:mediathek/)?[^/]+/(?P<type>[^/]+)/(?P<display_id>.+)\.html'
|
||||
_VALID_URL = r'(?P<page_url>https?://(?:www\d\.)?wdr\d?\.de)' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@ -125,7 +124,6 @@ class WDRPageIE(InfoExtractor):
|
||||
'ext': 'ttml',
|
||||
}]},
|
||||
},
|
||||
'skip': 'HTTP Error 404: Not Found',
|
||||
},
|
||||
{
|
||||
'url': 'http://www1.wdr.de/mediathek/audio/wdr3/wdr3-gespraech-am-samstag/audio-schriftstellerin-juli-zeh-100.html',
|
||||
@ -141,17 +139,19 @@ class WDRPageIE(InfoExtractor):
|
||||
'is_live': False,
|
||||
'subtitles': {}
|
||||
},
|
||||
'skip': 'HTTP Error 404: Not Found',
|
||||
},
|
||||
{
|
||||
'url': 'http://www1.wdr.de/mediathek/video/live/index.html',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1406149',
|
||||
'id': 'mdb-103364',
|
||||
'ext': 'mp4',
|
||||
'title': r're:^WDR Fernsehen im Livestream \(nur in Deutschland erreichbar\) [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||
'display_id': 'index',
|
||||
'title': r're:^WDR Fernsehen im Livestream [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||
'alt_title': 'WDR Fernsehen Live',
|
||||
'upload_date': '20150101',
|
||||
'upload_date': None,
|
||||
'description': 'md5:ae2ff888510623bf8d4b115f95a9b7c9',
|
||||
'is_live': True,
|
||||
'subtitles': {}
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # m3u8 download
|
||||
@ -159,18 +159,19 @@ class WDRPageIE(InfoExtractor):
|
||||
},
|
||||
{
|
||||
'url': 'http://www1.wdr.de/mediathek/video/sendungen/aktuelle-stunde/aktuelle-stunde-120.html',
|
||||
'playlist_mincount': 7,
|
||||
'playlist_mincount': 8,
|
||||
'info_dict': {
|
||||
'id': 'aktuelle-stunde-120',
|
||||
'id': 'aktuelle-stunde/aktuelle-stunde-120',
|
||||
},
|
||||
},
|
||||
{
|
||||
'url': 'http://www.wdrmaus.de/aktuelle-sendung/index.php5',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1552552',
|
||||
'id': 'mdb-1323501',
|
||||
'ext': 'mp4',
|
||||
'upload_date': 're:^[0-9]{8}$',
|
||||
'title': 're:^Die Sendung mit der Maus vom [0-9.]{10}$',
|
||||
'description': 'Die Seite mit der Maus -',
|
||||
},
|
||||
'skip': 'The id changes from week to week because of the new episode'
|
||||
},
|
||||
@ -182,6 +183,7 @@ class WDRPageIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20130919',
|
||||
'title': 'Sachgeschichte - Achterbahn ',
|
||||
'description': 'Die Seite mit der Maus -',
|
||||
},
|
||||
},
|
||||
{
|
||||
@ -189,114 +191,52 @@ class WDRPageIE(InfoExtractor):
|
||||
# Live stream, MD5 unstable
|
||||
'info_dict': {
|
||||
'id': 'mdb-869971',
|
||||
'ext': 'mp4',
|
||||
'title': r're:^COSMO Livestream [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||
'ext': 'flv',
|
||||
'title': 'COSMO Livestream',
|
||||
'description': 'md5:2309992a6716c347891c045be50992e4',
|
||||
'upload_date': '20160101',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # m3u8 download
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://www.sportschau.de/handballem2018/handball-nationalmannschaft-em-stolperstein-vorrunde-100.html',
|
||||
'info_dict': {
|
||||
'id': 'mdb-1556012',
|
||||
'ext': 'mp4',
|
||||
'title': 'DHB-Vizepräsident Bob Hanning - "Die Weltspitze ist extrem breit"',
|
||||
'upload_date': '20180111',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
{
|
||||
'url': 'http://www.sportschau.de/handballem2018/audio-vorschau---die-handball-em-startet-mit-grossem-favoritenfeld-100.html',
|
||||
'only_matching': True,
|
||||
}
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
url_type = mobj.group('type')
|
||||
page_url = mobj.group('page_url')
|
||||
display_id = mobj.group('display_id')
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
entries = []
|
||||
info_dict = self._extract_wdr_video(webpage, display_id)
|
||||
|
||||
# Article with several videos
|
||||
|
||||
# for wdr.de the data-extension is in a tag with the class "mediaLink"
|
||||
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
|
||||
# for wdrmaus, in a tag with the class "videoButton" (previously a link
|
||||
# to the page in a multiline "videoLink"-tag)
|
||||
for mobj in re.finditer(
|
||||
r'''(?sx)class=
|
||||
(?:
|
||||
(["\'])(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b.*?\1[^>]+|
|
||||
(["\'])videoLink\b.*?\2[\s]*>\n[^\n]*
|
||||
)data-extension=(["\'])(?P<data>(?:(?!\3).)+)\3
|
||||
''', webpage):
|
||||
media_link_obj = self._parse_json(
|
||||
mobj.group('data'), display_id, transform_source=js_to_json,
|
||||
fatal=False)
|
||||
if not media_link_obj:
|
||||
continue
|
||||
jsonp_url = try_get(
|
||||
media_link_obj, lambda x: x['mediaObj']['url'], compat_str)
|
||||
if jsonp_url:
|
||||
entries.append(self.url_result(jsonp_url, ie=WDRIE.ie_key()))
|
||||
|
||||
# Playlist (e.g. https://www1.wdr.de/mediathek/video/sendungen/aktuelle-stunde/aktuelle-stunde-120.html)
|
||||
if not entries:
|
||||
if not info_dict:
|
||||
entries = [
|
||||
self.url_result(
|
||||
compat_urlparse.urljoin(url, mobj.group('href')),
|
||||
ie=WDRPageIE.ie_key())
|
||||
for mobj in re.finditer(
|
||||
r'<a[^>]+\bhref=(["\'])(?P<href>(?:(?!\1).)+)\1[^>]+\bdata-extension=',
|
||||
webpage) if re.match(self._PAGE_REGEX, mobj.group('href'))
|
||||
self.url_result(page_url + href[0], 'WDR')
|
||||
for href in re.findall(
|
||||
r'<a href="(%s)"[^>]+data-extension=' % self._PAGE_REGEX,
|
||||
webpage)
|
||||
]
|
||||
|
||||
if entries: # Playlist page
|
||||
return self.playlist_result(entries, playlist_id=display_id)
|
||||
|
||||
raise ExtractorError('No downloadable streams found', expected=True)
|
||||
|
||||
class WDRElefantIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)wdrmaus\.de/elefantenseite/#(?P<id>.+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.wdrmaus.de/elefantenseite/#folge_ostern_2015',
|
||||
'info_dict': {
|
||||
'title': 'Folge Oster-Spezial 2015',
|
||||
'id': 'mdb-1088195',
|
||||
'ext': 'mp4',
|
||||
'age_limit': None,
|
||||
'upload_date': '20150406'
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
is_live = url_type == 'live'
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
if is_live:
|
||||
info_dict.update({
|
||||
'title': self._live_title(info_dict['title']),
|
||||
'upload_date': None,
|
||||
})
|
||||
elif 'upload_date' not in info_dict:
|
||||
info_dict['upload_date'] = unified_strdate(self._html_search_meta('DC.Date', webpage, 'upload date'))
|
||||
|
||||
# Table of Contents seems to always be at this address, so fetch it directly.
|
||||
# The website fetches configurationJS.php5, which links to tableOfContentsJS.php5.
|
||||
table_of_contents = self._download_json(
|
||||
'https://www.wdrmaus.de/elefantenseite/data/tableOfContentsJS.php5',
|
||||
display_id)
|
||||
if display_id not in table_of_contents:
|
||||
raise ExtractorError(
|
||||
'No entry in site\'s table of contents for this URL. '
|
||||
'Is the fragment part of the URL (after the #) correct?',
|
||||
expected=True)
|
||||
xml_metadata_path = table_of_contents[display_id]['xmlPath']
|
||||
xml_metadata = self._download_xml(
|
||||
'https://www.wdrmaus.de/elefantenseite/' + xml_metadata_path,
|
||||
display_id)
|
||||
zmdb_url_element = xml_metadata.find('./movie/zmdb_url')
|
||||
if zmdb_url_element is None:
|
||||
raise ExtractorError(
|
||||
'%s is not a video' % display_id, expected=True)
|
||||
return self.url_result(zmdb_url_element.text, ie=WDRIE.ie_key())
|
||||
info_dict.update({
|
||||
'description': self._html_search_meta('Description', webpage),
|
||||
'is_live': is_live,
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
||||
|
||||
class WDRMobileIE(InfoExtractor):
|
||||
|
@ -1,140 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
import json
|
||||
import random
|
||||
import re
|
||||
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_str,
|
||||
)
|
||||
from ..utils import (
|
||||
js_to_json,
|
||||
strip_jsonp,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class WeiboIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://weibo\.com/[0-9]+/(?P<id>[a-zA-Z0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'https://weibo.com/6275294458/Fp6RGfbff?type=comment',
|
||||
'info_dict': {
|
||||
'id': 'Fp6RGfbff',
|
||||
'ext': 'mp4',
|
||||
'title': 'You should have servants to massage you,... 来自Hosico_猫 - 微博',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
# to get Referer url for genvisitor
|
||||
webpage, urlh = self._download_webpage_handle(url, video_id)
|
||||
|
||||
visitor_url = urlh.geturl()
|
||||
|
||||
if 'passport.weibo.com' in visitor_url:
|
||||
# first visit
|
||||
visitor_data = self._download_json(
|
||||
'https://passport.weibo.com/visitor/genvisitor', video_id,
|
||||
note='Generating first-visit data',
|
||||
transform_source=strip_jsonp,
|
||||
headers={'Referer': visitor_url},
|
||||
data=urlencode_postdata({
|
||||
'cb': 'gen_callback',
|
||||
'fp': json.dumps({
|
||||
'os': '2',
|
||||
'browser': 'Gecko57,0,0,0',
|
||||
'fonts': 'undefined',
|
||||
'screenInfo': '1440*900*24',
|
||||
'plugins': '',
|
||||
}),
|
||||
}))
|
||||
|
||||
tid = visitor_data['data']['tid']
|
||||
cnfd = '%03d' % visitor_data['data']['confidence']
|
||||
|
||||
self._download_webpage(
|
||||
'https://passport.weibo.com/visitor/visitor', video_id,
|
||||
note='Running first-visit callback',
|
||||
query={
|
||||
'a': 'incarnate',
|
||||
't': tid,
|
||||
'w': 2,
|
||||
'c': cnfd,
|
||||
'cb': 'cross_domain',
|
||||
'from': 'weibo',
|
||||
'_rand': random.random(),
|
||||
})
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, note='Revisiting webpage')
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>(.+?)</title>', webpage, 'title')
|
||||
|
||||
video_formats = compat_parse_qs(self._search_regex(
|
||||
r'video-sources=\\\"(.+?)\"', webpage, 'video_sources'))
|
||||
|
||||
formats = []
|
||||
supported_resolutions = (480, 720)
|
||||
for res in supported_resolutions:
|
||||
vid_urls = video_formats.get(compat_str(res))
|
||||
if not vid_urls or not isinstance(vid_urls, list):
|
||||
continue
|
||||
|
||||
vid_url = vid_urls[0]
|
||||
formats.append({
|
||||
'url': vid_url,
|
||||
'height': res,
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
uploader = self._og_search_property(
|
||||
'nick-name', webpage, 'uploader', default=None)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'uploader': uploader,
|
||||
'formats': formats
|
||||
}
|
||||
|
||||
|
||||
class WeiboMobileIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://m\.weibo\.cn/status/(?P<id>[0-9]+)(\?.+)?'
|
||||
_TEST = {
|
||||
'url': 'https://m.weibo.cn/status/4189191225395228?wm=3333_2001&sourcetype=weixin&featurecode=newtitle&from=singlemessage&isappinstalled=0',
|
||||
'info_dict': {
|
||||
'id': '4189191225395228',
|
||||
'ext': 'mp4',
|
||||
'title': '午睡当然是要甜甜蜜蜜的啦',
|
||||
'uploader': '柴犬柴犬'
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
# to get Referer url for genvisitor
|
||||
webpage = self._download_webpage(url, video_id, note='visit the page')
|
||||
|
||||
weibo_info = self._parse_json(self._search_regex(
|
||||
r'var\s+\$render_data\s*=\s*\[({.*})\]\[0\]\s*\|\|\s*{};',
|
||||
webpage, 'js_code', flags=re.DOTALL),
|
||||
video_id, transform_source=js_to_json)
|
||||
|
||||
status_data = weibo_info.get('status', {})
|
||||
page_info = status_data.get('page_info')
|
||||
title = status_data['status_title']
|
||||
uploader = status_data.get('user', {}).get('screen_name')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'uploader': uploader,
|
||||
'url': page_info['media_info']['stream_url']
|
||||
}
|
@ -1,233 +0,0 @@
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import itertools
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class XimalayaBaseIE(InfoExtractor):
|
||||
_GEO_COUNTRIES = ['CN']
|
||||
|
||||
|
||||
class XimalayaIE(XimalayaBaseIE):
|
||||
IE_NAME = 'ximalaya'
|
||||
IE_DESC = '喜马拉雅FM'
|
||||
_VALID_URL = r'https?://(?:www\.|m\.)?ximalaya\.com/(?P<uid>[0-9]+)/sound/(?P<id>[0-9]+)'
|
||||
_USER_URL_FORMAT = '%s://www.ximalaya.com/zhubo/%i/'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.ximalaya.com/61425525/sound/47740352/',
|
||||
'info_dict': {
|
||||
'id': '47740352',
|
||||
'ext': 'm4a',
|
||||
'uploader': '小彬彬爱听书',
|
||||
'uploader_id': 61425525,
|
||||
'uploader_url': 'http://www.ximalaya.com/zhubo/61425525/',
|
||||
'title': '261.唐诗三百首.卷八.送孟浩然之广陵.李白',
|
||||
'description': "contains:《送孟浩然之广陵》\n作者:李白\n故人西辞黄鹤楼,烟花三月下扬州。\n孤帆远影碧空尽,惟见长江天际流。",
|
||||
'thumbnails': [
|
||||
{
|
||||
'name': 'cover_url',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
{
|
||||
'name': 'cover_url_142',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
'width': 180,
|
||||
'height': 180
|
||||
}
|
||||
],
|
||||
'categories': ['renwen', '人文'],
|
||||
'duration': 93,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://m.ximalaya.com/61425525/sound/47740352/',
|
||||
'info_dict': {
|
||||
'id': '47740352',
|
||||
'ext': 'm4a',
|
||||
'uploader': '小彬彬爱听书',
|
||||
'uploader_id': 61425525,
|
||||
'uploader_url': 'http://www.ximalaya.com/zhubo/61425525/',
|
||||
'title': '261.唐诗三百首.卷八.送孟浩然之广陵.李白',
|
||||
'description': "contains:《送孟浩然之广陵》\n作者:李白\n故人西辞黄鹤楼,烟花三月下扬州。\n孤帆远影碧空尽,惟见长江天际流。",
|
||||
'thumbnails': [
|
||||
{
|
||||
'name': 'cover_url',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
{
|
||||
'name': 'cover_url_142',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
'width': 180,
|
||||
'height': 180
|
||||
}
|
||||
],
|
||||
'categories': ['renwen', '人文'],
|
||||
'duration': 93,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'https://www.ximalaya.com/11045267/sound/15705996/',
|
||||
'info_dict': {
|
||||
'id': '15705996',
|
||||
'ext': 'm4a',
|
||||
'uploader': '李延隆老师',
|
||||
'uploader_id': 11045267,
|
||||
'uploader_url': 'https://www.ximalaya.com/zhubo/11045267/',
|
||||
'title': 'Lesson 1 Excuse me!',
|
||||
'description': "contains:Listen to the tape then answer\xa0this question. Whose handbag is it?\n"
|
||||
"听录音,然后回答问题,这是谁的手袋?",
|
||||
'thumbnails': [
|
||||
{
|
||||
'name': 'cover_url',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
{
|
||||
'name': 'cover_url_142',
|
||||
'url': r're:^https?://.*\.jpg$',
|
||||
'width': 180,
|
||||
'height': 180
|
||||
}
|
||||
],
|
||||
'categories': ['train', '外语'],
|
||||
'duration': 40,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
}
|
||||
},
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
is_m = 'm.ximalaya' in url
|
||||
scheme = 'https' if url.startswith('https') else 'http'
|
||||
|
||||
audio_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, audio_id,
|
||||
note='Download sound page for %s' % audio_id,
|
||||
errnote='Unable to get sound page')
|
||||
|
||||
audio_info_file = '%s://m.ximalaya.com/tracks/%s.json' % (scheme, audio_id)
|
||||
audio_info = self._download_json(audio_info_file, audio_id,
|
||||
'Downloading info json %s' % audio_info_file,
|
||||
'Unable to download info file')
|
||||
|
||||
formats = []
|
||||
for bps, k in (('24k', 'play_path_32'), ('64k', 'play_path_64')):
|
||||
if audio_info.get(k):
|
||||
formats.append({
|
||||
'format_id': bps,
|
||||
'url': audio_info[k],
|
||||
})
|
||||
|
||||
thumbnails = []
|
||||
for k in audio_info.keys():
|
||||
# cover pics kyes like: cover_url', 'cover_url_142'
|
||||
if k.startswith('cover_url'):
|
||||
thumbnail = {'name': k, 'url': audio_info[k]}
|
||||
if k == 'cover_url_142':
|
||||
thumbnail['width'] = 180
|
||||
thumbnail['height'] = 180
|
||||
thumbnails.append(thumbnail)
|
||||
|
||||
audio_uploader_id = audio_info.get('uid')
|
||||
|
||||
if is_m:
|
||||
audio_description = self._html_search_regex(r'(?s)<section\s+class=["\']content[^>]+>(.+?)</section>',
|
||||
webpage, 'audio_description', fatal=False)
|
||||
else:
|
||||
audio_description = self._html_search_regex(r'(?s)<div\s+class=["\']rich_intro[^>]*>(.+?</article>)',
|
||||
webpage, 'audio_description', fatal=False)
|
||||
|
||||
if not audio_description:
|
||||
audio_description_file = '%s://www.ximalaya.com/sounds/%s/rich_intro' % (scheme, audio_id)
|
||||
audio_description = self._download_webpage(audio_description_file, audio_id,
|
||||
note='Downloading description file %s' % audio_description_file,
|
||||
errnote='Unable to download descrip file',
|
||||
fatal=False)
|
||||
audio_description = audio_description.strip() if audio_description else None
|
||||
|
||||
return {
|
||||
'id': audio_id,
|
||||
'uploader': audio_info.get('nickname'),
|
||||
'uploader_id': audio_uploader_id,
|
||||
'uploader_url': self._USER_URL_FORMAT % (scheme, audio_uploader_id) if audio_uploader_id else None,
|
||||
'title': audio_info['title'],
|
||||
'thumbnails': thumbnails,
|
||||
'description': audio_description,
|
||||
'categories': list(filter(None, (audio_info.get('category_name'), audio_info.get('category_title')))),
|
||||
'duration': audio_info.get('duration'),
|
||||
'view_count': audio_info.get('play_count'),
|
||||
'like_count': audio_info.get('favorites_count'),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class XimalayaAlbumIE(XimalayaBaseIE):
|
||||
IE_NAME = 'ximalaya:album'
|
||||
IE_DESC = '喜马拉雅FM 专辑'
|
||||
_VALID_URL = r'https?://(?:www\.|m\.)?ximalaya\.com/(?P<uid>[0-9]+)/album/(?P<id>[0-9]+)'
|
||||
_TEMPLATE_URL = '%s://www.ximalaya.com/%s/album/%s/'
|
||||
_BASE_URL_TEMPL = '%s://www.ximalaya.com%s'
|
||||
_LIST_VIDEO_RE = r'<a[^>]+?href="(?P<url>/%s/sound/(?P<id>\d+)/?)"[^>]+?title="(?P<title>[^>]+)">'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ximalaya.com/61425525/album/5534601/',
|
||||
'info_dict': {
|
||||
'title': '唐诗三百首(含赏析)',
|
||||
'id': '5534601',
|
||||
},
|
||||
'playlist_count': 312,
|
||||
}, {
|
||||
'url': 'http://m.ximalaya.com/61425525/album/5534601',
|
||||
'info_dict': {
|
||||
'title': '唐诗三百首(含赏析)',
|
||||
'id': '5534601',
|
||||
},
|
||||
'playlist_count': 312,
|
||||
},
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
self.scheme = scheme = 'https' if url.startswith('https') else 'http'
|
||||
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
uid, playlist_id = mobj.group('uid'), mobj.group('id')
|
||||
|
||||
webpage = self._download_webpage(self._TEMPLATE_URL % (scheme, uid, playlist_id), playlist_id,
|
||||
note='Download album page for %s' % playlist_id,
|
||||
errnote='Unable to get album info')
|
||||
|
||||
title = self._html_search_regex(r'detailContent_title[^>]*><h1(?:[^>]+)?>([^<]+)</h1>',
|
||||
webpage, 'title', fatal=False)
|
||||
|
||||
return self.playlist_result(self._entries(webpage, playlist_id, uid), playlist_id, title)
|
||||
|
||||
def _entries(self, page, playlist_id, uid):
|
||||
html = page
|
||||
for page_num in itertools.count(1):
|
||||
for entry in self._process_page(html, uid):
|
||||
yield entry
|
||||
|
||||
next_url = self._search_regex(r'<a\s+href=(["\'])(?P<more>[\S]+)\1[^>]+rel=(["\'])next\3',
|
||||
html, 'list_next_url', default=None, group='more')
|
||||
if not next_url:
|
||||
break
|
||||
|
||||
next_full_url = self._BASE_URL_TEMPL % (self.scheme, next_url)
|
||||
html = self._download_webpage(next_full_url, playlist_id)
|
||||
|
||||
def _process_page(self, html, uid):
|
||||
find_from = html.index('album_soundlist')
|
||||
for mobj in re.finditer(self._LIST_VIDEO_RE % uid, html[find_from:]):
|
||||
yield self.url_result(self._BASE_URL_TEMPL % (self.scheme, mobj.group('url')),
|
||||
XimalayaIE.ie_key(),
|
||||
mobj.group('id'),
|
||||
mobj.group('title'))
|
@ -1596,12 +1596,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
if 'token' not in video_info:
|
||||
video_info = get_video_info
|
||||
break
|
||||
|
||||
def extract_unavailable_message():
|
||||
return self._html_search_regex(
|
||||
r'(?s)<h1[^>]+id="unavailable-message"[^>]*>(.+?)</h1>',
|
||||
video_webpage, 'unavailable message', default=None)
|
||||
|
||||
if 'token' not in video_info:
|
||||
if 'reason' in video_info:
|
||||
if 'The uploader has not made this video available in your country.' in video_info['reason']:
|
||||
@ -1610,13 +1604,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
countries = regions_allowed.split(',') if regions_allowed else None
|
||||
self.raise_geo_restricted(
|
||||
msg=video_info['reason'][0], countries=countries)
|
||||
reason = video_info['reason'][0]
|
||||
if 'Invalid parameters' in reason:
|
||||
unavailable_message = extract_unavailable_message()
|
||||
if unavailable_message:
|
||||
reason = unavailable_message
|
||||
raise ExtractorError(
|
||||
'YouTube said: %s' % reason,
|
||||
'YouTube said: %s' % video_info['reason'][0],
|
||||
expected=True, video_id=video_id)
|
||||
else:
|
||||
raise ExtractorError(
|
||||
@ -1821,7 +1810,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'url': video_info['conn'][0],
|
||||
'player_url': player_url,
|
||||
}]
|
||||
elif not is_live and (len(video_info.get('url_encoded_fmt_stream_map', [''])[0]) >= 1 or len(video_info.get('adaptive_fmts', [''])[0]) >= 1):
|
||||
elif len(video_info.get('url_encoded_fmt_stream_map', [''])[0]) >= 1 or len(video_info.get('adaptive_fmts', [''])[0]) >= 1:
|
||||
encoded_url_map = video_info.get('url_encoded_fmt_stream_map', [''])[0] + ',' + video_info.get('adaptive_fmts', [''])[0]
|
||||
if 'rtmpe%3Dyes' in encoded_url_map:
|
||||
raise ExtractorError('rtmpe downloads are not supported, see https://github.com/rg3/youtube-dl/issues/343 for more information.', expected=True)
|
||||
@ -1944,11 +1933,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
break
|
||||
if codecs:
|
||||
dct.update(parse_codecs(codecs))
|
||||
if dct.get('acodec') == 'none' or dct.get('vcodec') == 'none':
|
||||
dct['downloader_options'] = {
|
||||
# Youtube throttles chunks >~10M
|
||||
'http_chunk_size': 10485760,
|
||||
}
|
||||
formats.append(dct)
|
||||
elif video_info.get('hlsvp'):
|
||||
manifest_url = video_info['hlsvp'][0]
|
||||
@ -1969,7 +1953,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
a_format.setdefault('http_headers', {})['Youtubedl-no-compression'] = 'True'
|
||||
formats.append(a_format)
|
||||
else:
|
||||
unavailable_message = extract_unavailable_message()
|
||||
unavailable_message = self._html_search_regex(
|
||||
r'(?s)<h1[^>]+id="unavailable-message"[^>]*>(.+?)</h1>',
|
||||
video_webpage, 'unavailable message', default=None)
|
||||
if unavailable_message:
|
||||
raise ExtractorError(unavailable_message, expected=True)
|
||||
raise ExtractorError('no conn, hlsvp or url_encoded_fmt_stream_map information found in video info')
|
||||
@ -2544,11 +2530,10 @@ class YoutubeLiveIE(YoutubeBaseInfoExtractor):
|
||||
webpage = self._download_webpage(url, channel_id, fatal=False)
|
||||
if webpage:
|
||||
page_type = self._og_search_property(
|
||||
'type', webpage, 'page type', default='')
|
||||
'type', webpage, 'page type', default=None)
|
||||
video_id = self._html_search_meta(
|
||||
'videoId', webpage, 'video id', default=None)
|
||||
if page_type.startswith('video') and video_id and re.match(
|
||||
r'^[0-9A-Za-z_-]{11}$', video_id):
|
||||
if page_type == 'video' and video_id and re.match(r'^[0-9A-Za-z_-]{11}$', video_id):
|
||||
return self.url_result(video_id, YoutubeIE.ie_key())
|
||||
return self.url_result(base_url)
|
||||
|
||||
|
@ -478,11 +478,6 @@ def parseOpts(overrideArguments=None):
|
||||
'--no-resize-buffer',
|
||||
action='store_true', dest='noresizebuffer', default=False,
|
||||
help='Do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.')
|
||||
downloader.add_option(
|
||||
'--http-chunk-size',
|
||||
dest='http_chunk_size', metavar='SIZE', default=None,
|
||||
help='Size of a chunk for chunk-based HTTP downloading (e.g. 10485760 or 10M) (default is disabled). '
|
||||
'May be useful for bypassing bandwidth throttling imposed by a webserver (experimental)')
|
||||
downloader.add_option(
|
||||
'--test',
|
||||
action='store_true', dest='test', default=False,
|
||||
|
@ -866,8 +866,8 @@ def _create_http_connection(ydl_handler, http_class, is_https, *args, **kwargs):
|
||||
# expected HTTP responses to meet HTTP/1.0 or later (see also
|
||||
# https://github.com/rg3/youtube-dl/issues/6727)
|
||||
if sys.version_info < (3, 0):
|
||||
kwargs['strict'] = True
|
||||
hc = http_class(*args, **compat_kwargs(kwargs))
|
||||
kwargs[b'strict'] = True
|
||||
hc = http_class(*args, **kwargs)
|
||||
source_address = ydl_handler._params.get('source_address')
|
||||
if source_address is not None:
|
||||
sa = (source_address, 0)
|
||||
@ -2267,7 +2267,7 @@ def js_to_json(code):
|
||||
"(?:[^"\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^"\\]*"|
|
||||
'(?:[^'\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^'\\]*'|
|
||||
{comment}|,(?={skip}[\]}}])|
|
||||
(?:(?<![0-9])[eE]|[a-df-zA-DF-Z_])[.a-zA-Z_0-9]*|
|
||||
[a-zA-Z_][.a-zA-Z_0-9]*|
|
||||
\b(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:{skip}:)?|
|
||||
[0-9]+(?={skip}:)
|
||||
'''.format(comment=COMMENT_RE, skip=SKIP_RE), fix_kv, code)
|
||||
|
@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2018.02.08'
|
||||
__version__ = '2017.12.31'
|
||||
|
Reference in New Issue
Block a user