Compare commits
68 Commits
2016.09.03
...
2016.09.08
Author | SHA1 | Date | |
---|---|---|---|
|
b717837190 | ||
|
2abad67e52 | ||
|
ad0e2b3359 | ||
|
37720844f6 | ||
|
6cfcb8ac36 | ||
|
7a979da8cb | ||
|
2fdc7b0e04 | ||
|
010d034fca | ||
|
02e552886f | ||
|
25042f7372 | ||
|
3f612f0767 | ||
|
17bf6e71cc | ||
|
881f35479d | ||
|
89f257d6e5 | ||
|
e78a5428b6 | ||
|
6656a82481 | ||
|
d7e794928d | ||
|
9c27188988 | ||
|
b84d311d53 | ||
|
f87feb4b68 | ||
|
2841bdcebb | ||
|
84b91dd4e3 | ||
|
92c9c2a88b | ||
|
9d54b02bae | ||
|
846d8b76a0 | ||
|
aa3f9fe695 | ||
|
8258f4457c | ||
|
948cd5b72d | ||
|
155bc674c4 | ||
|
c33c962adf | ||
|
bdcc046d12 | ||
|
a493f10208 | ||
|
f3eeaacb4e | ||
|
b4d6a85d60 | ||
|
0b36a96212 | ||
|
bc22a79694 | ||
|
340e31ca74 | ||
|
973dee491f | ||
|
1f85029d82 | ||
|
95be19d436 | ||
|
95843da529 | ||
|
abf2c79f95 | ||
|
b49ad71ce1 | ||
|
9127e1533d | ||
|
78e762d23c | ||
|
4809490108 | ||
|
8112bfeaba | ||
|
d9606d9b6c | ||
|
433af6ad30 | ||
|
feaa5ad787 | ||
|
100bd86a68 | ||
|
0def758782 | ||
|
919cf1a62f | ||
|
b29cd56591 | ||
|
622638512b | ||
|
37c7490ac6 | ||
|
091624f9da | ||
|
7e5dc339de | ||
|
4a69fa04e0 | ||
|
2e99cd30c3 | ||
|
25afc2a783 | ||
|
9603b66012 | ||
|
45aab4d30b | ||
|
ed2bfe93aa | ||
|
cdc783510b | ||
|
cf0efe9636 | ||
|
dedb177029 | ||
|
7be15d4097 |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.03*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.03**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.08*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.08**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.09.03
|
||||
[debug] youtube-dl version 2016.09.08
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
2
AUTHORS
2
AUTHORS
@@ -183,3 +183,5 @@ Petr Zvoníček
|
||||
Pratyush Singh
|
||||
Aleksander Nitecki
|
||||
Sebastian Blunt
|
||||
Matěj Cepl
|
||||
Xie Yanbo
|
||||
|
53
ChangeLog
53
ChangeLog
@@ -1,3 +1,55 @@
|
||||
version 2016.09.08
|
||||
|
||||
Extractors
|
||||
+ [jwplatform] Extract height from format label
|
||||
+ [yahoo] Extract Brightcove Legacy Studio embeds (#9345)
|
||||
* [videomore] Fix extraction (#10592)
|
||||
* [foxgay] Fix extraction (#10480)
|
||||
+ [rmcdecouverte] Add extractor for rmcdecouverte.bfmtv.com (#9709)
|
||||
* [gamestar] Fix metadata extraction (#10479)
|
||||
* [puls4] Fix extraction (#10583)
|
||||
+ [cctv] Add extractor for CCTV and CNTV (#8153)
|
||||
+ [lci] Add extractor for lci.fr (#10573)
|
||||
+ [wat] Extract DASH formats
|
||||
+ [viafree] Improve video id detection (#10569)
|
||||
+ [trutv] Add extractor for trutv.com (#10519)
|
||||
+ [nick] Add support for nickelodeon.nl (#10559)
|
||||
+ [abcotvs:clips] Add support for clips.abcotvs.com
|
||||
+ [abcotvs] Add support for ABC Owned Television Stations sites (#9551)
|
||||
+ [miaopai] Add extractor for miaopai.com (#10556)
|
||||
* [gamestar] Fix metadata extraction (#10479)
|
||||
+ [bilibili] Add support for episodes (#10190)
|
||||
+ [tvnoe] Add extractor for tvnoe.cz (#10524)
|
||||
|
||||
|
||||
version 2016.09.04.1
|
||||
|
||||
Core
|
||||
* In DASH downloader if the first segment fails, abort the whole download
|
||||
process to prevent throttling (#10497)
|
||||
+ Add support for --skip-unavailable-fragments and --fragment retries in
|
||||
hlsnative downloader (#10165, #10448).
|
||||
+ Add support for --skip-unavailable-fragments in DASH downloader
|
||||
+ Introduce --skip-unavailable-fragments option for fragment based downloaders
|
||||
that allows to skip fragments unavailable due to a HTTP error
|
||||
* Fix extraction of video/audio entries with src attribute in
|
||||
_parse_html5_media_entries (#10540)
|
||||
|
||||
Extractors
|
||||
* [theplatform] Relax URL regular expression (#10546)
|
||||
* [youtube:playlist] Extend URL regular expression
|
||||
* [rottentomatoes] Delegate extraction to internetvideoarchive extractor
|
||||
* [internetvideoarchive] Extract all formats
|
||||
* [pornvoisines] Fix extraction (#10469)
|
||||
* [rottentomatoes] Fix extraction (#10467)
|
||||
* [espn] Extend URL regular expression (#10549)
|
||||
* [vimple] Extend URL regular expression (#10547)
|
||||
* [youtube:watchlater] Fix extraction (#10544)
|
||||
* [youjizz] Fix extraction (#10437)
|
||||
+ [foxnews] Add support for FoxNews Insider (#10445)
|
||||
+ [fc2] Recognize Flash player URLs (#10512)
|
||||
|
||||
|
||||
version 2016.09.03
|
||||
|
||||
Core
|
||||
@@ -5,7 +57,6 @@ Core
|
||||
_extract_m3u8_formats (#10522)
|
||||
* Handle semicolon in mimetype2ext
|
||||
|
||||
|
||||
Extractors
|
||||
+ [youtube] Add support for rental videos' previews (#10532)
|
||||
* [youtube:playlist] Fallback to video extraction for video/playlist URLs when
|
||||
|
17
README.md
17
README.md
@@ -89,6 +89,8 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
--mark-watched Mark videos watched (YouTube only)
|
||||
--no-mark-watched Do not mark videos watched (YouTube only)
|
||||
--no-color Do not emit color codes in output
|
||||
--abort-on-unavailable-fragment Abort downloading when some fragment is not
|
||||
available
|
||||
|
||||
## Network Options:
|
||||
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
|
||||
@@ -173,7 +175,10 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
-R, --retries RETRIES Number of retries (default is 10), or
|
||||
"infinite".
|
||||
--fragment-retries RETRIES Number of retries for a fragment (default
|
||||
is 10), or "infinite" (DASH only)
|
||||
is 10), or "infinite" (DASH and hlsnative
|
||||
only)
|
||||
--skip-unavailable-fragments Skip unavailable fragments (DASH and
|
||||
hlsnative only)
|
||||
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
|
||||
(default is 1024)
|
||||
--no-resize-buffer Do not automatically adjust the buffer
|
||||
@@ -846,6 +851,16 @@ will download the complete `PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re` playlist and cre
|
||||
|
||||
youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
|
||||
|
||||
### Should I add `--hls-prefer-native` into my config?
|
||||
|
||||
When youtube-dl detects an HLS video, it can download it either with the built-in downloader or ffmpeg. Since many HLS streams are slightly invalid and ffmpeg/youtube-dl each handle some invalid cases better than the other, there is an option to switch the downloader if needed.
|
||||
|
||||
When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
|
||||
|
||||
In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](http://rg3.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
|
||||
|
||||
If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
|
||||
|
||||
### Can you add support for this anime video site, or site which shows current movies for free?
|
||||
|
||||
As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
|
||||
|
@@ -19,9 +19,10 @@
|
||||
- **9now.com.au**
|
||||
- **abc.net.au**
|
||||
- **abc.net.au:iview**
|
||||
- **Abc7News**
|
||||
- **abcnews**
|
||||
- **abcnews:video**
|
||||
- **abcotvs**: ABC Owned Television Stations
|
||||
- **abcotvs:clips**
|
||||
- **AcademicEarth:Course**
|
||||
- **acast**
|
||||
- **acast:channel**
|
||||
@@ -128,6 +129,7 @@
|
||||
- **CBSNews**: CBS News
|
||||
- **CBSNewsLiveVideo**: CBS News Live Videos
|
||||
- **CBSSports**
|
||||
- **CCTV**
|
||||
- **CDA**
|
||||
- **CeskaTelevize**
|
||||
- **channel9**: Channel 9
|
||||
@@ -232,6 +234,7 @@
|
||||
- **FacebookPluginsVideo**
|
||||
- **faz.net**
|
||||
- **fc2**
|
||||
- **fc2:embed**
|
||||
- **Fczenit**
|
||||
- **features.aol.com**
|
||||
- **fernsehkritik.tv**
|
||||
@@ -245,6 +248,7 @@
|
||||
- **FOX**
|
||||
- **Foxgay**
|
||||
- **FoxNews**: Fox News and Fox Business Video
|
||||
- **foxnews:insider**
|
||||
- **FoxSports**
|
||||
- **france2.fr:generation-quoi**
|
||||
- **FranceCulture**
|
||||
@@ -350,6 +354,7 @@
|
||||
- **kuwo:song**: 酷我音乐
|
||||
- **la7.it**
|
||||
- **Laola1Tv**
|
||||
- **LCI**
|
||||
- **Lcp**
|
||||
- **LcpPlay**
|
||||
- **Le**: 乐视网
|
||||
@@ -388,6 +393,7 @@
|
||||
- **Metacritic**
|
||||
- **Mgoon**
|
||||
- **MGTV**: 芒果TV
|
||||
- **MiaoPai**
|
||||
- **Minhateca**
|
||||
- **MinistryGrid**
|
||||
- **Minoto**
|
||||
@@ -574,6 +580,7 @@
|
||||
- **revision3:embed**
|
||||
- **RICE**
|
||||
- **RingTV**
|
||||
- **RMCDecouverte**
|
||||
- **RockstarGames**
|
||||
- **RoosterTeeth**
|
||||
- **RottenTomatoes**
|
||||
@@ -719,6 +726,7 @@
|
||||
- **TrailerAddict** (Currently broken)
|
||||
- **Trilulilu**
|
||||
- **trollvids**
|
||||
- **TruTV**
|
||||
- **Tube8**
|
||||
- **TubiTv**
|
||||
- **tudou**
|
||||
@@ -740,6 +748,7 @@
|
||||
- **TVCArticle**
|
||||
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||
- **tvland.com**
|
||||
- **TVNoe**
|
||||
- **tvp**: Telewizja Polska
|
||||
- **tvp:embed**: Telewizja Polska
|
||||
- **tvp:series**
|
||||
|
@@ -318,6 +318,7 @@ def _real_main(argv=None):
|
||||
'nooverwrites': opts.nooverwrites,
|
||||
'retries': opts.retries,
|
||||
'fragment_retries': opts.fragment_retries,
|
||||
'skip_unavailable_fragments': opts.skip_unavailable_fragments,
|
||||
'buffersize': opts.buffersize,
|
||||
'noresizebuffer': opts.noresizebuffer,
|
||||
'continuedl': opts.continue_dl,
|
||||
|
@@ -38,8 +38,10 @@ class DashSegmentsFD(FragmentFD):
|
||||
segments_filenames = []
|
||||
|
||||
fragment_retries = self.params.get('fragment_retries', 0)
|
||||
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
|
||||
|
||||
def append_url_to_file(target_url, tmp_filename, segment_name):
|
||||
def process_segment(segment, tmp_filename, fatal):
|
||||
target_url, segment_name = segment
|
||||
target_filename = '%s-%s' % (tmp_filename, segment_name)
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
@@ -52,26 +54,35 @@ class DashSegmentsFD(FragmentFD):
|
||||
down.close()
|
||||
segments_filenames.append(target_sanitized)
|
||||
break
|
||||
except (compat_urllib_error.HTTPError, ) as err:
|
||||
except compat_urllib_error.HTTPError as err:
|
||||
# YouTube may often return 404 HTTP error for a fragment causing the
|
||||
# whole download to fail. However if the same fragment is immediately
|
||||
# retried with the same request data this usually succeeds (1-2 attemps
|
||||
# is usually enough) thus allowing to download the whole file successfully.
|
||||
# So, we will retry all fragments that fail with 404 HTTP error for now.
|
||||
if err.code != 404:
|
||||
raise
|
||||
# Retry fragment
|
||||
# To be future-proof we will retry all fragments that fail with any
|
||||
# HTTP error.
|
||||
count += 1
|
||||
if count <= fragment_retries:
|
||||
self.report_retry_fragment(segment_name, count, fragment_retries)
|
||||
self.report_retry_fragment(err, segment_name, count, fragment_retries)
|
||||
if count > fragment_retries:
|
||||
if not fatal:
|
||||
self.report_skip_fragment(segment_name)
|
||||
return True
|
||||
self.report_error('giving up after %s fragment retries' % fragment_retries)
|
||||
return False
|
||||
return True
|
||||
|
||||
if initialization_url:
|
||||
append_url_to_file(initialization_url, ctx['tmpfilename'], 'Init')
|
||||
for i, segment_url in enumerate(segment_urls):
|
||||
append_url_to_file(segment_url, ctx['tmpfilename'], 'Seg%d' % i)
|
||||
segments_to_download = [(initialization_url, 'Init')] if initialization_url else []
|
||||
segments_to_download.extend([
|
||||
(segment_url, 'Seg%d' % i)
|
||||
for i, segment_url in enumerate(segment_urls)])
|
||||
|
||||
for i, segment in enumerate(segments_to_download):
|
||||
# In DASH, the first segment contains necessary headers to
|
||||
# generate a valid MP4 file, so always abort for the first segment
|
||||
fatal = i == 0 or not skip_unavailable_fragments
|
||||
if not process_segment(segment, ctx['tmpfilename'], fatal):
|
||||
return False
|
||||
|
||||
self._finish_frag_download(ctx)
|
||||
|
||||
|
@@ -6,6 +6,7 @@ import time
|
||||
from .common import FileDownloader
|
||||
from .http import HttpFD
|
||||
from ..utils import (
|
||||
error_to_compat_str,
|
||||
encodeFilename,
|
||||
sanitize_open,
|
||||
)
|
||||
@@ -22,13 +23,19 @@ class FragmentFD(FileDownloader):
|
||||
|
||||
Available options:
|
||||
|
||||
fragment_retries: Number of times to retry a fragment for HTTP error (DASH only)
|
||||
fragment_retries: Number of times to retry a fragment for HTTP error (DASH
|
||||
and hlsnative only)
|
||||
skip_unavailable_fragments:
|
||||
Skip unavailable fragments (DASH and hlsnative only)
|
||||
"""
|
||||
|
||||
def report_retry_fragment(self, fragment_name, count, retries):
|
||||
def report_retry_fragment(self, err, fragment_name, count, retries):
|
||||
self.to_screen(
|
||||
'[download] Got server HTTP error. Retrying fragment %s (attempt %d of %s)...'
|
||||
% (fragment_name, count, self.format_retries(retries)))
|
||||
'[download] Got server HTTP error: %s. Retrying fragment %s (attempt %d of %s)...'
|
||||
% (error_to_compat_str(err), fragment_name, count, self.format_retries(retries)))
|
||||
|
||||
def report_skip_fragment(self, fragment_name):
|
||||
self.to_screen('[download] Skipping fragment %s...' % fragment_name)
|
||||
|
||||
def _prepare_and_start_frag_download(self, ctx):
|
||||
self._prepare_frag_download(ctx)
|
||||
|
@@ -13,6 +13,7 @@ from .fragment import FragmentFD
|
||||
from .external import FFmpegFD
|
||||
|
||||
from ..compat import (
|
||||
compat_urllib_error,
|
||||
compat_urlparse,
|
||||
compat_struct_pack,
|
||||
)
|
||||
@@ -83,6 +84,10 @@ class HlsFD(FragmentFD):
|
||||
|
||||
self._prepare_and_start_frag_download(ctx)
|
||||
|
||||
fragment_retries = self.params.get('fragment_retries', 0)
|
||||
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
|
||||
test = self.params.get('test', False)
|
||||
|
||||
extra_query = None
|
||||
extra_param_to_segment_url = info_dict.get('extra_param_to_segment_url')
|
||||
if extra_param_to_segment_url:
|
||||
@@ -99,15 +104,37 @@ class HlsFD(FragmentFD):
|
||||
line
|
||||
if re.match(r'^https?://', line)
|
||||
else compat_urlparse.urljoin(man_url, line))
|
||||
frag_filename = '%s-Frag%d' % (ctx['tmpfilename'], i)
|
||||
frag_name = 'Frag%d' % i
|
||||
frag_filename = '%s-%s' % (ctx['tmpfilename'], frag_name)
|
||||
if extra_query:
|
||||
frag_url = update_url_query(frag_url, extra_query)
|
||||
success = ctx['dl'].download(frag_filename, {'url': frag_url})
|
||||
if not success:
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
try:
|
||||
success = ctx['dl'].download(frag_filename, {'url': frag_url})
|
||||
if not success:
|
||||
return False
|
||||
down, frag_sanitized = sanitize_open(frag_filename, 'rb')
|
||||
frag_content = down.read()
|
||||
down.close()
|
||||
break
|
||||
except compat_urllib_error.HTTPError as err:
|
||||
# Unavailable (possibly temporary) fragments may be served.
|
||||
# First we try to retry then either skip or abort.
|
||||
# See https://github.com/rg3/youtube-dl/issues/10165,
|
||||
# https://github.com/rg3/youtube-dl/issues/10448).
|
||||
count += 1
|
||||
if count <= fragment_retries:
|
||||
self.report_retry_fragment(err, frag_name, count, fragment_retries)
|
||||
if count > fragment_retries:
|
||||
if skip_unavailable_fragments:
|
||||
i += 1
|
||||
media_sequence += 1
|
||||
self.report_skip_fragment(frag_name)
|
||||
continue
|
||||
self.report_error(
|
||||
'giving up after %s fragment retries' % fragment_retries)
|
||||
return False
|
||||
down, frag_sanitized = sanitize_open(frag_filename, 'rb')
|
||||
frag_content = down.read()
|
||||
down.close()
|
||||
if decrypt_info['METHOD'] == 'AES-128':
|
||||
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
|
||||
frag_content = AES.new(
|
||||
@@ -115,7 +142,7 @@ class HlsFD(FragmentFD):
|
||||
ctx['dest_stream'].write(frag_content)
|
||||
frags_filenames.append(frag_sanitized)
|
||||
# We only download the first fragment during the test
|
||||
if self.params.get('test', False):
|
||||
if test:
|
||||
break
|
||||
i += 1
|
||||
media_sequence += 1
|
||||
|
@@ -12,7 +12,7 @@ from ..compat import compat_urlparse
|
||||
|
||||
class AbcNewsVideoIE(AMPIE):
|
||||
IE_NAME = 'abcnews:video'
|
||||
_VALID_URL = 'http://abcnews.go.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://abcnews\.go\.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932',
|
||||
@@ -49,7 +49,7 @@ class AbcNewsVideoIE(AMPIE):
|
||||
|
||||
class AbcNewsIE(InfoExtractor):
|
||||
IE_NAME = 'abcnews'
|
||||
_VALID_URL = 'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
|
||||
|
@@ -1,13 +1,19 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_iso8601
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class Abc7NewsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://abc7news\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
|
||||
class ABCOTVSIE(InfoExtractor):
|
||||
IE_NAME = 'abcotvs'
|
||||
IE_DESC = 'ABC Owned Television Stations'
|
||||
_VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
|
||||
@@ -15,7 +21,7 @@ class Abc7NewsIE(InfoExtractor):
|
||||
'id': '472581',
|
||||
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
|
||||
'ext': 'mp4',
|
||||
'title': 'East Bay museum celebrates history of synthesized music',
|
||||
'title': 'East Bay museum celebrates vintage synthesizers',
|
||||
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'timestamp': 1421123075,
|
||||
@@ -41,7 +47,7 @@ class Abc7NewsIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
m3u8 = self._html_search_meta(
|
||||
'contentURL', webpage, 'm3u8 url', fatal=True)
|
||||
'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0]
|
||||
|
||||
formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
|
||||
self._sort_formats(formats)
|
||||
@@ -66,3 +72,41 @@ class Abc7NewsIE(InfoExtractor):
|
||||
'uploader': uploader,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class ABCOTVSClipsIE(InfoExtractor):
|
||||
IE_NAME = 'abcotvs:clips'
|
||||
_VALID_URL = r'https?://clips\.abcotvs\.com/(?:[^/]+/)*video/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://clips.abcotvs.com/kabc/video/214814',
|
||||
'info_dict': {
|
||||
'id': '214814',
|
||||
'ext': 'mp4',
|
||||
'title': 'SpaceX launch pad explosion destroys rocket, satellite',
|
||||
'description': 'md5:9f186e5ad8f490f65409965ee9c7be1b',
|
||||
'upload_date': '20160901',
|
||||
'timestamp': 1472756695,
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_data = self._download_json('https://clips.abcotvs.com/vogo/video/getByIds?ids=' + video_id, video_id)['results'][0]
|
||||
title = video_data['title']
|
||||
formats = self._extract_m3u8_formats(
|
||||
video_data['videoURL'].split('?')[0], video_id, 'mp4')
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': video_data.get('description'),
|
||||
'thumbnail': video_data.get('thumbnailURL'),
|
||||
'duration': int_or_none(video_data.get('duration')),
|
||||
'timestamp': int_or_none(video_data.get('pubDate')),
|
||||
'formats': formats,
|
||||
}
|
@@ -238,7 +238,7 @@ class ARDMediathekIE(InfoExtractor):
|
||||
|
||||
|
||||
class ARDIE(InfoExtractor):
|
||||
_VALID_URL = '(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
||||
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
|
||||
'md5': 'd216c3a86493f9322545e045ddc3eb35',
|
||||
|
@@ -10,11 +10,12 @@ from ..utils import (
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
unified_timestamp,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class BiliBiliIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||
@@ -77,6 +78,17 @@ class BiliBiliIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['upload time'],
|
||||
}, {
|
||||
'url': 'http://bangumi.bilibili.com/anime/v/40068',
|
||||
'md5': '08d539a0884f3deb7b698fb13ba69696',
|
||||
'info_dict': {
|
||||
'id': '40068',
|
||||
'ext': 'mp4',
|
||||
'duration': 1402.357,
|
||||
'title': '混沌武士 : 第7集 四面楚歌 A Risky Racket',
|
||||
'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
|
||||
'thumbnail': 're:^http?://.+\.jpg',
|
||||
},
|
||||
}]
|
||||
|
||||
_APP_KEY = '6f90a59ac58a4123'
|
||||
@@ -84,13 +96,19 @@ class BiliBiliIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
cid = compat_parse_qs(self._search_regex(
|
||||
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||
webpage, 'player parameters'))['cid'][0]
|
||||
if 'anime/v' not in url:
|
||||
cid = compat_parse_qs(self._search_regex(
|
||||
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||
webpage, 'player parameters'))['cid'][0]
|
||||
else:
|
||||
js = self._download_json(
|
||||
'http://bangumi.bilibili.com/web_api/get_source', video_id,
|
||||
data=urlencode_postdata({'episode_id': video_id}),
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'})
|
||||
cid = js['result']['cid']
|
||||
|
||||
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
|
||||
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
|
||||
@@ -125,6 +143,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
description = self._html_search_meta('description', webpage)
|
||||
timestamp = unified_timestamp(self._html_search_regex(
|
||||
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', fatal=False))
|
||||
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
|
||||
|
||||
# TODO 'view_count' requires deobfuscating Javascript
|
||||
info = {
|
||||
@@ -132,7 +151,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
'title': title,
|
||||
'description': description,
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': self._html_search_meta('thumbnailUrl', webpage),
|
||||
'thumbnail': thumbnail,
|
||||
'duration': float_or_none(video_info.get('timelength'), scale=1000),
|
||||
}
|
||||
|
||||
|
@@ -30,7 +30,7 @@ class CartoonNetworkIE(TurnerBaseIE):
|
||||
return self._extract_cvp_info(
|
||||
'http://www.cartoonnetwork.com/video-seo-svc/episodeservices/getCvpPlaylist?networkName=CN2&' + query, video_id, {
|
||||
'secure': {
|
||||
'media_src': 'http://apple-secure.cdn.turner.com/toon/big',
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big',
|
||||
'tokenizer_src': 'http://www.cartoonnetwork.com/cntv/mvpd/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
})
|
||||
|
53
youtube_dl/extractor/cctv.py
Normal file
53
youtube_dl/extractor/cctv.py
Normal file
@@ -0,0 +1,53 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
|
||||
|
||||
class CCTVIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
|
||||
(?:
|
||||
cctv\.(?:com|cn)|
|
||||
cntv\.cn
|
||||
)/
|
||||
(?:
|
||||
video/[^/]+/(?P<id>[0-9a-f]{32})|
|
||||
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
|
||||
)'''
|
||||
_TESTS = [{
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'md5': '819c7b49fc3927d529fb4cd555621823',
|
||||
'info_dict': {
|
||||
'id': '454368eb19ad44a1925bf1eb96140a61',
|
||||
'ext': 'mp4',
|
||||
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
|
||||
'only_matching': True
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, display_id = re.match(self._VALID_URL, url).groups()
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
|
||||
webpage, 'video_id')
|
||||
api_data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
|
||||
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': api_data['title'],
|
||||
'formats': self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
|
||||
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
|
||||
}
|
@@ -1163,13 +1163,6 @@ class InfoExtractor(object):
|
||||
m3u8_id=None, note=None, errnote=None,
|
||||
fatal=True, live=False):
|
||||
|
||||
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
|
||||
|
||||
format_url = lambda u: (
|
||||
u
|
||||
if re.match(r'^https?://', u)
|
||||
else compat_urlparse.urljoin(m3u8_url, u))
|
||||
|
||||
res = self._download_webpage_handle(
|
||||
m3u8_url, video_id,
|
||||
note=note or 'Downloading m3u8 information',
|
||||
@@ -1180,6 +1173,13 @@ class InfoExtractor(object):
|
||||
m3u8_doc, urlh = res
|
||||
m3u8_url = urlh.geturl()
|
||||
|
||||
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
|
||||
|
||||
format_url = lambda u: (
|
||||
u
|
||||
if re.match(r'^https?://', u)
|
||||
else compat_urlparse.urljoin(m3u8_url, u))
|
||||
|
||||
# We should try extracting formats only from master playlists [1], i.e.
|
||||
# playlists that describe available qualities. On the other hand media
|
||||
# playlists [2] should be returned as is since they contain just the media
|
||||
@@ -1749,7 +1749,7 @@ class InfoExtractor(object):
|
||||
media_attributes = extract_attributes(media_tag)
|
||||
src = media_attributes.get('src')
|
||||
if src:
|
||||
_, formats = _media_formats(src)
|
||||
_, formats = _media_formats(src, media_type)
|
||||
media_info['formats'].extend(formats)
|
||||
media_info['thumbnail'] = media_attributes.get('poster')
|
||||
if media_content:
|
||||
|
@@ -394,7 +394,7 @@ class DailymotionUserIE(DailymotionPlaylistIE):
|
||||
|
||||
|
||||
class DailymotionCloudIE(DailymotionBaseInfoExtractor):
|
||||
_VALID_URL_PREFIX = r'http://api\.dmcloud\.net/(?:player/)?embed/'
|
||||
_VALID_URL_PREFIX = r'https?://api\.dmcloud\.net/(?:player/)?embed/'
|
||||
_VALID_URL = r'%s[^/]+/(?P<id>[^/?]+)' % _VALID_URL_PREFIX
|
||||
_VALID_EMBED_URL = r'%s[^/]+/[^\'"]+' % _VALID_URL_PREFIX
|
||||
|
||||
|
@@ -5,7 +5,7 @@ from ..utils import remove_end
|
||||
|
||||
|
||||
class ESPNIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://espn\.go\.com/(?:[^/]+/)*(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:espn\.go|(?:www\.)?espn)\.com/(?:[^/]+/)*(?P<id>[^/]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://espn.go.com/video/clip?id=10365079',
|
||||
'md5': '60e5d097a523e767d06479335d1bdc58',
|
||||
@@ -47,6 +47,9 @@ class ESPNIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://espn.go.com/nba/playoffs/2015/story/_/id/12887571/john-wall-washington-wizards-no-swelling-left-hand-wrist-game-5-return',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.espn.com/video/clip?id=10365079',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@@ -5,11 +5,14 @@ from .abc import (
|
||||
ABCIE,
|
||||
ABCIViewIE,
|
||||
)
|
||||
from .abc7news import Abc7NewsIE
|
||||
from .abcnews import (
|
||||
AbcNewsIE,
|
||||
AbcNewsVideoIE,
|
||||
)
|
||||
from .abcotvs import (
|
||||
ABCOTVSIE,
|
||||
ABCOTVSClipsIE,
|
||||
)
|
||||
from .academicearth import AcademicEarthCourseIE
|
||||
from .acast import (
|
||||
ACastIE,
|
||||
@@ -143,6 +146,7 @@ from .cbsnews import (
|
||||
)
|
||||
from .cbssports import CBSSportsIE
|
||||
from .ccc import CCCIE
|
||||
from .cctv import CCTVIE
|
||||
from .cda import CDAIE
|
||||
from .ceskatelevize import CeskaTelevizeIE
|
||||
from .channel9 import Channel9IE
|
||||
@@ -269,7 +273,10 @@ from .facebook import (
|
||||
FacebookPluginsVideoIE,
|
||||
)
|
||||
from .faz import FazIE
|
||||
from .fc2 import FC2IE
|
||||
from .fc2 import (
|
||||
FC2IE,
|
||||
FC2EmbedIE,
|
||||
)
|
||||
from .fczenit import FczenitIE
|
||||
from .firstpost import FirstpostIE
|
||||
from .firsttv import FirstTVIE
|
||||
@@ -284,7 +291,10 @@ from .formula1 import Formula1IE
|
||||
from .fourtube import FourTubeIE
|
||||
from .fox import FOXIE
|
||||
from .foxgay import FoxgayIE
|
||||
from .foxnews import FoxNewsIE
|
||||
from .foxnews import (
|
||||
FoxNewsIE,
|
||||
FoxNewsInsiderIE,
|
||||
)
|
||||
from .foxsports import FoxSportsIE
|
||||
from .franceculture import FranceCultureIE
|
||||
from .franceinter import FranceInterIE
|
||||
@@ -415,6 +425,7 @@ from .kuwo import (
|
||||
)
|
||||
from .la7 import LA7IE
|
||||
from .laola1tv import Laola1TvIE
|
||||
from .lci import LCIIE
|
||||
from .lcp import (
|
||||
LcpPlayIE,
|
||||
LcpIE,
|
||||
@@ -465,6 +476,7 @@ from .metacafe import MetacafeIE
|
||||
from .metacritic import MetacriticIE
|
||||
from .mgoon import MgoonIE
|
||||
from .mgtv import MGTVIE
|
||||
from .miaopai import MiaoPaiIE
|
||||
from .microsoftvirtualacademy import (
|
||||
MicrosoftVirtualAcademyIE,
|
||||
MicrosoftVirtualAcademyCourseIE,
|
||||
@@ -712,6 +724,7 @@ from .revision3 import (
|
||||
)
|
||||
from .rice import RICEIE
|
||||
from .ringtv import RingTVIE
|
||||
from .rmcdecouverte import RMCDecouverteIE
|
||||
from .ro220 import Ro220IE
|
||||
from .rockstargames import RockstarGamesIE
|
||||
from .roosterteeth import RoosterTeethIE
|
||||
@@ -881,6 +894,7 @@ from .toypics import ToypicsUserIE, ToypicsIE
|
||||
from .traileraddict import TrailerAddictIE
|
||||
from .trilulilu import TriluliluIE
|
||||
from .trollvids import TrollvidsIE
|
||||
from .trutv import TruTVIE
|
||||
from .tube8 import Tube8IE
|
||||
from .tubitv import TubiTvIE
|
||||
from .tudou import (
|
||||
@@ -910,6 +924,7 @@ from .tvc import (
|
||||
)
|
||||
from .tvigle import TvigleIE
|
||||
from .tvland import TVLandIE
|
||||
from .tvnoe import TVNoeIE
|
||||
from .tvp import (
|
||||
TVPEmbedIE,
|
||||
TVPIE,
|
||||
|
@@ -1,10 +1,12 @@
|
||||
#! -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import hashlib
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urllib_request,
|
||||
compat_urlparse,
|
||||
)
|
||||
@@ -16,7 +18,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class FC2IE(InfoExtractor):
|
||||
_VALID_URL = r'^https?://video\.fc2\.com/(?:[^/]+/)*content/(?P<id>[^/]+)'
|
||||
_VALID_URL = r'^(?:https?://video\.fc2\.com/(?:[^/]+/)*content/|fc2:)(?P<id>[^/]+)'
|
||||
IE_NAME = 'fc2'
|
||||
_NETRC_MACHINE = 'fc2'
|
||||
_TESTS = [{
|
||||
@@ -75,12 +77,17 @@ class FC2IE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
self._login()
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
self._downloader.cookiejar.clear_session_cookies() # must clear
|
||||
self._login()
|
||||
webpage = None
|
||||
if not url.startswith('fc2:'):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
self._downloader.cookiejar.clear_session_cookies() # must clear
|
||||
self._login()
|
||||
|
||||
title = self._og_search_title(webpage)
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
title = 'FC2 video %s' % video_id
|
||||
thumbnail = None
|
||||
if webpage is not None:
|
||||
title = self._og_search_title(webpage)
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
refer = url.replace('/content/', '/a/content/') if '/a/content/' not in url else url
|
||||
|
||||
mimi = hashlib.md5((video_id + '_gGddgPfeaf_gzyr').encode('utf-8')).hexdigest()
|
||||
@@ -113,3 +120,41 @@ class FC2IE(InfoExtractor):
|
||||
'ext': 'flv',
|
||||
'thumbnail': thumbnail,
|
||||
}
|
||||
|
||||
|
||||
class FC2EmbedIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://video\.fc2\.com/flv2\.swf\?(?P<query>.+)'
|
||||
IE_NAME = 'fc2:embed'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://video.fc2.com/flv2.swf?t=201404182936758512407645&i=20130316kwishtfitaknmcgd76kjd864hso93htfjcnaogz629mcgfs6rbfk0hsycma7shkf85937cbchfygd74&i=201403223kCqB3Ez&d=2625&sj=11&lang=ja&rel=1&from=11&cmt=1&tk=TlRBM09EQTNNekU9&tl=プリズン・ブレイク%20S1-01%20マイケル%20【吹替】',
|
||||
'md5': 'b8aae5334cb691bdb1193a88a6ab5d5a',
|
||||
'info_dict': {
|
||||
'id': '201403223kCqB3Ez',
|
||||
'ext': 'flv',
|
||||
'title': 'プリズン・ブレイク S1-01 マイケル 【吹替】',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
query = compat_parse_qs(mobj.group('query'))
|
||||
|
||||
video_id = query['i'][-1]
|
||||
title = query.get('tl', ['FC2 video %s' % video_id])[0]
|
||||
|
||||
sj = query.get('sj', [None])[0]
|
||||
thumbnail = None
|
||||
if sj:
|
||||
# See thumbnailImagePath() in ServerConst.as of flv2.swf
|
||||
thumbnail = 'http://video%s-thumbnail.fc2.com/up/pic/%s.jpg' % (
|
||||
sj, '/'.join((video_id[:6], video_id[6:8], video_id[-2], video_id[-1], video_id)))
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': FC2IE.ie_key(),
|
||||
'url': 'fc2:%s' % video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
}
|
||||
|
@@ -1,18 +1,24 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import itertools
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
get_element_by_id,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
class FoxgayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml'
|
||||
_TEST = {
|
||||
'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml',
|
||||
'md5': '80d72beab5d04e1655a56ad37afe6841',
|
||||
'md5': '344558ccfea74d33b7adbce22e577f54',
|
||||
'info_dict': {
|
||||
'id': '2582',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:6122f7ae0fc6b21ebdf59c5e083ce25a',
|
||||
'description': 'md5:5e51dc4405f1fd315f7927daed2ce5cf',
|
||||
'title': 'Fuck Turkish-style',
|
||||
'description': 'md5:6ae2d9486921891efe89231ace13ffdf',
|
||||
'age_limit': 18,
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
},
|
||||
@@ -22,27 +28,35 @@ class FoxgayIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>(?P<title>.*?)</title>',
|
||||
webpage, 'title', fatal=False)
|
||||
description = self._html_search_regex(
|
||||
r'<div class="ico_desc"><h2>(?P<description>.*?)</h2>',
|
||||
webpage, 'description', fatal=False)
|
||||
title = remove_end(self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title'), ' - Foxgay.com')
|
||||
description = get_element_by_id('inf_tit', webpage)
|
||||
|
||||
# The default user-agent with foxgay cookies leads to pages without videos
|
||||
self._downloader.cookiejar.clear('.foxgay.com')
|
||||
# Find the URL for the iFrame which contains the actual video.
|
||||
iframe_url = self._html_search_regex(
|
||||
r'<iframe[^>]+src=([\'"])(?P<url>[^\'"]+)\1', webpage,
|
||||
'video frame', group='url')
|
||||
iframe = self._download_webpage(
|
||||
self._html_search_regex(r'iframe src="(?P<frame>.*?)"', webpage, 'video frame'),
|
||||
video_id)
|
||||
video_url = self._html_search_regex(
|
||||
r"v_path = '(?P<vid>http://.*?)'", iframe, 'url')
|
||||
thumb_url = self._html_search_regex(
|
||||
r"t_path = '(?P<thumb>http://.*?)'", iframe, 'thumbnail', fatal=False)
|
||||
iframe_url, video_id, headers={'User-Agent': 'curl/7.50.1'},
|
||||
note='Downloading video frame')
|
||||
video_data = self._parse_json(self._search_regex(
|
||||
r'video_data\s*=\s*([^;]+);', iframe, 'video data'), video_id)
|
||||
|
||||
formats = [{
|
||||
'url': source,
|
||||
'height': resolution,
|
||||
} for source, resolution in zip(
|
||||
video_data['sources'], video_data.get('resolutions', itertools.repeat(None)))]
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': video_url,
|
||||
'formats': formats,
|
||||
'description': description,
|
||||
'thumbnail': thumb_url,
|
||||
'thumbnail': video_data.get('act_vid', {}).get('thumb'),
|
||||
'age_limit': 18,
|
||||
}
|
||||
|
@@ -3,11 +3,12 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .amp import AMPIE
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class FoxNewsIE(AMPIE):
|
||||
IE_DESC = 'Fox News and Fox Business Video'
|
||||
_VALID_URL = r'https?://(?P<host>video\.fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?P<host>video\.(?:insider\.)?fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://video.foxnews.com/v/3937480/frozen-in-time/#sp=show-clips',
|
||||
@@ -49,6 +50,11 @@ class FoxNewsIE(AMPIE):
|
||||
'url': 'http://video.foxbusiness.com/v/4442309889001',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
# From http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words
|
||||
'url': 'http://video.insider.foxnews.com/v/video-embed.html?video_id=5099377331001&autoplay=true&share_url=http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words&share_title=Student%20Group:%20Saying%20%27Politically%20Correct,%27%20%27Trash%27%20and%20%27Lame%27%20Is%20Offensive&share=true',
|
||||
'only_matching': True,
|
||||
},
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -58,3 +64,43 @@ class FoxNewsIE(AMPIE):
|
||||
'http://%s/v/feed/video/%s.js?template=fox' % (host, video_id))
|
||||
info['id'] = video_id
|
||||
return info
|
||||
|
||||
|
||||
class FoxNewsInsiderIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://insider\.foxnews\.com/([^/]+/)+(?P<id>[a-z-]+)'
|
||||
IE_NAME = 'foxnews:insider'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words',
|
||||
'md5': 'a10c755e582d28120c62749b4feb4c0c',
|
||||
'info_dict': {
|
||||
'id': '5099377331001',
|
||||
'display_id': 'univ-wisconsin-student-group-pushing-silence-certain-words',
|
||||
'ext': 'mp4',
|
||||
'title': 'Student Group: Saying \'Politically Correct,\' \'Trash\' and \'Lame\' Is Offensive',
|
||||
'description': 'Is campus censorship getting out of control?',
|
||||
'timestamp': 1472168725,
|
||||
'upload_date': '20160825',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
'add_ie': [FoxNewsIE.ie_key()],
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
embed_url = self._html_search_meta('embedUrl', webpage, 'embed URL')
|
||||
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._og_search_description(webpage)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': FoxNewsIE.ie_key(),
|
||||
'url': embed_url,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
}
|
||||
|
@@ -1,14 +1,10 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
str_to_int,
|
||||
unified_strdate,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
@@ -21,8 +17,9 @@ class GameStarIE(InfoExtractor):
|
||||
'id': '76110',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den vollständigen Trailer an.',
|
||||
'thumbnail': 'http://images.gamestar.de/images/idgwpgsgp/bdb/2494525/600x.jpg',
|
||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'timestamp': 1406542020,
|
||||
'upload_date': '20140728',
|
||||
'duration': 17
|
||||
}
|
||||
@@ -32,41 +29,27 @@ class GameStarIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
og_title = self._og_search_title(webpage)
|
||||
title = re.sub(r'\s*- Video (bei|-) GameStar\.de$', '', og_title)
|
||||
|
||||
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id
|
||||
|
||||
description = self._og_search_description(webpage).strip()
|
||||
|
||||
thumbnail = self._proto_relative_url(
|
||||
self._og_search_thumbnail(webpage), scheme='http:')
|
||||
|
||||
upload_date = unified_strdate(self._html_search_regex(
|
||||
r'<span style="float:left;font-size:11px;">Datum: ([0-9]+\.[0-9]+\.[0-9]+) ',
|
||||
webpage, 'upload_date', fatal=False))
|
||||
|
||||
duration = parse_duration(self._html_search_regex(
|
||||
r' Länge: ([0-9]+:[0-9]+)</span>', webpage, 'duration',
|
||||
fatal=False))
|
||||
|
||||
view_count = str_to_int(self._html_search_regex(
|
||||
r' Zuschauer: ([0-9\.]+) ', webpage,
|
||||
'view_count', fatal=False))
|
||||
# TODO: there are multiple ld+json objects in the webpage,
|
||||
# while _search_json_ld finds only the first one
|
||||
json_ld = self._parse_json(self._search_regex(
|
||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
||||
webpage, 'JSON-LD', group='json_ld'), video_id)
|
||||
info_dict = self._json_ld(json_ld, video_id)
|
||||
info_dict['title'] = remove_end(info_dict['title'], ' - GameStar')
|
||||
|
||||
view_count = json_ld.get('interactionCount')
|
||||
comment_count = int_or_none(self._html_search_regex(
|
||||
r'>Kommentieren \(([0-9]+)\)</a>', webpage, 'comment_count',
|
||||
r'([0-9]+) Kommentare</span>', webpage, 'comment_count',
|
||||
fatal=False))
|
||||
|
||||
return {
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': url,
|
||||
'ext': 'mp4',
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
'upload_date': upload_date,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
@@ -19,7 +19,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class GloboIE(InfoExtractor):
|
||||
_VALID_URL = '(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})'
|
||||
_VALID_URL = r'(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})'
|
||||
|
||||
_API_URL_TEMPLATE = 'http://api.globovideos.com/videos/%s/playlist'
|
||||
_SECURITY_URL_TEMPLATE = 'http://security.video.globo.com/videos/%s/hash?player=flash&version=17.0.0.132&resource_id=%s'
|
||||
@@ -396,7 +396,7 @@ class GloboIE(InfoExtractor):
|
||||
|
||||
|
||||
class GloboArticleIE(InfoExtractor):
|
||||
_VALID_URL = 'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/]+)(?:\.html)?'
|
||||
_VALID_URL = r'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/]+)(?:\.html)?'
|
||||
|
||||
_VIDEOID_REGEXES = [
|
||||
r'\bdata-video-id=["\'](\d{7,})',
|
||||
|
@@ -48,13 +48,23 @@ class InternetVideoArchiveIE(InfoExtractor):
|
||||
# There are multiple videos in the playlist whlie only the first one
|
||||
# matches the video played in browsers
|
||||
video_info = configuration['playlist'][0]
|
||||
title = video_info['title']
|
||||
|
||||
formats = []
|
||||
for source in video_info['sources']:
|
||||
file_url = source['file']
|
||||
if determine_ext(file_url) == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
file_url, video_id, ext='mp4', m3u8_id='hls'))
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
file_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
if m3u8_formats:
|
||||
formats.extend(m3u8_formats)
|
||||
file_url = m3u8_formats[0]['url']
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
file_url.replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
file_url.replace('.m3u8', '.mpd'),
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
a_format = {
|
||||
'url': file_url,
|
||||
@@ -70,7 +80,6 @@ class InternetVideoArchiveIE(InfoExtractor):
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = video_info['title']
|
||||
description = video_info.get('description')
|
||||
thumbnail = video_info.get('image')
|
||||
else:
|
||||
|
@@ -63,10 +63,17 @@ class JWPlatformBaseIE(InfoExtractor):
|
||||
'ext': ext,
|
||||
})
|
||||
else:
|
||||
height = int_or_none(source.get('height'))
|
||||
if height is None:
|
||||
# Often no height is provided but there is a label in
|
||||
# format like 1080p.
|
||||
height = int_or_none(self._search_regex(
|
||||
r'^(\d{3,})[pP]$', source.get('label') or '',
|
||||
'height', default=None))
|
||||
a_format = {
|
||||
'url': source_url,
|
||||
'width': int_or_none(source.get('width')),
|
||||
'height': int_or_none(source.get('height')),
|
||||
'height': height,
|
||||
'ext': ext,
|
||||
}
|
||||
if source_url.startswith('rtmp'):
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class KaraoketvIE(InfoExtractor):
|
||||
_VALID_URL = r'http://www.karaoketv.co.il/[^/]+/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://www\.karaoketv\.co\.il/[^/]+/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.karaoketv.co.il/%D7%A9%D7%99%D7%A8%D7%99_%D7%A7%D7%A8%D7%99%D7%95%D7%A7%D7%99/58356/%D7%90%D7%99%D7%96%D7%95%D7%9F',
|
||||
'info_dict': {
|
||||
|
24
youtube_dl/extractor/lci.py
Normal file
24
youtube_dl/extractor/lci.py
Normal file
@@ -0,0 +1,24 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class LCIIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?lci\.fr/[^/]+/[\w-]+-(?P<id>\d+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.lci.fr/international/etats-unis-a-j-62-hillary-clinton-reste-sans-voix-2001679.html',
|
||||
'md5': '2fdb2538b884d4d695f9bd2bde137e6c',
|
||||
'info_dict': {
|
||||
'id': '13244802',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hillary Clinton et sa quinte de toux, en plein meeting',
|
||||
'description': 'md5:a4363e3a960860132f8124b62f4a01c9',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
wat_id = self._search_regex(r'data-watid=[\'"](\d+)', webpage, 'wat id')
|
||||
return self.url_result('wat:' + wat_id, 'Wat', wat_id)
|
40
youtube_dl/extractor/miaopai.py
Normal file
40
youtube_dl/extractor/miaopai.py
Normal file
@@ -0,0 +1,40 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class MiaoPaiIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?miaopai\.com/show/(?P<id>[-A-Za-z0-9~_]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.miaopai.com/show/n~0hO7sfV1nBEw4Y29-Hqg__.htm',
|
||||
'md5': '095ed3f1cd96b821add957bdc29f845b',
|
||||
'info_dict': {
|
||||
'id': 'n~0hO7sfV1nBEw4Y29-Hqg__',
|
||||
'ext': 'mp4',
|
||||
'title': '西游记音乐会的秒拍视频',
|
||||
'thumbnail': 're:^https?://.*/n~0hO7sfV1nBEw4Y29-Hqg___m.jpg',
|
||||
}
|
||||
}
|
||||
|
||||
_USER_AGENT_IPAD = 'Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, headers={'User-Agent': self._USER_AGENT_IPAD})
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title')
|
||||
thumbnail = self._html_search_regex(
|
||||
r'<div[^>]+class=(?P<q1>[\'"]).*\bvideo_img\b.*(?P=q1)[^>]+data-url=(?P<q2>[\'"])(?P<url>[^\'"]+)(?P=q2)',
|
||||
webpage, 'thumbnail', fatal=False, group='url')
|
||||
videos = self._parse_html5_media_entries(url, webpage, video_id)
|
||||
info = videos[0]
|
||||
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
})
|
||||
return info
|
@@ -35,7 +35,8 @@ class MoeVideoIE(InfoExtractor):
|
||||
'height': 360,
|
||||
'duration': 179,
|
||||
'filesize': 17822500,
|
||||
}
|
||||
},
|
||||
'skip': 'Video has been removed',
|
||||
},
|
||||
{
|
||||
'url': 'http://playreplay.net/video/77107.7f325710a627383d40540d8e991a',
|
||||
|
@@ -69,13 +69,16 @@ class NickIE(MTVServicesInfoExtractor):
|
||||
|
||||
class NickDeIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'nick.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?nick\.de/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:nick\.de|nickelodeon\.nl)/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.nick.de/playlist/3773-top-videos/videos/episode/17306-zu-wasser-und-zu-land-rauchende-erdnusse',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nick.de/shows/342-icarly',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nickelodeon.nl/shows/474-spongebob/videos/17403-een-kijkje-in-de-keuken-met-sandy-van-binnenuit',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@@ -90,7 +90,7 @@ class OnetBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class OnetIE(OnetBaseIE):
|
||||
_VALID_URL = 'https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
IE_NAME = 'onet.tv'
|
||||
|
||||
_TEST = {
|
||||
|
@@ -2,7 +2,6 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import random
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
@@ -13,61 +12,69 @@ from ..utils import (
|
||||
|
||||
|
||||
class PornoVoisinesIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?pornovoisines\.com/showvideo/(?P<id>\d+)/(?P<display_id>[^/]+)'
|
||||
|
||||
_VIDEO_URL_TEMPLATE = 'http://stream%d.pornovoisines.com' \
|
||||
'/static/media/video/transcoded/%s-640x360-1000-trscded.mp4'
|
||||
|
||||
_SERVER_NUMBERS = (1, 2)
|
||||
_VALID_URL = r'https?://(?:www\.)?pornovoisines\.com/videos/show/(?P<id>\d+)/(?P<display_id>[^/.]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.pornovoisines.com/showvideo/1285/recherche-appartement/',
|
||||
'md5': '5ac670803bc12e9e7f9f662ce64cf1d1',
|
||||
'url': 'http://www.pornovoisines.com/videos/show/919/recherche-appartement.html',
|
||||
'md5': '6f8aca6a058592ab49fe701c8ba8317b',
|
||||
'info_dict': {
|
||||
'id': '1285',
|
||||
'id': '919',
|
||||
'display_id': 'recherche-appartement',
|
||||
'ext': 'mp4',
|
||||
'title': 'Recherche appartement',
|
||||
'description': 'md5:819ea0b785e2a04667a1a01cdc89594e',
|
||||
'description': 'md5:fe10cb92ae2dd3ed94bb4080d11ff493',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'upload_date': '20140925',
|
||||
'duration': 120,
|
||||
'view_count': int,
|
||||
'average_rating': float,
|
||||
'categories': ['Débutantes', 'Scénario', 'Sodomie'],
|
||||
'categories': ['Débutante', 'Débutantes', 'Scénario', 'Sodomie'],
|
||||
'age_limit': 18,
|
||||
'subtitles': {
|
||||
'fr': [{
|
||||
'ext': 'vtt',
|
||||
}]
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def build_video_url(cls, num):
|
||||
return cls._VIDEO_URL_TEMPLATE % (random.choice(cls._SERVER_NUMBERS), num)
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id')
|
||||
|
||||
settings_url = self._download_json(
|
||||
'http://www.pornovoisines.com/api/video/%s/getsettingsurl/' % video_id,
|
||||
video_id, note='Getting settings URL')['video_settings_url']
|
||||
settings = self._download_json(settings_url, video_id)['data']
|
||||
|
||||
formats = []
|
||||
for kind, data in settings['variants'].items():
|
||||
if kind == 'HLS':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
data, video_id, ext='mp4', entry_protocol='m3u8_native', m3u8_id='hls'))
|
||||
elif kind == 'MP4':
|
||||
for item in data:
|
||||
formats.append({
|
||||
'url': item['url'],
|
||||
'height': item.get('height'),
|
||||
'bitrate': item.get('bitrate'),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_url = self.build_video_url(video_id)
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._og_search_description(webpage)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<h1>(.+?)</h1>', webpage, 'title', flags=re.DOTALL)
|
||||
description = self._html_search_regex(
|
||||
r'<article id="descriptif">(.+?)</article>',
|
||||
webpage, 'description', fatal=False, flags=re.DOTALL)
|
||||
|
||||
thumbnail = self._search_regex(
|
||||
r'<div id="mediaspace%s">\s*<img src="/?([^"]+)"' % video_id,
|
||||
webpage, 'thumbnail', fatal=False)
|
||||
if thumbnail:
|
||||
thumbnail = 'http://www.pornovoisines.com/%s' % thumbnail
|
||||
# The webpage has a bug - there's no space between "thumb" and src=
|
||||
thumbnail = self._html_search_regex(
|
||||
r'<img[^>]+class=([\'"])thumb\1[^>]*src=([\'"])(?P<url>[^"]+)\2',
|
||||
webpage, 'thumbnail', fatal=False, group='url')
|
||||
|
||||
upload_date = unified_strdate(self._search_regex(
|
||||
r'Publié le ([\d-]+)', webpage, 'upload date', fatal=False))
|
||||
duration = int_or_none(self._search_regex(
|
||||
'Durée (\d+)', webpage, 'duration', fatal=False))
|
||||
r'Le\s*<b>([\d/]+)', webpage, 'upload date', fatal=False))
|
||||
duration = settings.get('main', {}).get('duration')
|
||||
view_count = int_or_none(self._search_regex(
|
||||
r'(\d+) vues', webpage, 'view count', fatal=False))
|
||||
average_rating = self._search_regex(
|
||||
@@ -75,15 +82,19 @@ class PornoVoisinesIE(InfoExtractor):
|
||||
if average_rating:
|
||||
average_rating = float_or_none(average_rating.replace(',', '.'))
|
||||
|
||||
categories = self._html_search_meta(
|
||||
'keywords', webpage, 'categories', fatal=False)
|
||||
categories = self._html_search_regex(
|
||||
r'(?s)Catégories\s*:\s*<b>(.+?)</b>', webpage, 'categories', fatal=False)
|
||||
if categories:
|
||||
categories = [category.strip() for category in categories.split(',')]
|
||||
|
||||
subtitles = {'fr': [{
|
||||
'url': subtitle,
|
||||
} for subtitle in settings.get('main', {}).get('vtt_tracks', {}).values()]}
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'url': video_url,
|
||||
'formats': formats,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
@@ -93,4 +104,5 @@ class PornoVoisinesIE(InfoExtractor):
|
||||
'average_rating': average_rating,
|
||||
'categories': categories,
|
||||
'age_limit': 18,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
@@ -15,7 +15,111 @@ from ..utils import (
|
||||
)
|
||||
|
||||
|
||||
class ProSiebenSat1IE(InfoExtractor):
|
||||
class ProSiebenSat1BaseIE(InfoExtractor):
|
||||
def _extract_video_info(self, url, clip_id):
|
||||
client_location = url
|
||||
|
||||
video = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos',
|
||||
clip_id, 'Downloading videos JSON', query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
'ids': clip_id,
|
||||
})[0]
|
||||
|
||||
if video.get('is_protected') is True:
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
duration = float_or_none(video.get('duration'))
|
||||
source_ids = [compat_str(source['id']) for source in video['sources']]
|
||||
|
||||
client_id = self._SALT[:2] + sha1(''.join([clip_id, self._SALT, self._TOKEN, client_location, self._SALT, self._CLIENT_NAME]).encode('utf-8')).hexdigest()
|
||||
|
||||
sources = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources' % clip_id,
|
||||
clip_id, 'Downloading sources JSON', query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
})
|
||||
server_id = sources['server_id']
|
||||
|
||||
def fix_bitrate(bitrate):
|
||||
bitrate = int_or_none(bitrate)
|
||||
if not bitrate:
|
||||
return None
|
||||
return (bitrate // 1000) if bitrate % 1000 == 0 else bitrate
|
||||
|
||||
formats = []
|
||||
for source_id in source_ids:
|
||||
client_id = self._SALT[:2] + sha1(''.join([self._SALT, clip_id, self._TOKEN, server_id, client_location, source_id, self._SALT, self._CLIENT_NAME]).encode('utf-8')).hexdigest()
|
||||
urls = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url' % clip_id,
|
||||
clip_id, 'Downloading urls JSON', fatal=False, query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
'server_id': server_id,
|
||||
'source_ids': source_id,
|
||||
})
|
||||
if not urls:
|
||||
continue
|
||||
if urls.get('status_code') != 0:
|
||||
raise ExtractorError('This video is unavailable', expected=True)
|
||||
urls_sources = urls['sources']
|
||||
if isinstance(urls_sources, dict):
|
||||
urls_sources = urls_sources.values()
|
||||
for source in urls_sources:
|
||||
source_url = source.get('url')
|
||||
if not source_url:
|
||||
continue
|
||||
protocol = source.get('protocol')
|
||||
mimetype = source.get('mimetype')
|
||||
if mimetype == 'application/f4m+xml' or 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
source_url, clip_id, f4m_id='hds', fatal=False))
|
||||
elif mimetype == 'application/x-mpegURL':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, clip_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = fix_bitrate(source['bitrate'])
|
||||
if protocol in ('rtmp', 'rtmpe'):
|
||||
mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
|
||||
if not mobj:
|
||||
continue
|
||||
path = mobj.group('path')
|
||||
mp4colon_index = path.rfind('mp4:')
|
||||
app = path[:mp4colon_index]
|
||||
play_path = path[mp4colon_index:]
|
||||
formats.append({
|
||||
'url': '%s/%s' % (mobj.group('url'), app),
|
||||
'app': app,
|
||||
'play_path': play_path,
|
||||
'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
|
||||
'page_url': 'http://www.prosieben.de',
|
||||
'tbr': tbr,
|
||||
'ext': 'flv',
|
||||
'format_id': 'rtmp%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
else:
|
||||
formats.append({
|
||||
'url': source_url,
|
||||
'tbr': tbr,
|
||||
'format_id': 'http%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
IE_NAME = 'prosiebensat1'
|
||||
IE_DESC = 'ProSiebenSat.1 Digital'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:prosieben|prosiebenmaxx|sixx|sat1|kabeleins|the-voice-of-germany|7tv)\.(?:de|at|ch)|ran\.de|fem\.com)/(?P<id>.+)'
|
||||
@@ -188,6 +292,9 @@ class ProSiebenSat1IE(InfoExtractor):
|
||||
},
|
||||
]
|
||||
|
||||
_TOKEN = 'prosieben'
|
||||
_SALT = '01!8d8F_)r9]4s[qeuXfP%'
|
||||
_CLIENT_NAME = 'kolibri-2.0.19-splec4'
|
||||
_CLIPID_REGEXES = [
|
||||
r'"clip_id"\s*:\s+"(\d+)"',
|
||||
r'clipid: "(\d+)"',
|
||||
@@ -234,123 +341,22 @@ class ProSiebenSat1IE(InfoExtractor):
|
||||
def _extract_clip(self, url, webpage):
|
||||
clip_id = self._html_search_regex(
|
||||
self._CLIPID_REGEXES, webpage, 'clip id')
|
||||
|
||||
access_token = 'prosieben'
|
||||
client_name = 'kolibri-2.0.19-splec4'
|
||||
client_location = url
|
||||
|
||||
video = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos',
|
||||
clip_id, 'Downloading videos JSON', query={
|
||||
'access_token': access_token,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
'ids': clip_id,
|
||||
})[0]
|
||||
|
||||
if video.get('is_protected') is True:
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
duration = float_or_none(video.get('duration'))
|
||||
source_ids = [compat_str(source['id']) for source in video['sources']]
|
||||
|
||||
g = '01!8d8F_)r9]4s[qeuXfP%'
|
||||
client_id = g[:2] + sha1(''.join([clip_id, g, access_token, client_location, g, client_name]).encode('utf-8')).hexdigest()
|
||||
|
||||
sources = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources' % clip_id,
|
||||
clip_id, 'Downloading sources JSON', query={
|
||||
'access_token': access_token,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
})
|
||||
server_id = sources['server_id']
|
||||
|
||||
title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
|
||||
|
||||
def fix_bitrate(bitrate):
|
||||
bitrate = int_or_none(bitrate)
|
||||
if not bitrate:
|
||||
return None
|
||||
return (bitrate // 1000) if bitrate % 1000 == 0 else bitrate
|
||||
|
||||
formats = []
|
||||
for source_id in source_ids:
|
||||
client_id = g[:2] + sha1(''.join([g, clip_id, access_token, server_id, client_location, source_id, g, client_name]).encode('utf-8')).hexdigest()
|
||||
urls = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url' % clip_id,
|
||||
clip_id, 'Downloading urls JSON', fatal=False, query={
|
||||
'access_token': access_token,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
'server_id': server_id,
|
||||
'source_ids': source_id,
|
||||
})
|
||||
if not urls:
|
||||
continue
|
||||
if urls.get('status_code') != 0:
|
||||
raise ExtractorError('This video is unavailable', expected=True)
|
||||
urls_sources = urls['sources']
|
||||
if isinstance(urls_sources, dict):
|
||||
urls_sources = urls_sources.values()
|
||||
for source in urls_sources:
|
||||
source_url = source.get('url')
|
||||
if not source_url:
|
||||
continue
|
||||
protocol = source.get('protocol')
|
||||
mimetype = source.get('mimetype')
|
||||
if mimetype == 'application/f4m+xml' or 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
source_url, clip_id, f4m_id='hds', fatal=False))
|
||||
elif mimetype == 'application/x-mpegURL':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, clip_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = fix_bitrate(source['bitrate'])
|
||||
if protocol in ('rtmp', 'rtmpe'):
|
||||
mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
|
||||
if not mobj:
|
||||
continue
|
||||
path = mobj.group('path')
|
||||
mp4colon_index = path.rfind('mp4:')
|
||||
app = path[:mp4colon_index]
|
||||
play_path = path[mp4colon_index:]
|
||||
formats.append({
|
||||
'url': '%s/%s' % (mobj.group('url'), app),
|
||||
'app': app,
|
||||
'play_path': play_path,
|
||||
'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
|
||||
'page_url': 'http://www.prosieben.de',
|
||||
'tbr': tbr,
|
||||
'ext': 'flv',
|
||||
'format_id': 'rtmp%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
else:
|
||||
formats.append({
|
||||
'url': source_url,
|
||||
'tbr': tbr,
|
||||
'format_id': 'http%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = self._extract_video_info(url, clip_id)
|
||||
description = self._html_search_regex(
|
||||
self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
upload_date = unified_strdate(self._html_search_regex(
|
||||
self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
|
||||
|
||||
return {
|
||||
info.update({
|
||||
'id': clip_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'upload_date': upload_date,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
})
|
||||
return info
|
||||
|
||||
def _extract_playlist(self, url, webpage):
|
||||
playlist_id = self._html_search_regex(
|
||||
|
@@ -1,88 +1,51 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .prosiebensat1 import ProSiebenSat1BaseIE
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
unified_strdate,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
compat_str,
|
||||
)
|
||||
|
||||
|
||||
class Puls4IE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?puls4\.com/video/[^/]+/play/(?P<id>[0-9]+)'
|
||||
class Puls4IE(ProSiebenSat1BaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>(?:[^/]+/)*?videos/[^?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.puls4.com/video/pro-und-contra/play/2716816',
|
||||
'md5': '49f6a6629747eeec43cef6a46b5df81d',
|
||||
'url': 'http://www.puls4.com/2-minuten-2-millionen/staffel-3/videos/2min2miotalk/Tobias-Homberger-von-myclubs-im-2min2miotalk-118118',
|
||||
'md5': 'fd3c6b0903ac72c9d004f04bc6bb3e03',
|
||||
'info_dict': {
|
||||
'id': '2716816',
|
||||
'ext': 'mp4',
|
||||
'title': 'Pro und Contra vom 23.02.2015',
|
||||
'description': 'md5:293e44634d9477a67122489994675db6',
|
||||
'duration': 2989,
|
||||
'upload_date': '20150224',
|
||||
'id': '118118',
|
||||
'ext': 'flv',
|
||||
'title': 'Tobias Homberger von myclubs im #2min2miotalk',
|
||||
'description': 'md5:f9def7c5e8745d6026d8885487d91955',
|
||||
'upload_date': '20160830',
|
||||
'uploader': 'PULS_4',
|
||||
},
|
||||
'skip': 'Only works from Germany',
|
||||
}, {
|
||||
'url': 'http://www.puls4.com/video/kult-spielfilme/play/1298106',
|
||||
'md5': '6a48316c8903ece8dab9b9a7bf7a59ec',
|
||||
'info_dict': {
|
||||
'id': '1298106',
|
||||
'ext': 'mp4',
|
||||
'title': 'Lucky Fritz',
|
||||
},
|
||||
'skip': 'Only works from Germany',
|
||||
}]
|
||||
_TOKEN = 'puls4'
|
||||
_SALT = '01!kaNgaiNgah1Ie4AeSha'
|
||||
_CLIENT_NAME = ''
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
error_message = self._html_search_regex(
|
||||
r'<div[^>]+class="message-error"[^>]*>(.+?)</div>',
|
||||
webpage, 'error message', default=None)
|
||||
if error_message:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
|
||||
|
||||
real_url = self._html_search_regex(
|
||||
r'\"fsk-button\".+?href=\"([^"]+)',
|
||||
webpage, 'fsk_button', default=None)
|
||||
if real_url:
|
||||
webpage = self._download_webpage(real_url, video_id)
|
||||
|
||||
player = self._search_regex(
|
||||
r'p4_video_player(?:_iframe)?\("video_\d+_container"\s*,(.+?)\);\s*\}',
|
||||
webpage, 'player')
|
||||
|
||||
player_json = self._parse_json(
|
||||
'[%s]' % player, video_id,
|
||||
transform_source=lambda s: s.replace('undefined,', ''))
|
||||
|
||||
formats = None
|
||||
result = None
|
||||
|
||||
for v in player_json:
|
||||
if isinstance(v, list) and not formats:
|
||||
formats = [{
|
||||
'url': f['url'],
|
||||
'format': 'hd' if f.get('hd') else 'sd',
|
||||
'width': int_or_none(f.get('size_x')),
|
||||
'height': int_or_none(f.get('size_y')),
|
||||
'tbr': int_or_none(f.get('bitrate')),
|
||||
} for f in v]
|
||||
self._sort_formats(formats)
|
||||
elif isinstance(v, dict) and not result:
|
||||
result = {
|
||||
'id': video_id,
|
||||
'title': v['videopartname'].strip(),
|
||||
'description': v.get('videotitle'),
|
||||
'duration': int_or_none(v.get('videoduration') or v.get('episodeduration')),
|
||||
'upload_date': unified_strdate(v.get('clipreleasetime')),
|
||||
'uploader': v.get('channel'),
|
||||
}
|
||||
|
||||
result['formats'] = formats
|
||||
|
||||
return result
|
||||
path = self._match_id(url)
|
||||
content_path = self._download_json(
|
||||
'http://www.puls4.com/api/json-fe/page/' + path, path)['content'][0]['url']
|
||||
media = self._download_json(
|
||||
'http://www.puls4.com' + content_path,
|
||||
content_path)['mediaCurrent']
|
||||
player_content = media['playerContent']
|
||||
info = self._extract_video_info(url, player_content['id'])
|
||||
info.update({
|
||||
'id': compat_str(media['objectId']),
|
||||
'title': player_content['title'],
|
||||
'description': media.get('description'),
|
||||
'thumbnail': media.get('previewLink'),
|
||||
'upload_date': unified_strdate(media.get('date')),
|
||||
'duration': parse_duration(player_content.get('duration')),
|
||||
'episode': player_content.get('episodePartName'),
|
||||
'show': media.get('channel'),
|
||||
'season_id': player_content.get('seasonId'),
|
||||
'uploader': player_content.get('sourceCompany'),
|
||||
})
|
||||
return info
|
||||
|
39
youtube_dl/extractor/rmcdecouverte.py
Normal file
39
youtube_dl/extractor/rmcdecouverte.py
Normal file
@@ -0,0 +1,39 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
|
||||
|
||||
class RMCDecouverteIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/mediaplayer-replay.*?\bid=(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://rmcdecouverte.bfmtv.com/mediaplayer-replay/?id=1430&title=LES%20HEROS%20DU%2088e%20ETAGE',
|
||||
'info_dict': {
|
||||
'id': '5111223049001',
|
||||
'ext': 'mp4',
|
||||
'title': ': LES HEROS DU 88e ETAGE',
|
||||
'description': 'Découvrez comment la bravoure de deux hommes dans la Tour Nord du World Trade Center a sauvé la vie d\'innombrables personnes le 11 septembre 2001.',
|
||||
'uploader_id': '1969646226001',
|
||||
'upload_date': '20160904',
|
||||
'timestamp': 1472951103,
|
||||
},
|
||||
'params': {
|
||||
# rtmp download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Only works from France',
|
||||
}
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1969646226001/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
@@ -1,7 +1,6 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from .internetvideoarchive import InternetVideoArchiveIE
|
||||
|
||||
|
||||
@@ -11,21 +10,23 @@ class RottenTomatoesIE(InfoExtractor):
|
||||
_TEST = {
|
||||
'url': 'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
|
||||
'info_dict': {
|
||||
'id': '613340',
|
||||
'id': '11028566',
|
||||
'ext': 'mp4',
|
||||
'title': 'Toy Story 3',
|
||||
'description': 'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
og_video = self._og_search_video_url(webpage)
|
||||
query = compat_urlparse.urlparse(og_video).query
|
||||
iva_id = self._search_regex(r'publishedid=(\d+)', webpage, 'internet video archive id')
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': InternetVideoArchiveIE._build_xml_url(query),
|
||||
'url': 'http://video.internetvideoarchive.net/player/6/configuration.ashx?domain=www.videodetective.com&customerid=69249&playerid=641&publishedid=' + iva_id,
|
||||
'ie_key': InternetVideoArchiveIE.ie_key(),
|
||||
'id': video_id,
|
||||
'title': self._og_search_title(webpage),
|
||||
}
|
||||
|
@@ -88,7 +88,7 @@ class RutubeIE(InfoExtractor):
|
||||
class RutubeEmbedIE(InfoExtractor):
|
||||
IE_NAME = 'rutube:embed'
|
||||
IE_DESC = 'Rutube embedded videos'
|
||||
_VALID_URL = 'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://rutube.ru/video/embed/6722881?vk_puid37=&vk_puid38=',
|
||||
|
@@ -103,7 +103,7 @@ class SpiegelIE(InfoExtractor):
|
||||
|
||||
|
||||
class SpiegelArticleIE(InfoExtractor):
|
||||
_VALID_URL = 'https?://www\.spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
_VALID_URL = r'https?://www\.spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
IE_NAME = 'Spiegel:Article'
|
||||
IE_DESC = 'Articles on spiegel.de'
|
||||
_TESTS = [{
|
||||
|
@@ -53,7 +53,7 @@ class TBSIE(TurnerBaseIE):
|
||||
'media_src': 'http://ht.cdn.turner.com/%s/big' % site,
|
||||
},
|
||||
'secure': {
|
||||
'media_src': 'http://apple-secure.cdn.turner.com/%s/big' % site,
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/%s/big' % site,
|
||||
'tokenizer_src': 'http://www.%s.com/video/processors/services/token_ipadAdobe.do' % domain,
|
||||
},
|
||||
})
|
||||
|
@@ -96,7 +96,7 @@ class ThePlatformBaseIE(OnceIE):
|
||||
class ThePlatformIE(ThePlatformBaseIE, AdobePassIE):
|
||||
_VALID_URL = r'''(?x)
|
||||
(?:https?://(?:link|player)\.theplatform\.com/[sp]/(?P<provider_id>[^/]+)/
|
||||
(?:(?:(?:[^/]+/)+select/)?(?P<media>media/(?:guid/\d+/)?)|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
|
||||
(?:(?:(?:[^/]+/)+select/)?(?P<media>media/(?:guid/\d+/)?)?|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
|
||||
|theplatform:)(?P<id>[^/\?&]+)'''
|
||||
|
||||
_TESTS = [{
|
||||
@@ -116,6 +116,7 @@ class ThePlatformIE(ThePlatformBaseIE, AdobePassIE):
|
||||
# rtmp download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': '404 Not Found',
|
||||
}, {
|
||||
# from http://www.cnet.com/videos/tesla-model-s-a-second-step-towards-a-cleaner-motoring-future/
|
||||
'url': 'http://link.theplatform.com/s/kYEXFC/22d_qsQ6MIRT',
|
||||
|
@@ -1,10 +1,14 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from ..compat import compat_parse_qs
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
|
||||
|
||||
class TlcDeIE(InfoExtractor):
|
||||
@@ -35,5 +39,5 @@ class TlcDeIE(InfoExtractor):
|
||||
title = mobj.group('title')
|
||||
webpage = self._download_webpage(url, title)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(brightcove_legacy_url)['@videoPlayer'][0]
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||
|
35
youtube_dl/extractor/trutv.py
Normal file
35
youtube_dl/extractor/trutv.py
Normal file
@@ -0,0 +1,35 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .turner import TurnerBaseIE
|
||||
|
||||
|
||||
class TruTVIE(TurnerBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?trutv\.com(?:(?P<path>/shows/[^/]+/videos/[^/?#]+?)\.html|/full-episodes/[^/]+/(?P<id>\d+))'
|
||||
_TEST = {
|
||||
'url': 'http://www.trutv.com/shows/10-things/videos/you-wont-believe-these-sports-bets.html',
|
||||
'md5': '2cdc844f317579fed1a7251b087ff417',
|
||||
'info_dict': {
|
||||
'id': '/shows/10-things/videos/you-wont-believe-these-sports-bets',
|
||||
'ext': 'mp4',
|
||||
'title': 'You Won\'t Believe These Sports Bets',
|
||||
'description': 'Jamie Lee sits down with a bookie to discuss the bizarre world of illegal sports betting.',
|
||||
'upload_date': '20130305',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
path, video_id = re.match(self._VALID_URL, url).groups()
|
||||
if path:
|
||||
data_src = 'http://www.trutv.com/video/cvp/v2/xml/content.xml?id=%s.xml' % path
|
||||
else:
|
||||
data_src = 'http://www.trutv.com/tveverywhere/services/cvpXML.do?titleId=' + video_id
|
||||
return self._extract_cvp_info(
|
||||
data_src, path, {
|
||||
'secure': {
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/trutv/big',
|
||||
'tokenizer_src': 'http://www.trutv.com/tveverywhere/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
})
|
@@ -12,7 +12,7 @@ from ..utils import (
|
||||
parse_duration,
|
||||
xpath_attr,
|
||||
update_url_query,
|
||||
compat_urlparse,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
@@ -24,6 +24,7 @@ class TurnerBaseIE(InfoExtractor):
|
||||
video_data = self._download_xml(data_src, video_id)
|
||||
video_id = video_data.attrib['id']
|
||||
title = xpath_text(video_data, 'headline', fatal=True)
|
||||
content_id = xpath_text(video_data, 'contentId') or video_id
|
||||
# rtmp_src = xpath_text(video_data, 'akamai/src')
|
||||
# if rtmp_src:
|
||||
# splited_rtmp_src = rtmp_src.split(',')
|
||||
@@ -54,7 +55,7 @@ class TurnerBaseIE(InfoExtractor):
|
||||
# auth = self._download_webpage(
|
||||
# protected_path_data['tokenizer_src'], query={
|
||||
# 'path': protected_path,
|
||||
# 'videoId': video_id,
|
||||
# 'videoId': content_id,
|
||||
# 'aifp': aifp,
|
||||
# })
|
||||
# token = xpath_text(auth, 'token')
|
||||
@@ -72,8 +73,11 @@ class TurnerBaseIE(InfoExtractor):
|
||||
auth = self._download_xml(
|
||||
secure_path_data['tokenizer_src'], video_id, query={
|
||||
'path': secure_path,
|
||||
'videoId': video_id,
|
||||
'videoId': content_id,
|
||||
})
|
||||
error_msg = xpath_text(auth, 'error/msg')
|
||||
if error_msg:
|
||||
raise ExtractorError(error_msg, expected=True)
|
||||
token = xpath_text(auth, 'token')
|
||||
if not token:
|
||||
continue
|
||||
@@ -93,19 +97,9 @@ class TurnerBaseIE(InfoExtractor):
|
||||
formats.extend(self._extract_smil_formats(
|
||||
video_url, video_id, fatal=False))
|
||||
elif ext == 'm3u8':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', m3u8_id=format_id or 'hls',
|
||||
fatal=False)
|
||||
if m3u8_formats:
|
||||
# Sometimes final URLs inside m3u8 are unsigned, let's fix this
|
||||
# ourselves
|
||||
qs = compat_urlparse.urlparse(video_url).query
|
||||
if qs:
|
||||
query = compat_urlparse.parse_qs(qs)
|
||||
for m3u8_format in m3u8_formats:
|
||||
m3u8_format['url'] = update_url_query(m3u8_format['url'], query)
|
||||
m3u8_format['extra_param_to_segment_url'] = qs
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4',
|
||||
m3u8_id=format_id or 'hls', fatal=False))
|
||||
elif ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
update_url_query(video_url, {'hdcore': '3.7.0'}),
|
||||
|
49
youtube_dl/extractor/tvnoe.py
Normal file
49
youtube_dl/extractor/tvnoe.py
Normal file
@@ -0,0 +1,49 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
get_element_by_class,
|
||||
js_to_json,
|
||||
)
|
||||
|
||||
|
||||
class TVNoeIE(JWPlatformBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?tvnoe\.cz/video/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.tvnoe.cz/video/10362',
|
||||
'md5': 'aee983f279aab96ec45ab6e2abb3c2ca',
|
||||
'info_dict': {
|
||||
'id': '10362',
|
||||
'ext': 'mp4',
|
||||
'series': 'Noční univerzita',
|
||||
'title': 'prof. Tomáš Halík, Th.D. - Návrat náboženství a střet civilizací',
|
||||
'description': 'md5:f337bae384e1a531a52c55ebc50fff41',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
iframe_url = self._search_regex(
|
||||
r'<iframe[^>]+src="([^"]+)"', webpage, 'iframe URL')
|
||||
|
||||
ifs_page = self._download_webpage(iframe_url, video_id)
|
||||
jwplayer_data = self._parse_json(
|
||||
self._find_jwplayer_data(ifs_page),
|
||||
video_id, transform_source=js_to_json)
|
||||
info_dict = self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, require_title=False, base_url=iframe_url)
|
||||
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'title': clean_html(get_element_by_class(
|
||||
'field-name-field-podnazev', webpage)),
|
||||
'description': clean_html(get_element_by_class(
|
||||
'field-name-body', webpage)),
|
||||
'series': clean_html(get_element_by_class('title', webpage))
|
||||
})
|
||||
|
||||
return info_dict
|
@@ -348,6 +348,25 @@ class ViafreeIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [TVPlayIE.ie_key()],
|
||||
}, {
|
||||
# with relatedClips
|
||||
'url': 'http://www.viafree.se/program/reality/sommaren-med-youtube-stjarnorna/sasong-1/avsnitt-1',
|
||||
'info_dict': {
|
||||
'id': '758770',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sommaren med YouTube-stjärnorna S01E01',
|
||||
'description': 'md5:2bc69dce2c4bb48391e858539bbb0e3f',
|
||||
'series': 'Sommaren med YouTube-stjärnorna',
|
||||
'season': 'Säsong 1',
|
||||
'season_number': 1,
|
||||
'duration': 1326,
|
||||
'timestamp': 1470905572,
|
||||
'upload_date': '20160811',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [TVPlayIE.ie_key()],
|
||||
}, {
|
||||
'url': 'http://www.viafree.no/programmer/underholdning/det-beste-vorspielet/sesong-2/episode-1',
|
||||
'only_matching': True,
|
||||
@@ -365,8 +384,17 @@ class ViafreeIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'currentVideo["\']\s*:\s*.+?["\']id["\']\s*:\s*["\'](?P<id>\d{6,})',
|
||||
webpage, 'video id')
|
||||
video_id = None
|
||||
|
||||
thumbnail = self._og_search_thumbnail(webpage, default=None)
|
||||
if thumbnail:
|
||||
video_id = self._search_regex(
|
||||
r'https?://[^/]+/imagecache/(?:[^/]+/)+seasons/\d+/(\d{6,})/',
|
||||
thumbnail, 'video id', default=None)
|
||||
|
||||
if not video_id:
|
||||
video_id = self._search_regex(
|
||||
r'currentVideo["\']\s*:\s*.+?["\']id["\']\s*:\s*["\'](\d{6,})',
|
||||
webpage, 'video id')
|
||||
|
||||
return self.url_result('mtg:%s' % video_id, TVPlayIE.ie_key())
|
||||
|
@@ -342,7 +342,7 @@ class TwitterIE(InfoExtractor):
|
||||
|
||||
class TwitterAmplifyIE(TwitterBaseIE):
|
||||
IE_NAME = 'twitter:amplify'
|
||||
_VALID_URL = 'https?://amp\.twimg\.com/v/(?P<id>[0-9a-f\-]{36})'
|
||||
_VALID_URL = r'https?://amp\.twimg\.com/v/(?P<id>[0-9a-f\-]{36})'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://amp.twimg.com/v/0ba0c3c7-0af3-4c0a-bed5-7efd1ffa2951',
|
||||
|
@@ -6,8 +6,7 @@ import re
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
parse_iso8601,
|
||||
xpath_element,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
@@ -17,38 +16,32 @@ class VideomoreIE(InfoExtractor):
|
||||
_VALID_URL = r'videomore:(?P<sid>\d+)$|https?://videomore\.ru/(?:(?:embed|[^/]+/[^/]+)/|[^/]+\?.*\btrack_id=)(?P<id>\d+)(?:[/?#&]|\.(?:xml|json)|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://videomore.ru/kino_v_detalayah/5_sezon/367617',
|
||||
'md5': '70875fbf57a1cd004709920381587185',
|
||||
'md5': '44455a346edc0d509ac5b5a5b531dc35',
|
||||
'info_dict': {
|
||||
'id': '367617',
|
||||
'ext': 'flv',
|
||||
'title': 'В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'description': 'В гостях – лучшие романтические комедии года, «Выживший» Иньярриту и «Стив Джобс» Дэнни Бойла.',
|
||||
'title': 'Кино в деталях 5 сезон В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'series': 'Кино в деталях',
|
||||
'episode': 'В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'episode_number': None,
|
||||
'season': 'Сезон 2015',
|
||||
'season_number': 5,
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 2910,
|
||||
'age_limit': 16,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 16,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://videomore.ru/embed/259974',
|
||||
'info_dict': {
|
||||
'id': '259974',
|
||||
'ext': 'flv',
|
||||
'title': '80 серия',
|
||||
'description': '«Медведей» ждет решающий матч. Макеев выясняет отношения со Стрельцовым. Парни узнают подробности прошлого Макеева.',
|
||||
'title': 'Молодежка 2 сезон 40 серия',
|
||||
'series': 'Молодежка',
|
||||
'episode': '80 серия',
|
||||
'episode_number': 40,
|
||||
'season': '2 сезон',
|
||||
'season_number': 2,
|
||||
'episode': '40 серия',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 2809,
|
||||
'age_limit': 16,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 16,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -58,13 +51,8 @@ class VideomoreIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '341073',
|
||||
'ext': 'flv',
|
||||
'title': 'Команда проиграла из-за Бакина?',
|
||||
'description': 'Молодежка 3 сезон скоро',
|
||||
'series': 'Молодежка',
|
||||
'title': 'Промо Команда проиграла из-за Бакина?',
|
||||
'episode': 'Команда проиграла из-за Бакина?',
|
||||
'episode_number': None,
|
||||
'season': 'Промо',
|
||||
'season_number': 99,
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 29,
|
||||
'age_limit': 16,
|
||||
@@ -109,43 +97,33 @@ class VideomoreIE(InfoExtractor):
|
||||
'http://videomore.ru/video/tracks/%s.xml' % video_id,
|
||||
video_id, 'Downloading video XML')
|
||||
|
||||
video_url = xpath_text(video, './/video_url', 'video url', fatal=True)
|
||||
item = xpath_element(video, './/playlist/item', fatal=True)
|
||||
|
||||
title = xpath_text(
|
||||
item, ('./title', './episode_name'), 'title', fatal=True)
|
||||
|
||||
video_url = xpath_text(item, './video_url', 'video url', fatal=True)
|
||||
formats = self._extract_f4m_formats(video_url, video_id, f4m_id='hds')
|
||||
self._sort_formats(formats)
|
||||
|
||||
data = self._download_json(
|
||||
'http://videomore.ru/video/tracks/%s.json' % video_id,
|
||||
video_id, 'Downloading video JSON')
|
||||
thumbnail = xpath_text(item, './thumbnail_url')
|
||||
duration = int_or_none(xpath_text(item, './duration'))
|
||||
view_count = int_or_none(xpath_text(item, './views'))
|
||||
comment_count = int_or_none(xpath_text(item, './count_comments'))
|
||||
age_limit = int_or_none(xpath_text(item, './min_age'))
|
||||
|
||||
title = data.get('title') or data['project_title']
|
||||
description = data.get('description') or data.get('description_raw')
|
||||
timestamp = parse_iso8601(data.get('published_at'))
|
||||
duration = int_or_none(data.get('duration'))
|
||||
view_count = int_or_none(data.get('views'))
|
||||
age_limit = parse_age_limit(data.get('min_age'))
|
||||
thumbnails = [{
|
||||
'url': thumbnail,
|
||||
} for thumbnail in data.get('big_thumbnail_urls', [])]
|
||||
|
||||
series = data.get('project_title')
|
||||
episode = data.get('title')
|
||||
episode_number = int_or_none(data.get('episode_of_season') or None)
|
||||
season = data.get('season_title')
|
||||
season_number = int_or_none(data.get('season_pos') or None)
|
||||
series = xpath_text(item, './project_name')
|
||||
episode = xpath_text(item, './episode_name')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'series': series,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
'season': season,
|
||||
'season_number': season_number,
|
||||
'thumbnails': thumbnails,
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count,
|
||||
'age_limit': age_limit,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@@ -28,23 +28,24 @@ class SprutoBaseIE(InfoExtractor):
|
||||
|
||||
class VimpleIE(SprutoBaseIE):
|
||||
IE_DESC = 'Vimple - one-click video hosting'
|
||||
_VALID_URL = r'https?://(?:player\.vimple\.ru/iframe|vimple\.ru)/(?P<id>[\da-f-]{32,36})'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://vimple.ru/c0f6b1687dcd4000a97ebe70068039cf',
|
||||
'md5': '2e750a330ed211d3fd41821c6ad9a279',
|
||||
'info_dict': {
|
||||
'id': 'c0f6b168-7dcd-4000-a97e-be70068039cf',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sunset',
|
||||
'duration': 20,
|
||||
'thumbnail': 're:https?://.*?\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://player.vimple.ru/iframe/52e1beec-1314-4a83-aeac-c61562eadbf9',
|
||||
'only_matching': True,
|
||||
}
|
||||
]
|
||||
_VALID_URL = r'https?://(?:player\.vimple\.(?:ru|co)/iframe|vimple\.(?:ru|co))/(?P<id>[\da-f-]{32,36})'
|
||||
_TESTS = [{
|
||||
'url': 'http://vimple.ru/c0f6b1687dcd4000a97ebe70068039cf',
|
||||
'md5': '2e750a330ed211d3fd41821c6ad9a279',
|
||||
'info_dict': {
|
||||
'id': 'c0f6b168-7dcd-4000-a97e-be70068039cf',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sunset',
|
||||
'duration': 20,
|
||||
'thumbnail': 're:https?://.*?\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://player.vimple.ru/iframe/52e1beec-1314-4a83-aeac-c61562eadbf9',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://vimple.co/04506a053f124483b8fb05ed73899f19',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
@@ -86,38 +86,50 @@ class WatIE(InfoExtractor):
|
||||
|
||||
def extract_url(path_template, url_type):
|
||||
req_url = 'http://www.wat.tv/get/%s' % (path_template % video_id)
|
||||
head = self._request_webpage(HEADRequest(req_url), video_id, 'Extracting %s url' % url_type)
|
||||
red_url = head.geturl()
|
||||
if req_url == red_url:
|
||||
raise ExtractorError(
|
||||
'%s said: Sorry, this video is not available from your country.' % self.IE_NAME,
|
||||
expected=True)
|
||||
return red_url
|
||||
head = self._request_webpage(HEADRequest(req_url), video_id, 'Extracting %s url' % url_type, fatal=False)
|
||||
if head:
|
||||
red_url = head.geturl()
|
||||
if req_url != red_url:
|
||||
return red_url
|
||||
return None
|
||||
|
||||
def remove_bitrate_limit(manifest_url):
|
||||
return re.sub(r'(?:max|min)_bitrate=\d+&?', '', manifest_url)
|
||||
|
||||
formats = []
|
||||
try:
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
m3u8_url = extract_url('ipad/%s.m3u8', 'm3u8')
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios.', 'web.').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
format_id = m3u8_format['format_id'].replace('hls', 'http')
|
||||
fmt_url = re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': fmt_url,
|
||||
'format_id': format_id,
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
manifest_urls = self._download_json(
|
||||
'http://www.wat.tv/get/webhtml/' + video_id, video_id)
|
||||
m3u8_url = manifest_urls.get('hls')
|
||||
if m3u8_url:
|
||||
m3u8_url = remove_bitrate_limit(m3u8_url)
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
if m3u8_formats:
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios', 'web').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
if http_url:
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
format_id = m3u8_format['format_id'].replace('hls', 'http')
|
||||
fmt_url = re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': fmt_url,
|
||||
'format_id': format_id,
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
mpd_url = manifest_urls.get('mpd')
|
||||
if mpd_url:
|
||||
formats.extend(self._extract_mpd_formats(remove_bitrate_limit(
|
||||
mpd_url), video_id, mpd_id='dash', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
except ExtractorError:
|
||||
abr = 64
|
||||
|
@@ -19,7 +19,10 @@ from ..utils import (
|
||||
determine_ext,
|
||||
)
|
||||
|
||||
from .brightcove import BrightcoveNewIE
|
||||
from .brightcove import (
|
||||
BrightcoveLegacyIE,
|
||||
BrightcoveNewIE,
|
||||
)
|
||||
from .nbc import NBCSportsVPlayerIE
|
||||
|
||||
|
||||
@@ -223,6 +226,11 @@ class YahooIE(InfoExtractor):
|
||||
if nbc_sports_url:
|
||||
return self.url_result(nbc_sports_url, NBCSportsVPlayerIE.ie_key())
|
||||
|
||||
# Look for Brightcove Legacy Studio embeds
|
||||
bc_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
if bc_url:
|
||||
return self.url_result(bc_url, BrightcoveLegacyIE.ie_key())
|
||||
|
||||
# Look for Brightcove New Studio embeds
|
||||
bc_url = BrightcoveNewIE._extract_url(webpage)
|
||||
if bc_url:
|
||||
|
@@ -1,21 +1,16 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
class YouJizzIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:\w+\.)?youjizz\.com/videos/(?:[^/#?]+)?-(?P<id>[0-9]+)\.html(?:$|[?#])'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.youjizz.com/videos/zeichentrick-1-2189178.html',
|
||||
'md5': '07e15fa469ba384c7693fd246905547c',
|
||||
'md5': '78fc1901148284c69af12640e01c6310',
|
||||
'info_dict': {
|
||||
'id': '2189178',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'Zeichentrick 1',
|
||||
'age_limit': 18,
|
||||
}
|
||||
@@ -27,38 +22,18 @@ class YouJizzIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
# YouJizz's HTML5 player has invalid HTML
|
||||
webpage = webpage.replace('"controls', '" controls')
|
||||
age_limit = self._rta_search(webpage)
|
||||
video_title = self._html_search_regex(
|
||||
r'<title>\s*(.*)\s*</title>', webpage, 'title')
|
||||
|
||||
embed_page_url = self._search_regex(
|
||||
r'(https?://www.youjizz.com/videos/embed/[0-9]+)',
|
||||
webpage, 'embed page')
|
||||
webpage = self._download_webpage(
|
||||
embed_page_url, video_id, note='downloading embed page')
|
||||
info_dict = self._parse_html5_media_entries(url, webpage, video_id)[0]
|
||||
|
||||
# Get the video URL
|
||||
m_playlist = re.search(r'so.addVariable\("playlist", ?"(?P<playlist>.+?)"\);', webpage)
|
||||
if m_playlist is not None:
|
||||
playlist_url = m_playlist.group('playlist')
|
||||
playlist_page = self._download_webpage(playlist_url, video_id,
|
||||
'Downloading playlist page')
|
||||
m_levels = list(re.finditer(r'<level bitrate="(\d+?)" file="(.*?)"', playlist_page))
|
||||
if len(m_levels) == 0:
|
||||
raise ExtractorError('Unable to extract video url')
|
||||
videos = [(int(m.group(1)), m.group(2)) for m in m_levels]
|
||||
(_, video_url) = sorted(videos)[0]
|
||||
video_url = video_url.replace('%252F', '%2F')
|
||||
else:
|
||||
video_url = self._search_regex(r'so.addVariable\("file",encodeURIComponent\("(?P<source>[^"]+)"\)\);',
|
||||
webpage, 'video URL')
|
||||
|
||||
return {
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': video_title,
|
||||
'ext': 'flv',
|
||||
'format': 'flv',
|
||||
'player_url': embed_page_url,
|
||||
'age_limit': age_limit,
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
@@ -264,7 +264,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
)
|
||||
)? # all until now is optional -> you can pass the naked ID
|
||||
([0-9A-Za-z_-]{11}) # here is it! the YouTube video ID
|
||||
(?!.*?&list=) # combined list/video URLs are handled by the playlist IE
|
||||
(?!.*?\blist=) # combined list/video URLs are handled by the playlist IE
|
||||
(?(1).+)? # if we found the ID, everything can follow
|
||||
$"""
|
||||
_NEXT_URL_RE = r'[\?&]next_url=([^&]+)'
|
||||
@@ -1778,11 +1778,14 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
||||
_VALID_URL = r"""(?x)(?:
|
||||
(?:https?://)?
|
||||
(?:\w+\.)?
|
||||
youtube\.com/
|
||||
(?:
|
||||
(?:course|view_play_list|my_playlists|artist|playlist|watch|embed/videoseries)
|
||||
\? (?:.*?[&;])*? (?:p|a|list)=
|
||||
| p/
|
||||
youtube\.com/
|
||||
(?:
|
||||
(?:course|view_play_list|my_playlists|artist|playlist|watch|embed/videoseries)
|
||||
\? (?:.*?[&;])*? (?:p|a|list)=
|
||||
| p/
|
||||
)|
|
||||
youtu\.be/[0-9A-Za-z_-]{11}\?.*?\blist=
|
||||
)
|
||||
(
|
||||
(?:PL|LL|EC|UU|FL|RD|UL)?[0-9A-Za-z-_]{10,}
|
||||
@@ -1887,6 +1890,9 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [YoutubeIE.ie_key()],
|
||||
}, {
|
||||
'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_initialize(self):
|
||||
@@ -2376,7 +2382,7 @@ class YoutubeWatchLaterIE(YoutubePlaylistIE):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video = self._check_download_just_video(url, 'WL')
|
||||
_, video = self._check_download_just_video(url, 'WL')
|
||||
if video:
|
||||
return video
|
||||
_, playlist = self._extract_playlist('WL')
|
||||
@@ -2411,7 +2417,7 @@ class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
|
||||
|
||||
class YoutubeHistoryIE(YoutubeFeedsInfoExtractor):
|
||||
IE_DESC = 'Youtube watch history, ":ythistory" for short (requires authentication)'
|
||||
_VALID_URL = 'https?://www\.youtube\.com/feed/history|:ythistory'
|
||||
_VALID_URL = r'https?://www\.youtube\.com/feed/history|:ythistory'
|
||||
_FEED_NAME = 'history'
|
||||
_PLAYLIST_TITLE = 'Youtube History'
|
||||
|
||||
|
@@ -423,7 +423,15 @@ def parseOpts(overrideArguments=None):
|
||||
downloader.add_option(
|
||||
'--fragment-retries',
|
||||
dest='fragment_retries', metavar='RETRIES', default=10,
|
||||
help='Number of retries for a fragment (default is %default), or "infinite" (DASH only)')
|
||||
help='Number of retries for a fragment (default is %default), or "infinite" (DASH and hlsnative only)')
|
||||
downloader.add_option(
|
||||
'--skip-unavailable-fragments',
|
||||
action='store_true', dest='skip_unavailable_fragments', default=True,
|
||||
help='Skip unavailable fragments (DASH and hlsnative only)')
|
||||
general.add_option(
|
||||
'--abort-on-unavailable-fragment',
|
||||
action='store_false', dest='skip_unavailable_fragments',
|
||||
help='Abort downloading when some fragment is not available')
|
||||
downloader.add_option(
|
||||
'--buffer-size',
|
||||
dest='buffersize', metavar='SIZE', default='1024',
|
||||
|
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2016.09.03'
|
||||
__version__ = '2016.09.08'
|
||||
|
Reference in New Issue
Block a user