Compare commits

..

77 Commits

Author SHA1 Message Date
Sergey M․
7d539ee10a release 2017.03.16 2017-03-16 22:42:12 +07:00
Sergey M․
6ad476079d [ChangeLog] Actualize 2017-03-16 22:39:48 +07:00
Philipp Hagemeister
0efbc6b56d [options] Mention flac support and sort alphabetically among the audio formats 2017-03-16 12:54:47 +01:00
Philipp Hagemeister
21bfcd3d6e [postprocessor/ffmpeg] Add support for flac
Requested at http://stackoverflow.com/q/42828041/35070
2017-03-16 12:50:45 +01:00
Sergey M․
b51dc9db0e [extractor/common] Extract SMIL formats from jwplayer 2017-03-16 03:30:53 +07:00
Sergey M․
a309684285 [extractor/generic] Add forgotten return for jwplayer formats 2017-03-16 03:28:01 +07:00
Remita Amine
ba448445b8 [redbull] improve extraction
- extract 1080p quality
- correct ttml subtitle ext
- catch api errors
- reduce request size
2017-03-15 01:40:54 +01:00
Sergey M․
5db83d79bf release 2017.03.15 2017-03-15 02:01:24 +07:00
Sergey M․
2a751e137f [ChangeLog] Actualize 2017-03-15 02:00:10 +07:00
Vijay Singh
398887b4c0 [Openload] Fixed Extraction
They did changed it again.
2017-03-14 14:03:52 +08:00
Sergey M․
66bf351f80 [facebook] Make title optional (closes #12443) 2017-03-14 00:38:07 +07:00
Sergey M․
9d08963022 [telecinco] Add test for #12430 2017-03-13 22:41:28 +07:00
Sergey M․
e313d209c2 [mitele] Add support for ooyala videos (closes #12430) 2017-03-13 22:39:15 +07:00
Vijay Singh
ff9d509d20 [openload] Fix extraction
Just a minor fix for openload
2017-03-13 04:22:35 +08:00
Lucas M
c1795ca6c8 [streamable] Update API URL 2017-03-13 02:51:59 +08:00
Starsam80
8c99623259 [crunchyroll] Extract season name 2017-03-12 12:18:10 +08:00
Sergey M․
57b0ddb35f [discoverygo] Actualize test 2017-03-11 23:21:08 +07:00
Sergey M․
a28f8d7396 [discoverygo] Bypass geo restriction 2017-03-11 23:18:42 +07:00
Sergey M․
7049799470 [discoverygo:playlist] Add extractor (closes #12424) 2017-03-11 23:16:51 +07:00
Yen Chi Hsuan
4605c94d1a [__init__] Fix missing subtitles if --add-metadata is used (#12423)
The previous fix for #5594 is incorrect
2017-03-11 19:37:45 +08:00
Sergey M․
a8e687a4da release 2017.03.10 2017-03-10 23:26:28 +07:00
Sergey M․
f9e5c92c94 [ChangeLog] Actualize 2017-03-10 23:23:24 +07:00
Sergey M․
c2ee861c6d [extractor/generic] Make title optional for jwplayer embeds (closes #12410) 2017-03-10 23:16:53 +07:00
Sergey M․
bd34c32bd7 [wdr] Actualize comment 2017-03-10 23:07:36 +07:00
runningbits
f802c48660 [wdr:maus] Fix extraction and update tests 2017-03-10 23:59:32 +08:00
Sergey M․
76bee08fe7 [prosiebensat1] Improve title extraction and add test 2017-03-09 23:42:07 +07:00
Thomas Christlieb
2913821723 [prosiebensat1] Improve title extraction (closes #12318) 2017-03-10 00:18:37 +08:00
Sergey M․
0e7f9a9b48 [dplayit] Relax playback info URL extraction 2017-03-08 21:30:30 +07:00
Sergey M․
0cf2352e85 [dplayit] Separate and rewrite extractor and bypass geo restriction (closes #12393) 2017-03-08 21:20:01 +07:00
Yen Chi Hsuan
0f6b87d067 [miomio] Fix extraction
Closes #12291
Closes #12388
Closes #12402
2017-03-08 19:46:58 +08:00
Sergey M․
d7344d33b1 [telequebec] Fix description extraction and update test (closes #12399) 2017-03-08 18:25:59 +07:00
denneboomyo
b08cc749d6 [openload] Fix extraction 2017-03-08 06:01:27 +08:00
Sergey M․
b68a812ea8 [extractor/generic] Add test for brigthcove UUID-like videoPlayer 2017-03-07 23:00:21 +07:00
Sergey M․
2e76bdc850 [brightcove:legacy] Relax videoPlayer validation check (closes #12381) 2017-03-07 22:59:33 +07:00
Yen Chi Hsuan
fe646a2f10 [twitch] PEP8 2017-03-07 15:34:06 +08:00
Sergey M․
9df53ea36e Credit @puxlit for twitch 2fa (#11974) 2017-03-07 04:05:47 +07:00
Sergey M․
d7d7f84c95 Credit @benages for redbull.tv (#11948) 2017-03-07 04:05:47 +07:00
Sergey M․
dccd0ab35d release 2017.03.07 2017-03-07 03:59:22 +07:00
Sergey M․
80146dcc6c [ChangeLog] Actualize 2017-03-07 03:57:54 +07:00
Sergey M․
e30ccf7047 [soundcloud] Update client id (closes #12376) 2017-03-06 23:05:38 +07:00
Yen Chi Hsuan
54a3a8827b [__init__] Metadata should be added after conversion
Fixes #5594
2017-03-06 18:09:12 +08:00
Yen Chi Hsuan
92cb5763f4 [ChangeLog] Update after #12357 2017-03-06 18:04:19 +08:00
denneboomyo
da92da4b88 Openload fix extraction (#12357)
* Fix extraction
2017-03-06 18:00:17 +08:00
Sergey M․
1664702626 release 2017.03.06 2017-03-06 04:04:39 +07:00
Sergey M․
3f116b189b [ChangeLog] Actualize 2017-03-06 04:01:21 +07:00
Sergey M․
4b5de77bdb [utils] Process bytestrings in urljoin (closes #12369) 2017-03-06 03:57:46 +07:00
Sergey M․
96182695e4 [drtv] Add geo countries to GeoRestrictedError 2017-03-06 03:23:42 +07:00
Sergey M․
fc11ad3833 [drtv:live] Bypass geo restriction 2017-03-06 03:23:42 +07:00
Yen Chi Hsuan
d2b64e04b4 [addanime] Skip an invalid test 2017-03-06 00:35:04 +08:00
Sergey M․
5dd376345b [tunepk] Add extractor (closes #12197, closes #12243) 2017-03-05 23:31:38 +07:00
Sergey M․
1a2192cb90 [extractor/common] Pass arguments to _parse_jwplayer_formats and PEP8 2017-03-05 23:29:17 +07:00
Sergey M․
0236cd0dfd [extractor/common] Improve height extraction and extract bitrate 2017-03-05 23:25:03 +07:00
Sergey M․
ed0cf9b383 [extractor/common] Move jwplayer formats extraction in separate method 2017-03-05 23:22:27 +07:00
Sergey M․
a50862b735 [downloader/external] Add missing import and PEP8 2017-03-05 10:24:29 +07:00
John Hawkinson
6d0fe752bf [external:ffmpeg] In test harness, limit to 10k download size
Otherwise, if you screw up a playlist test by including a playlist
dictionary key, you'll be there for eons while it downloads all the
files before erroring out.
2017-03-05 11:19:44 +08:00
Sergey M․
afa4597618 release 2017.03.05 2017-03-05 02:23:08 +07:00
Sergey M․
75027364ba [ChangeLog] Actualize 2017-03-05 02:22:02 +07:00
Sergey M․
5316566edc [twitch] Use better naming and simplify (closes #11974) 2017-03-05 02:06:33 +07:00
Xiao Di Guan
c64c03be35 [twitch] Add basic support for two-factor authentication 2017-03-05 01:06:27 +07:00
Sergey M․
bcefc59279 Credit @vierbergenlars for vijf.be (#12304) 2017-03-05 00:03:59 +07:00
Sergey M․
6f211dc936 Credit @obilodeau for vrak (#11452) 2017-03-05 00:03:59 +07:00
Sergey M․
f24c1e5584 Credit @TobiX for #9725 2017-03-05 00:03:59 +07:00
Sergey M․
466274fe9a Credit @p2004a for vodpl (#12122) 2017-03-05 00:03:59 +07:00
Sergey M․
30f8f142d4 Credit @ThomasChr for #12015 and #12245 2017-03-05 00:03:59 +07:00
Lars Vierbergen
a3ba8a7acf [vier] Add support for vijf.be
vier.be and vijf.be run on the same CMS and are property of the same company,
so the same extractor can be used for both of them.
2017-03-05 00:47:19 +08:00
Sergey M․
054a587de8 [redbulltv] Improve extraction (closes #11948, closes #3919) 2017-03-04 23:28:21 +07:00
Juanjo Benages
64b7ccef3e [redbulltv] Add extractor 2017-03-04 23:26:15 +07:00
Yen Chi Hsuan
6f4e4132d8 [douyutv] Switch to the PC API to escape the 5-min limitation
Thanks @spacemeowx2 for the algo.

Ref: https://gist.github.com/spacemeowx2/629b1d131bd7e240a7d28742048e80fc

Closes #12316
2017-03-04 23:23:18 +08:00
Sergey M․
eb3079b6ce [generic] Add support for rutube embeds 2017-03-04 00:46:33 +07:00
Sergey M․
bc82f22879 [rutube] Relax _VALID_URL 2017-03-04 00:42:51 +07:00
Sergey M․
4d058c9862 [vrak] Improve and update test (closes #11452) 2017-03-03 23:58:16 +07:00
Sergey M․
d16f27ca27 [brightcove:new] Add ability to smuggle geo_countries into URL 2017-03-03 23:58:03 +07:00
Olivier Bilodeau
cbb127568a [vrak] Add extractor 2017-03-03 23:54:21 +07:00
Sergey M․
d02d4fa0a9 [brightcove:new] Raise GeoRestrictedError 2017-03-03 22:49:48 +07:00
Sergey M․
692fa200ca [go] Relax _VALID_URL (closes #12341) 2017-03-03 22:28:34 +07:00
Sergey M․
9bae185ba6 [24video] Use original host for requests (closes #12339) 2017-03-03 22:16:00 +07:00
Sergey M․
4d345bf17b [ruutu] Disable DASH formats (closes #12322)
Due to causing out of sync issue
2017-03-02 23:53:46 +07:00
41 changed files with 988 additions and 213 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.03.02*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.03.02**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.03.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.03.16**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.03.02
[debug] youtube-dl version 2017.03.16
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -202,3 +202,10 @@ Fabian Stahl
Bagira
Odd Stråbø
Philip Herzog
Thomas Christlieb
Marek Rusinowski
Tobias Gruetzmacher
Olivier Bilodeau
Lars Vierbergen
Juanjo Benages
Xiao Di Guan

View File

@@ -1,3 +1,83 @@
version 2017.03.16
Core
+ [postprocessor/ffmpeg] Add support for flac
+ [extractor/common] Extract SMIL formats from jwplayer
Extractors
+ [generic] Add forgotten return for jwplayer formats
* [redbulltv] Improve extraction
version 2017.03.15
Core
* Fix missing subtitles if --add-metadata is used (#12423)
Extractors
* [facebook] Make title optional (#12443)
+ [mitele] Add support for ooyala videos (#12430)
* [openload] Fix extraction (#12435, #12446)
* [streamable] Update API URL (#12433)
+ [crunchyroll] Extract season name (#12428)
* [discoverygo] Bypass geo restriction
+ [discoverygo:playlist] Add support for playlists (#12424)
version 2017.03.10
Extractors
* [generic] Make title optional for jwplayer embeds (#12410)
* [wdr:maus] Fix extraction (#12373)
* [prosiebensat1] Improve title extraction (#12318, #12327)
* [dplayit] Separate and rewrite extractor and bypass geo restriction (#12393)
* [miomio] Fix extraction (#12291, #12388, #12402)
* [telequebec] Fix description extraction (#12399)
* [openload] Fix extraction (#12357)
* [brightcove:legacy] Relax videoPlayer validation check (#12381)
version 2017.03.07
Core
* Metadata are now added after conversion (#5594)
Extractors
* [soundcloud] Update client id (#12376)
* [openload] Fix extraction (#10408, #12357)
version 2017.03.06
Core
+ [utils] Process bytestrings in urljoin (#12369)
* [extractor/common] Improve height extraction and extract bitrate
* [extractor/common] Move jwplayer formats extraction in separate method
+ [external:ffmpeg] Limit test download size to 10KiB (#12362)
Extractors
+ [drtv] Add geo countries to GeoRestrictedError
+ [drtv:live] Bypass geo restriction
+ [tunepk] Add extractor (#12197, #12243)
version 2017.03.05
Extractors
+ [twitch] Add basic support for two-factor authentication (#11974)
+ [vier] Add support for vijf.be (#12304)
+ [redbulltv] Add support for redbull.tv (#3919, #11948)
* [douyutv] Switch to the PC API to escape the 5-min limitation (#12316)
+ [generic] Add support for rutube embeds
+ [rutube] Relax URL regular expression
+ [vrak] Add support for vrak.tv (#11452)
+ [brightcove:new] Add ability to smuggle geo_countries into URL
+ [brightcove:new] Raise GeoRestrictedError
* [go] Relax URL regular expression (#12341)
* [24video] Use original host for requests (#12339)
* [ruutu] Disable DASH formats (#12322)
version 2017.03.02
Core

View File

@@ -375,8 +375,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
(requires ffmpeg or avconv and ffprobe or
avprobe)
--audio-format FORMAT Specify audio format: "best", "aac",
"vorbis", "mp3", "m4a", "opus", or "wav";
"best" by default; No effect without -x
"flac", "mp3", "m4a", "opus", "vorbis", or
"wav"; "best" by default; No effect without
-x
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
a value between 0 (better) and 9 (worse)
for VBR or a specific bitrate like 128K

View File

@@ -208,10 +208,12 @@
- **Digiteka**
- **Discovery**
- **DiscoveryGo**
- **DiscoveryGoPlaylist**
- **Disney**
- **Dotsub**
- **DouyuTV**: 斗鱼
- **DPlay**
- **DPlayIt**
- **dramafever**
- **dramafever:series**
- **DRBonanza**
@@ -626,6 +628,7 @@
- **RaiTV**
- **RBMARadio**
- **RDS**: RDS.ca
- **RedBullTV**
- **RedTube**
- **RegioTV**
- **RENTV**
@@ -797,6 +800,7 @@
- **tunein:program**
- **tunein:station**
- **tunein:topic**
- **TunePk**
- **Turbo**
- **Tutv**
- **tv.dfb.de**
@@ -916,6 +920,7 @@
- **VoxMedia**
- **Vporn**
- **vpro**: npo.nl and ntr.nl
- **Vrak**
- **VRT**
- **vube**: Vube.com
- **VuClip**

View File

@@ -455,6 +455,9 @@ class TestUtil(unittest.TestCase):
def test_urljoin(self):
self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin(b'http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', b'/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin(b'http://foo.de/', b'/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('//foo.de/', '/a/b/c.txt'), '//foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de/', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
self.assertEqual(urljoin('http://foo.de', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')

View File

@@ -196,7 +196,7 @@ def _real_main(argv=None):
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
raise ValueError('Playlist end must be greater than playlist start')
if opts.extractaudio:
if opts.audioformat not in ['best', 'aac', 'mp3', 'm4a', 'opus', 'vorbis', 'wav']:
if opts.audioformat not in ['best', 'aac', 'flac', 'mp3', 'm4a', 'opus', 'vorbis', 'wav']:
parser.error('invalid audio format specified')
if opts.audioquality:
opts.audioquality = opts.audioquality.strip('k').strip('K')
@@ -242,14 +242,11 @@ def _real_main(argv=None):
# PostProcessors
postprocessors = []
# Add the metadata pp first, the other pps will copy it
if opts.metafromtitle:
postprocessors.append({
'key': 'MetadataFromTitle',
'titleformat': opts.metafromtitle
})
if opts.addmetadata:
postprocessors.append({'key': 'FFmpegMetadata'})
if opts.extractaudio:
postprocessors.append({
'key': 'FFmpegExtractAudio',
@@ -262,6 +259,16 @@ def _real_main(argv=None):
'key': 'FFmpegVideoConvertor',
'preferedformat': opts.recodevideo,
})
# FFmpegMetadataPP should be run after FFmpegVideoConvertorPP and
# FFmpegExtractAudioPP as containers before conversion may not support
# metadata (3gp, webm, etc.)
# And this post-processor should be placed before other metadata
# manipulating post-processors (FFmpegEmbedSubtitle) to prevent loss of
# extra metadata. By default ffmpeg preserves metadata applicable for both
# source and target containers. From this point the container won't change,
# so metadata can be added here.
if opts.addmetadata:
postprocessors.append({'key': 'FFmpegMetadata'})
if opts.convertsubtitles:
postprocessors.append({
'key': 'FFmpegSubtitlesConvertor',

View File

@@ -6,7 +6,10 @@ import sys
import re
from .common import FileDownloader
from ..compat import compat_setenv
from ..compat import (
compat_setenv,
compat_str,
)
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
from ..utils import (
cli_option,
@@ -270,6 +273,10 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_live', 'live']
args += ['-i', url, '-c', 'copy']
if self.params.get('test', False):
args += ['-fs', compat_str(self._TEST_FILE_SIZE)]
if protocol in ('m3u8', 'm3u8_native'):
if self.params.get('hls_use_mpegts', False) or tmpfilename == '-':
args += ['-f', 'mpegts']

View File

@@ -25,7 +25,8 @@ class AddAnimeIE(InfoExtractor):
'ext': 'mp4',
'description': 'One Piece 606',
'title': 'One Piece 606',
}
},
'skip': 'Video is gone',
}, {
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
'only_matching': True,

View File

@@ -193,7 +193,13 @@ class BrightcoveLegacyIE(InfoExtractor):
if videoPlayer is not None:
if isinstance(videoPlayer, list):
videoPlayer = videoPlayer[0]
if not (videoPlayer.isdigit() or videoPlayer.startswith('ref:')):
videoPlayer = videoPlayer.strip()
# UUID is also possible for videoPlayer (e.g.
# http://www.popcornflix.com/hoodies-vs-hooligans/7f2d2b87-bbf2-4623-acfb-ea942b4f01dd
# or http://www8.hp.com/cn/zh/home.html)
if not (re.match(
r'^(?:\d+|[\da-fA-F]{8}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{12})$',
videoPlayer) or videoPlayer.startswith('ref:')):
return None
params['@videoPlayer'] = videoPlayer
linkBase = find_param('linkBaseURL')
@@ -515,6 +521,9 @@ class BrightcoveNewIE(InfoExtractor):
return entries
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(
@@ -544,8 +553,10 @@ class BrightcoveNewIE(InfoExtractor):
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
raise ExtractorError(
json_data.get('message') or json_data['error_code'], expected=True)
message = json_data.get('message') or json_data['error_code']
if json_data.get('error_subcode') == 'CLIENT_GEO':
self.raise_geo_restricted(msg=message)
raise ExtractorError(message, expected=True)
raise
title = json_data['name'].strip()

View File

@@ -2198,56 +2198,9 @@ class InfoExtractor(object):
this_video_id = video_id or video_data['mediaid']
formats = []
for source in video_data['sources']:
source_url = self._proto_relative_url(source['file'])
if base_url:
source_url = compat_urlparse.urljoin(base_url, source_url)
source_type = source.get('type') or ''
ext = mimetype2ext(source_type) or determine_ext(source_url)
if source_type == 'hls' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_url, this_video_id, 'mp4', 'm3u8_native', m3u8_id=m3u8_id, fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
source_url, this_video_id, mpd_id=mpd_id, fatal=False))
# https://github.com/jwplayer/jwplayer/blob/master/src/js/providers/default.js#L67
elif source_type.startswith('audio') or ext in ('oga', 'aac', 'mp3', 'mpeg', 'vorbis'):
formats.append({
'url': source_url,
'vcodec': 'none',
'ext': ext,
})
else:
height = int_or_none(source.get('height'))
if height is None:
# Often no height is provided but there is a label in
# format like 1080p.
height = int_or_none(self._search_regex(
r'^(\d{3,})[pP]$', source.get('label') or '',
'height', default=None))
a_format = {
'url': source_url,
'width': int_or_none(source.get('width')),
'height': height,
'ext': ext,
}
if source_url.startswith('rtmp'):
a_format['ext'] = 'flv'
# See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as
# of jwplayer.flash.swf
rtmp_url_parts = re.split(
r'((?:mp4|mp3|flv):)', source_url, 1)
if len(rtmp_url_parts) == 3:
rtmp_url, prefix, play_path = rtmp_url_parts
a_format.update({
'url': rtmp_url,
'play_path': prefix + play_path,
})
if rtmp_params:
a_format.update(rtmp_params)
formats.append(a_format)
formats = self._parse_jwplayer_formats(
video_data['sources'], video_id=this_video_id, m3u8_id=m3u8_id,
mpd_id=mpd_id, rtmp_params=rtmp_params, base_url=base_url)
self._sort_formats(formats)
subtitles = {}
@@ -2278,6 +2231,65 @@ class InfoExtractor(object):
else:
return self.playlist_result(entries)
def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None,
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
formats = []
for source in jwplayer_sources_data:
source_url = self._proto_relative_url(source['file'])
if base_url:
source_url = compat_urlparse.urljoin(base_url, source_url)
source_type = source.get('type') or ''
ext = mimetype2ext(source_type) or determine_ext(source_url)
if source_type == 'hls' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=m3u8_id, fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
source_url, video_id, mpd_id=mpd_id, fatal=False))
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
source_url, video_id, fatal=False))
# https://github.com/jwplayer/jwplayer/blob/master/src/js/providers/default.js#L67
elif source_type.startswith('audio') or ext in (
'oga', 'aac', 'mp3', 'mpeg', 'vorbis'):
formats.append({
'url': source_url,
'vcodec': 'none',
'ext': ext,
})
else:
height = int_or_none(source.get('height'))
if height is None:
# Often no height is provided but there is a label in
# format like "1080p", "720p SD", or 1080.
height = int_or_none(self._search_regex(
r'^(\d{3,4})[pP]?(?:\b|$)', compat_str(source.get('label') or ''),
'height', default=None))
a_format = {
'url': source_url,
'width': int_or_none(source.get('width')),
'height': height,
'tbr': int_or_none(source.get('bitrate')),
'ext': ext,
}
if source_url.startswith('rtmp'):
a_format['ext'] = 'flv'
# See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as
# of jwplayer.flash.swf
rtmp_url_parts = re.split(
r'((?:mp4|mp3|flv):)', source_url, 1)
if len(rtmp_url_parts) == 3:
rtmp_url, prefix, play_path = rtmp_url_parts
a_format.update({
'url': rtmp_url,
'play_path': prefix + play_path,
})
if rtmp_params:
a_format.update(rtmp_params)
formats.append(a_format)
return formats
def _live_title(self, name):
""" Generate the title for a live video """
now = datetime.datetime.now()

View File

@@ -177,6 +177,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'uploader': 'Kadokawa Pictures Inc.',
'upload_date': '20170118',
'series': "KONOSUBA -God's blessing on this wonderful world!",
'season': "KONOSUBA -God's blessing on this wonderful world! 2",
'season_number': 2,
'episode': 'Give Me Deliverance from this Judicial Injustice!',
'episode_number': 1,
@@ -222,6 +223,23 @@ class CrunchyrollIE(CrunchyrollBaseIE):
# just test metadata extraction
'skip_download': True,
},
}, {
# A video with a vastly different season name compared to the series name
'url': 'http://www.crunchyroll.com/nyarko-san-another-crawling-chaos/episode-1-test-590532',
'info_dict': {
'id': '590532',
'ext': 'mp4',
'title': 'Haiyoru! Nyaruani (ONA) Episode 1 Test',
'description': 'Mahiro and Nyaruko talk about official certification.',
'uploader': 'TV TOKYO',
'upload_date': '20120305',
'series': 'Nyarko-san: Another Crawling Chaos',
'season': 'Haiyoru! Nyaruani (ONA)',
},
'params': {
# Just test metadata extraction
'skip_download': True,
},
}]
_FORMAT_IDS = {
@@ -491,7 +509,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
# webpage provide more accurate data than series_title from XML
series = self._html_search_regex(
r'id=["\']showmedia_about_episode_num[^>]+>\s*<a[^>]+>([^<]+)',
webpage, 'series', default=xpath_text(metadata, 'series_title'))
webpage, 'series', fatal=False)
season = xpath_text(metadata, 'series_title')
episode = xpath_text(metadata, 'episode_title')
episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
@@ -508,6 +527,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
'uploader': video_uploader,
'upload_date': video_upload_date,
'series': series,
'season': season,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,

View File

@@ -1,17 +1,21 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
extract_attributes,
ExtractorError,
int_or_none,
parse_age_limit,
ExtractorError,
remove_end,
unescapeHTML,
)
class DiscoveryGoIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?(?:
class DiscoveryGoBaseIE(InfoExtractor):
_VALID_URL_TEMPLATE = r'''(?x)https?://(?:www\.)?(?:
discovery|
investigationdiscovery|
discoverylife|
@@ -21,18 +25,23 @@ class DiscoveryGoIE(InfoExtractor):
sciencechannel|
tlc|
velocitychannel
)go\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'''
)go\.com/%s(?P<id>[^/?#&]+)'''
class DiscoveryGoIE(DiscoveryGoBaseIE):
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % r'(?:[^/]+/)+'
_GEO_COUNTRIES = ['US']
_TEST = {
'url': 'https://www.discoverygo.com/love-at-first-kiss/kiss-first-ask-questions-later/',
'url': 'https://www.discoverygo.com/bering-sea-gold/reaper-madness/',
'info_dict': {
'id': '57a33c536b66d1cd0345eeb1',
'id': '58c167d86b66d12f2addeb01',
'ext': 'mp4',
'title': 'Kiss First, Ask Questions Later!',
'description': 'md5:fe923ba34050eae468bffae10831cb22',
'duration': 2579,
'series': 'Love at First Kiss',
'season_number': 1,
'episode_number': 1,
'title': 'Reaper Madness',
'description': 'md5:09f2c625c99afb8946ed4fb7865f6e78',
'duration': 2519,
'series': 'Bering Sea Gold',
'season_number': 8,
'episode_number': 6,
'age_limit': 14,
},
}
@@ -113,3 +122,46 @@ class DiscoveryGoIE(InfoExtractor):
'formats': formats,
'subtitles': subtitles,
}
class DiscoveryGoPlaylistIE(DiscoveryGoBaseIE):
_VALID_URL = DiscoveryGoBaseIE._VALID_URL_TEMPLATE % ''
_TEST = {
'url': 'https://www.discoverygo.com/bering-sea-gold/',
'info_dict': {
'id': 'bering-sea-gold',
'title': 'Bering Sea Gold',
'description': 'md5:cc5c6489835949043c0cc3ad66c2fa0e',
},
'playlist_mincount': 6,
}
@classmethod
def suitable(cls, url):
return False if DiscoveryGoIE.suitable(url) else super(
DiscoveryGoPlaylistIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
entries = []
for mobj in re.finditer(r'data-json=(["\'])(?P<json>{.+?})\1', webpage):
data = self._parse_json(
mobj.group('json'), display_id,
transform_source=unescapeHTML, fatal=False)
if not isinstance(data, dict) or data.get('type') != 'episode':
continue
episode_url = data.get('socialUrl')
if not episode_url:
continue
entries.append(self.url_result(
episode_url, ie=DiscoveryGoIE.ie_key(),
video_id=data.get('id')))
return self.playlist_result(
entries, display_id,
remove_end(self._og_search_title(
webpage, fatal=False), ' | Discovery GO'),
self._og_search_description(webpage))

View File

@@ -1,6 +1,9 @@
# coding: utf-8
from __future__ import unicode_literals
import time
import hashlib
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@@ -16,7 +19,7 @@ class DouyuTVIE(InfoExtractor):
'info_dict': {
'id': '17732',
'display_id': 'iseven',
'ext': 'mp4',
'ext': 'flv',
'title': 're:^清晨醒脑T-ARA根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
@@ -31,7 +34,7 @@ class DouyuTVIE(InfoExtractor):
'info_dict': {
'id': '85982',
'display_id': '85982',
'ext': 'mp4',
'ext': 'flv',
'title': 're:^小漠从零单排记——CSOL2躲猫猫 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:746a2f7a253966a06755a912f0acc0d2',
'thumbnail': r're:^https?://.*\.jpg$',
@@ -47,7 +50,7 @@ class DouyuTVIE(InfoExtractor):
'info_dict': {
'id': '17732',
'display_id': '17732',
'ext': 'mp4',
'ext': 'flv',
'title': 're:^清晨醒脑T-ARA根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
@@ -66,10 +69,6 @@ class DouyuTVIE(InfoExtractor):
'only_matching': True,
}]
# Decompile core.swf in webpage by ffdec "Search SWFs in memory". core.swf
# is encrypted originally, but ffdec can dump memory to get the decrypted one.
_API_KEY = 'A12Svb&%1UUmf@hC'
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -80,6 +79,7 @@ class DouyuTVIE(InfoExtractor):
room_id = self._html_search_regex(
r'"room_id\\?"\s*:\s*(\d+),', page, 'room id')
# Grab metadata from mobile API
room = self._download_json(
'http://m.douyu.com/html5/live?roomId=%s' % room_id, video_id,
note='Downloading room info')['data']
@@ -88,8 +88,19 @@ class DouyuTVIE(InfoExtractor):
if room.get('show_status') == '2':
raise ExtractorError('Live stream is offline', expected=True)
formats = self._extract_m3u8_formats(
room['hls_url'], video_id, ext='mp4')
# Grab the URL from PC client API
# The m3u8 url from mobile API requires re-authentication every 5 minutes
tt = int(time.time())
signContent = 'lapi/live/thirdPart/getPlay/%s?aid=pcclient&rate=0&time=%d9TUk5fjjUjg9qIMH3sdnh' % (room_id, tt)
sign = hashlib.md5(signContent.encode('ascii')).hexdigest()
video_url = self._download_json(
'http://coapi.douyucdn.cn/lapi/live/thirdPart/getPlay/' + room_id,
video_id, note='Downloading video URL info',
query={'rate': 0}, headers={
'auth': sign,
'time': str(tt),
'aid': 'pcclient'
})['data']['live_url']
title = self._live_title(unescapeHTML(room['room_name']))
description = room.get('show_details')
@@ -99,7 +110,7 @@ class DouyuTVIE(InfoExtractor):
return {
'id': room_id,
'display_id': video_id,
'formats': formats,
'url': video_url,
'title': title,
'description': description,
'thumbnail': thumbnail,

View File

@@ -6,37 +6,24 @@ import re
import time
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..compat import (
compat_urlparse,
compat_HTTPError,
)
from ..utils import (
USER_AGENTS,
ExtractorError,
int_or_none,
unified_strdate,
remove_end,
update_url_query,
)
class DPlayIE(InfoExtractor):
_VALID_URL = r'https?://(?P<domain>it\.dplay\.com|www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?P<domain>www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
_TESTS = [{
# geo restricted, via direct unsigned hls URL
'url': 'http://it.dplay.com/take-me-out/stagione-1-episodio-25/',
'info_dict': {
'id': '1255600',
'display_id': 'stagione-1-episodio-25',
'ext': 'mp4',
'title': 'Episodio 25',
'description': 'md5:cae5f40ad988811b197d2d27a53227eb',
'duration': 2761,
'timestamp': 1454701800,
'upload_date': '20160205',
'creator': 'RTIT',
'series': 'Take me out',
'season_number': 1,
'episode_number': 25,
'age_limit': 0,
},
'expected_warnings': ['Unable to download f4m manifest'],
}, {
# non geo restricted, via secure api, unsigned download hls URL
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
'info_dict': {
@@ -168,3 +155,90 @@ class DPlayIE(InfoExtractor):
'formats': formats,
'subtitles': subtitles,
}
class DPlayItIE(InfoExtractor):
_VALID_URL = r'https?://it\.dplay\.com/[^/]+/[^/]+/(?P<id>[^/?#]+)'
_GEO_COUNTRIES = ['IT']
_TEST = {
'url': 'http://it.dplay.com/nove/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij/',
'md5': '2b808ffb00fc47b884a172ca5d13053c',
'info_dict': {
'id': '6918',
'display_id': 'luigi-di-maio-la-psicosi-di-stanislawskij',
'ext': 'mp4',
'title': 'Biografie imbarazzanti: Luigi Di Maio: la psicosi di Stanislawskij',
'description': 'md5:3c7a4303aef85868f867a26f5cc14813',
'thumbnail': r're:^https?://.*\.jpe?g',
'upload_date': '20160524',
'series': 'Biografie imbarazzanti',
'season_number': 1,
'episode': 'Luigi Di Maio: la psicosi di Stanislawskij',
'episode_number': 1,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
info_url = self._search_regex(
r'url\s*:\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
webpage, 'video id')
title = remove_end(self._og_search_title(webpage), ' | Dplay')
try:
info = self._download_json(
info_url, display_id, headers={
'Authorization': 'Bearer %s' % self._get_cookies(url).get(
'dplayit_token').value,
'Referer': url,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 403):
info = self._parse_json(e.cause.read().decode('utf-8'), display_id)
error = info['errors'][0]
if error.get('code') == 'access.denied.geoblocked':
self.raise_geo_restricted(
msg=error.get('detail'), countries=self._GEO_COUNTRIES)
raise ExtractorError(info['errors'][0]['detail'], expected=True)
raise
hls_url = info['data']['attributes']['streaming']['hls']['url']
formats = self._extract_m3u8_formats(
hls_url, display_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id='hls')
series = self._html_search_regex(
r'(?s)<h1[^>]+class=["\'].*?\bshow_title\b.*?["\'][^>]*>(.+?)</h1>',
webpage, 'series', fatal=False)
episode = self._search_regex(
r'<p[^>]+class=["\'].*?\bdesc_ep\b.*?["\'][^>]*>\s*<br/>\s*<b>([^<]+)',
webpage, 'episode', fatal=False)
mobj = re.search(
r'(?s)<span[^>]+class=["\']dates["\'][^>]*>.+?\bS\.(?P<season_number>\d+)\s+E\.(?P<episode_number>\d+)\s*-\s*(?P<upload_date>\d{2}/\d{2}/\d{4})',
webpage)
if mobj:
season_number = int(mobj.group('season_number'))
episode_number = int(mobj.group('episode_number'))
upload_date = unified_strdate(mobj.group('upload_date'))
else:
season_number = episode_number = upload_date = None
return {
'id': info_url.rpartition('/')[-1],
'display_id': display_id,
'title': title,
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'series': series,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,
'upload_date': upload_date,
'formats': formats,
}

View File

@@ -15,6 +15,8 @@ from ..utils import (
class DRTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio/ondemand)/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['DK']
IE_NAME = 'drtv'
_TESTS = [{
'url': 'https://www.dr.dk/tv/se/boern/ultra/klassen-ultra/klassen-darlig-taber-10',
@@ -137,7 +139,7 @@ class DRTVIE(InfoExtractor):
if not formats and restricted_to_denmark:
self.raise_geo_restricted(
'Unfortunately, DR is not allowed to show this program outside Denmark.',
expected=True)
countries=self._GEO_COUNTRIES)
self._sort_formats(formats)
@@ -156,6 +158,7 @@ class DRTVIE(InfoExtractor):
class DRTVLiveIE(InfoExtractor):
IE_NAME = 'drtv:live'
_VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv|TV)/live/(?P<id>[\da-z-]+)'
_GEO_COUNTRIES = ['DK']
_TEST = {
'url': 'https://www.dr.dk/tv/live/dr1',
'info_dict': {

View File

@@ -246,7 +246,10 @@ from .dfb import DFBIE
from .dhm import DHMIE
from .dotsub import DotsubIE
from .douyutv import DouyuTVIE
from .dplay import DPlayIE
from .dplay import (
DPlayIE,
DPlayItIE,
)
from .dramafever import (
DramaFeverIE,
DramaFeverSeriesIE,
@@ -262,7 +265,10 @@ from .dvtv import DVTVIE
from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .discoverygo import DiscoveryGoIE
from .discoverygo import (
DiscoveryGoIE,
DiscoveryGoPlaylistIE,
)
from .disney import DisneyIE
from .dispeak import DigitallySpeakingIE
from .dropbox import DropboxIE
@@ -793,6 +799,7 @@ from .rai import (
)
from .rbmaradio import RBMARadioIE
from .rds import RDSIE
from .redbulltv import RedBullTVIE
from .redtube import RedTubeIE
from .regiotv import RegioTVIE
from .rentv import (
@@ -999,6 +1006,7 @@ from .tunein import (
TuneInTopicIE,
TuneInShortenerIE,
)
from .tunepk import TunePkIE
from .turbo import TurboIE
from .tutv import TutvIE
from .tv2 import (
@@ -1165,6 +1173,7 @@ from .voicerepublic import VoiceRepublicIE
from .voxmedia import VoxMediaIE
from .vporn import VpornIE
from .vrt import VRTIE
from .vrak import VrakIE
from .vube import VubeIE
from .vuclip import VuClipIE
from .vvvvid import VVVVIDIE

View File

@@ -196,6 +196,10 @@ class FacebookIE(InfoExtractor):
}, {
'url': 'https://www.facebookcorewwwi.onion/video.php?v=274175099429670',
'only_matching': True,
}, {
# no title
'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/',
'only_matching': True,
}]
@staticmethod
@@ -353,15 +357,15 @@ class FacebookIE(InfoExtractor):
self._sort_formats(formats)
video_title = self._html_search_regex(
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage, 'title',
default=None)
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage,
'title', default=None)
if not video_title:
video_title = self._html_search_regex(
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',
webpage, 'alternative title', default=None)
if not video_title:
video_title = self._html_search_meta(
'description', webpage, 'title')
'description', webpage, 'title', default=None)
if video_title:
video_title = limit_length(video_title, 80)
else:

View File

@@ -84,6 +84,7 @@ from .twentymin import TwentyMinutenIE
from .ustream import UstreamIE
from .openload import OpenloadIE
from .videopress import VideoPressIE
from .rutube import RutubeIE
class GenericIE(InfoExtractor):
@@ -448,6 +449,23 @@ class GenericIE(InfoExtractor):
},
}],
},
{
# Brightcove with UUID in videoPlayer
'url': 'http://www8.hp.com/cn/zh/home.html',
'info_dict': {
'id': '5255815316001',
'ext': 'mp4',
'title': 'Sprocket Video - China',
'description': 'Sprocket Video - China',
'uploader': 'HP-Video Gallery',
'timestamp': 1482263210,
'upload_date': '20161220',
'uploader_id': '1107601872001',
},
'params': {
'skip_download': True, # m3u8 download
},
},
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
@@ -1502,6 +1520,23 @@ class GenericIE(InfoExtractor):
},
'add_ie': [VideoPressIE.ie_key()],
},
{
# Rutube embed
'url': 'http://magazzino.friday.ru/videos/vipuski/kazan-2',
'info_dict': {
'id': '9b3d5bee0a8740bf70dfd29d3ea43541',
'ext': 'flv',
'title': 'Магаззино: Казань 2',
'description': 'md5:99bccdfac2269f0e8fdbc4bbc9db184a',
'uploader': 'Магаззино',
'upload_date': '20170228',
'uploader_id': '996642',
},
'params': {
'skip_download': True,
},
'add_ie': [RutubeIE.ie_key()],
},
{
# ThePlatform embedded with whitespaces in URLs
'url': 'http://www.golfchannel.com/topics/shows/golftalkcentral.htm',
@@ -2480,6 +2515,12 @@ class GenericIE(InfoExtractor):
return _playlist_from_matches(
videopress_urls, ie=VideoPressIE.ie_key())
# Look for Rutube embeds
rutube_urls = RutubeIE._extract_urls(webpage)
if rutube_urls:
return _playlist_from_matches(
rutube_urls, ie=RutubeIE.ie_key())
# Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(
webpage, video_id, default={}, expected_type='VideoObject')
@@ -2509,7 +2550,11 @@ class GenericIE(InfoExtractor):
try:
jwplayer_data = self._parse_json(
jwplayer_data_str, video_id, transform_source=js_to_json)
return self._parse_jwplayer_data(jwplayer_data, video_id)
info = self._parse_jwplayer_data(
jwplayer_data, video_id, require_title=False)
if not info.get('title'):
info['title'] = video_title
return info
except ExtractorError:
pass

View File

@@ -36,7 +36,7 @@ class GoIE(AdobePassIE):
'requestor_id': 'DisneyXD',
}
}
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|(?:[^/]+/)*(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
_TESTS = [{
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
'info_dict': {
@@ -52,6 +52,12 @@ class GoIE(AdobePassIE):
}, {
'url': 'http://abc.go.com/shows/after-paradise/video/most-recent/vdka3335601',
'only_matching': True,
}, {
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
'only_matching': True,
}, {
'url': 'http://abc.go.com/shows/world-news-tonight/episode-guide/2017-02/17-021717-intense-stand-off-between-man-with-rifle-and-police-in-oakland',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -51,6 +51,7 @@ class MioMioIE(InfoExtractor):
'ext': 'mp4',
'title': 'マツコの知らない世界【劇的進化SPビニール傘冷凍食品2016】 1_2 - 16 05 31',
},
'skip': 'Unable to load videos',
}]
def _extract_mioplayer(self, webpage, video_id, title, http_headers):
@@ -94,9 +95,18 @@ class MioMioIE(InfoExtractor):
return entries
def _download_chinese_webpage(self, *args, **kwargs):
# Requests with English locales return garbage
headers = {
'Accept-Language': 'zh-TW,en-US;q=0.7,en;q=0.3',
}
kwargs.setdefault('headers', {}).update(headers)
return self._download_webpage(*args, **kwargs)
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_chinese_webpage(
url, video_id)
title = self._html_search_meta(
'description', webpage, 'title', fatal=True)
@@ -106,7 +116,7 @@ class MioMioIE(InfoExtractor):
if '_h5' in mioplayer_path:
player_url = compat_urlparse.urljoin(url, mioplayer_path)
player_webpage = self._download_webpage(
player_webpage = self._download_chinese_webpage(
player_url, video_id,
note='Downloading player webpage', headers={'Referer': url})
entries = self._parse_html5_media_entries(player_url, player_webpage, video_id)

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import uuid
from .common import InfoExtractor
from .ooyala import OoyalaIE
from ..compat import (
compat_str,
compat_urllib_parse_urlencode,
@@ -24,6 +25,9 @@ class MiTeleBaseIE(InfoExtractor):
r'(?s)(<ms-video-player.+?</ms-video-player>)',
webpage, 'ms video player'))
video_id = player_data['data-media-id']
if player_data.get('data-cms-id') == 'ooyala':
return self.url_result(
'ooyala:%s' % video_id, ie=OoyalaIE.ie_key(), video_id=video_id)
config_url = compat_urlparse.urljoin(url, player_data['data-config'])
config = self._download_json(
config_url, video_id, 'Downloading config JSON')

View File

@@ -75,22 +75,40 @@ class OpenloadIE(InfoExtractor):
'<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
webpage, 'openload ID')
first_char = int(ol_id[0])
urlcode = []
num = 1
video_url_chars = []
while num < len(ol_id):
i = ord(ol_id[num])
key = 0
if i <= 90:
key = i - 65
elif i >= 97:
key = 25 + i - 97
urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
num += 5
first_char = ord(ol_id[0])
key = first_char - 50
maxKey = max(2, key)
key = min(maxKey, len(ol_id) - 22)
t = ol_id[key:key + 20]
video_url = 'https://openload.co/stream/' + ''.join(
[value for _, value in sorted(urlcode, key=lambda x: x[0])])
hashMap = {}
v = ol_id.replace(t, "")
h = 0
while h < len(t):
f = t[h:h + 2]
i = int(f, 16)
hashMap[h / 2] = i
h += 2
h = 0
while h < len(v):
B = v[h:h + 3]
i = int(B, 16)
if (h / 3) % 3 == 0:
i = int(B, 8)
index = (h / 3) % 10
A = hashMap[index]
i = i ^ 47
i = i ^ A
video_url_chars.append(compat_chr(i))
h += 3
video_url = 'https://openload.co/stream/%s?mime=true'
video_url = video_url % (''.join(video_url_chars))
title = self._og_search_title(webpage, default=None) or self._search_regex(
r'<span[^>]+class=["\']title["\'][^>]*>([^<]+)', webpage,

View File

@@ -300,6 +300,21 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
'skip_download': True,
},
},
{
# title in <h2 class="subtitle">
'url': 'http://www.prosieben.de/stars/oscar-award/videos/jetzt-erst-enthuellt-das-geheimnis-von-emma-stones-oscar-robe-clip',
'info_dict': {
'id': '4895826',
'ext': 'mp4',
'title': 'Jetzt erst enthüllt: Das Geheimnis von Emma Stones Oscar-Robe',
'description': 'md5:e5ace2bc43fadf7b63adc6187e9450b9',
'upload_date': '20170302',
},
'params': {
'skip_download': True,
},
'skip': 'geo restricted to Germany',
},
{
# geo restricted to Germany
'url': 'http://www.kabeleinsdoku.de/tv/mayday-alarm-im-cockpit/video/102-notlandung-im-hudson-river-ganze-folge',
@@ -338,6 +353,7 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
r'<header class="module_header">\s*<h2>([^<]+)</h2>\s*</header>',
r'<h2 class="video-title" itemprop="name">\s*(.+?)</h2>',
r'<div[^>]+id="veeseoTitle"[^>]*>(.+?)</div>',
r'<h2[^>]+class="subtitle"[^>]*>([^<]+)</h2>',
]
_DESCRIPTION_REGEXES = [
r'<p itemprop="description">\s*(.+?)</p>',
@@ -369,7 +385,9 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
def _extract_clip(self, url, webpage):
clip_id = self._html_search_regex(
self._CLIPID_REGEXES, webpage, 'clip id')
title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
title = self._html_search_regex(
self._TITLE_REGEXES, webpage, 'title',
default=None) or self._og_search_title(webpage)
info = self._extract_video_info(url, clip_id)
description = self._html_search_regex(
self._DESCRIPTION_REGEXES, webpage, 'description', default=None)

View File

@@ -0,0 +1,122 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import (
float_or_none,
int_or_none,
try_get,
# unified_timestamp,
ExtractorError,
)
class RedBullTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?redbull\.tv/(?:video|film)/(?P<id>AP-\w+)'
_TESTS = [{
# film
'url': 'https://www.redbull.tv/video/AP-1Q756YYX51W11/abc-of-wrc',
'md5': 'fb0445b98aa4394e504b413d98031d1f',
'info_dict': {
'id': 'AP-1Q756YYX51W11',
'ext': 'mp4',
'title': 'ABC of...WRC',
'description': 'md5:5c7ed8f4015c8492ecf64b6ab31e7d31',
'duration': 1582.04,
# 'timestamp': 1488405786,
# 'upload_date': '20170301',
},
}, {
# episode
'url': 'https://www.redbull.tv/video/AP-1PMT5JCWH1W11/grime?playlist=shows:shows-playall:web',
'info_dict': {
'id': 'AP-1PMT5JCWH1W11',
'ext': 'mp4',
'title': 'Grime - Hashtags S2 E4',
'description': 'md5:334b741c8c1ce65be057eab6773c1cf5',
'duration': 904.6,
# 'timestamp': 1487290093,
# 'upload_date': '20170217',
'series': 'Hashtags',
'season_number': 2,
'episode_number': 4,
},
}, {
'url': 'https://www.redbull.tv/film/AP-1MSKKF5T92111/in-motion',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
session = self._download_json(
'https://api-v2.redbull.tv/session', video_id,
note='Downloading access token', query={
'build': '4.370.0',
'category': 'personal_computer',
'os_version': '1.0',
'os_family': 'http',
})
if session.get('code') == 'error':
raise ExtractorError('%s said: %s' % (
self.IE_NAME, session['message']))
auth = '%s %s' % (session.get('token_type', 'Bearer'), session['access_token'])
try:
info = self._download_json(
'https://api-v2.redbull.tv/content/%s' % video_id,
video_id, note='Downloading video information',
headers={'Authorization': auth}
)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
error_message = self._parse_json(
e.cause.read().decode(), video_id)['message']
raise ExtractorError('%s said: %s' % (
self.IE_NAME, error_message), expected=True)
raise
video = info['video_product']
title = info['title'].strip()
formats = self._extract_m3u8_formats(
video['url'], video_id, 'mp4', 'm3u8_native')
self._sort_formats(formats)
subtitles = {}
for _, captions in (try_get(
video, lambda x: x['attachments']['captions'],
dict) or {}).items():
if not captions or not isinstance(captions, list):
continue
for caption in captions:
caption_url = caption.get('url')
if not caption_url:
continue
ext = caption.get('format')
if ext == 'xml':
ext = 'ttml'
subtitles.setdefault(caption.get('lang') or 'en', []).append({
'url': caption_url,
'ext': ext,
})
subheading = info.get('subheading')
if subheading:
title += ' - %s' % subheading
return {
'id': video_id,
'title': title,
'description': info.get('long_description') or info.get(
'short_description'),
'duration': float_or_none(video.get('duration'), scale=1000),
# 'timestamp': unified_timestamp(info.get('published')),
'series': info.get('show_title'),
'season_number': int_or_none(info.get('season_number')),
'episode_number': int_or_none(info.get('episode_number')),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -17,7 +17,7 @@ from ..utils import (
class RutubeIE(InfoExtractor):
IE_NAME = 'rutube'
IE_DESC = 'Rutube videos'
_VALID_URL = r'https?://rutube\.ru/(?:video|play/embed)/(?P<id>[\da-z]{32})'
_VALID_URL = r'https?://rutube\.ru/(?:video|(?:play/)?embed)/(?P<id>[\da-z]{32})'
_TESTS = [{
'url': 'http://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/',
@@ -39,8 +39,17 @@ class RutubeIE(InfoExtractor):
}, {
'url': 'http://rutube.ru/play/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True,
}, {
'url': 'http://rutube.ru/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [mobj.group('url') for mobj in re.finditer(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//rutube\.ru/embed/[\da-z]{32}.*?)\1',
webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(

View File

@@ -82,6 +82,9 @@ class RuutuIE(InfoExtractor):
formats.extend(self._extract_f4m_formats(
video_url, video_id, f4m_id='hds', fatal=False))
elif ext == 'mpd':
# video-only and audio-only streams are of different
# duration resulting in out of sync issue
continue
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:

View File

@@ -121,7 +121,7 @@ class SoundcloudIE(InfoExtractor):
},
]
_CLIENT_ID = 'fDoItMDbsbZz8dY16ZzARCZmzgHBPotA'
_CLIENT_ID = '2t9loNQH90kzJcsFCODdigxfp325aq4z'
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
@staticmethod

View File

@@ -65,7 +65,7 @@ class StreamableIE(InfoExtractor):
# to return video info like the title properly sometimes, and doesn't
# include info like the video duration
video = self._download_json(
'https://streamable.com/ajax/videos/%s' % video_id, video_id)
'https://ajax.streamable.com/videos/%s' % video_id, video_id)
# Format IDs:
# 0 The video is being uploaded

View File

@@ -44,6 +44,10 @@ class TelecincoIE(MiTeleBaseIE):
}, {
'url': 'http://www.telecinco.es/espanasinirmaslejos/Espana-gran-destino-turistico_2_1240605043.html',
'only_matching': True,
}, {
# ooyala video
'url': 'http://www.cuatro.com/chesterinlove/a-carta/chester-chester_in_love-chester_edu_2_2331030022.html',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -2,15 +2,17 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
smuggle_url,
try_get,
)
class TeleQuebecIE(InfoExtractor):
_VALID_URL = r'https?://zonevideo\.telequebec\.tv/media/(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://zonevideo.telequebec.tv/media/20984/le-couronnement-de-new-york/couronnement-de-new-york',
'md5': 'fe95a0957e5707b1b01f5013e725c90f',
'info_dict': {
@@ -18,10 +20,14 @@ class TeleQuebecIE(InfoExtractor):
'ext': 'mp4',
'title': 'Le couronnement de New York',
'description': 'md5:f5b3d27a689ec6c1486132b2d687d432',
'upload_date': '20160220',
'timestamp': 1455965438,
'upload_date': '20170201',
'timestamp': 1485972222,
}
}
}, {
# no description
'url': 'http://zonevideo.telequebec.tv/media/30261',
'only_matching': True,
}]
def _real_extract(self, url):
media_id = self._match_id(url)
@@ -31,9 +37,13 @@ class TeleQuebecIE(InfoExtractor):
return {
'_type': 'url_transparent',
'id': media_id,
'url': smuggle_url('limelight:media:' + media_data['streamInfo']['sourceId'], {'geo_countries': ['CA']}),
'url': smuggle_url(
'limelight:media:' + media_data['streamInfo']['sourceId'],
{'geo_countries': ['CA']}),
'title': media_data['title'],
'description': media_data.get('descriptions', [{'text': None}])[0].get('text'),
'duration': int_or_none(media_data.get('durationInMilliseconds'), 1000),
'description': try_get(
media_data, lambda x: x['descriptions'][0]['text'], compat_str),
'duration': int_or_none(
media_data.get('durationInMilliseconds'), 1000),
'ie_key': 'LimelightMedia',
}

View File

@@ -0,0 +1,90 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
try_get,
unified_timestamp,
)
class TunePkIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:www\.)?tune\.pk/(?:video/|player/embed_player.php?.*?\bvid=)|
embed\.tune\.pk/play/
)
(?P<id>\d+)
'''
_TESTS = [{
'url': 'https://tune.pk/video/6919541/maudie-2017-international-trailer-1-ft-ethan-hawke-sally-hawkins',
'md5': '0c537163b7f6f97da3c5dd1e3ef6dd55',
'info_dict': {
'id': '6919541',
'ext': 'mp4',
'title': 'Maudie (2017) | International Trailer # 1 ft Ethan Hawke, Sally Hawkins',
'description': 'md5:eb5a04114fafef5cec90799a93a2d09c',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1487327564,
'upload_date': '20170217',
'uploader': 'Movie Trailers',
'duration': 107,
'view_count': int,
}
}, {
'url': 'https://tune.pk/player/embed_player.php?vid=6919541&folder=2017/02/17/&width=600&height=350&autoplay=no',
'only_matching': True,
}, {
'url': 'https://embed.tune.pk/play/6919541?autoplay=no&ssl=yes&inline=true',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'https://tune.pk/video/%s' % video_id, video_id)
details = self._parse_json(
self._search_regex(
r'new\s+TunePlayer\(({.+?})\)\s*;\s*\n', webpage, 'tune player'),
video_id)['details']
video = details['video']
title = video.get('title') or self._og_search_title(
webpage, default=None) or self._html_search_meta(
'title', webpage, 'title', fatal=True)
formats = self._parse_jwplayer_formats(
details['player']['sources'], video_id)
self._sort_formats(formats)
description = self._og_search_description(
webpage, default=None) or self._html_search_meta(
'description', webpage, 'description')
thumbnail = video.get('thumb') or self._og_search_thumbnail(
webpage, default=None) or self._html_search_meta(
'thumbnail', webpage, 'thumbnail')
timestamp = unified_timestamp(video.get('date_added'))
uploader = try_get(
video, lambda x: x['uploader']['name'],
compat_str) or self._html_search_meta('author', webpage, 'author')
duration = int_or_none(video.get('duration'))
view_count = int_or_none(video.get('views'))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'uploader': uploader,
'duration': duration,
'view_count': view_count,
'formats': formats,
}

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
parse_iso8601,
@@ -12,7 +14,7 @@ from ..utils import (
class TwentyFourVideoIE(InfoExtractor):
IE_NAME = '24video'
_VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx|sex|tube)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_VALID_URL = r'https?://(?P<host>(?:www\.)?24video\.(?:net|me|xxx|sex|tube))/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.24video.net/video/view/1044982',
@@ -43,10 +45,12 @@ class TwentyFourVideoIE(InfoExtractor):
}]
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
host = mobj.group('host')
webpage = self._download_webpage(
'http://www.24video.sex/video/view/%s' % video_id, video_id)
'http://%s/video/view/%s' % (host, video_id), video_id)
title = self._og_search_title(webpage)
description = self._html_search_regex(
@@ -72,11 +76,11 @@ class TwentyFourVideoIE(InfoExtractor):
# Sets some cookies
self._download_xml(
r'http://www.24video.sex/video/xml/%s?mode=init' % video_id,
r'http://%s/video/xml/%s?mode=init' % (host, video_id),
video_id, 'Downloading init XML')
video_xml = self._download_xml(
'http://www.24video.sex/video/xml/%s?mode=play' % video_id,
'http://%s/video/xml/%s?mode=play' % (host, video_id),
video_id, 'Downloading video XML')
video = xpath_element(video_xml, './/video', 'video', fatal=True)

View File

@@ -12,7 +12,6 @@ from ..compat import (
compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
compat_urlparse,
)
from ..utils import (
clean_html,
@@ -24,6 +23,7 @@ from ..utils import (
parse_iso8601,
update_url_query,
urlencode_postdata,
urljoin,
)
@@ -32,7 +32,7 @@ class TwitchBaseIE(InfoExtractor):
_API_BASE = 'https://api.twitch.tv'
_USHER_BASE = 'https://usher.ttvnw.net'
_LOGIN_URL = 'http://www.twitch.tv/login'
_LOGIN_URL = 'https://www.twitch.tv/login'
_CLIENT_ID = 'jzkbprff40iqj646a697cyrvl0zt2m6'
_NETRC_MACHINE = 'twitch'
@@ -64,6 +64,35 @@ class TwitchBaseIE(InfoExtractor):
raise ExtractorError(
'Unable to login. Twitch said: %s' % message, expected=True)
def login_step(page, urlh, note, data):
form = self._hidden_inputs(page)
form.update(data)
page_url = urlh.geturl()
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', page,
'post url', default=page_url, group='url')
post_url = urljoin(page_url, post_url)
headers = {'Referer': page_url}
try:
response = self._download_json(
post_url, None, note,
data=urlencode_postdata(form),
headers=headers)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
fail(response['message'])
raise
redirect_url = urljoin(post_url, response['redirect'])
return self._download_webpage_handle(
redirect_url, None, 'Downloading login redirect page',
headers=headers)
login_page, handle = self._download_webpage_handle(
self._LOGIN_URL, None, 'Downloading login page')
@@ -71,40 +100,19 @@ class TwitchBaseIE(InfoExtractor):
if 'blacklist_message' in login_page:
fail(clean_html(login_page))
login_form = self._hidden_inputs(login_page)
redirect_page, handle = login_step(
login_page, handle, 'Logging in as %s' % username, {
'username': username,
'password': password,
})
login_form.update({
'username': username,
'password': password,
})
redirect_url = handle.geturl()
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page,
'post url', default=redirect_url, group='url')
if not post_url.startswith('http'):
post_url = compat_urlparse.urljoin(redirect_url, post_url)
headers = {'Referer': redirect_url}
try:
response = self._download_json(
post_url, None, 'Logging in as %s' % username,
data=urlencode_postdata(login_form),
headers=headers)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
fail(response['message'])
raise
if response.get('redirect'):
self._download_webpage(
response['redirect'], None, 'Downloading login redirect page',
headers=headers)
if re.search(r'(?i)<form[^>]+id="two-factor-submit"', redirect_page) is not None:
# TODO: Add mechanism to request an SMS or phone call
tfa_token = self._get_tfa_info('two-factor authentication token')
login_step(redirect_page, handle, 'Submitting TFA token', {
'authy_token': tfa_token,
'remember_2fa': 'true',
})
def _prefer_source(self, formats):
try:

View File

@@ -9,7 +9,7 @@ from .common import InfoExtractor
class VierIE(InfoExtractor):
IE_NAME = 'vier'
_VALID_URL = r'https?://(?:www\.)?vier\.be/(?:[^/]+/videos/(?P<display_id>[^/]+)(?:/(?P<id>\d+))?|video/v3/embed/(?P<embed_id>\d+))'
_VALID_URL = r'https?://(?:www\.)?(?P<site>vier|vijf)\.be/(?:[^/]+/videos/(?P<display_id>[^/]+)(?:/(?P<id>\d+))?|video/v3/embed/(?P<embed_id>\d+))'
_TESTS = [{
'url': 'http://www.vier.be/planb/videos/het-wordt-warm-de-moestuin/16129',
'info_dict': {
@@ -23,6 +23,19 @@ class VierIE(InfoExtractor):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.vijf.be/temptationisland/videos/zo-grappig-temptation-island-hosts-moeten-kiezen-tussen-onmogelijke-dilemmas/2561614',
'info_dict': {
'id': '2561614',
'display_id': 'zo-grappig-temptation-island-hosts-moeten-kiezen-tussen-onmogelijke-dilemmas',
'ext': 'mp4',
'title': 'ZO grappig: Temptation Island hosts moeten kiezen tussen onmogelijke dilemma\'s',
'description': 'Het spel is simpel: Annelien Coorevits en Rick Brandsteder krijgen telkens 2 dilemma\'s voorgeschoteld en ze MOETEN een keuze maken.',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.vier.be/planb/videos/mieren-herders-van-de-bladluizen',
'only_matching': True,
@@ -35,6 +48,7 @@ class VierIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
embed_id = mobj.group('embed_id')
display_id = mobj.group('display_id') or embed_id
site = mobj.group('site')
webpage = self._download_webpage(url, display_id)
@@ -43,7 +57,7 @@ class VierIE(InfoExtractor):
webpage, 'video id')
application = self._search_regex(
[r'data-application="([^"]+)"', r'"application"\s*:\s*"([^"]+)"'],
webpage, 'application', default='vier_vod')
webpage, 'application', default=site + '_vod')
filename = self._search_regex(
[r'data-filename="([^"]+)"', r'"filename"\s*:\s*"([^"]+)"'],
webpage, 'filename')
@@ -68,13 +82,19 @@ class VierIE(InfoExtractor):
class VierVideosIE(InfoExtractor):
IE_NAME = 'vier:videos'
_VALID_URL = r'https?://(?:www\.)?vier\.be/(?P<program>[^/]+)/videos(?:\?.*\bpage=(?P<page>\d+)|$)'
_VALID_URL = r'https?://(?:www\.)?(?P<site>vier|vijf)\.be/(?P<program>[^/]+)/videos(?:\?.*\bpage=(?P<page>\d+)|$)'
_TESTS = [{
'url': 'http://www.vier.be/demoestuin/videos',
'info_dict': {
'id': 'demoestuin',
},
'playlist_mincount': 153,
}, {
'url': 'http://www.vijf.be/temptationisland/videos',
'info_dict': {
'id': 'temptationisland',
},
'playlist_mincount': 159,
}, {
'url': 'http://www.vier.be/demoestuin/videos?page=6',
'info_dict': {
@@ -92,6 +112,7 @@ class VierVideosIE(InfoExtractor):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
program = mobj.group('program')
site = mobj.group('site')
page_id = mobj.group('page')
if page_id:
@@ -105,13 +126,13 @@ class VierVideosIE(InfoExtractor):
entries = []
for current_page_id in itertools.count(start_page):
current_page = self._download_webpage(
'http://www.vier.be/%s/videos?page=%d' % (program, current_page_id),
'http://www.%s.be/%s/videos?page=%d' % (site, program, current_page_id),
program,
'Downloading page %d' % (current_page_id + 1))
page_entries = [
self.url_result('http://www.vier.be' + video_url, 'Vier')
self.url_result('http://www.' + site + '.be' + video_url, 'Vier')
for video_url in re.findall(
r'<h3><a href="(/[^/]+/videos/[^/]+(?:/\d+)?)">', current_page)]
r'<h[23]><a href="(/[^/]+/videos/[^/]+(?:/\d+)?)">', current_page)]
entries.extend(page_entries)
if page_id or '>Meer<' not in current_page:
break

View File

@@ -0,0 +1,80 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .brightcove import BrightcoveNewIE
from ..utils import (
int_or_none,
parse_age_limit,
smuggle_url,
unescapeHTML,
)
class VrakIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vrak\.tv/videos\?.*?\btarget=(?P<id>[\d.]+)'
_TEST = {
'url': 'http://www.vrak.tv/videos?target=1.2306782&filtre=emission&id=1.1806721',
'info_dict': {
'id': '5345661243001',
'ext': 'mp4',
'title': 'Obésité, film de hockey et Roseline Filion',
'timestamp': 1488492126,
'upload_date': '20170302',
'uploader_id': '2890187628001',
'creator': 'VRAK.TV',
'age_limit': 8,
'series': 'ALT (Actualité Légèrement Tordue)',
'episode': 'Obésité, film de hockey et Roseline Filion',
'tags': list,
},
'params': {
'skip_download': True,
},
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/2890187628001/default_default/index.html?videoId=%s'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(
r'<h\d\b[^>]+\bclass=["\']videoTitle["\'][^>]*>([^<]+)',
webpage, 'title', default=None) or self._og_search_title(webpage)
content = self._parse_json(
self._search_regex(
r'data-player-options-content=(["\'])(?P<content>{.+?})\1',
webpage, 'content', default='{}', group='content'),
video_id, transform_source=unescapeHTML)
ref_id = content.get('refId') or self._search_regex(
r'refId&quot;:&quot;([^&]+)&quot;', webpage, 'ref id')
brightcove_id = self._search_regex(
r'''(?x)
java\.lang\.String\s+value\s*=\s*["']brightcove\.article\.\d+\.%s
[^>]*
java\.lang\.String\s+value\s*=\s*["'](\d+)
''' % re.escape(ref_id), webpage, 'brightcove id')
return {
'_type': 'url_transparent',
'ie_key': BrightcoveNewIE.ie_key(),
'url': smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
{'geo_countries': ['CA']}),
'id': brightcove_id,
'description': content.get('description'),
'creator': content.get('brand'),
'age_limit': parse_age_limit(content.get('rating')),
'series': content.get('showName') or content.get(
'episodeName'), # this is intentional
'season_number': int_or_none(content.get('seasonNumber')),
'episode': title,
'episode_number': int_or_none(content.get('episodeNumber')),
'tags': content.get('tags', []),
}

View File

@@ -19,9 +19,10 @@ class WDRBaseIE(InfoExtractor):
def _extract_wdr_video(self, webpage, display_id):
# for wdr.de the data-extension is in a tag with the class "mediaLink"
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
# for wdrmaus its in a link to the page in a multiline "videoLink"-tag
# for wdrmaus, in a tag with the class "videoButton" (previously a link
# to the page in a multiline "videoLink"-tag)
json_metadata = self._html_search_regex(
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
webpage, 'media link', default=None, flags=re.MULTILINE)
if not json_metadata:
@@ -32,7 +33,7 @@ class WDRBaseIE(InfoExtractor):
jsonp_url = media_link_obj['mediaObj']['url']
metadata = self._download_json(
jsonp_url, 'metadata', transform_source=strip_jsonp)
jsonp_url, display_id, transform_source=strip_jsonp)
metadata_tracker_data = metadata['trackerData']
metadata_media_resource = metadata['mediaResource']
@@ -161,23 +162,23 @@ class WDRIE(WDRBaseIE):
{
'url': 'http://www.wdrmaus.de/aktuelle-sendung/index.php5',
'info_dict': {
'id': 'mdb-1096487',
'ext': 'flv',
'id': 'mdb-1323501',
'ext': 'mp4',
'upload_date': 're:^[0-9]{8}$',
'title': 're:^Die Sendung mit der Maus vom [0-9.]{10}$',
'description': '- Die Sendung mit der Maus -',
'description': 'Die Seite mit der Maus -',
},
'skip': 'The id changes from week to week because of the new episode'
},
{
'url': 'http://www.wdrmaus.de/sachgeschichten/sachgeschichten/achterbahn.php5',
'url': 'http://www.wdrmaus.de/filme/sachgeschichten/achterbahn.php5',
'md5': '803138901f6368ee497b4d195bb164f2',
'info_dict': {
'id': 'mdb-186083',
'ext': 'mp4',
'upload_date': '20130919',
'title': 'Sachgeschichte - Achterbahn ',
'description': '- Die Sendung mit der Maus -',
'description': 'Die Seite mit der Maus -',
},
},
{
@@ -186,7 +187,7 @@ class WDRIE(WDRBaseIE):
'info_dict': {
'id': 'mdb-869971',
'ext': 'flv',
'title': 'Funkhaus Europa Livestream',
'title': 'COSMO Livestream',
'description': 'md5:2309992a6716c347891c045be50992e4',
'upload_date': '20160101',
},

View File

@@ -773,7 +773,7 @@ def parseOpts(overrideArguments=None):
help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
postproc.add_option(
'--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default; No effect without -x')
help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x')
postproc.add_option(
'--audio-quality', metavar='QUALITY',
dest='audioquality', default='5',

View File

@@ -26,15 +26,25 @@ from ..utils import (
EXT_TO_OUT_FORMATS = {
"aac": "adts",
"m4a": "ipod",
"mka": "matroska",
"mkv": "matroska",
"mpg": "mpeg",
"ogv": "ogg",
"ts": "mpegts",
"wma": "asf",
"wmv": "asf",
'aac': 'adts',
'flac': 'flac',
'm4a': 'ipod',
'mka': 'matroska',
'mkv': 'matroska',
'mpg': 'mpeg',
'ogv': 'ogg',
'ts': 'mpegts',
'wma': 'asf',
'wmv': 'asf',
}
ACODECS = {
'mp3': 'libmp3lame',
'aac': 'aac',
'flac': 'flac',
'm4a': 'aac',
'opus': 'opus',
'vorbis': 'libvorbis',
'wav': None,
}
@@ -237,7 +247,7 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
acodec = 'copy'
extension = 'm4a'
more_opts = ['-bsf:a', 'aac_adtstoasc']
elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
elif filecodec in ['aac', 'flac', 'mp3', 'vorbis', 'opus']:
# Lossless if possible
acodec = 'copy'
extension = filecodec
@@ -256,8 +266,8 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
else:
more_opts += ['-b:a', self._preferredquality + 'k']
else:
# We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
# We convert the audio (lossy if codec is lossy)
acodec = ACODECS[self._preferredcodec]
extension = self._preferredcodec
more_opts = []
if self._preferredquality is not None:

View File

@@ -1748,11 +1748,16 @@ def base_url(url):
def urljoin(base, path):
if isinstance(path, bytes):
path = path.decode('utf-8')
if not isinstance(path, compat_str) or not path:
return None
if re.match(r'^(?:https?:)?//', path):
return path
if not isinstance(base, compat_str) or not re.match(r'^(?:https?:)?//', base):
if isinstance(base, bytes):
base = base.decode('utf-8')
if not isinstance(base, compat_str) or not re.match(
r'^(?:https?:)?//', base):
return None
return compat_urlparse.urljoin(base, path)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.03.02'
__version__ = '2017.03.16'