Compare commits

...

38 Commits

Author SHA1 Message Date
3255fe7141 release 2015.02.09.2 2015-02-09 14:46:30 +01:00
e98b8e79ea [generic] Improve SBS detection (Fixes #4899) 2015-02-09 14:46:10 +01:00
196121c51b release 2015.02.09.1 2015-02-09 10:49:10 +01:00
5269028951 [rtlnow] Add test for @mmue's extension (#4908) 2015-02-09 10:47:19 +01:00
f7bc056b5a Merge remote-tracking branch 'mmue/fix-rtlnow' 2015-02-09 10:44:55 +01:00
a0f7198544 [generic] Add support for jwPlayer YouTube videos
This makes nationalarchives.gov.uk work (Fixes #4907, fixes #4876)
2015-02-09 10:43:01 +01:00
dd8930684e release 2015.02.09 2015-02-09 10:28:16 +01:00
bdb186f3b0 fix rtlnow for newer series like "Der Bachelor" season 5 2015-02-08 21:55:39 +01:00
64f9baa084 [options] Mention asr as possible filter 2015-02-09 01:35:16 +06:00
b29231c040 release 2015.02.08 2015-02-08 20:28:38 +01:00
6128bf07a9 [options] Update help on string comparisons 2015-02-09 01:27:27 +06:00
2ec19e9558 [YoutubeDL] Allow filtering by audio sampling rate 2015-02-09 01:09:45 +06:00
9ddb6925bf [YoutubeDL] Allow filtering by string properties (#4906) 2015-02-09 01:07:43 +06:00
12931e1c6e Credit @robin007bond for tweakers (#4881) and gamekings fixes (#4901) 2015-02-08 23:33:29 +06:00
41c23b0da5 [gamekings] Support videos from news pages 2015-02-08 23:12:59 +06:00
2578ab19e4 Merge branch 'robin007bond-gamekings' 2015-02-08 23:03:31 +06:00
d87ec897e9 [gamekings] Improve extraction 2015-02-08 23:03:12 +06:00
3bd4bffb1c Merge branch 'gamekings' of https://github.com/robin007bond/youtube-dl into robin007bond-gamekings 2015-02-08 22:46:43 +06:00
c36b09a502 [Gamekings] Use thumbnail in return statement 2015-02-08 16:46:13 +01:00
641eb10d34 Use _family_friendly_search for determining age_limit 2015-02-08 17:45:38 +02:00
955c5505e7 [Gamekings] Use xpath
XPath is used for extracting the video url and the thumbnail
2015-02-08 16:44:25 +01:00
69319969de [extractor/common] Add new helper method _family_friendly_search 2015-02-08 17:39:00 +02:00
a14292e848 [soulanime] Remove extractor (#4554)
Was supposed to be deleted by 67c2bcd
2015-02-08 16:57:07 +02:00
5d678df64a [Gamekings] Download playlist
Todo: URL and Thumbnail should be extracted with XPath
2015-02-08 15:34:37 +01:00
8ca8cbe2bd [Gamekings] Check string for vimeo, fix test
The test now doesn't fail anymore. It just checks the string for having
"vimeo" in it, instead of using the method for URL-checking, since it's
returns an error.

The tests don't fail, and the extractor works fine now.
2015-02-08 14:41:14 +01:00
ba322d8209 [Gamekings] Added test and replaced video_url
Quick and dirty fix for the Gamekings extractor. It gives an error about
the video_url, but it downloads it now instead of giving a 404 error on
newer Gamekings videos
2015-02-08 14:23:37 +01:00
2f38289b79 [Gamekings] Fix order of replacement string
Oops.
2015-02-08 13:49:32 +01:00
f23a3ca699 [Gamekings] Fixed typo in URL replacement 2015-02-08 13:47:27 +01:00
77d2b106cc [Gamekings] Fix 404 when large isn't available
When trying to download some GameKings videos, not all worked. This was
because not all videos had a "/large"-URL available. The extractor
checks now if the /large URL is available, if it isn't, it tries to get
the normal URL.
2015-02-08 13:42:41 +01:00
c0e46412e9 [aparat] Fix extraction (Closes #4897) 2015-02-08 17:30:29 +06:00
0161353d7d [test/test_YoutubeDL] Remove debug print call 2015-02-06 23:58:01 +01:00
2b4ecde2c8 [test/YoutubeDL] Add a simple test for postprocesors
Just checks that the 'keepvideo' option works as intended.
2015-02-06 23:54:25 +01:00
b3a286d69d [YoutubeDL] _calc_cookies: add get_header method to _PseudoRequest (#4861) 2015-02-06 22:23:06 +01:00
467d3c9a0c [ffmpeg] --extrac-audio: Use the same options for avconv and ffmpeg
They have been available in ffmpeg since version 0.9, and we require 1.0 or higher.
2015-02-06 22:05:11 +01:00
ad5747bad1 [rtp] Construct regular HTTP download URLs (#4882) 2015-02-06 23:00:54 +02:00
d6eb66ed3c [aftenposten] Add extractor (Closes #4863) 2015-02-07 01:46:54 +06:00
7f2a9f1b49 [tvigle] Add support for cloud URLs (Closes #4887) 2015-02-06 21:15:01 +06:00
1e1896f2de [extractor/common] Correct sort order.
We should look at height and width before ext_preference.
2015-02-06 15:16:45 +01:00
21 changed files with 357 additions and 164 deletions

View File

@ -109,3 +109,4 @@ David Luhmer
Shaya Goldberg Shaya Goldberg
Paul Hartmann Paul Hartmann
Frans de Jonge Frans de Jonge
Robin de Rooij

View File

@ -292,18 +292,20 @@ which means you can modify it, redistribute it or use it however you like.
video results by putting a condition in video results by putting a condition in
brackets, as in -f "best[height=720]" (or brackets, as in -f "best[height=720]" (or
-f "[filesize>10M]"). This works for -f "[filesize>10M]"). This works for
filesize, height, width, tbr, abr, vbr, and filesize, height, width, tbr, abr, vbr,
fps and the comparisons <, <=, >, >=, =, != asr, and fps and the comparisons <, <=, >,
. Formats for which the value is not known >=, =, != and for ext, acodec, vcodec,
are excluded unless you put a question mark container, and protocol and the comparisons
(?) after the operator. You can combine =, != . Formats for which the value is not
format filters, so -f "[height <=? known are excluded unless you put a
720][tbr>500]" selects up to 720p videos question mark (?) after the operator. You
(or videos where the height is not known) can combine format filters, so -f "[height
with a bitrate of at least 500 KBit/s. By <=? 720][tbr>500]" selects up to 720p
default, youtube-dl will pick the best videos (or videos where the height is not
quality. Use commas to download multiple known) with a bitrate of at least 500
audio formats, such as -f KBit/s. By default, youtube-dl will pick
the best quality. Use commas to download
multiple audio formats, such as -f
136/137/mp4/bestvideo,140/m4a/bestaudio. 136/137/mp4/bestvideo,140/m4a/bestaudio.
You can merge the video and audio of two You can merge the video and audio of two
formats into a single file using -f <video- formats into a single file using -f <video-

View File

@ -14,6 +14,7 @@
- **AddAnime** - **AddAnime**
- **AdobeTV** - **AdobeTV**
- **AdultSwim** - **AdultSwim**
- **Aftenposten**
- **Aftonbladet** - **Aftonbladet**
- **AlJazeera** - **AlJazeera**
- **Allocine** - **Allocine**

View File

@ -13,6 +13,7 @@ import copy
from test.helper import FakeYDL, assertRegexpMatches from test.helper import FakeYDL, assertRegexpMatches
from youtube_dl import YoutubeDL from youtube_dl import YoutubeDL
from youtube_dl.extractor import YoutubeIE from youtube_dl.extractor import YoutubeIE
from youtube_dl.postprocessor.common import PostProcessor
class YDL(FakeYDL): class YDL(FakeYDL):
@ -370,5 +371,35 @@ class TestFormatSelection(unittest.TestCase):
'vbr': 10, 'vbr': 10,
}), '^\s*10k$') }), '^\s*10k$')
def test_postprocessors(self):
filename = 'post-processor-testfile.mp4'
audiofile = filename + '.mp3'
class SimplePP(PostProcessor):
def run(self, info):
with open(audiofile, 'wt') as f:
f.write('EXAMPLE')
info['filepath']
return False, info
def run_pp(params):
with open(filename, 'wt') as f:
f.write('EXAMPLE')
ydl = YoutubeDL(params)
ydl.add_post_processor(SimplePP())
ydl.post_process(filename, {'filepath': filename})
run_pp({'keepvideo': True})
self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(filename)
os.unlink(audiofile)
run_pp({'keepvideo': False})
self.assertFalse(os.path.exists(filename), '%s exists' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(audiofile)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -826,15 +826,13 @@ class YoutubeDL(object):
'!=': operator.ne, '!=': operator.ne,
} }
operator_rex = re.compile(r'''(?x)\s*\[ operator_rex = re.compile(r'''(?x)\s*\[
(?P<key>width|height|tbr|abr|vbr|filesize|fps) (?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s* \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?) (?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
\]$ \]$
''' % '|'.join(map(re.escape, OPERATORS.keys()))) ''' % '|'.join(map(re.escape, OPERATORS.keys())))
m = operator_rex.search(format_spec) m = operator_rex.search(format_spec)
if not m: if m:
raise ValueError('Invalid format specification %r' % format_spec)
try: try:
comparison_value = int(m.group('value')) comparison_value = int(m.group('value'))
except ValueError: except ValueError:
@ -847,6 +845,25 @@ class YoutubeDL(object):
m.group('value'), format_spec)) m.group('value'), format_spec))
op = OPERATORS[m.group('op')] op = OPERATORS[m.group('op')]
if not m:
STR_OPERATORS = {
'=': operator.eq,
'!=': operator.ne,
}
str_operator_rex = re.compile(r'''(?x)\s*\[
\s*(?P<key>ext|acodec|vcodec|container|protocol)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9_-]+)
\s*\]$
''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.search(format_spec)
if m:
comparison_value = m.group('value')
op = STR_OPERATORS[m.group('op')]
if not m:
raise ValueError('Invalid format specification %r' % format_spec)
def _filter(f): def _filter(f):
actual_value = f.get(m.group('key')) actual_value = f.get(m.group('key'))
if actual_value is None: if actual_value is None:
@ -938,6 +955,9 @@ class YoutubeDL(object):
def has_header(self, h): def has_header(self, h):
return h in self.headers return h in self.headers
def get_header(self, h, default=None):
return self.headers.get(h, default)
pr = _PseudoRequest(info_dict['url']) pr = _PseudoRequest(info_dict['url'])
self.cookiejar.add_cookie_header(pr) self.cookiejar.add_cookie_header(pr)
return pr.headers.get('Cookie') return pr.headers.get('Cookie')

View File

@ -6,6 +6,7 @@ from .academicearth import AcademicEarthCourseIE
from .addanime import AddAnimeIE from .addanime import AddAnimeIE
from .adobetv import AdobeTVIE from .adobetv import AdobeTVIE
from .adultswim import AdultSwimIE from .adultswim import AdultSwimIE
from .aftenposten import AftenpostenIE
from .aftonbladet import AftonbladetIE from .aftonbladet import AftonbladetIE
from .aljazeera import AlJazeeraIE from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE from .alphaporno import AlphaPornoIE

View File

@ -0,0 +1,103 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
xpath_with_ns,
xpath_text,
find_xpath_attr,
)
class AftenpostenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
_TEST = {
'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=&section=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
'md5': 'fd828cd29774a729bf4d4425fe192972',
'info_dict': {
'id': '21039',
'ext': 'mov',
'title': 'TRAILER: "Sweatshop" - I can´t take any more',
'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
'timestamp': 1416927969,
'upload_date': '20141125',
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(
r'data-xs-id="(\d+)"', webpage, 'video id')
data = self._download_xml(
'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
NS_MAP = {
'atom': 'http://www.w3.org/2005/Atom',
'xt': 'http://xstream.dk/',
'media': 'http://search.yahoo.com/mrss/',
}
entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
title = xpath_text(
entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
description = xpath_text(
entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
timestamp = parse_iso8601(xpath_text(
entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
formats = []
media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
media_url = media_content.get('url')
if not media_url:
continue
tbr = int_or_none(media_content.get('bitrate'))
mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
if mobj:
formats.append({
'url': mobj.group('url'),
'play_path': 'mp4:%s' % mobj.group('playpath'),
'app': mobj.group('app'),
'ext': 'flv',
'tbr': tbr,
'format_id': 'rtmp-%d' % tbr,
})
else:
formats.append({
'url': media_url,
'tbr': tbr,
})
self._sort_formats(formats)
link = find_xpath_attr(
entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
if link is not None:
formats.append({
'url': link.get('href'),
'format_id': link.get('rel'),
})
thumbnails = [{
'url': splash.get('url'),
'width': int_or_none(splash.get('width')),
'height': int_or_none(splash.get('height')),
} for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'formats': formats,
'thumbnails': thumbnails,
}

View File

@ -20,6 +20,7 @@ class AparatIE(InfoExtractor):
'id': 'wP8On', 'id': 'wP8On',
'ext': 'mp4', 'ext': 'mp4',
'title': 'تیم گلکسی 11 - زومیت', 'title': 'تیم گلکسی 11 - زومیت',
'age_limit': 0,
}, },
# 'skip': 'Extremely unreliable', # 'skip': 'Extremely unreliable',
} }
@ -34,7 +35,8 @@ class AparatIE(InfoExtractor):
video_id + '/vt/frame') video_id + '/vt/frame')
webpage = self._download_webpage(embed_url, video_id) webpage = self._download_webpage(embed_url, video_id)
video_urls = re.findall(r'fileList\[[0-9]+\]\s*=\s*"([^"]+)"', webpage) video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
for i, video_url in enumerate(video_urls): for i, video_url in enumerate(video_urls):
req = HEADRequest(video_url) req = HEADRequest(video_url)
res = self._request_webpage( res = self._request_webpage(
@ -46,7 +48,7 @@ class AparatIE(InfoExtractor):
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title') title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
thumbnail = self._search_regex( thumbnail = self._search_regex(
r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False) r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
return { return {
'id': video_id, 'id': video_id,
@ -54,4 +56,5 @@ class AparatIE(InfoExtractor):
'url': video_url, 'url': video_url,
'ext': 'mp4', 'ext': 'mp4',
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'age_limit': self._family_friendly_search(webpage),
} }

View File

@ -656,6 +656,21 @@ class InfoExtractor(object):
} }
return RATING_TABLE.get(rating.lower(), None) return RATING_TABLE.get(rating.lower(), None)
def _family_friendly_search(self, html):
# See http://schema.org/VideoObj
family_friendly = self._html_search_meta('isFamilyFriendly', html)
if not family_friendly:
return None
RATING_TABLE = {
'1': 0,
'true': 0,
'0': 18,
'false': 18,
}
return RATING_TABLE.get(family_friendly.lower(), None)
def _twitter_search_player(self, html): def _twitter_search_player(self, html):
return self._html_search_meta('twitter:player', html, return self._html_search_meta('twitter:player', html,
'twitter card player') 'twitter card player')
@ -707,9 +722,9 @@ class InfoExtractor(object):
f.get('quality') if f.get('quality') is not None else -1, f.get('quality') if f.get('quality') is not None else -1,
f.get('tbr') if f.get('tbr') is not None else -1, f.get('tbr') if f.get('tbr') is not None else -1,
f.get('vbr') if f.get('vbr') is not None else -1, f.get('vbr') if f.get('vbr') is not None else -1,
ext_preference,
f.get('height') if f.get('height') is not None else -1, f.get('height') if f.get('height') is not None else -1,
f.get('width') if f.get('width') is not None else -1, f.get('width') if f.get('width') is not None else -1,
ext_preference,
f.get('abr') if f.get('abr') is not None else -1, f.get('abr') if f.get('abr') is not None else -1,
audio_ext_preference, audio_ext_preference,
f.get('fps') if f.get('fps') is not None else -1, f.get('fps') if f.get('fps') is not None else -1,

View File

@ -1,41 +1,67 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
xpath_text,
xpath_with_ns,
)
class GamekingsIE(InfoExtractor): class GamekingsIE(InfoExtractor):
_VALID_URL = r'http://www\.gamekings\.tv/videos/(?P<name>[0-9a-z\-]+)' _VALID_URL = r'http://www\.gamekings\.tv/(?:videos|nieuws)/(?P<id>[^/]+)'
_TEST = { _TESTS = [{
'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/', 'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/',
# MD5 is flaky, seems to change regularly # MD5 is flaky, seems to change regularly
# 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3', # 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3',
'info_dict': { 'info_dict': {
'id': '20130811', 'id': 'phoenix-wright-ace-attorney-dual-destinies-review',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review', 'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
'description': 'md5:36fd701e57e8c15ac8682a2374c99731', 'description': 'md5:36fd701e57e8c15ac8682a2374c99731',
} 'thumbnail': 're:^https?://.*\.jpg$',
} },
}, {
# vimeo video
'url': 'http://www.gamekings.tv/videos/the-legend-of-zelda-majoras-mask/',
'md5': '12bf04dfd238e70058046937657ea68d',
'info_dict': {
'id': 'the-legend-of-zelda-majoras-mask',
'ext': 'mp4',
'title': 'The Legend of Zelda: Majoras Mask',
'description': 'md5:9917825fe0e9f4057601fe1e38860de3',
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.gamekings.tv/nieuws/gamekings-extra-shelly-en-david-bereiden-zich-voor-op-de-livestream/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url) webpage = self._download_webpage(url, video_id)
name = mobj.group('name')
webpage = self._download_webpage(url, name)
video_url = self._og_search_video_url(webpage)
video = re.search(r'[0-9]+', video_url) playlist_id = self._search_regex(
video_id = video.group(0) r'gogoVideo\(\s*\d+\s*,\s*"([^"]+)', webpage, 'playlist id')
# Todo: add medium format playlist = self._download_xml(
video_url = video_url.replace(video_id, 'large/' + video_id) 'http://www.gamekings.tv/wp-content/themes/gk2010/rss_playlist.php?id=%s' % playlist_id,
video_id)
NS_MAP = {
'jwplayer': 'http://rss.jwpcdn.com/'
}
item = playlist.find('./channel/item')
thumbnail = xpath_text(item, xpath_with_ns('./jwplayer:image', NS_MAP), 'thumbnail')
video_url = item.find(xpath_with_ns('./jwplayer:source', NS_MAP)).get('file')
return { return {
'id': video_id, 'id': video_id,
'ext': 'mp4',
'url': video_url, 'url': video_url,
'title': self._og_search_title(webpage), 'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage), 'description': self._og_search_description(webpage),
'thumbnail': thumbnail,
} }

View File

@ -524,6 +524,19 @@ class GenericIE(InfoExtractor):
'upload_date': '20150126', 'upload_date': '20150126',
}, },
'add_ie': ['Viddler'], 'add_ie': ['Viddler'],
},
# jwplayer YouTube
{
'url': 'http://media.nationalarchives.gov.uk/index.php/webinar-using-discovery-national-archives-online-catalogue/',
'info_dict': {
'id': 'Mrj4DVp2zeA',
'ext': 'mp4',
'upload_date': '20150204',
'uploader': 'The National Archives UK',
'description': 'md5:a236581cd2449dd2df4f93412f3f01c6',
'uploader_id': 'NationalArchives08',
'title': 'Webinar: Using Discovery, The National Archives online catalogue',
},
} }
] ]
@ -1034,7 +1047,12 @@ class GenericIE(InfoExtractor):
# Look for embedded sbs.com.au player # Look for embedded sbs.com.au player
mobj = re.search( mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:www\.)sbs\.com\.au/ondemand/video/single/.+?)\1', r'''(?x)
(?:
<meta\s+property="og:video"\s+content=|
<iframe[^>]+?src=
)
(["\'])(?P<url>https?://(?:www\.)?sbs\.com\.au/ondemand/video/.+?)\1''',
webpage) webpage)
if mobj is not None: if mobj is not None:
return self.url_result(mobj.group('url'), 'SBS') return self.url_result(mobj.group('url'), 'SBS')
@ -1065,6 +1083,8 @@ class GenericIE(InfoExtractor):
return self.url_result(mobj.group('url'), 'Livestream') return self.url_result(mobj.group('url'), 'Livestream')
def check_video(vurl): def check_video(vurl):
if YoutubeIE.suitable(vurl):
return True
vpath = compat_urlparse.urlparse(vurl).path vpath = compat_urlparse.urlparse(vurl).path
vext = determine_ext(vpath) vext = determine_ext(vpath)
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml') return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml')
@ -1082,7 +1102,8 @@ class GenericIE(InfoExtractor):
JWPlayerOptions| JWPlayerOptions|
jwplayer\s*\(\s*["'][^'"]+["']\s*\)\s*\.setup jwplayer\s*\(\s*["'][^'"]+["']\s*\)\s*\.setup
) )
.*?file\s*:\s*["\'](.*?)["\']''', webpage)) .*?
['"]?file['"]?\s*:\s*["\'](.*?)["\']''', webpage))
if not found: if not found:
# Broaden the search a little bit # Broaden the search a little bit
found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage)) found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage))

View File

@ -34,8 +34,6 @@ class GoshgayIE(InfoExtractor):
duration = parse_duration(self._html_search_regex( duration = parse_duration(self._html_search_regex(
r'<span class="duration">\s*-?\s*(.*?)</span>', r'<span class="duration">\s*-?\s*(.*?)</span>',
webpage, 'duration', fatal=False)) webpage, 'duration', fatal=False))
family_friendly = self._html_search_meta(
'isFamilyFriendly', webpage, default='false')
flashvars = compat_parse_qs(self._html_search_regex( flashvars = compat_parse_qs(self._html_search_regex(
r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"', r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"',
@ -49,5 +47,5 @@ class GoshgayIE(InfoExtractor):
'title': title, 'title': title,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'duration': duration, 'duration': duration,
'age_limit': 0 if family_friendly == 'true' else 18, 'age_limit': self._family_friendly_search(webpage),
} }

View File

@ -80,9 +80,6 @@ class IzleseneIE(InfoExtractor):
r'comment_count\s*=\s*\'([^\']+)\';', r'comment_count\s*=\s*\'([^\']+)\';',
webpage, 'comment_count', fatal=False) webpage, 'comment_count', fatal=False)
family_friendly = self._html_search_meta(
'isFamilyFriendly', webpage, 'age limit', fatal=False)
content_url = self._html_search_meta( content_url = self._html_search_meta(
'contentURL', webpage, 'content URL', fatal=False) 'contentURL', webpage, 'content URL', fatal=False)
ext = determine_ext(content_url, 'mp4') ext = determine_ext(content_url, 'mp4')
@ -120,6 +117,6 @@ class IzleseneIE(InfoExtractor):
'duration': duration, 'duration': duration,
'view_count': int_or_none(view_count), 'view_count': int_or_none(view_count),
'comment_count': int_or_none(comment_count), 'comment_count': int_or_none(comment_count),
'age_limit': 18 if family_friendly == 'False' else 0, 'age_limit': self._family_friendly_search(webpage),
'formats': formats, 'formats': formats,
} }

View File

@ -91,6 +91,15 @@ class RTLnowIE(InfoExtractor):
}, },
}, },
{ {
'url': 'http://rtl-now.rtl.de/der-bachelor/folge-4.php?film_id=188729&player=1&season=5',
'info_dict': {
'id': '188729',
'ext': 'flv',
'upload_date': '20150204',
'description': 'md5:5e1ce23095e61a79c166d134b683cecc',
'title': 'Der Bachelor - Folge 4',
}
}, {
'url': 'http://www.n-tvnow.de/deluxe-alles-was-spass-macht/thema-ua-luxushotel-fuer-vierbeiner.php?container_id=153819&player=1&season=0', 'url': 'http://www.n-tvnow.de/deluxe-alles-was-spass-macht/thema-ua-luxushotel-fuer-vierbeiner.php?container_id=153819&player=1&season=0',
'only_matching': True, 'only_matching': True,
}, },
@ -133,6 +142,15 @@ class RTLnowIE(InfoExtractor):
'page_url': video_page_url, 'page_url': video_page_url,
'player_url': video_page_url + 'includes/vodplayer.swf', 'player_url': video_page_url + 'includes/vodplayer.swf',
} }
else:
mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text)
if mobj:
fmt = {
'url': 'rtmpe://fmspay-fra2.rtl.de/' + mobj.group('hoster'),
'play_path': 'mp4:' + mobj.group('play_path'),
'page_url': url,
'player_url': video_page_url + 'includes/vodplayer.swf',
}
else: else:
fmt = { fmt = {
'url': filename.text, 'url': filename.text,

View File

@ -1,16 +1,16 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import js_to_json
class RTPIE(InfoExtractor): class RTPIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?' _VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?'
_TESTS = [{ _TESTS = [{
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas', 'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
'md5': 'e736ce0c665e459ddb818546220b4ef8',
'info_dict': { 'info_dict': {
'id': 'e174042', 'id': 'e174042',
'ext': 'mp3', 'ext': 'mp3',
@ -18,9 +18,6 @@ class RTPIE(InfoExtractor):
'description': 'As paixões musicais de António Cartaxo e António Macedo', 'description': 'As paixões musicais de António Cartaxo e António Macedo',
'thumbnail': 're:^https?://.*\.jpg', 'thumbnail': 're:^https?://.*\.jpg',
}, },
'params': {
'skip_download': True, # RTMP download
},
}, { }, {
'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas', 'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
'only_matching': True, 'only_matching': True,
@ -37,21 +34,48 @@ class RTPIE(InfoExtractor):
player_config = self._search_regex( player_config = self._search_regex(
r'(?s)RTPPLAY\.player\.newPlayer\(\s*(\{.*?\})\s*\)', webpage, 'player config') r'(?s)RTPPLAY\.player\.newPlayer\(\s*(\{.*?\})\s*\)', webpage, 'player config')
config = json.loads(js_to_json(player_config)) config = self._parse_json(player_config, video_id)
path, ext = config.get('file').rsplit('.', 1) path, ext = config.get('file').rsplit('.', 1)
formats = [{ formats = [{
'format_id': 'rtmp',
'ext': ext,
'vcodec': config.get('type') == 'audio' and 'none' or None,
'preference': -2,
'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
'app': config.get('application'), 'app': config.get('application'),
'play_path': '{ext:s}:{path:s}'.format(ext=ext, path=path), 'play_path': '{ext:s}:{path:s}'.format(ext=ext, path=path),
'page_url': url, 'page_url': url,
'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
'rtmp_live': config.get('live', False), 'rtmp_live': config.get('live', False),
'ext': ext,
'vcodec': config.get('type') == 'audio' and 'none' or None,
'player_url': 'http://programas.rtp.pt/play/player.swf?v3', 'player_url': 'http://programas.rtp.pt/play/player.swf?v3',
'rtmp_real_time': True, 'rtmp_real_time': True,
}] }]
# Construct regular HTTP download URLs
replacements = {
'audio': {
'format_id': 'mp3',
'pattern': r'^nas2\.share/wavrss/',
'repl': 'http://rsspod.rtp.pt/podcasts/',
'vcodec': 'none',
},
'video': {
'format_id': 'mp4_h264',
'pattern': r'^nas2\.share/h264/',
'repl': 'http://rsspod.rtp.pt/videocasts/',
'vcodec': 'h264',
},
}
r = replacements[config['type']]
if re.match(r['pattern'], config['file']) is not None:
formats.append({
'format_id': r['format_id'],
'url': re.sub(r['pattern'], r['repl'], config['file']),
'vcodec': r['vcodec'],
})
self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,

View File

@ -1,80 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
HEADRequest,
urlhandle_detect_ext,
)
class SoulAnimeWatchingIE(InfoExtractor):
IE_NAME = "soulanime:watching"
IE_DESC = "SoulAnime video"
_TEST = {
'url': 'http://www.soul-anime.net/watching/seirei-tsukai-no-blade-dance-episode-9/',
'md5': '05fae04abf72298098b528e98abf4298',
'info_dict': {
'id': 'seirei-tsukai-no-blade-dance-episode-9',
'ext': 'mp4',
'title': 'seirei-tsukai-no-blade-dance-episode-9',
'description': 'seirei-tsukai-no-blade-dance-episode-9'
}
}
_VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/watch[^/]*/(?P<id>[^/]+)'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
domain = mobj.group('domain')
page = self._download_webpage(url, video_id)
video_url_encoded = self._html_search_regex(
r'<div id="download">[^<]*<a href="(?P<url>[^"]+)"', page, 'url')
video_url = "http://www.soul-anime." + domain + video_url_encoded
ext_req = HEADRequest(video_url)
ext_handle = self._request_webpage(
ext_req, video_id, note='Determining extension')
ext = urlhandle_detect_ext(ext_handle)
return {
'id': video_id,
'url': video_url,
'ext': ext,
'title': video_id,
'description': video_id
}
class SoulAnimeSeriesIE(InfoExtractor):
IE_NAME = "soulanime:series"
IE_DESC = "SoulAnime Series"
_VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/anime./(?P<id>[^/]+)'
_EPISODE_REGEX = r'<option value="(/watch[^/]*/[^"]+)">[^<]*</option>'
_TEST = {
'url': 'http://www.soul-anime.net/anime1/black-rock-shooter-tv/',
'info_dict': {
'id': 'black-rock-shooter-tv'
},
'playlist_count': 8
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
series_id = mobj.group('id')
domain = mobj.group('domain')
pattern = re.compile(self._EPISODE_REGEX)
page = self._download_webpage(url, series_id, "Downloading series page")
mobj = pattern.findall(page)
entries = [self.url_result("http://www.soul-anime." + domain + obj) for obj in mobj]
return self.playlist_result(entries, series_id)

View File

@ -15,7 +15,8 @@ class TeamcocoIE(InfoExtractor):
'id': '80187', 'id': '80187',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Conan Becomes A Mary Kay Beauty Consultant', 'title': 'Conan Becomes A Mary Kay Beauty Consultant',
'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.' 'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.',
'age_limit': 0,
} }
}, { }, {
'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush', 'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush',
@ -24,7 +25,8 @@ class TeamcocoIE(InfoExtractor):
'id': '19705', 'id': '19705',
'ext': 'mp4', 'ext': 'mp4',
"description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.", "description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.",
"title": "Louis C.K. Interview Pt. 1 11/3/11" "title": "Louis C.K. Interview Pt. 1 11/3/11",
'age_limit': 0,
} }
} }
] ]
@ -83,4 +85,5 @@ class TeamcocoIE(InfoExtractor):
'title': self._og_search_title(webpage), 'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage), 'description': self._og_search_description(webpage),
'age_limit': self._family_friendly_search(webpage),
} }

View File

@ -1,6 +1,8 @@
# encoding: utf-8 # encoding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
float_or_none, float_or_none,
@ -11,7 +13,7 @@ from ..utils import (
class TvigleIE(InfoExtractor): class TvigleIE(InfoExtractor):
IE_NAME = 'tvigle' IE_NAME = 'tvigle'
IE_DESC = 'Интернет-телевидение Tvigle.ru' IE_DESC = 'Интернет-телевидение Tvigle.ru'
_VALID_URL = r'http://(?:www\.)?tvigle\.ru/(?:[^/]+/)+(?P<id>[^/]+)/$' _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'
_TESTS = [ _TESTS = [
{ {
@ -38,16 +40,22 @@ class TvigleIE(InfoExtractor):
'duration': 186.080, 'duration': 186.080,
'age_limit': 0, 'age_limit': 0,
}, },
}, }, {
'url': 'https://cloud.tvigle.ru/video/5267604/',
'only_matching': True,
}
] ]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('display_id')
if not video_id:
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex( video_id = self._html_search_regex(
r'<li class="video-preview current_playing" id="(\d+)">', webpage, 'video id') r'<li class="video-preview current_playing" id="(\d+)">',
webpage, 'video id')
video_data = self._download_json( video_data = self._download_json(
'http://cloud.tvigle.ru/api/play/video/%s/' % video_id, display_id) 'http://cloud.tvigle.ru/api/play/video/%s/' % video_id, display_id)

View File

@ -297,8 +297,10 @@ def parseOpts(overrideArguments=None):
' You can filter the video results by putting a condition in' ' You can filter the video results by putting a condition in'
' brackets, as in -f "best[height=720]"' ' brackets, as in -f "best[height=720]"'
' (or -f "[filesize>10M]"). ' ' (or -f "[filesize>10M]"). '
' This works for filesize, height, width, tbr, abr, vbr, and fps' ' This works for filesize, height, width, tbr, abr, vbr, asr, and fps'
' and the comparisons <, <=, >, >=, =, != .' ' and the comparisons <, <=, >, >=, =, !='
' and for ext, acodec, vcodec, container, and protocol'
' and the comparisons =, != .'
' Formats for which the value is not known are excluded unless you' ' Formats for which the value is not known are excluded unless you'
' put a question mark (?) after the operator.' ' put a question mark (?) after the operator.'
' You can combine format filters, so ' ' You can combine format filters, so '

View File

@ -166,14 +166,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
if filecodec is None: if filecodec is None:
raise PostProcessingError('WARNING: unable to obtain file audio codec with ffprobe') raise PostProcessingError('WARNING: unable to obtain file audio codec with ffprobe')
uses_avconv = self._uses_avconv()
more_opts = [] more_opts = []
if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'): if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']: if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']:
# Lossless, but in another container # Lossless, but in another container
acodec = 'copy' acodec = 'copy'
extension = 'm4a' extension = 'm4a'
more_opts = ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc'] more_opts = ['-bsf:a', 'aac_adtstoasc']
elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']: elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
# Lossless if possible # Lossless if possible
acodec = 'copy' acodec = 'copy'
@ -189,9 +188,9 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
more_opts = [] more_opts = []
if self._preferredquality is not None: if self._preferredquality is not None:
if int(self._preferredquality) < 10: if int(self._preferredquality) < 10:
more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality] more_opts += ['-q:a', self._preferredquality]
else: else:
more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k'] more_opts += ['-b:a', self._preferredquality + 'k']
else: else:
# We convert the audio (lossy) # We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec] acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
@ -200,13 +199,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
if self._preferredquality is not None: if self._preferredquality is not None:
# The opus codec doesn't support the -aq option # The opus codec doesn't support the -aq option
if int(self._preferredquality) < 10 and extension != 'opus': if int(self._preferredquality) < 10 and extension != 'opus':
more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality] more_opts += ['-q:a', self._preferredquality]
else: else:
more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k'] more_opts += ['-b:a', self._preferredquality + 'k']
if self._preferredcodec == 'aac': if self._preferredcodec == 'aac':
more_opts += ['-f', 'adts'] more_opts += ['-f', 'adts']
if self._preferredcodec == 'm4a': if self._preferredcodec == 'm4a':
more_opts += ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc'] more_opts += ['-bsf:a', 'aac_adtstoasc']
if self._preferredcodec == 'vorbis': if self._preferredcodec == 'vorbis':
extension = 'ogg' extension = 'ogg'
if self._preferredcodec == 'wav': if self._preferredcodec == 'wav':

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2015.02.06' __version__ = '2015.02.09.2'