Compare commits

..

47 Commits

Author SHA1 Message Date
Philipp Hagemeister
b29231c040 release 2015.02.08 2015-02-08 20:28:38 +01:00
Sergey M․
6128bf07a9 [options] Update help on string comparisons 2015-02-09 01:27:27 +06:00
Sergey M․
2ec19e9558 [YoutubeDL] Allow filtering by audio sampling rate 2015-02-09 01:09:45 +06:00
Sergey M․
9ddb6925bf [YoutubeDL] Allow filtering by string properties (#4906) 2015-02-09 01:07:43 +06:00
Sergey M․
12931e1c6e Credit @robin007bond for tweakers (#4881) and gamekings fixes (#4901) 2015-02-08 23:33:29 +06:00
Sergey M․
41c23b0da5 [gamekings] Support videos from news pages 2015-02-08 23:12:59 +06:00
Sergey M․
2578ab19e4 Merge branch 'robin007bond-gamekings' 2015-02-08 23:03:31 +06:00
Sergey M․
d87ec897e9 [gamekings] Improve extraction 2015-02-08 23:03:12 +06:00
Sergey M․
3bd4bffb1c Merge branch 'gamekings' of https://github.com/robin007bond/youtube-dl into robin007bond-gamekings 2015-02-08 22:46:43 +06:00
robin
c36b09a502 [Gamekings] Use thumbnail in return statement 2015-02-08 16:46:13 +01:00
Naglis Jonaitis
641eb10d34 Use _family_friendly_search for determining age_limit 2015-02-08 17:45:38 +02:00
robin
955c5505e7 [Gamekings] Use xpath
XPath is used for extracting the video url and the thumbnail
2015-02-08 16:44:25 +01:00
Naglis Jonaitis
69319969de [extractor/common] Add new helper method _family_friendly_search 2015-02-08 17:39:00 +02:00
Naglis Jonaitis
a14292e848 [soulanime] Remove extractor (#4554)
Was supposed to be deleted by 67c2bcd
2015-02-08 16:57:07 +02:00
robin
5d678df64a [Gamekings] Download playlist
Todo: URL and Thumbnail should be extracted with XPath
2015-02-08 15:34:37 +01:00
robin
8ca8cbe2bd [Gamekings] Check string for vimeo, fix test
The test now doesn't fail anymore. It just checks the string for having
"vimeo" in it, instead of using the method for URL-checking, since it's
returns an error.

The tests don't fail, and the extractor works fine now.
2015-02-08 14:41:14 +01:00
robin
ba322d8209 [Gamekings] Added test and replaced video_url
Quick and dirty fix for the Gamekings extractor. It gives an error about
the video_url, but it downloads it now instead of giving a 404 error on
newer Gamekings videos
2015-02-08 14:23:37 +01:00
robin
2f38289b79 [Gamekings] Fix order of replacement string
Oops.
2015-02-08 13:49:32 +01:00
robin
f23a3ca699 [Gamekings] Fixed typo in URL replacement 2015-02-08 13:47:27 +01:00
robin
77d2b106cc [Gamekings] Fix 404 when large isn't available
When trying to download some GameKings videos, not all worked. This was
because not all videos had a "/large"-URL available. The extractor
checks now if the /large URL is available, if it isn't, it tries to get
the normal URL.
2015-02-08 13:42:41 +01:00
Sergey M․
c0e46412e9 [aparat] Fix extraction (Closes #4897) 2015-02-08 17:30:29 +06:00
Jaime Marquínez Ferrándiz
0161353d7d [test/test_YoutubeDL] Remove debug print call 2015-02-06 23:58:01 +01:00
Jaime Marquínez Ferrándiz
2b4ecde2c8 [test/YoutubeDL] Add a simple test for postprocesors
Just checks that the 'keepvideo' option works as intended.
2015-02-06 23:54:25 +01:00
Jaime Marquínez Ferrándiz
b3a286d69d [YoutubeDL] _calc_cookies: add get_header method to _PseudoRequest (#4861) 2015-02-06 22:23:06 +01:00
Jaime Marquínez Ferrándiz
467d3c9a0c [ffmpeg] --extrac-audio: Use the same options for avconv and ffmpeg
They have been available in ffmpeg since version 0.9, and we require 1.0 or higher.
2015-02-06 22:05:11 +01:00
Naglis Jonaitis
ad5747bad1 [rtp] Construct regular HTTP download URLs (#4882) 2015-02-06 23:00:54 +02:00
Sergey M․
d6eb66ed3c [aftenposten] Add extractor (Closes #4863) 2015-02-07 01:46:54 +06:00
Sergey M․
7f2a9f1b49 [tvigle] Add support for cloud URLs (Closes #4887) 2015-02-06 21:15:01 +06:00
Philipp Hagemeister
1e1896f2de [extractor/common] Correct sort order.
We should look at height and width before ext_preference.
2015-02-06 15:16:45 +01:00
Philipp Hagemeister
c831973366 release 2015.02.06 2015-02-06 14:38:30 +01:00
Naglis Jonaitis
1a2548d9e9 [rtp] Pass --realtime to rtmpdump (Fixes #4882)
A workaround for video jumping back in time.
2015-02-06 13:44:46 +02:00
Sergey M․
3900eec27c [extractor/common] Fix 2.0 manifest extraction (Closes #4830) 2015-02-06 04:29:29 +06:00
Sergey M․
a02d212638 Merge branch 'robin007bond-tweakers' 2015-02-06 03:23:56 +06:00
Sergey M․
9c91a8fa70 [tweakers] Switch extraction to xspf playlist, extract all formats and meta (#4881) 2015-02-06 03:23:42 +06:00
Sergey M․
41469f335e Merge branch 'tweakers' of https://github.com/robin007bond/youtube-dl into robin007bond-tweakers 2015-02-06 02:59:33 +06:00
robin
67ce4f8820 Use match_id method instead of splitted URL 2015-02-05 21:49:13 +01:00
robin
bc63d56cca Remove unnecessary TODO-comments 2015-02-05 21:40:18 +01:00
robin
c893d70805 Remove player-url in tweakers.py
Player-url only needed for rmftp, not for regular URLs
2015-02-05 21:38:35 +01:00
robin
3ee6e02564 Edit Tweakers extractor
Fixed code conventions (mainly adding two or more spaces before making
an inline comment)
2015-02-05 19:59:36 +01:00
robin
e3aaace400 [tweakers] Add new extractor 2015-02-05 19:55:41 +01:00
Sergey M․
300753a069 [YoutubeDL] Fix video+audio format field (Closes #4880) 2015-02-06 00:51:16 +06:00
Sergey M․
f13b88c616 [rts] Fix f4m and m3u8 extraction (Closes #4873) 2015-02-05 22:17:50 +06:00
Sergey M․
60ca389c64 [extractor/common] Prefix f4m/m3u8 entries with identifier 2015-02-05 22:16:27 +06:00
Sergey M․
1b0f3919c1 Merge branch 'Frenzie-npo' 2015-02-05 20:15:13 +06:00
Sergey M․
6a348cf7d5 Credit @Frenzie for npo subtitles (#4878) 2015-02-05 20:14:56 +06:00
Sergey M․
9e91449c8d [npo] Fix subtitles (Closes #3638) 2015-02-05 20:13:28 +06:00
Frans de Jonge
25e5ebf382 Add NPO.nl subtitles
Implements #3638
2015-02-05 12:51:33 +01:00
22 changed files with 417 additions and 163 deletions

View File

@@ -108,3 +108,5 @@ Enam Mijbah Noor
David Luhmer
Shaya Goldberg
Paul Hartmann
Frans de Jonge
Robin de Rooij

View File

@@ -294,7 +294,9 @@ which means you can modify it, redistribute it or use it however you like.
-f "[filesize>10M]"). This works for
filesize, height, width, tbr, abr, vbr, and
fps and the comparisons <, <=, >, >=, =, !=
. Formats for which the value is not known
and for ext, acodec, vcodec, container and
protocol and the comparisons =, != .
Formats for which the value is not known
are excluded unless you put a question mark
(?) after the operator. You can combine
format filters, so -f "[height <=?

View File

@@ -14,6 +14,7 @@
- **AddAnime**
- **AdobeTV**
- **AdultSwim**
- **Aftenposten**
- **Aftonbladet**
- **AlJazeera**
- **Allocine**
@@ -441,6 +442,7 @@
- **tvp.pl**
- **tvp.pl:Series**
- **TVPlay**: TV3Play and related services
- **Tweakers**
- **twitch:bookmarks**
- **twitch:chapter**
- **twitch:past_broadcasts**

View File

@@ -13,6 +13,7 @@ import copy
from test.helper import FakeYDL, assertRegexpMatches
from youtube_dl import YoutubeDL
from youtube_dl.extractor import YoutubeIE
from youtube_dl.postprocessor.common import PostProcessor
class YDL(FakeYDL):
@@ -370,5 +371,35 @@ class TestFormatSelection(unittest.TestCase):
'vbr': 10,
}), '^\s*10k$')
def test_postprocessors(self):
filename = 'post-processor-testfile.mp4'
audiofile = filename + '.mp3'
class SimplePP(PostProcessor):
def run(self, info):
with open(audiofile, 'wt') as f:
f.write('EXAMPLE')
info['filepath']
return False, info
def run_pp(params):
with open(filename, 'wt') as f:
f.write('EXAMPLE')
ydl = YoutubeDL(params)
ydl.add_post_processor(SimplePP())
ydl.post_process(filename, {'filepath': filename})
run_pp({'keepvideo': True})
self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(filename)
os.unlink(audiofile)
run_pp({'keepvideo': False})
self.assertFalse(os.path.exists(filename), '%s exists' % filename)
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
os.unlink(audiofile)
if __name__ == '__main__':
unittest.main()

View File

@@ -826,27 +826,44 @@ class YoutubeDL(object):
'!=': operator.ne,
}
operator_rex = re.compile(r'''(?x)\s*\[
(?P<key>width|height|tbr|abr|vbr|filesize|fps)
(?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
\]$
''' % '|'.join(map(re.escape, OPERATORS.keys())))
m = operator_rex.search(format_spec)
if m:
try:
comparison_value = int(m.group('value'))
except ValueError:
comparison_value = parse_filesize(m.group('value'))
if comparison_value is None:
comparison_value = parse_filesize(m.group('value') + 'B')
if comparison_value is None:
raise ValueError(
'Invalid value %r in format specification %r' % (
m.group('value'), format_spec))
op = OPERATORS[m.group('op')]
if not m:
STR_OPERATORS = {
'=': operator.eq,
'!=': operator.ne,
}
str_operator_rex = re.compile(r'''(?x)\s*\[
\s*(?P<key>ext|acodec|vcodec|container|protocol)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9_-]+)
\s*\]$
''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.search(format_spec)
if m:
comparison_value = m.group('value')
op = STR_OPERATORS[m.group('op')]
if not m:
raise ValueError('Invalid format specification %r' % format_spec)
try:
comparison_value = int(m.group('value'))
except ValueError:
comparison_value = parse_filesize(m.group('value'))
if comparison_value is None:
comparison_value = parse_filesize(m.group('value') + 'B')
if comparison_value is None:
raise ValueError(
'Invalid value %r in format specification %r' % (
m.group('value'), format_spec))
op = OPERATORS[m.group('op')]
def _filter(f):
actual_value = f.get(m.group('key'))
if actual_value is None:
@@ -938,6 +955,9 @@ class YoutubeDL(object):
def has_header(self, h):
return h in self.headers
def get_header(self, h, default=None):
return self.headers.get(h, default)
pr = _PseudoRequest(info_dict['url'])
self.cookiejar.add_cookie_header(pr)
return pr.headers.get('Cookie')
@@ -1076,7 +1096,8 @@ class YoutubeDL(object):
else self.params['merge_output_format'])
selected_format = {
'requested_formats': formats_info,
'format': rf,
'format': '%s+%s' % (formats_info[0].get('format'),
formats_info[1].get('format')),
'format_id': '%s+%s' % (formats_info[0].get('format_id'),
formats_info[1].get('format_id')),
'width': formats_info[0].get('width'),

View File

@@ -6,6 +6,7 @@ from .academicearth import AcademicEarthCourseIE
from .addanime import AddAnimeIE
from .adobetv import AdobeTVIE
from .adultswim import AdultSwimIE
from .aftenposten import AftenpostenIE
from .aftonbladet import AftonbladetIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
@@ -475,6 +476,7 @@ from .tutv import TutvIE
from .tvigle import TvigleIE
from .tvp import TvpIE, TvpSeriesIE
from .tvplay import TVPlayIE
from .tweakers import TweakersIE
from .twentyfourvideo import TwentyFourVideoIE
from .twitch import (
TwitchVideoIE,

View File

@@ -0,0 +1,103 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
xpath_with_ns,
xpath_text,
find_xpath_attr,
)
class AftenpostenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
_TEST = {
'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=&section=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
'md5': 'fd828cd29774a729bf4d4425fe192972',
'info_dict': {
'id': '21039',
'ext': 'mov',
'title': 'TRAILER: "Sweatshop" - I can´t take any more',
'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
'timestamp': 1416927969,
'upload_date': '20141125',
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(
r'data-xs-id="(\d+)"', webpage, 'video id')
data = self._download_xml(
'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
NS_MAP = {
'atom': 'http://www.w3.org/2005/Atom',
'xt': 'http://xstream.dk/',
'media': 'http://search.yahoo.com/mrss/',
}
entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
title = xpath_text(
entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
description = xpath_text(
entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
timestamp = parse_iso8601(xpath_text(
entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
formats = []
media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
media_url = media_content.get('url')
if not media_url:
continue
tbr = int_or_none(media_content.get('bitrate'))
mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
if mobj:
formats.append({
'url': mobj.group('url'),
'play_path': 'mp4:%s' % mobj.group('playpath'),
'app': mobj.group('app'),
'ext': 'flv',
'tbr': tbr,
'format_id': 'rtmp-%d' % tbr,
})
else:
formats.append({
'url': media_url,
'tbr': tbr,
})
self._sort_formats(formats)
link = find_xpath_attr(
entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
if link is not None:
formats.append({
'url': link.get('href'),
'format_id': link.get('rel'),
})
thumbnails = [{
'url': splash.get('url'),
'width': int_or_none(splash.get('width')),
'height': int_or_none(splash.get('height')),
} for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'formats': formats,
'thumbnails': thumbnails,
}

View File

@@ -20,6 +20,7 @@ class AparatIE(InfoExtractor):
'id': 'wP8On',
'ext': 'mp4',
'title': 'تیم گلکسی 11 - زومیت',
'age_limit': 0,
},
# 'skip': 'Extremely unreliable',
}
@@ -34,7 +35,8 @@ class AparatIE(InfoExtractor):
video_id + '/vt/frame')
webpage = self._download_webpage(embed_url, video_id)
video_urls = re.findall(r'fileList\[[0-9]+\]\s*=\s*"([^"]+)"', webpage)
video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
for i, video_url in enumerate(video_urls):
req = HEADRequest(video_url)
res = self._request_webpage(
@@ -46,7 +48,7 @@ class AparatIE(InfoExtractor):
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
thumbnail = self._search_regex(
r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
return {
'id': video_id,
@@ -54,4 +56,5 @@ class AparatIE(InfoExtractor):
'url': video_url,
'ext': 'mp4',
'thumbnail': thumbnail,
'age_limit': self._family_friendly_search(webpage),
}

View File

@@ -656,6 +656,21 @@ class InfoExtractor(object):
}
return RATING_TABLE.get(rating.lower(), None)
def _family_friendly_search(self, html):
# See http://schema.org/VideoObj
family_friendly = self._html_search_meta('isFamilyFriendly', html)
if not family_friendly:
return None
RATING_TABLE = {
'1': 0,
'true': 0,
'0': 18,
'false': 18,
}
return RATING_TABLE.get(family_friendly.lower(), None)
def _twitter_search_player(self, html):
return self._html_search_meta('twitter:player', html,
'twitter card player')
@@ -707,9 +722,9 @@ class InfoExtractor(object):
f.get('quality') if f.get('quality') is not None else -1,
f.get('tbr') if f.get('tbr') is not None else -1,
f.get('vbr') if f.get('vbr') is not None else -1,
ext_preference,
f.get('height') if f.get('height') is not None else -1,
f.get('width') if f.get('width') is not None else -1,
ext_preference,
f.get('abr') if f.get('abr') is not None else -1,
audio_ext_preference,
f.get('fps') if f.get('fps') is not None else -1,
@@ -765,7 +780,7 @@ class InfoExtractor(object):
self.to_screen(msg)
time.sleep(timeout)
def _extract_f4m_formats(self, manifest_url, video_id):
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None):
manifest = self._download_xml(
manifest_url, video_id, 'Downloading f4m manifest',
'Unable to download f4m manifest')
@@ -778,26 +793,28 @@ class InfoExtractor(object):
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
for i, media_el in enumerate(media_nodes):
if manifest_version == '2.0':
manifest_url = '/'.join(manifest_url.split('/')[:-1]) + '/' + media_el.attrib.get('href')
manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
+ (media_el.attrib.get('href') or media_el.attrib.get('url')))
tbr = int_or_none(media_el.attrib.get('bitrate'))
format_id = 'f4m-%d' % (i if tbr is None else tbr)
formats.append({
'format_id': format_id,
'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
'url': manifest_url,
'ext': 'flv',
'tbr': tbr,
'width': int_or_none(media_el.attrib.get('width')),
'height': int_or_none(media_el.attrib.get('height')),
'preference': preference,
})
self._sort_formats(formats)
return formats
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None):
entry_protocol='m3u8', preference=None,
m3u8_id=None):
formats = [{
'format_id': 'm3u8-meta',
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
'url': m3u8_url,
'ext': ext,
'protocol': 'm3u8',
@@ -833,9 +850,8 @@ class InfoExtractor(object):
formats.append({'url': format_url(line)})
continue
tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
f = {
'format_id': 'm3u8-%d' % (tbr if tbr else len(formats)),
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
'url': format_url(line.strip()),
'tbr': tbr,
'ext': ext,

View File

@@ -1,41 +1,67 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
xpath_text,
xpath_with_ns,
)
class GamekingsIE(InfoExtractor):
_VALID_URL = r'http://www\.gamekings\.tv/videos/(?P<name>[0-9a-z\-]+)'
_TEST = {
_VALID_URL = r'http://www\.gamekings\.tv/(?:videos|nieuws)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/',
# MD5 is flaky, seems to change regularly
# 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3',
'info_dict': {
'id': '20130811',
'id': 'phoenix-wright-ace-attorney-dual-destinies-review',
'ext': 'mp4',
'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
'description': 'md5:36fd701e57e8c15ac8682a2374c99731',
}
}
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
# vimeo video
'url': 'http://www.gamekings.tv/videos/the-legend-of-zelda-majoras-mask/',
'md5': '12bf04dfd238e70058046937657ea68d',
'info_dict': {
'id': 'the-legend-of-zelda-majoras-mask',
'ext': 'mp4',
'title': 'The Legend of Zelda: Majoras Mask',
'description': 'md5:9917825fe0e9f4057601fe1e38860de3',
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.gamekings.tv/nieuws/gamekings-extra-shelly-en-david-bereiden-zich-voor-op-de-livestream/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
name = mobj.group('name')
webpage = self._download_webpage(url, name)
video_url = self._og_search_video_url(webpage)
webpage = self._download_webpage(url, video_id)
video = re.search(r'[0-9]+', video_url)
video_id = video.group(0)
playlist_id = self._search_regex(
r'gogoVideo\(\s*\d+\s*,\s*"([^"]+)', webpage, 'playlist id')
# Todo: add medium format
video_url = video_url.replace(video_id, 'large/' + video_id)
playlist = self._download_xml(
'http://www.gamekings.tv/wp-content/themes/gk2010/rss_playlist.php?id=%s' % playlist_id,
video_id)
NS_MAP = {
'jwplayer': 'http://rss.jwpcdn.com/'
}
item = playlist.find('./channel/item')
thumbnail = xpath_text(item, xpath_with_ns('./jwplayer:image', NS_MAP), 'thumbnail')
video_url = item.find(xpath_with_ns('./jwplayer:source', NS_MAP)).get('file')
return {
'id': video_id,
'ext': 'mp4',
'url': video_url,
'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'thumbnail': thumbnail,
}

View File

@@ -34,8 +34,6 @@ class GoshgayIE(InfoExtractor):
duration = parse_duration(self._html_search_regex(
r'<span class="duration">\s*-?\s*(.*?)</span>',
webpage, 'duration', fatal=False))
family_friendly = self._html_search_meta(
'isFamilyFriendly', webpage, default='false')
flashvars = compat_parse_qs(self._html_search_regex(
r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"',
@@ -49,5 +47,5 @@ class GoshgayIE(InfoExtractor):
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'age_limit': 0 if family_friendly == 'true' else 18,
'age_limit': self._family_friendly_search(webpage),
}

View File

@@ -80,9 +80,6 @@ class IzleseneIE(InfoExtractor):
r'comment_count\s*=\s*\'([^\']+)\';',
webpage, 'comment_count', fatal=False)
family_friendly = self._html_search_meta(
'isFamilyFriendly', webpage, 'age limit', fatal=False)
content_url = self._html_search_meta(
'contentURL', webpage, 'content URL', fatal=False)
ext = determine_ext(content_url, 'mp4')
@@ -120,6 +117,6 @@ class IzleseneIE(InfoExtractor):
'duration': duration,
'view_count': int_or_none(view_count),
'comment_count': int_or_none(comment_count),
'age_limit': 18 if family_friendly == 'False' else 0,
'age_limit': self._family_friendly_search(webpage),
'formats': formats,
}

View File

@@ -1,6 +1,6 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .subtitles import SubtitlesInfoExtractor
from ..utils import (
fix_xml_ampersands,
parse_duration,
@@ -11,7 +11,7 @@ from ..utils import (
)
class NPOBaseIE(InfoExtractor):
class NPOBaseIE(SubtitlesInfoExtractor):
def _get_token(self, video_id):
token_page = self._download_webpage(
'http://ida.omroep.nl/npoplayer/i.js',
@@ -161,6 +161,16 @@ class NPOIE(NPOBaseIE):
self._sort_formats(formats)
subtitles = {}
if metadata.get('tt888') == 'ja':
subtitles['nl'] = 'http://e.omroep.nl/tt888/%s' % video_id
if self._downloader.params.get('listsubtitles', False):
self._list_available_subtitles(video_id, subtitles)
return
subtitles = self.extract_subtitles(video_id, subtitles)
return {
'id': video_id,
'title': metadata['titel'],
@@ -169,6 +179,7 @@ class NPOIE(NPOBaseIE):
'upload_date': unified_strdate(metadata.get('gidsdatum')),
'duration': parse_duration(metadata.get('tijdsduur')),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -1,16 +1,16 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import js_to_json
class RTPIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?'
_TESTS = [{
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
'md5': 'e736ce0c665e459ddb818546220b4ef8',
'info_dict': {
'id': 'e174042',
'ext': 'mp3',
@@ -18,9 +18,6 @@ class RTPIE(InfoExtractor):
'description': 'As paixões musicais de António Cartaxo e António Macedo',
'thumbnail': 're:^https?://.*\.jpg',
},
'params': {
'skip_download': True, # RTMP download
},
}, {
'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
'only_matching': True,
@@ -37,20 +34,48 @@ class RTPIE(InfoExtractor):
player_config = self._search_regex(
r'(?s)RTPPLAY\.player\.newPlayer\(\s*(\{.*?\})\s*\)', webpage, 'player config')
config = json.loads(js_to_json(player_config))
config = self._parse_json(player_config, video_id)
path, ext = config.get('file').rsplit('.', 1)
formats = [{
'format_id': 'rtmp',
'ext': ext,
'vcodec': config.get('type') == 'audio' and 'none' or None,
'preference': -2,
'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
'app': config.get('application'),
'play_path': '{ext:s}:{path:s}'.format(ext=ext, path=path),
'page_url': url,
'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
'rtmp_live': config.get('live', False),
'ext': ext,
'vcodec': config.get('type') == 'audio' and 'none' or None,
'player_url': 'http://programas.rtp.pt/play/player.swf?v3',
'rtmp_real_time': True,
}]
# Construct regular HTTP download URLs
replacements = {
'audio': {
'format_id': 'mp3',
'pattern': r'^nas2\.share/wavrss/',
'repl': 'http://rsspod.rtp.pt/podcasts/',
'vcodec': 'none',
},
'video': {
'format_id': 'mp4_h264',
'pattern': r'^nas2\.share/h264/',
'repl': 'http://rsspod.rtp.pt/videocasts/',
'vcodec': 'h264',
},
}
r = replacements[config['type']]
if re.match(r['pattern'], config['file']) is not None:
formats.append({
'format_id': r['format_id'],
'url': re.sub(r['pattern'], r['repl'], config['file']),
'vcodec': r['vcodec'],
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,

View File

@@ -6,12 +6,14 @@ import re
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urllib_parse_urlparse,
)
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
unescapeHTML,
xpath_text,
)
@@ -159,11 +161,27 @@ class RTSIE(InfoExtractor):
return int_or_none(self._search_regex(
r'-([0-9]+)k\.', url, 'bitrate', default=None))
formats = [{
'format_id': fid,
'url': furl,
'tbr': extract_bitrate(furl),
} for fid, furl in info['streams'].items()]
formats = []
for format_id, format_url in info['streams'].items():
if format_url.endswith('.f4m'):
token = self._download_xml(
'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
video_id, 'Downloading %s token' % format_id)
auth_params = xpath_text(token, './/authparams', 'auth params')
if not auth_params:
continue
formats.extend(self._extract_f4m_formats(
'%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
video_id, f4m_id=format_id))
elif format_url.endswith('.m3u8'):
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', m3u8_id=format_id))
else:
formats.append({
'format_id': format_id,
'url': format_url,
'tbr': extract_bitrate(format_url),
})
if 'media' in info:
formats.extend([{

View File

@@ -1,80 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
HEADRequest,
urlhandle_detect_ext,
)
class SoulAnimeWatchingIE(InfoExtractor):
IE_NAME = "soulanime:watching"
IE_DESC = "SoulAnime video"
_TEST = {
'url': 'http://www.soul-anime.net/watching/seirei-tsukai-no-blade-dance-episode-9/',
'md5': '05fae04abf72298098b528e98abf4298',
'info_dict': {
'id': 'seirei-tsukai-no-blade-dance-episode-9',
'ext': 'mp4',
'title': 'seirei-tsukai-no-blade-dance-episode-9',
'description': 'seirei-tsukai-no-blade-dance-episode-9'
}
}
_VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/watch[^/]*/(?P<id>[^/]+)'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
domain = mobj.group('domain')
page = self._download_webpage(url, video_id)
video_url_encoded = self._html_search_regex(
r'<div id="download">[^<]*<a href="(?P<url>[^"]+)"', page, 'url')
video_url = "http://www.soul-anime." + domain + video_url_encoded
ext_req = HEADRequest(video_url)
ext_handle = self._request_webpage(
ext_req, video_id, note='Determining extension')
ext = urlhandle_detect_ext(ext_handle)
return {
'id': video_id,
'url': video_url,
'ext': ext,
'title': video_id,
'description': video_id
}
class SoulAnimeSeriesIE(InfoExtractor):
IE_NAME = "soulanime:series"
IE_DESC = "SoulAnime Series"
_VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/anime./(?P<id>[^/]+)'
_EPISODE_REGEX = r'<option value="(/watch[^/]*/[^"]+)">[^<]*</option>'
_TEST = {
'url': 'http://www.soul-anime.net/anime1/black-rock-shooter-tv/',
'info_dict': {
'id': 'black-rock-shooter-tv'
},
'playlist_count': 8
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
series_id = mobj.group('id')
domain = mobj.group('domain')
pattern = re.compile(self._EPISODE_REGEX)
page = self._download_webpage(url, series_id, "Downloading series page")
mobj = pattern.findall(page)
entries = [self.url_result("http://www.soul-anime." + domain + obj) for obj in mobj]
return self.playlist_result(entries, series_id)

View File

@@ -15,7 +15,8 @@ class TeamcocoIE(InfoExtractor):
'id': '80187',
'ext': 'mp4',
'title': 'Conan Becomes A Mary Kay Beauty Consultant',
'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.'
'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.',
'age_limit': 0,
}
}, {
'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush',
@@ -24,7 +25,8 @@ class TeamcocoIE(InfoExtractor):
'id': '19705',
'ext': 'mp4',
"description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.",
"title": "Louis C.K. Interview Pt. 1 11/3/11"
"title": "Louis C.K. Interview Pt. 1 11/3/11",
'age_limit': 0,
}
}
]
@@ -83,4 +85,5 @@ class TeamcocoIE(InfoExtractor):
'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage),
'age_limit': self._family_friendly_search(webpage),
}

View File

@@ -1,6 +1,8 @@
# encoding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
float_or_none,
@@ -11,7 +13,7 @@ from ..utils import (
class TvigleIE(InfoExtractor):
IE_NAME = 'tvigle'
IE_DESC = 'Интернет-телевидение Tvigle.ru'
_VALID_URL = r'http://(?:www\.)?tvigle\.ru/(?:[^/]+/)+(?P<id>[^/]+)/$'
_VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'
_TESTS = [
{
@@ -38,16 +40,22 @@ class TvigleIE(InfoExtractor):
'duration': 186.080,
'age_limit': 0,
},
},
}, {
'url': 'https://cloud.tvigle.ru/video/5267604/',
'only_matching': True,
}
]
def _real_extract(self, url):
display_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('display_id')
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(
r'<li class="video-preview current_playing" id="(\d+)">', webpage, 'video id')
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(
r'<li class="video-preview current_playing" id="(\d+)">',
webpage, 'video id')
video_data = self._download_json(
'http://cloud.tvigle.ru/api/play/video/%s/' % video_id, display_id)

View File

@@ -0,0 +1,65 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
xpath_text,
xpath_with_ns,
int_or_none,
float_or_none,
)
class TweakersIE(InfoExtractor):
_VALID_URL = r'https?://tweakers\.net/video/(?P<id>\d+)'
_TEST = {
'url': 'https://tweakers.net/video/9926/new-nintendo-3ds-xl-op-alle-fronten-beter.html',
'md5': '1b5afa817403bb5baa08359dca31e6df',
'info_dict': {
'id': '9926',
'ext': 'mp4',
'title': 'New Nintendo 3DS XL - Op alle fronten beter',
'description': 'md5:f97324cc71e86e11c853f0763820e3ba',
'thumbnail': 're:^https?://.*\.jpe?g$',
'duration': 386,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
playlist = self._download_xml(
'https://tweakers.net/video/s1playlist/%s/playlist.xspf' % video_id,
video_id)
NS_MAP = {
'xspf': 'http://xspf.org/ns/0/',
's1': 'http://static.streamone.nl/player/ns/0',
}
track = playlist.find(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP))
title = xpath_text(
track, xpath_with_ns('./xspf:title', NS_MAP), 'title')
description = xpath_text(
track, xpath_with_ns('./xspf:annotation', NS_MAP), 'description')
thumbnail = xpath_text(
track, xpath_with_ns('./xspf:image', NS_MAP), 'thumbnail')
duration = float_or_none(
xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'),
1000)
formats = [{
'url': location.text,
'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
} for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}

View File

@@ -298,7 +298,9 @@ def parseOpts(overrideArguments=None):
' brackets, as in -f "best[height=720]"'
' (or -f "[filesize>10M]"). '
' This works for filesize, height, width, tbr, abr, vbr, and fps'
' and the comparisons <, <=, >, >=, =, != .'
' and the comparisons <, <=, >, >=, =, !='
' and for ext, acodec, vcodec, container and protocol'
' and the comparisons =, != .'
' Formats for which the value is not known are excluded unless you'
' put a question mark (?) after the operator.'
' You can combine format filters, so '

View File

@@ -166,14 +166,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
if filecodec is None:
raise PostProcessingError('WARNING: unable to obtain file audio codec with ffprobe')
uses_avconv = self._uses_avconv()
more_opts = []
if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']:
# Lossless, but in another container
acodec = 'copy'
extension = 'm4a'
more_opts = ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
more_opts = ['-bsf:a', 'aac_adtstoasc']
elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
# Lossless if possible
acodec = 'copy'
@@ -189,9 +188,9 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
more_opts = []
if self._preferredquality is not None:
if int(self._preferredquality) < 10:
more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
more_opts += ['-q:a', self._preferredquality]
else:
more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
more_opts += ['-b:a', self._preferredquality + 'k']
else:
# We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
@@ -200,13 +199,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
if self._preferredquality is not None:
# The opus codec doesn't support the -aq option
if int(self._preferredquality) < 10 and extension != 'opus':
more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
more_opts += ['-q:a', self._preferredquality]
else:
more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
more_opts += ['-b:a', self._preferredquality + 'k']
if self._preferredcodec == 'aac':
more_opts += ['-f', 'adts']
if self._preferredcodec == 'm4a':
more_opts += ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
more_opts += ['-bsf:a', 'aac_adtstoasc']
if self._preferredcodec == 'vorbis':
extension = 'ogg'
if self._preferredcodec == 'wav':

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2015.02.04'
__version__ = '2015.02.08'