Compare commits

...

55 Commits

Author SHA1 Message Date
1ac1af9b47 release 2015.02.19.2 2015-02-19 01:43:28 +01:00
3bf5705316 [imgur] Add new extractor 2015-02-19 01:43:20 +01:00
1c2528c8a3 [cbs] Modernize 2015-02-19 01:22:50 +01:00
7bd15b1a03 release 2015.02.19.1 2015-02-19 01:04:24 +01:00
6b961a85fd [patreon] Add support for embedlies (fixes #4969) 2015-02-19 01:04:19 +01:00
7707004043 [patreon] Modernize 2015-02-19 00:38:05 +01:00
a025d3c5a5 release 2015.02.19 2015-02-19 00:31:23 +01:00
c460bdd56b [sandia] Add new extractor (#4974) 2015-02-19 00:31:01 +01:00
b81a359eb6 [YoutubeDL] Use render_table for format listing 2015-02-19 00:28:58 +01:00
d61aefb24c Merge remote-tracking branch 'origin/master' 2015-02-19 00:01:14 +01:00
d305dd73a3 [utils] Fix js_to_json
Previously, the runtime could be atrocious for longer inputs.
2015-02-18 23:59:51 +01:00
93a16ba238 [vimeo] Raise the ExtractorError with expected=True when no video password is given 2015-02-18 22:00:12 +01:00
85d5866177 [yahoo] Remove md5sum from test case
The md5 sum has changed repeatedly, and we check whether it looks like a video anyways nowadays.
2015-02-18 20:03:04 +01:00
9789d7535d [xtube] Fix test case 2015-02-18 19:58:41 +01:00
d8443cd3f7 [wsj] Correct test case 2015-02-18 19:56:24 +01:00
d47c26e168 [brightcove] Correct keys in playlists 2015-02-18 19:56:10 +01:00
81975f4693 release 2015.02.18.1 2015-02-18 10:54:56 +01:00
b8b928d5cb [README] Add an FAQ entry for the player change in anticipation of many more bug reports 2015-02-18 10:54:45 +01:00
3eff81fbf7 [jsinterp] Disable comment support
We need a proper lexer to be able to understand YouTube's code, which contains /* inside of strings.
For now it's sufficient to just disable comment support altogether.

Fixes #4976, fixes #4979, fixes #4980, fixes #4981, fixes #4982.
Closes #4977.
2015-02-18 10:47:42 +01:00
785521bf4f [youtube] Remove useless if 2015-02-18 10:42:23 +01:00
6d1a55a521 [youtube] Show entire player URL when -v is given 2015-02-18 10:39:14 +01:00
9cad27008b release 2015.02.18 2015-02-18 00:49:34 +01:00
11e611a7fa Extend various playlist tests 2015-02-18 00:49:10 +01:00
72c1f8de06 [bandcamp:album] Fix extractor results and associated test 2015-02-18 00:48:52 +01:00
6e99868e4c [buzzfeed] Fix playlist test case 2015-02-18 00:41:45 +01:00
4d278fde64 [ign] Amend playlist test 2015-02-18 00:38:55 +01:00
f21e915fb9 [test/helper] Render info_dict with a final comma 2015-02-18 00:38:42 +01:00
6f53c63df6 [test/helper] Only output a newline for forgotten keys if keys are really missing 2015-02-18 00:37:54 +01:00
1def5f359e [livestream] Correct playlist ID and add a test for it 2015-02-18 00:34:45 +01:00
15ec669374 [vk] Amend playlist test 2015-02-18 00:33:41 +01:00
a3fa5da496 [vimeo] Amend playlist tests 2015-02-18 00:33:31 +01:00
30965ac66a [vimeo] Prevent infinite loops if video password verification fails
We're seeing this in the tests¹ right now, which do not terminate.

¹  https://travis-ci.org/jaimeMF/youtube-dl/jobs/51135858
2015-02-18 00:27:58 +01:00
09ab40b7d1 Merge branch 'progress-as-hook2' 2015-02-17 23:41:48 +01:00
fa15607773 PEP8 fixes 2015-02-17 21:46:20 +01:00
a91a2c1a83 [downloader] Remove various unneeded assignments and imports 2015-02-17 21:44:41 +01:00
16e7711e22 [downloader/http] Remove gruesome import 2015-02-17 21:42:31 +01:00
5cda4eda72 [YoutubeDL] Use a progress hook for progress reporting
Instead of every downloader calling two helper functions, let our progress report be an ordinary progress hook like everyone else's.
Closes #4875.
2015-02-17 21:40:35 +01:00
98f000409f [radio.de] Fix extraction 2015-02-17 21:40:09 +01:00
4a8d4a53b1 [videolecturesnet] Fix rtmp stream glitches (Closes #4968) 2015-02-18 01:16:49 +06:00
4cd95bcbc3 [twitch:stream] Prefer the 'source' format (fixes #4972) 2015-02-17 18:57:01 +01:00
be24c8697f release 2015.02.17.2 2015-02-17 17:38:31 +01:00
0d93378887 [videolecturesnet] Check http format URLs (Closes #4968) 2015-02-17 22:35:27 +06:00
4069766c52 [extractor/common] Test URLs with GET 2015-02-17 22:35:27 +06:00
7010577720 release 2015.02.17.1 2015-02-17 17:35:08 +01:00
8ac27a68e6 [hls] Switch to available as a property 2015-02-17 17:35:03 +01:00
46312e0b46 release 2015.02.17 2015-02-17 17:29:32 +01:00
f9216ed6ad Merge remote-tracking branch 'origin/master' 2015-02-17 17:28:51 +01:00
65bf37ef83 [ffmpeg] Remove trivial helper method 2015-02-17 17:27:29 +01:00
f740fae2a4 [ffmpeg] Make available a property 2015-02-17 17:26:41 +01:00
fbc503d696 [downloader/hls] Fix detection of ffmpeg/avconv (reported in #4966) 2015-02-17 16:40:42 +01:00
662435f728 [YoutubeDL] Use a Request object for getting the cookies (fixes #4970)
So that we don't have to implement all the methods used by the cookiejar.
2015-02-17 16:29:24 +01:00
163d966707 [downloader/external] curl: Add the '--location' flag
curl doesn't follow redirections by default
2015-02-17 16:21:02 +01:00
85729c51af [downloader] Add --hls-prefer-native to use the native HLS downloader (#4966) 2015-02-17 12:09:12 +01:00
1db5fbcfe3 release 2015.02.16.1 2015-02-16 15:47:13 +01:00
59b8ab5834 [rtlnl|generic] Add support for rtl.nl embeds (Fixes #4959) 2015-02-16 15:45:45 +01:00
45 changed files with 642 additions and 225 deletions

View File

@ -161,6 +161,8 @@ which means you can modify it, redistribute it or use it however you like.
--playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize (experimental) set file xattribute
ytdl.filesize with expected filesize
--hls-prefer-native (experimental) Use the native HLS
downloader instead of ffmpeg.
--external-downloader COMMAND (experimental) Use the specified external
downloader. Currently supports
aria2c,curl,wget
@ -513,11 +515,15 @@ If you want to play the video on a machine that is not running youtube-dl, you c
### ERROR: no fmt_url_map or conn information found in video info
youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### ERROR: unable to download video ###
youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### ExtractorError: Could not find JS function u'OF'
In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### SyntaxError: Non-ASCII character ###

View File

@ -121,6 +121,7 @@
- **EllenTV**
- **EllenTV:clips**
- **ElPais**: El País
- **Embedly**
- **EMPFlix**
- **Engadget**
- **Eporner**
@ -190,6 +191,7 @@
- **ign.com**
- **imdb**: Internet Movie Database trailers
- **imdb:list**: Internet Movie Database lists
- **Imgur**
- **Ina**
- **InfoQ**
- **Instagram**
@ -338,9 +340,9 @@
- **Roxwel**
- **RTBF**
- **Rte**
- **rtl.nl**: rtl.nl and rtlxl.nl
- **RTL2**
- **RTLnow**
- **rtlxl.nl**
- **RTP**
- **RTS**: RTS.ch
- **rtve.es:alacarta**: RTVE a la carta
@ -352,6 +354,7 @@
- **rutube:movie**: Rutube movies
- **rutube:person**: Rutube person videos
- **RUTV**: RUTV.RU
- **Sandia**: Sandia National Laboratories
- **Sapo**: SAPO Vídeos
- **savefrom.net**
- **SBS**: sbs.com.au

View File

@ -113,6 +113,16 @@ def expect_info_dict(self, got_dict, expected_dict):
self.assertTrue(
got.startswith(start_str),
'field %s (value: %r) should start with %r' % (info_field, got, start_str))
elif isinstance(expected, compat_str) and expected.startswith('contains:'):
got = got_dict.get(info_field)
contains_str = expected[len('contains:'):]
self.assertTrue(
isinstance(got, compat_str),
'Expected a %s object, but got %s for field %s' % (
compat_str.__name__, type(got).__name__, info_field))
self.assertTrue(
contains_str in got,
'field %s (value: %r) should contain %r' % (info_field, got, contains_str))
elif isinstance(expected, type):
got = got_dict.get(info_field)
self.assertTrue(isinstance(got, expected),
@ -163,12 +173,14 @@ def expect_info_dict(self, got_dict, expected_dict):
info_dict_str += ''.join(
' %s: %s,\n' % (_repr(k), _repr(v))
for k, v in test_info_dict.items() if k not in missing_keys)
info_dict_str += '\n'
if info_dict_str:
info_dict_str += '\n'
info_dict_str += ''.join(
' %s: %s,\n' % (_repr(k), _repr(test_info_dict[k]))
for k in missing_keys)
write_string(
'\n\'info_dict\': {\n' + info_dict_str + '}\n', out=sys.stderr)
'\n\'info_dict\': {\n' + info_dict_str + '},\n', out=sys.stderr)
self.assertFalse(
missing_keys,
'Missing keys in test definition: %s' % (

View File

@ -70,6 +70,8 @@ class TestJSInterpreter(unittest.TestCase):
self.assertEqual(jsi.call_function('f'), -11)
def test_comments(self):
'Skipping: Not yet fully implemented'
return
jsi = JSInterpreter('''
function x() {
var x = /* 1 + */ 2;
@ -80,6 +82,15 @@ class TestJSInterpreter(unittest.TestCase):
''')
self.assertEqual(jsi.call_function('x'), 52)
jsi = JSInterpreter('''
function f() {
var x = "/*";
var y = 1 /* comment */ + 2;
return y;
}
''')
self.assertEqual(jsi.call_function('f'), 3)
def test_precedence(self):
jsi = JSInterpreter('''
function x() {

View File

@ -370,6 +370,10 @@ class TestUtil(unittest.TestCase):
"playlist":[{"controls":{"all":null}}]
}''')
inp = '"SAND Number: SAND 2013-7800P\\nPresenter: Tom Russo\\nHabanero Software Training - Xyce Software\\nXyce, Sandia\\u0027s"'
json_code = js_to_json(inp)
self.assertEqual(json.loads(json_code), json.loads(inp))
def test_js_to_json_edgecases(self):
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})

View File

@ -64,6 +64,12 @@ _TESTS = [
'js',
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
),
(
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
'js',
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
)
]

View File

@ -199,18 +199,25 @@ class YoutubeDL(object):
postprocessor.
progress_hooks: A list of functions that get called on download
progress, with a dictionary with the entries
* status: One of "downloading" and "finished".
* status: One of "downloading", "error", or "finished".
Check this first and ignore unknown values.
If status is one of "downloading" or "finished", the
If status is one of "downloading", or "finished", the
following properties may also be present:
* filename: The final filename (always present)
* tmpfilename: The filename we're currently writing to
* downloaded_bytes: Bytes on disk
* total_bytes: Size of the whole file, None if unknown
* tmpfilename: The filename we're currently writing to
* total_bytes_estimate: Guess of the eventual file size,
None if unavailable.
* elapsed: The number of seconds since download started.
* eta: The estimated time in seconds, None if unknown
* speed: The download speed in bytes/second, None if
unknown
* fragment_index: The counter of the currently
downloaded video fragment.
* fragment_count: The number of fragments (= individual
files that will be merged)
Progress hooks are guaranteed to be called at least once
(with status "finished") if the download is successful.
@ -225,7 +232,6 @@ class YoutubeDL(object):
call_home: Boolean, true iff we are allowed to contact the
youtube-dl servers for debugging.
sleep_interval: Number of seconds to sleep before each download.
external_downloader: Executable of the external downloader to call.
listformats: Print an overview of available video formats and exit.
list_thumbnails: Print a table of all thumbnails and exit.
match_filter: A function that gets called with the info_dict of
@ -235,6 +241,10 @@ class YoutubeDL(object):
match_filter_func in utils.py is one example for this.
no_color: Do not emit color codes in output.
The following options determine which downloader is picked:
external_downloader: Executable of the external downloader to call.
None or unset for standard (built-in) downloader.
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader:
@ -951,30 +961,9 @@ class YoutubeDL(object):
return res
def _calc_cookies(self, info_dict):
class _PseudoRequest(object):
def __init__(self, url):
self.url = url
self.headers = {}
self.unverifiable = False
def add_unredirected_header(self, k, v):
self.headers[k] = v
def get_full_url(self):
return self.url
def is_unverifiable(self):
return self.unverifiable
def has_header(self, h):
return h in self.headers
def get_header(self, h, default=None):
return self.headers.get(h, default)
pr = _PseudoRequest(info_dict['url'])
pr = compat_urllib_request.Request(info_dict['url'])
self.cookiejar.add_cookie_header(pr)
return pr.headers.get('Cookie')
return pr.get_header('Cookie')
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
@ -1298,7 +1287,7 @@ class YoutubeDL(object):
downloaded = []
success = True
merger = FFmpegMergerPP(self, not self.params.get('keepvideo'))
if not merger.available():
if not merger.available:
postprocessors = []
self.report_warning('You have requested multiple '
'formats but ffmpeg or avconv are not installed.'
@ -1545,29 +1534,18 @@ class YoutubeDL(object):
return res
def list_formats(self, info_dict):
def line(format, idlen=20):
return (('%-' + compat_str(idlen + 1) + 's%-10s%-12s%s') % (
format['format_id'],
format['ext'],
self.format_resolution(format),
self._format_note(format),
))
formats = info_dict.get('formats', [info_dict])
idlen = max(len('format code'),
max(len(f['format_id']) for f in formats))
formats_s = [
line(f, idlen) for f in formats
table = [
[f['format_id'], f['ext'], self.format_resolution(f), self._format_note(f)]
for f in formats
if f.get('preference') is None or f['preference'] >= -1000]
if len(formats) > 1:
formats_s[-1] += (' ' if self._format_note(formats[-1]) else '') + '(best)'
table[-1][-1] += (' ' if table[-1][-1] else '') + '(best)'
header_line = line({
'format_id': 'format code', 'ext': 'extension',
'resolution': 'resolution', 'format_note': 'note'}, idlen=idlen)
header_line = ['format code', 'extension', 'resolution', 'note']
self.to_screen(
'[info] Available formats for %s:\n%s\n%s' %
(info_dict['id'], header_line, '\n'.join(formats_s)))
'[info] Available formats for %s:\n%s' %
(info_dict['id'], render_table(header_line, table)))
def list_thumbnails(self, info_dict):
thumbnails = info_dict.get('thumbnails')

View File

@ -351,6 +351,7 @@ def _real_main(argv=None):
'match_filter': match_filter,
'no_color': opts.no_color,
'ffmpeg_location': opts.ffmpeg_location,
'hls_prefer_native': opts.hls_prefer_native,
}
with YoutubeDL(ydl_opts) as ydl:

View File

@ -34,6 +34,9 @@ def get_suitable_downloader(info_dict, params={}):
if ed.supports(info_dict):
return ed
if protocol == 'm3u8' and params.get('hls_prefer_native'):
return NativeHlsFD
return PROTOCOL_MAP.get(protocol, HttpFD)

View File

@ -1,4 +1,4 @@
from __future__ import unicode_literals
from __future__ import division, unicode_literals
import os
import re
@ -54,6 +54,7 @@ class FileDownloader(object):
self.ydl = ydl
self._progress_hooks = []
self.params = params
self.add_progress_hook(self.report_progress)
@staticmethod
def format_seconds(seconds):
@ -226,42 +227,64 @@ class FileDownloader(object):
self.to_screen(clear_line + fullmsg, skip_eol=not is_last_line)
self.to_console_title('youtube-dl ' + msg)
def report_progress(self, percent, data_len_str, speed, eta):
"""Report download progress."""
if self.params.get('noprogress', False):
def report_progress(self, s):
if s['status'] == 'finished':
if self.params.get('noprogress', False):
self.to_screen('[download] Download completed')
else:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
if s.get('elapsed') is not None:
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
else:
msg_template = '100%% of %(_total_bytes_str)s'
self._report_progress_status(
msg_template % s, is_last_line=True)
if self.params.get('noprogress'):
return
if eta is not None:
eta_str = self.format_eta(eta)
else:
eta_str = 'Unknown ETA'
if percent is not None:
percent_str = self.format_percent(percent)
else:
percent_str = 'Unknown %'
speed_str = self.format_speed(speed)
msg = ('%s of %s at %s ETA %s' %
(percent_str, data_len_str, speed_str, eta_str))
self._report_progress_status(msg)
def report_progress_live_stream(self, downloaded_data_len, speed, elapsed):
if self.params.get('noprogress', False):
if s['status'] != 'downloading':
return
downloaded_str = format_bytes(downloaded_data_len)
speed_str = self.format_speed(speed)
elapsed_str = FileDownloader.format_seconds(elapsed)
msg = '%s at %s (%s)' % (downloaded_str, speed_str, elapsed_str)
self._report_progress_status(msg)
def report_finish(self, data_len_str, tot_time):
"""Report download finished."""
if self.params.get('noprogress', False):
self.to_screen('[download] Download completed')
if s.get('eta') is not None:
s['_eta_str'] = self.format_eta(s['eta'])
else:
self._report_progress_status(
('100%% of %s in %s' %
(data_len_str, self.format_seconds(tot_time))),
is_last_line=True)
s['_eta_str'] = 'Unknown ETA'
if s.get('total_bytes') and s.get('downloaded_bytes') is not None:
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes'])
elif s.get('total_bytes_estimate') and s.get('downloaded_bytes') is not None:
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes_estimate'])
else:
if s.get('downloaded_bytes') == 0:
s['_percent_str'] = self.format_percent(0)
else:
s['_percent_str'] = 'Unknown %'
if s.get('speed') is not None:
s['_speed_str'] = self.format_speed(s['speed'])
else:
s['_speed_str'] = 'Unknown speed'
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template = '%(_percent_str)s of %(_total_bytes_str)s at %(_speed_str)s ETA %(_eta_str)s'
elif s.get('total_bytes_estimate') is not None:
s['_total_bytes_estimate_str'] = format_bytes(s['total_bytes_estimate'])
msg_template = '%(_percent_str)s of ~%(_total_bytes_estimate_str)s at %(_speed_str)s ETA %(_eta_str)s'
else:
if s.get('downloaded_bytes') is not None:
s['_downloaded_bytes_str'] = format_bytes(s['downloaded_bytes'])
if s.get('elapsed'):
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s (%(_elapsed_str)s)'
else:
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s'
else:
msg_template = '%(_percent_str)s % at %(_speed_str)s ETA %(_eta_str)s'
self._report_progress_status(msg_template % s)
def report_resuming_byte(self, resume_len):
"""Report attempt to resume at given byte."""

View File

@ -75,7 +75,7 @@ class ExternalFD(FileDownloader):
class CurlFD(ExternalFD):
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-o', tmpfilename]
cmd = [self.exe, '--location', '-o', tmpfilename]
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._source_address('--interface')

View File

@ -1,4 +1,4 @@
from __future__ import unicode_literals
from __future__ import division, unicode_literals
import base64
import io
@ -15,7 +15,6 @@ from ..compat import (
from ..utils import (
struct_pack,
struct_unpack,
format_bytes,
encodeFilename,
sanitize_open,
xpath_text,
@ -252,17 +251,6 @@ class F4mFD(FileDownloader):
requested_bitrate = info_dict.get('tbr')
self.to_screen('[download] Downloading f4m manifest')
manifest = self.ydl.urlopen(man_url).read()
self.report_destination(filename)
http_dl = HttpQuietDownloader(
self.ydl,
{
'continuedl': True,
'quiet': True,
'noprogress': True,
'ratelimit': self.params.get('ratelimit', None),
'test': self.params.get('test', False),
}
)
doc = etree.fromstring(manifest)
formats = [(int(f.attrib.get('bitrate', -1)), f)
@ -298,39 +286,65 @@ class F4mFD(FileDownloader):
# For some akamai manifests we'll need to add a query to the fragment url
akamai_pv = xpath_text(doc, _add_ns('pv-2.0'))
self.report_destination(filename)
http_dl = HttpQuietDownloader(
self.ydl,
{
'continuedl': True,
'quiet': True,
'noprogress': True,
'ratelimit': self.params.get('ratelimit', None),
'test': self.params.get('test', False),
}
)
tmpfilename = self.temp_name(filename)
(dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
write_flv_header(dest_stream)
write_metadata_tag(dest_stream, metadata)
# This dict stores the download progress, it's updated by the progress
# hook
state = {
'status': 'downloading',
'downloaded_bytes': 0,
'frag_counter': 0,
'frag_index': 0,
'frag_count': total_frags,
'filename': filename,
'tmpfilename': tmpfilename,
}
start = time.time()
def frag_progress_hook(status):
frag_total_bytes = status.get('total_bytes', 0)
estimated_size = (state['downloaded_bytes'] +
(total_frags - state['frag_counter']) * frag_total_bytes)
if status['status'] == 'finished':
def frag_progress_hook(s):
if s['status'] not in ('downloading', 'finished'):
return
frag_total_bytes = s.get('total_bytes', 0)
if s['status'] == 'finished':
state['downloaded_bytes'] += frag_total_bytes
state['frag_counter'] += 1
progress = self.calc_percent(state['frag_counter'], total_frags)
byte_counter = state['downloaded_bytes']
state['frag_index'] += 1
estimated_size = (
(state['downloaded_bytes'] + frag_total_bytes)
/ (state['frag_index'] + 1) * total_frags)
time_now = time.time()
state['total_bytes_estimate'] = estimated_size
state['elapsed'] = time_now - start
if s['status'] == 'finished':
progress = self.calc_percent(state['frag_index'], total_frags)
else:
frag_downloaded_bytes = status['downloaded_bytes']
byte_counter = state['downloaded_bytes'] + frag_downloaded_bytes
frag_downloaded_bytes = s['downloaded_bytes']
frag_progress = self.calc_percent(frag_downloaded_bytes,
frag_total_bytes)
progress = self.calc_percent(state['frag_counter'], total_frags)
progress = self.calc_percent(state['frag_index'], total_frags)
progress += frag_progress / float(total_frags)
eta = self.calc_eta(start, time.time(), estimated_size, byte_counter)
self.report_progress(progress, format_bytes(estimated_size),
status.get('speed'), eta)
state['eta'] = self.calc_eta(
start, time_now, estimated_size, state['downloaded_bytes'] + frag_downloaded_bytes)
state['speed'] = s.get('speed')
self._hook_progress(state)
http_dl.add_progress_hook(frag_progress_hook)
frags_filenames = []
@ -354,8 +368,8 @@ class F4mFD(FileDownloader):
frags_filenames.append(frag_filename)
dest_stream.close()
self.report_finish(format_bytes(state['downloaded_bytes']), time.time() - start)
elapsed = time.time() - start
self.try_rename(tmpfilename, filename)
for frag_file in frags_filenames:
os.remove(frag_file)
@ -366,6 +380,7 @@ class F4mFD(FileDownloader):
'total_bytes': fsize,
'filename': filename,
'status': 'finished',
'elapsed': elapsed,
})
return True

View File

@ -1,10 +1,9 @@
from __future__ import unicode_literals
import os
import time
from socket import error as SocketError
import errno
import os
import socket
import time
from .common import FileDownloader
from ..compat import (
@ -15,7 +14,6 @@ from ..utils import (
ContentTooShortError,
encodeFilename,
sanitize_open,
format_bytes,
)
@ -102,7 +100,7 @@ class HttpFD(FileDownloader):
resume_len = 0
open_mode = 'wb'
break
except SocketError as e:
except socket.error as e:
if e.errno != errno.ECONNRESET:
# Connection reset is no problem, just retry
raise
@ -137,7 +135,6 @@ class HttpFD(FileDownloader):
self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
return False
data_len_str = format_bytes(data_len)
byte_counter = 0 + resume_len
block_size = self.params.get('buffersize', 1024)
start = time.time()
@ -196,20 +193,19 @@ class HttpFD(FileDownloader):
# Progress message
speed = self.calc_speed(start, now, byte_counter - resume_len)
if data_len is None:
eta = percent = None
eta = None
else:
percent = self.calc_percent(byte_counter, data_len)
eta = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
self.report_progress(percent, data_len_str, speed, eta)
self._hook_progress({
'status': 'downloading',
'downloaded_bytes': byte_counter,
'total_bytes': data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'status': 'downloading',
'eta': eta,
'speed': speed,
'elapsed': now - start,
})
if is_test and byte_counter == data_len:
@ -221,7 +217,13 @@ class HttpFD(FileDownloader):
return False
if tmpfilename != '-':
stream.close()
self.report_finish(data_len_str, (time.time() - start))
self._hook_progress({
'downloaded_bytes': byte_counter,
'total_bytes': data_len,
'tmpfilename': tmpfilename,
'status': 'error',
})
if data_len is not None and byte_counter != data_len:
raise ContentTooShortError(byte_counter, int(data_len))
self.try_rename(tmpfilename, filename)
@ -235,6 +237,7 @@ class HttpFD(FileDownloader):
'total_bytes': byte_counter,
'filename': filename,
'status': 'finished',
'elapsed': time.time() - start,
})
return True

View File

@ -11,7 +11,6 @@ from ..compat import compat_str
from ..utils import (
check_executable,
encodeFilename,
format_bytes,
get_exe_version,
)
@ -51,23 +50,23 @@ class RtmpFD(FileDownloader):
if not resume_percent:
resume_percent = percent
resume_downloaded_data_len = downloaded_data_len
eta = self.calc_eta(start, time.time(), 100 - resume_percent, percent - resume_percent)
speed = self.calc_speed(start, time.time(), downloaded_data_len - resume_downloaded_data_len)
time_now = time.time()
eta = self.calc_eta(start, time_now, 100 - resume_percent, percent - resume_percent)
speed = self.calc_speed(start, time_now, downloaded_data_len - resume_downloaded_data_len)
data_len = None
if percent > 0:
data_len = int(downloaded_data_len * 100 / percent)
data_len_str = '~' + format_bytes(data_len)
self.report_progress(percent, data_len_str, speed, eta)
cursor_in_new_line = False
self._hook_progress({
'status': 'downloading',
'downloaded_bytes': downloaded_data_len,
'total_bytes': data_len,
'total_bytes_estimate': data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'status': 'downloading',
'eta': eta,
'elapsed': time_now - start,
'speed': speed,
})
cursor_in_new_line = False
else:
# no percent for live streams
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
@ -75,15 +74,15 @@ class RtmpFD(FileDownloader):
downloaded_data_len = int(float(mobj.group(1)) * 1024)
time_now = time.time()
speed = self.calc_speed(start, time_now, downloaded_data_len)
self.report_progress_live_stream(downloaded_data_len, speed, time_now - start)
cursor_in_new_line = False
self._hook_progress({
'downloaded_bytes': downloaded_data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'status': 'downloading',
'elapsed': time_now - start,
'speed': speed,
})
cursor_in_new_line = False
elif self.params.get('verbose', False):
if not cursor_in_new_line:
self.to_screen('')

View File

@ -121,6 +121,7 @@ from .ellentv import (
EllenTVClipsIE,
)
from .elpais import ElPaisIE
from .embedly import EmbedlyIE
from .empflix import EMPFlixIE
from .engadget import EngadgetIE
from .eporner import EpornerIE
@ -204,6 +205,7 @@ from .imdb import (
ImdbIE,
ImdbListIE
)
from .imgur import ImgurIE
from .ina import InaIE
from .infoq import InfoQIE
from .instagram import InstagramIE, InstagramUserIE
@ -371,7 +373,7 @@ from .rottentomatoes import RottenTomatoesIE
from .roxwel import RoxwelIE
from .rtbf import RTBFIE
from .rte import RteIE
from .rtlnl import RtlXlIE
from .rtlnl import RtlNlIE
from .rtlnow import RTLnowIE
from .rtl2 import RTL2IE
from .rtp import RTPIE
@ -386,6 +388,7 @@ from .rutube import (
RutubePersonIE,
)
from .rutv import RUTVIE
from .sandia import SandiaIE
from .sapo import SapoIE
from .savefrom import SaveFromIE
from .sbs import SBSIE

View File

@ -38,6 +38,7 @@ class AdultSwimIE(InfoExtractor):
},
],
'info_dict': {
'id': 'rQxZvXQ4ROaSOqq-or2Mow',
'title': 'Rick and Morty - Pilot',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
}
@ -55,6 +56,7 @@ class AdultSwimIE(InfoExtractor):
}
],
'info_dict': {
'id': '-t8CamQlQ2aYZ49ItZCFog',
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},

View File

@ -14,6 +14,9 @@ class AppleTrailersIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/trailers/(?P<company>[^/]+)/(?P<movie>[^/]+)'
_TEST = {
"url": "http://trailers.apple.com/trailers/wb/manofsteel/",
'info_dict': {
'id': 'manofsteel',
},
"playlist": [
{
"md5": "d97a8e575432dbcb81b7c3acb741f8a8",

View File

@ -109,7 +109,7 @@ class BandcampIE(InfoExtractor):
class BandcampAlbumIE(InfoExtractor):
IE_NAME = 'Bandcamp:album'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+)|/?(?:$|[?#]))'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^?#]+)|/?(?:$|[?#]))'
_TESTS = [{
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@ -133,31 +133,37 @@ class BandcampAlbumIE(InfoExtractor):
],
'info_dict': {
'title': 'Jazz Format Mixtape vol.1',
'id': 'jazz-format-mixtape-vol-1',
'uploader_id': 'blazo',
},
'params': {
'playlistend': 2
},
'skip': 'Bandcamp imposes download limits. See test_playlists:test_bandcamp_album for the playlist test'
'skip': 'Bandcamp imposes download limits.'
}, {
'url': 'http://nightbringer.bandcamp.com/album/hierophany-of-the-open-grave',
'info_dict': {
'title': 'Hierophany of the Open Grave',
'uploader_id': 'nightbringer',
'id': 'hierophany-of-the-open-grave',
},
'playlist_mincount': 9,
}, {
'url': 'http://dotscale.bandcamp.com',
'info_dict': {
'title': 'Loom',
'id': 'dotscale',
'uploader_id': 'dotscale',
},
'playlist_mincount': 7,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('subdomain')
title = mobj.group('title')
display_id = title or playlist_id
webpage = self._download_webpage(url, display_id)
uploader_id = mobj.group('subdomain')
album_id = mobj.group('album_id')
playlist_id = album_id or uploader_id
webpage = self._download_webpage(url, playlist_id)
tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
if not tracks_paths:
raise ExtractorError('The page doesn\'t contain any tracks')
@ -168,8 +174,8 @@ class BandcampAlbumIE(InfoExtractor):
r'album_title\s*:\s*"(.*?)"', webpage, 'title', fatal=False)
return {
'_type': 'playlist',
'uploader_id': uploader_id,
'id': playlist_id,
'display_id': display_id,
'title': title,
'entries': entries,
}

View File

@ -95,6 +95,7 @@ class BrightcoveIE(InfoExtractor):
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?playerID=3550052898001&playerKey=AQ%7E%7E%2CAAABmA9XpXk%7E%2C-Kp7jNgisre1fG5OdqpAFUTcs0lP_ZoL',
'info_dict': {
'title': 'Sealife',
'id': '3550319591001',
},
'playlist_mincount': 7,
},
@ -247,7 +248,7 @@ class BrightcoveIE(InfoExtractor):
playlist_info = json_data['videoList']
videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
return self.playlist_result(videos, playlist_id=playlist_info['id'],
return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
def _extract_video_info(self, video_info):

View File

@ -33,6 +33,7 @@ class BuzzFeedIE(InfoExtractor):
'skip_download': True, # Got enough YouTube download tests
},
'info_dict': {
'id': 'look-at-this-cute-dog-omg',
'description': 're:Munchkin the Teddy Bear is back ?!',
'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
},
@ -42,8 +43,8 @@ class BuzzFeedIE(InfoExtractor):
'ext': 'mp4',
'upload_date': '20141124',
'uploader_id': 'CindysMunchkin',
'description': 're:© 2014 Munchkin the Shih Tzu',
'uploader': 'Munchkin the Shih Tzu',
'description': 're:© 2014 Munchkin the',
'uploader': 're:^Munchkin the',
'title': 're:Munchkin the Teddy Bear gets her exercise',
},
}]

View File

@ -1,7 +1,5 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
@ -39,8 +37,7 @@ class CBSIE(InfoExtractor):
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
real_id = self._search_regex(
r"video\.settings\.pid\s*=\s*'([^']+)';",

View File

@ -27,7 +27,6 @@ from ..utils import (
compiled_regex_type,
ExtractorError,
float_or_none,
HEADRequest,
int_or_none,
RegexNotFoundError,
sanitize_filename,
@ -753,9 +752,7 @@ class InfoExtractor(object):
def _is_valid_url(self, url, video_id, item='video'):
try:
self._request_webpage(
HEADRequest(url), video_id,
'Checking %s URL' % item)
self._request_webpage(url, video_id, 'Checking %s URL' % item)
return True
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError):
@ -841,6 +838,7 @@ class InfoExtractor(object):
note='Downloading m3u8 information',
errnote='Failed to download m3u8 information')
last_info = None
last_media = None
kv_rex = re.compile(
r'(?P<key>[a-zA-Z_-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)')
for line in m3u8_doc.splitlines():
@ -851,6 +849,13 @@ class InfoExtractor(object):
if v.startswith('"'):
v = v[1:-1]
last_info[m.group('key')] = v
elif line.startswith('#EXT-X-MEDIA:'):
last_media = {}
for m in kv_rex.finditer(line):
v = m.group('val')
if v.startswith('"'):
v = v[1:-1]
last_media[m.group('key')] = v
elif line.startswith('#') or not line.strip():
continue
else:
@ -879,6 +884,9 @@ class InfoExtractor(object):
width_str, height_str = resolution.split('x')
f['width'] = int(width_str)
f['height'] = int(height_str)
if last_media is not None:
f['m3u8_media'] = last_media
last_media = None
formats.append(f)
last_info = {}
self._sort_formats(formats)

View File

@ -194,6 +194,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
'info_dict': {
'title': 'SPORT',
'id': 'xv4bw_nqtv_sport',
},
'playlist_mincount': 20,
}]

View File

@ -0,0 +1,16 @@
# encoding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
class EmbedlyIE(InfoExtractor):
_VALID_URL = r'https?://(?:www|cdn\.)?embedly\.com/widgets/media\.html\?(?:[^#]*?&)?url=(?P<id>[^#&]+)'
_TESTS = [{
'url': 'https://cdn.embedly.com/widgets/media.html?src=http%3A%2F%2Fwww.youtube.com%2Fembed%2Fvideoseries%3Flist%3DUUGLim4T2loE5rwCMdpCIPVg&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DSU4fj_aEMVw%26list%3DUUGLim4T2loE5rwCMdpCIPVg&image=http%3A%2F%2Fi.ytimg.com%2Fvi%2FSU4fj_aEMVw%2Fhqdefault.jpg&key=8ee8a2e6a8cc47aab1a5ee67f9a178e0&type=text%2Fhtml&schema=youtube&autoplay=1',
'only_matching': True,
}]
def _real_extract(self, url):
return self.url_result(compat_urllib_parse_unquote(self._match_id(url)))

View File

@ -473,6 +473,7 @@ class GenericIE(InfoExtractor):
{
'url': 'http://discourse.ubuntu.com/t/unity-8-desktop-mode-windows-on-mir/1986',
'info_dict': {
'id': '1986',
'title': 'Unity 8 desktop-mode windows on Mir! - Ubuntu Discourse',
},
'playlist_mincount': 2,
@ -537,6 +538,15 @@ class GenericIE(InfoExtractor):
'uploader_id': 'NationalArchives08',
'title': 'Webinar: Using Discovery, The National Archives online catalogue',
},
},
# rtl.nl embed
{
'url': 'http://www.rtlnieuws.nl/nieuws/buitenland/aanslagen-kopenhagen',
'playlist_mincount': 5,
'info_dict': {
'id': 'aanslagen-kopenhagen',
'title': 'Aanslagen Kopenhagen | RTL Nieuws',
}
}
]
@ -782,6 +792,13 @@ class GenericIE(InfoExtractor):
'entries': entries,
}
# Look for embedded rtl.nl player
matches = re.findall(
r'<iframe\s+(?:[a-zA-Z-]+="[^"]+"\s+)*?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+video_embed[^"]+)"',
webpage)
if matches:
return _playlist_from_matches(matches, ie='RtlNl')
# Look for embedded (iframe) Vimeo player
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//player\.vimeo\.com/video/.+?)\1', webpage)
@ -789,7 +806,6 @@ class GenericIE(InfoExtractor):
player_url = unescapeHTML(mobj.group('url'))
surl = smuggle_url(player_url, {'Referer': url})
return self.url_result(surl)
# Look for embedded (swf embed) Vimeo player
mobj = re.search(
r'<embed[^>]+?src="((?:https?:)?//(?:www\.)?vimeo\.com/moogaloop\.swf.+?)"', webpage)

View File

@ -34,6 +34,9 @@ class IGNIE(InfoExtractor):
},
{
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
'info_dict': {
'id': '100-little-things-in-gta-5-that-will-blow-your-mind',
},
'playlist': [
{
'info_dict': {

View File

@ -0,0 +1,84 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
js_to_json,
mimetype2ext,
)
class ImgurIE(InfoExtractor):
_VALID_URL = r'https?://i\.imgur\.com/(?P<id>[a-zA-Z0-9]+)\.(?:mp4|gifv)'
_TESTS = [{
'url': 'https://i.imgur.com/A61SaA1.gifv',
'info_dict': {
'id': 'A61SaA1',
'ext': 'mp4',
'title': 'MRW gifv is up and running without any bugs',
'description': 'The Internet\'s visual storytelling community. Explore, share, and discuss the best visual stories the Internet has to offer.',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
width = int_or_none(self._search_regex(
r'<param name="width" value="([0-9]+)"',
webpage, 'width', fatal=False))
height = int_or_none(self._search_regex(
r'<param name="height" value="([0-9]+)"',
webpage, 'height', fatal=False))
formats = []
video_elements = self._search_regex(
r'(?s)<div class="video-elements">(.*?)</div>',
webpage, 'video elements')
formats = []
for m in re.finditer(r'<source\s+src="(?P<src>[^"]+)"\s+type="(?P<type>[^"]+)"', video_elements):
formats.append({
'format_id': m.group('type').partition('/')[2],
'url': self._proto_relative_url(m.group('src')),
'ext': mimetype2ext(m.group('type')),
'acodec': 'none',
'width': width,
'height': height,
'http_headers': {
'User-Agent': 'youtube-dl (like wget)',
},
})
gif_json = self._search_regex(
r'(?s)var\s+videoItem\s*=\s*(\{.*?\})',
webpage, 'GIF code', fatal=False)
if gif_json:
gifd = self._parse_json(
gif_json, video_id, transform_source=js_to_json)
formats.append({
'format_id': 'gif',
'preference': -10,
'width': width,
'height': height,
'ext': 'gif',
'acodec': 'none',
'vcodec': 'gif',
'container': 'gif',
'url': self._proto_relative_url(gifd['gifUrl']),
'filesize': gifd.get('size'),
'http_headers': {
'User-Agent': 'youtube-dl (like wget)',
},
})
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
'description': self._og_search_description(webpage),
'title': self._og_search_title(webpage),
}

View File

@ -37,6 +37,7 @@ class LivestreamIE(InfoExtractor):
'url': 'http://new.livestream.com/tedx/cityenglish',
'info_dict': {
'title': 'TEDCity2.0 (English)',
'id': '2245590',
},
'playlist_mincount': 4,
}, {
@ -148,7 +149,8 @@ class LivestreamIE(InfoExtractor):
if is_relevant(video_data, video_id)]
if video_id is None:
# This is an event page:
return self.playlist_result(videos, info['id'], info['full_name'])
return self.playlist_result(
videos, '%s' % info['id'], info['full_name'])
else:
if not videos:
raise ExtractorError('Cannot find video %s' % video_id)

View File

@ -1,9 +1,6 @@
# encoding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
js_to_json,
@ -11,7 +8,7 @@ from ..utils import (
class PatreonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?patreon\.com/creation\?hid=(.+)'
_VALID_URL = r'https?://(?:www\.)?patreon\.com/creation\?hid=(?P<id>[^&#]+)'
_TESTS = [
{
'url': 'http://www.patreon.com/creation?hid=743933',
@ -35,6 +32,23 @@ class PatreonIE(InfoExtractor):
'thumbnail': 're:^https?://.*$',
},
},
{
'url': 'https://www.patreon.com/creation?hid=1682498',
'info_dict': {
'id': 'SU4fj_aEMVw',
'ext': 'mp4',
'title': 'I\'m on Patreon!',
'uploader': 'TraciJHines',
'thumbnail': 're:^https?://.*$',
'upload_date': '20150211',
'description': 'md5:c5a706b1f687817a3de09db1eb93acd4',
'uploader_id': 'TraciJHines',
},
'params': {
'noplaylist': True,
'skip_download': True,
}
}
]
# Currently Patreon exposes download URL via hidden CSS, so login is not
@ -65,26 +79,29 @@ class PatreonIE(InfoExtractor):
'''
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group(1)
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._og_search_title(webpage).strip()
attach_fn = self._html_search_regex(
r'<div class="attach"><a target="_blank" href="([^"]+)">',
webpage, 'attachment URL', default=None)
embed = self._html_search_regex(
r'<div id="watchCreation">\s*<iframe class="embedly-embed" src="([^"]+)"',
webpage, 'embedded URL', default=None)
if attach_fn is not None:
video_url = 'http://www.patreon.com' + attach_fn
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex(
r'<strong>(.*?)</strong> is creating', webpage, 'uploader')
elif embed is not None:
return self.url_result(embed)
else:
playlist_js = self._search_regex(
playlist = self._parse_json(self._search_regex(
r'(?s)new\s+jPlayerPlaylist\(\s*\{\s*[^}]*},\s*(\[.*?,?\s*\])',
webpage, 'playlist JSON')
playlist_json = js_to_json(playlist_js)
playlist = json.loads(playlist_json)
webpage, 'playlist JSON'),
video_id, transform_source=js_to_json)
data = playlist[0]
video_url = self._proto_relative_url(data['mp3'])
thumbnail = self._proto_relative_url(data.get('cover'))

View File

@ -1,7 +1,5 @@
from __future__ import unicode_literals
import json
from .common import InfoExtractor
@ -10,13 +8,13 @@ class RadioDeIE(InfoExtractor):
_VALID_URL = r'https?://(?P<id>.+?)\.(?:radio\.(?:de|at|fr|pt|es|pl|it)|rad\.io)'
_TEST = {
'url': 'http://ndr2.radio.de/',
'md5': '3b4cdd011bc59174596b6145cda474a4',
'info_dict': {
'id': 'ndr2',
'ext': 'mp3',
'title': 're:^NDR 2 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:591c49c702db1a33751625ebfb67f273',
'thumbnail': 're:^https?://.*\.png',
'is_live': True,
},
'params': {
'skip_download': True,
@ -25,16 +23,15 @@ class RadioDeIE(InfoExtractor):
def _real_extract(self, url):
radio_id = self._match_id(url)
webpage = self._download_webpage(url, radio_id)
jscode = self._search_regex(
r"'components/station/stationService':\s*\{\s*'?station'?:\s*(\{.*?\s*\}),\n",
webpage, 'broadcast')
broadcast = json.loads(self._search_regex(
r'_getBroadcast\s*=\s*function\(\s*\)\s*{\s*return\s+({.+?})\s*;\s*}',
webpage, 'broadcast'))
broadcast = self._parse_json(jscode, radio_id)
title = self._live_title(broadcast['name'])
description = broadcast.get('description') or broadcast.get('shortDescription')
thumbnail = broadcast.get('picture4Url') or broadcast.get('picture4TransUrl')
thumbnail = broadcast.get('picture4Url') or broadcast.get('picture4TransUrl') or broadcast.get('logo100x100')
formats = [{
'url': stream['streamUrl'],

View File

@ -1,16 +1,25 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import parse_duration
from ..utils import (
int_or_none,
parse_duration,
)
class RtlXlIE(InfoExtractor):
IE_NAME = 'rtlxl.nl'
_VALID_URL = r'https?://(www\.)?rtlxl\.nl/#!/[^/]+/(?P<uuid>[^/?]+)'
class RtlNlIE(InfoExtractor):
IE_NAME = 'rtl.nl'
IE_DESC = 'rtl.nl and rtlxl.nl'
_VALID_URL = r'''(?x)
https?://(www\.)?
(?:
rtlxl\.nl/\#!/[^/]+/|
rtl\.nl/system/videoplayer/[^?#]+?/video_embed\.html\#uuid=
)
(?P<id>[0-9a-f-]+)'''
_TEST = {
_TESTS = [{
'url': 'http://www.rtlxl.nl/#!/rtl-nieuws-132237/6e4203a6-0a5e-3596-8424-c599a59e0677',
'md5': 'cc16baa36a6c169391f0764fa6b16654',
'info_dict': {
@ -22,21 +31,30 @@ class RtlXlIE(InfoExtractor):
'upload_date': '20140814',
'duration': 576.880,
},
}
}, {
'url': 'http://www.rtl.nl/system/videoplayer/derden/rtlnieuws/video_embed.html#uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed/autoplay=false',
'md5': 'dea7474214af1271d91ef332fb8be7ea',
'info_dict': {
'id': '84ae5571-ac25-4225-ae0c-ef8d9efb2aed',
'ext': 'mp4',
'timestamp': 1424039400,
'title': 'RTL Nieuws - Nieuwe beelden Kopenhagen: chaos direct na aanslag',
'thumbnail': 're:^https?://screenshots\.rtl\.nl/system/thumb/sz=[0-9]+x[0-9]+/uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed$',
'upload_date': '20150215',
'description': 'Er zijn nieuwe beelden vrijgegeven die vlak na de aanslag in Kopenhagen zijn gemaakt. Op de video is goed te zien hoe omstanders zich bekommeren om één van de slachtoffers, terwijl de eerste agenten ter plaatse komen.',
}
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
uuid = mobj.group('uuid')
uuid = self._match_id(url)
info = self._download_json(
'http://www.rtl.nl/system/s4m/vfd/version=2/uuid=%s/fmt=flash/' % uuid,
uuid)
material = info['material'][0]
episode_info = info['episodes'][0]
progname = info['abstracts'][0]['name']
subtitle = material['title'] or info['episodes'][0]['name']
description = material.get('synopsis') or info['episodes'][0]['synopsis']
# Use unencrypted m3u8 streams (See https://github.com/rg3/youtube-dl/issues/4118)
videopath = material['videopath'].replace('.f4m', '.m3u8')
@ -58,14 +76,29 @@ class RtlXlIE(InfoExtractor):
'quality': 0,
}
])
self._sort_formats(formats)
thumbnails = []
meta = info.get('meta', {})
for p in ('poster_base_url', '"thumb_base_url"'):
if not meta.get(p):
continue
thumbnails.append({
'url': self._proto_relative_url(meta[p] + uuid),
'width': int_or_none(self._search_regex(
r'/sz=([0-9]+)', meta[p], 'thumbnail width', fatal=False)),
'height': int_or_none(self._search_regex(
r'/sz=[0-9]+x([0-9]+)',
meta[p], 'thumbnail height', fatal=False))
})
return {
'id': uuid,
'title': '%s - %s' % (progname, subtitle),
'formats': formats,
'timestamp': material['original_date'],
'description': episode_info['synopsis'],
'description': description,
'duration': parse_duration(material.get('duration')),
'thumbnails': thumbnails,
}

View File

@ -0,0 +1,117 @@
# coding: utf-8
from __future__ import unicode_literals
import itertools
import json
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_request,
compat_urlparse,
)
from ..utils import (
int_or_none,
js_to_json,
mimetype2ext,
unified_strdate,
)
class SandiaIE(InfoExtractor):
IE_DESC = 'Sandia National Laboratories'
_VALID_URL = r'https?://digitalops\.sandia\.gov/Mediasite/Play/(?P<id>[0-9a-f]+)'
_TEST = {
'url': 'http://digitalops.sandia.gov/Mediasite/Play/24aace4429fc450fb5b38cdbf424a66e1d',
'md5': '9422edc9b9a60151727e4b6d8bef393d',
'info_dict': {
'id': '24aace4429fc450fb5b38cdbf424a66e1d',
'ext': 'mp4',
'title': 'Xyce Software Training - Section 1',
'description': 're:(?s)SAND Number: SAND 2013-7800.{200,}',
'upload_date': '20120904',
'duration': 7794,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
req = compat_urllib_request.Request(url)
req.add_header('Cookie', 'MediasitePlayerCaps=ClientPlugins=4')
webpage = self._download_webpage(req, video_id)
js_path = self._search_regex(
r'<script type="text/javascript" src="(/Mediasite/FileServer/Presentation/[^"]+)"',
webpage, 'JS code URL')
js_url = compat_urlparse.urljoin(url, js_path)
js_code = self._download_webpage(
js_url, video_id, note='Downloading player')
def extract_str(key, **args):
return self._search_regex(
r'Mediasite\.PlaybackManifest\.%s\s*=\s*(.+);\s*?\n' % re.escape(key),
js_code, key, **args)
def extract_data(key, **args):
data_json = extract_str(key, **args)
if data_json is None:
return data_json
return self._parse_json(
data_json, video_id, transform_source=js_to_json)
formats = []
for i in itertools.count():
fd = extract_data('VideoUrls[%d]' % i, default=None)
if fd is None:
break
formats.append({
'format_id': '%s' % i,
'format_note': fd['MimeType'].partition('/')[2],
'ext': mimetype2ext(fd['MimeType']),
'url': fd['Location'],
'protocol': 'f4m' if fd['MimeType'] == 'video/x-mp4-fragmented' else None,
})
self._sort_formats(formats)
slide_baseurl = compat_urlparse.urljoin(
url, extract_data('SlideBaseUrl'))
slide_template = slide_baseurl + re.sub(
r'\{0:D?([0-9+])\}', r'%0\1d', extract_data('SlideImageFileNameTemplate'))
slides = []
last_slide_time = 0
for i in itertools.count(1):
sd = extract_str('Slides[%d]' % i, default=None)
if sd is None:
break
timestamp = int_or_none(self._search_regex(
r'^Mediasite\.PlaybackManifest\.CreateSlide\("[^"]*"\s*,\s*([0-9]+),',
sd, 'slide %s timestamp' % i, fatal=False))
slides.append({
'url': slide_template % i,
'duration': timestamp - last_slide_time,
})
last_slide_time = timestamp
formats.append({
'format_id': 'slides',
'protocol': 'slideshow',
'url': json.dumps(slides),
'preference': -10000, # Downloader not yet written
})
self._sort_formats(formats)
title = extract_data('Title')
description = extract_data('Description', fatal=False)
duration = int_or_none(extract_data(
'Duration', fatal=False), scale=1000)
upload_date = unified_strdate(extract_data('AirDate', fatal=False))
return {
'id': video_id,
'title': title,
'description': description,
'formats': formats,
'upload_date': upload_date,
'duration': duration,
}

View File

@ -349,6 +349,13 @@ class TwitchStreamIE(TwitchBaseIE):
% (self._USHER_BASE, channel_id, compat_urllib_parse.urlencode(query).encode('utf-8')),
channel_id, 'mp4')
# prefer the 'source' stream, the others are limited to 30 fps
def _sort_source(f):
if f.get('m3u8_media') is not None and f['m3u8_media'].get('NAME') == 'Source':
return 1
return 0
formats = sorted(formats, key=_sort_source)
view_count = stream.get('viewers')
timestamp = parse_iso8601(stream.get('created_at'))

View File

@ -49,15 +49,31 @@ class VideoLecturesNetIE(InfoExtractor):
thumbnail = (
None if thumbnail_el is None else thumbnail_el.attrib.get('src'))
formats = [{
'url': v.attrib['src'],
'width': int_or_none(v.attrib.get('width')),
'height': int_or_none(v.attrib.get('height')),
'filesize': int_or_none(v.attrib.get('size')),
'tbr': int_or_none(v.attrib.get('systemBitrate')) / 1000.0,
'ext': v.attrib.get('ext'),
} for v in switch.findall('./video')
if v.attrib.get('proto') == 'http']
formats = []
for v in switch.findall('./video'):
proto = v.attrib.get('proto')
if proto not in ['http', 'rtmp']:
continue
f = {
'width': int_or_none(v.attrib.get('width')),
'height': int_or_none(v.attrib.get('height')),
'filesize': int_or_none(v.attrib.get('size')),
'tbr': int_or_none(v.attrib.get('systemBitrate')) / 1000.0,
'ext': v.attrib.get('ext'),
}
src = v.attrib['src']
if proto == 'http':
if self._is_valid_url(src, video_id):
f['url'] = src
formats.append(f)
elif proto == 'rtmp':
f.update({
'url': v.attrib['streamer'],
'play_path': src,
'rtmp_real_time': True,
})
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,

View File

@ -18,6 +18,7 @@ from ..utils import (
InAdvancePagedList,
int_or_none,
RegexNotFoundError,
smuggle_url,
std_headers,
unsmuggle_url,
urlencode_postdata,
@ -174,7 +175,7 @@ class VimeoIE(VimeoBaseInfoExtractor, SubtitlesInfoExtractor):
def _verify_video_password(self, url, video_id, webpage):
password = self._downloader.params.get('videopassword', None)
if password is None:
raise ExtractorError('This video is protected by a password, use the --video-password option')
raise ExtractorError('This video is protected by a password, use the --video-password option', expected=True)
token = self._search_regex(r'xsrft: \'(.*?)\'', webpage, 'login token')
data = compat_urllib_parse.urlencode({
'password': password,
@ -267,8 +268,11 @@ class VimeoIE(VimeoBaseInfoExtractor, SubtitlesInfoExtractor):
raise ExtractorError('The author has restricted the access to this video, try with the "--referer" option')
if re.search(r'<form[^>]+?id="pw_form"', webpage) is not None:
if data and '_video_password_verified' in data:
raise ExtractorError('video password verification failed!')
self._verify_video_password(url, video_id, webpage)
return self._real_extract(url)
return self._real_extract(
smuggle_url(url, {'_video_password_verified': 'verified'}))
else:
raise ExtractorError('Unable to extract info section',
cause=e)
@ -401,6 +405,7 @@ class VimeoChannelIE(InfoExtractor):
_TESTS = [{
'url': 'http://vimeo.com/channels/tributes',
'info_dict': {
'id': 'tributes',
'title': 'Vimeo Tributes',
},
'playlist_mincount': 25,
@ -479,6 +484,7 @@ class VimeoUserIE(VimeoChannelIE):
'url': 'http://vimeo.com/nkistudio/videos',
'info_dict': {
'title': 'Nki',
'id': 'nkistudio',
},
'playlist_mincount': 66,
}]
@ -496,6 +502,7 @@ class VimeoAlbumIE(VimeoChannelIE):
_TESTS = [{
'url': 'http://vimeo.com/album/2632481',
'info_dict': {
'id': '2632481',
'title': 'Staff Favorites: November 2013',
},
'playlist_mincount': 13,
@ -526,6 +533,7 @@ class VimeoGroupsIE(VimeoAlbumIE):
_TESTS = [{
'url': 'http://vimeo.com/groups/rolexawards',
'info_dict': {
'id': 'rolexawards',
'title': 'Rolex Awards for Enterprise',
},
'playlist_mincount': 73,
@ -608,6 +616,7 @@ class VimeoLikesIE(InfoExtractor):
'url': 'https://vimeo.com/user755559/likes/',
'playlist_mincount': 293,
"info_dict": {
'id': 'user755559_likes',
"description": "See all the videos urza likes",
"title": 'Videos urza likes',
},

View File

@ -217,6 +217,9 @@ class VKUserVideosIE(InfoExtractor):
_TEMPLATE_URL = 'https://vk.com/videos'
_TEST = {
'url': 'http://vk.com/videos205387401',
'info_dict': {
'id': '205387401',
},
'playlist_mincount': 4,
}

View File

@ -18,8 +18,8 @@ class WSJIE(InfoExtractor):
'id': '1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
'ext': 'mp4',
'upload_date': '20150202',
'uploader_id': 'bbright',
'creator': 'bbright',
'uploader_id': 'jdesai',
'creator': 'jdesai',
'categories': list, # a long list
'duration': 90,
'title': 'Bills Coach Rex Ryan Updates His Old Jets Tattoo',

View File

@ -22,7 +22,7 @@ class XTubeIE(InfoExtractor):
'id': 'kVTUy_G222_',
'ext': 'mp4',
'title': 'strange erotica',
'description': 'http://www.xtube.com an ET kind of thing',
'description': 'contains:an ET kind of thing',
'uploader': 'greenshowers',
'duration': 450,
'age_limit': 18,

View File

@ -24,7 +24,6 @@ class YahooIE(InfoExtractor):
_TESTS = [
{
'url': 'http://screen.yahoo.com/julian-smith-travis-legg-watch-214727115.html',
'md5': '4962b075c08be8690a922ee026d05e69',
'info_dict': {
'id': '2d25e626-2378-391f-ada0-ddaf1417e588',
'ext': 'mp4',

View File

@ -541,26 +541,30 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
if cache_spec is not None:
return lambda s: ''.join(s[i] for i in cache_spec)
download_note = (
'Downloading player %s' % player_url
if self._downloader.params.get('verbose') else
'Downloading %s player %s' % (player_type, player_id)
)
if player_type == 'js':
code = self._download_webpage(
player_url, video_id,
note='Downloading %s player %s' % (player_type, player_id),
note=download_note,
errnote='Download of %s failed' % player_url)
res = self._parse_sig_js(code)
elif player_type == 'swf':
urlh = self._request_webpage(
player_url, video_id,
note='Downloading %s player %s' % (player_type, player_id),
note=download_note,
errnote='Download of %s failed' % player_url)
code = urlh.read()
res = self._parse_sig_swf(code)
else:
assert False, 'Invalid player type %r' % player_type
if cache_spec is None:
test_string = ''.join(map(compat_chr, range(len(example_sig))))
cache_res = res(test_string)
cache_spec = [ord(c) for c in cache_res]
test_string = ''.join(map(compat_chr, range(len(example_sig))))
cache_res = res(test_string)
cache_spec = [ord(c) for c in cache_res]
self._downloader.cache.store('youtube-sigfuncs', func_id, cache_spec)
return res

View File

@ -30,13 +30,10 @@ class JSInterpreter(object):
def __init__(self, code, objects=None):
if objects is None:
objects = {}
self.code = self._remove_comments(code)
self.code = code
self._functions = {}
self._objects = objects
def _remove_comments(self, code):
return re.sub(r'(?s)/\*.*?\*/', '', code)
def interpret_statement(self, stmt, local_vars, allow_recursion=100):
if allow_recursion < 0:
raise ExtractorError('Recursion limit reached')

View File

@ -424,6 +424,10 @@ def parseOpts(overrideArguments=None):
'--xattr-set-filesize',
dest='xattr_set_filesize', action='store_true',
help='(experimental) set file xattribute ytdl.filesize with expected filesize')
downloader.add_option(
'--hls-prefer-native',
dest='hls_prefer_native', action='store_true',
help='(experimental) Use the native HLS downloader instead of ffmpeg.')
downloader.add_option(
'--external-downloader',
dest='external_downloader', metavar='COMMAND',

View File

@ -34,10 +34,10 @@ class FFmpegPostProcessor(PostProcessor):
self._determine_executables()
def check_version(self):
if not self.available():
if not self.available:
raise FFmpegPostProcessorError('ffmpeg or avconv not found. Please install one.')
required_version = '10-0' if self._uses_avconv() else '1.0'
required_version = '10-0' if self.basename == 'avconv' else '1.0'
if is_outdated_version(
self._versions[self.basename], required_version):
warning = 'Your copy of %s is outdated, update %s to version %s or newer if you encounter any errors.' % (
@ -108,12 +108,10 @@ class FFmpegPostProcessor(PostProcessor):
self.probe_basename = p
break
@property
def available(self):
return self.basename is not None
def _uses_avconv(self):
return self.basename == 'avconv'
@property
def executable(self):
return self._paths[self.basename]

View File

@ -1560,8 +1560,8 @@ def js_to_json(code):
return '"%s"' % v
res = re.sub(r'''(?x)
"(?:[^"\\]*(?:\\\\|\\")?)*"|
'(?:[^'\\]*(?:\\\\|\\')?)*'|
"(?:[^"\\]*(?:\\\\|\\['"nu]))*[^"\\]*"|
'(?:[^'\\]*(?:\\\\|\\['"nu]))*[^'\\]*'|
[a-zA-Z_][.a-zA-Z_0-9]*
''', fix_kv, code)
res = re.sub(r',(\s*\])', lambda m: m.group(1), res)
@ -1616,6 +1616,15 @@ def args_to_str(args):
return ' '.join(shlex_quote(a) for a in args)
def mimetype2ext(mt):
_, _, res = mt.rpartition('/')
return {
'x-ms-wmv': 'wmv',
'x-mp4-fragmented': 'mp4',
}.get(res, res)
def urlhandle_detect_ext(url_handle):
try:
url_handle.headers
@ -1631,7 +1640,7 @@ def urlhandle_detect_ext(url_handle):
if e:
return e
return getheader('Content-Type').split("/")[1]
return mimetype2ext(getheader('Content-Type'))
def age_restricted(content_limit, age_limit):

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2015.02.16'
__version__ = '2015.02.19.2'