Compare commits

..

41 Commits

Author SHA1 Message Date
1cf64ee468 release 2013.10.23.2 2013-10-23 18:38:09 +02:00
cdec0190c4 [dailymotion] Extract all the available formats (closes #1028) 2013-10-23 17:33:38 +02:00
2450bcb28b [nowvideo] Fix key extraction
Extract it from the embed page
2013-10-23 17:00:33 +02:00
3126050c0f Hide the video password on verbose mode 2013-10-23 16:32:17 +02:00
93b22c7828 [vimeo] fix the extraction for videos protected with password
Added a test video.
2013-10-23 16:31:53 +02:00
0a89b2852e release 2013.10.23.1 2013-10-23 15:12:33 +02:00
55b3e45bba [vimeo] Fix pro videos and player.vimeo.com urls
The old process can still be used for those videos.
Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
2013-10-23 14:38:03 +02:00
365bcf6d97 Merge remote-tracking branch 'origin/master' 2013-10-23 11:40:46 +02:00
71907db3ba [vimeo] Fix normal videos (Fixes #1642)
Vimeo Pro Videos are still broken
2013-10-23 11:38:53 +02:00
6803655ced Merge pull request #1622 from rbrito/fix-extension
extractor: youtube: Set extension of AAC audio formats to m4a.
2013-10-22 15:16:26 -07:00
df1c39ec5c release 2013.10.23 2013-10-23 00:07:27 +02:00
80f55a9511 release 2013.10.22 2013-10-22 22:35:13 +02:00
7853cc5ae1 Merge remote-tracking branch 'origin/master'
Conflicts:
	youtube_dl/YoutubeDL.py
2013-10-22 22:30:06 +02:00
586a91b67f Expand tilde in template (Fixes #1639) 2013-10-22 22:28:26 +02:00
b028e96144 [arte.tv:creative] Update the title of the test 2013-10-22 21:06:06 +02:00
ce68b5907c [nhl:videocenter] Fix playlist title extraction 2013-10-22 21:01:16 +02:00
fe7e0c9825 Style fixes in YoutubeDL.py
Fixed some of the problems reported by pep8
2013-10-22 14:49:34 +02:00
12893efe01 Respect the download parameter in YoutubeDL.process_video_result if the extractor handle the format selection 2013-10-22 00:01:59 +02:00
a6387bfd3c [vimeo] Implement the new format selection system (closes PR #996)
Rebased and deleted some parts to use the new system instead of copying the one from YoutubeIE
2013-10-21 23:16:11 +02:00
f6a54188c2 [youtube] Use 'node is None' when checking if the video has automatic captions
It had stopped working and it reports a FutureWarning
2013-10-21 16:28:55 +02:00
cbbd9a9c69 Fix the duration field for the VideoDetective and InternetVideoArchive tests
Also remove the use of the old format system and the comment
2013-10-21 15:07:33 +02:00
685a9cd2f1 [googleplus] Fix upload_date extraction 2013-10-21 15:00:21 +02:00
182a107877 [arte] Set the format_note and the format_id fields (closes #1628) 2013-10-21 14:42:30 +02:00
8c51aa6506 The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
2013-10-21 14:42:06 +02:00
3fd39e37f2 YoutubeDL: remove method that came from FileDownloader 2013-10-21 13:52:24 +02:00
49e86983e7 Allow to use the extension for the format selection
The best format with the extension is downloaded.
2013-10-21 13:31:55 +02:00
a9c58ad945 Accept requested formats to be in the format 35/best (closes #1552)
The format selection code is now an independent function.
2013-10-21 13:19:58 +02:00
f8b45beacc Merge remote-tracking branch 'rbrito/set-age'
Conflicts:
	youtube_dl/extractor/xhamster.py
2013-10-19 21:16:14 +02:00
9d92015d43 [xhamster] Add support for age_limit (Instead of #1627) 2013-10-19 21:09:48 +02:00
50a6150ed9 extractor: Set age limit on some adult-related extractors.
More age limit of videos for adult-related sites.

Note that, for redtube, I explicitly left the variable containing the age
limit, since the comment justifying the age limit is a good thing to have.

That being said, I included the age limit field on the test, to better
reflect what the information extractor does (even if it may not break the
automated tests).

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-10-19 14:19:25 -03:00
284acd57d6 Add an author email 2013-10-19 11:14:20 +02:00
8ed6b34477 extractor: Set age limit on some adult-related extractors.
This is similar in spirit to what was done in commit 8e590a117f.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-10-18 19:32:37 -03:00
f6f1fc9286 extractor: youtube: Fix extension of dash formats.
While we are at it, separate the audio formats from the video formats.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-10-18 18:53:00 -03:00
8e590a117f [xnxx] Add age_limit 2013-10-18 23:35:17 +02:00
d5594202aa Simplify release process 2013-10-18 23:34:55 +02:00
b186d949cf release 2013.10.18.2 2013-10-18 23:22:54 +02:00
3d2986063c [bash-completion] Do not use dash in function name (Fixes #1623) 2013-10-18 23:13:46 +02:00
41fd7c7e60 Add new option --abort-on-error 2013-10-18 23:09:32 +02:00
fdefe96bf2 Document %(format)s (#1612) 2013-10-18 23:09:08 +02:00
16f36a6fc9 extractor: youtube: Set extension of AAC audio formats to m4a.
This, in particular, eases downloading both audio and videos in DASH formats
before muxing them, which alleviates the problem that I exposed on issue

Furthermore, one may argue that this is, indeed, the case for correctness's
sake.

Signed-off-by: Rogério Brito <rbrito@ime.usp.br>
2013-10-18 17:50:55 -03:00
cce722b79c Add metavar to --cache-dir 2013-10-18 11:50:48 +02:00
24 changed files with 291 additions and 163 deletions

View File

@ -21,6 +21,8 @@ which means you can modify it, redistribute it or use it however you like.
sudo if needed)
-i, --ignore-errors continue on download errors, for example to to
skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the
playlist or the command line) if an error occurs
--dump-user-agent display the current browser identification
--user-agent UA specify a custom user agent
--referer REF specify a custom referer, use if the video access
@ -30,7 +32,7 @@ which means you can modify it, redistribute it or use it however you like.
--extractor-descriptions Output descriptions of all supported extractors
--proxy URL Use the specified HTTP/HTTPS proxy
--no-check-certificate Suppress HTTPS certificate validation.
--cache-dir None Location in the filesystem where youtube-dl can
--cache-dir DIR Location in the filesystem where youtube-dl can
store downloaded information permanently. By
default $XDG_CACHE_HOME/youtube-dl or ~/.cache
/youtube-dl .
@ -76,15 +78,17 @@ which means you can modify it, redistribute it or use it however you like.
%(uploader_id)s for the uploader nickname if
different, %(autonumber)s to get an automatically
incremented number, %(ext)s for the filename
extension, %(upload_date)s for the upload date
(YYYYMMDD), %(extractor)s for the provider
(youtube, metacafe, etc), %(id)s for the video id
, %(playlist)s for the playlist the video is in,
%(playlist_index)s for the position in the
playlist and %% for a literal percent. Use - to
output to stdout. Can also be used to download to
a different directory, for example with -o '/my/d
ownloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
extension, %(format)s for the format description
(like "22 - 1280x720" or "HD")%(upload_date)s for
the upload date (YYYYMMDD), %(extractor)s for the
provider (youtube, metacafe, etc), %(id)s for the
video id , %(playlist)s for the playlist the
video is in, %(playlist_index)s for the position
in the playlist and %% for a literal percent. Use
- to output to stdout. Can also be used to
download to a different directory, for example
with -o '/my/downloads/%(uploader)s/%(title)s-%(i
d)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in %(autonumber)s
when it is present in output filename template or
--autonumber option is given

View File

@ -1,4 +1,4 @@
__youtube-dl()
__youtube_dl()
{
local cur prev opts
COMPREPLY=()
@ -15,4 +15,4 @@ __youtube-dl()
fi
}
complete -F __youtube-dl youtube-dl
complete -F __youtube_dl youtube-dl

View File

@ -88,10 +88,6 @@ ROOT=$(pwd)
"$ROOT/devscripts/gh-pages/update-sites.py"
git add *.html *.html.in update
git commit -m "release $version"
git show HEAD
read -p "Is it good, can I push? (y/n) " -n 1
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
echo
git push "$ROOT" gh-pages
git push "$ORIGIN_URL" gh-pages
)

View File

@ -63,6 +63,7 @@ setup(
' YouTube.com and other video sites.',
url='https://github.com/rg3/youtube-dl',
author='Ricardo Garcia',
author_email='ytdl@yt-dl.org',
maintainer='Philipp Hagemeister',
maintainer_email='phihag@phihag.de',
packages=['youtube_dl', 'youtube_dl.extractor'],

View File

@ -94,6 +94,40 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded[u'format_id'], u'excellent')
def test_format_selection(self):
formats = [
{u'format_id': u'35', u'ext': u'mp4'},
{u'format_id': u'45', u'ext': u'webm'},
{u'format_id': u'47', u'ext': u'webm'},
{u'format_id': u'2', u'ext': u'flv'},
]
info_dict = {u'formats': formats, u'extractor': u'test'}
ydl = YDL({'format': u'20/47'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], u'47')
ydl = YDL({'format': u'20/71/worst'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], u'35')
ydl = YDL()
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], u'2')
ydl = YDL({'format': u'webm/mp4'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], u'47')
ydl = YDL({'format': u'3gp/40/mp4'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], u'35')
if __name__ == '__main__':
unittest.main()

View File

@ -91,7 +91,7 @@ class YoutubeDL(object):
downloadarchive: File name of a file where all downloads are recorded.
Videos already present in the file are not downloaded
again.
The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader:
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
@ -216,10 +216,10 @@ class YoutubeDL(object):
If stderr is a tty file the 'WARNING:' will be colored
'''
if sys.stderr.isatty() and os.name != 'nt':
_msg_header=u'\033[0;33mWARNING:\033[0m'
_msg_header = u'\033[0;33mWARNING:\033[0m'
else:
_msg_header=u'WARNING:'
warning_message=u'%s %s' % (_msg_header,message)
_msg_header = u'WARNING:'
warning_message = u'%s %s' % (_msg_header, message)
self.to_stderr(warning_message)
def report_error(self, message, tb=None):
@ -234,19 +234,6 @@ class YoutubeDL(object):
error_message = u'%s %s' % (_msg_header, message)
self.trouble(error_message, tb)
def slow_down(self, start_time, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
if rate_limit is None or byte_counter == 0:
return
now = time.time()
elapsed = now - start_time
if elapsed <= 0.0:
return
speed = float(byte_counter) / elapsed
if speed > rate_limit:
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
def report_writedescription(self, descfn):
""" Report that the description file is being written """
self.to_screen(u'[info] Writing video description to: ' + descfn)
@ -288,13 +275,15 @@ class YoutubeDL(object):
if template_dict['playlist_index'] is not None:
template_dict['playlist_index'] = u'%05d' % template_dict['playlist_index']
sanitize = lambda k,v: sanitize_filename(
sanitize = lambda k, v: sanitize_filename(
u'NA' if v is None else compat_str(v),
restricted=self.params.get('restrictfilenames'),
is_id=(k==u'id'))
template_dict = dict((k, sanitize(k, v)) for k,v in template_dict.items())
is_id=(k == u'id'))
template_dict = dict((k, sanitize(k, v))
for k, v in template_dict.items())
filename = self.params['outtmpl'] % template_dict
tmpl = os.path.expanduser(self.params['outtmpl'])
filename = tmpl % template_dict
return filename
except KeyError as err:
self.report_error(u'Erroneous output template')
@ -328,14 +317,14 @@ class YoutubeDL(object):
return (u'%(title)s has already been recorded in archive'
% info_dict)
return None
def extract_info(self, url, download=True, ie_key=None, extra_info={}):
'''
Returns a list with a dictionary for each video we find.
If 'download', also downloads the videos.
extra_info is a dict containing the extra values to add to each result
'''
if ie_key:
ies = [self.get_info_extractor(ie_key)]
else:
@ -377,7 +366,7 @@ class YoutubeDL(object):
raise
else:
self.report_error(u'no suitable InfoExtractor: %s' % url)
def process_ie_result(self, ie_result, download=True, extra_info={}):
"""
Take the result of the ie(may be modified) and resolve all unresolved
@ -401,7 +390,7 @@ class YoutubeDL(object):
elif result_type == 'playlist':
# We process each entry in the playlist
playlist = ie_result.get('title', None) or ie_result.get('id', None)
self.to_screen(u'[download] Downloading playlist: %s' % playlist)
self.to_screen(u'[download] Downloading playlist: %s' % playlist)
playlist_results = []
@ -419,12 +408,12 @@ class YoutubeDL(object):
self.to_screen(u"[%s] playlist '%s': Collected %d video ids (downloading %d of them)" %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
for i,entry in enumerate(entries,1):
self.to_screen(u'[download] Downloading video #%s of %s' %(i, n_entries))
for i, entry in enumerate(entries, 1):
self.to_screen(u'[download] Downloading video #%s of %s' % (i, n_entries))
extra = {
'playlist': playlist,
'playlist_index': i + playliststart,
}
'playlist': playlist,
'playlist_index': i + playliststart,
}
if not 'extractor' in entry:
# We set the extractor, if it's an url it will be set then to
# the new extractor, but if it's already a video we must make
@ -448,6 +437,22 @@ class YoutubeDL(object):
else:
raise Exception('Invalid result type: %s' % result_type)
def select_format(self, format_spec, available_formats):
if format_spec == 'best' or format_spec is None:
return available_formats[-1]
elif format_spec == 'worst':
return available_formats[0]
else:
extensions = [u'mp4', u'flv', u'webm', u'3gp']
if format_spec in extensions:
filter_f = lambda f: f['ext'] == format_spec
else:
filter_f = lambda f: f['format_id'] == format_spec
matches = list(filter(filter_f, available_formats))
if matches:
return matches[-1]
return None
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
@ -458,7 +463,8 @@ class YoutubeDL(object):
# This extractors handle format selection themselves
if info_dict['extractor'] in [u'youtube', u'Youku', u'YouPorn', u'mixcloud']:
self.process_info(info_dict)
if download:
self.process_info(info_dict)
return info_dict
# We now pick which formats have to be downloaded
@ -470,17 +476,14 @@ class YoutubeDL(object):
# We check that all the formats have the format and format_id fields
for (i, format) in enumerate(formats):
if format.get('format') is None:
if format.get('height') is not None:
if format.get('width') is not None:
format_desc = u'%sx%s' % (format['width'], format['height'])
else:
format_desc = u'%sp' % format['height']
else:
format_desc = '???'
format['format'] = format_desc
if format.get('format_id') is None:
format['format_id'] = compat_str(i)
if format.get('format') is None:
format['format'] = u'{id} - {res}{note}'.format(
id=format['format_id'],
res=self.format_resolution(format),
note=u' ({})'.format(format['format_note']) if format.get('format_note') is not None else '',
)
if self.params.get('listformats', None):
self.list_formats(info_dict)
@ -502,22 +505,20 @@ class YoutubeDL(object):
formats = sorted(formats, key=_free_formats_key)
req_format = self.params.get('format', 'best')
if req_format is None:
req_format = 'best'
formats_to_download = []
if req_format == 'best' or req_format is None:
formats_to_download = [formats[-1]]
elif req_format == 'worst':
formats_to_download = [formats[0]]
# The -1 is for supporting YoutubeIE
elif req_format in ('-1', 'all'):
if req_format in ('-1', 'all'):
formats_to_download = formats
else:
# We can accept formats requestd in the format: 34/10/5, we pick
# We can accept formats requestd in the format: 34/5/best, we pick
# the first that is available, starting from left
req_formats = req_format.split('/')
for rf in req_formats:
matches = filter(lambda f:f['format_id'] == rf ,formats)
if matches:
formats_to_download = [matches[0]]
selected_format = self.select_format(rf, formats)
if selected_format is not None:
formats_to_download = [selected_format]
break
if not formats_to_download:
raise ExtractorError(u'requested format not available')
@ -608,20 +609,20 @@ class YoutubeDL(object):
if self.params.get('writeannotations', False):
try:
annofn = filename + u'.annotations.xml'
self.report_writeannotations(annofn)
with io.open(encodeFilename(annofn), 'w', encoding='utf-8') as annofile:
annofile.write(info_dict['annotations'])
annofn = filename + u'.annotations.xml'
self.report_writeannotations(annofn)
with io.open(encodeFilename(annofn), 'w', encoding='utf-8') as annofile:
annofile.write(info_dict['annotations'])
except (KeyError, TypeError):
self.report_warning(u'There are no annotations to write.')
except (OSError, IOError):
self.report_error(u'Cannot write annotations file: ' + annofn)
return
self.report_error(u'Cannot write annotations file: ' + annofn)
return
subtitles_are_requested = any([self.params.get('writesubtitles', False),
self.params.get('writeautomaticsub')])
if subtitles_are_requested and 'subtitles' in info_dict and info_dict['subtitles']:
if subtitles_are_requested and 'subtitles' in info_dict and info_dict['subtitles']:
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
subtitles = info_dict['subtitles']
@ -643,7 +644,7 @@ class YoutubeDL(object):
infofn = filename + u'.info.json'
self.report_writeinfojson(infofn)
try:
json_info_dict = dict((k, v) for k,v in info_dict.items() if not k in ['urlhandle'])
json_info_dict = dict((k, v) for k, v in info_dict.items() if not k in ['urlhandle'])
write_json_file(json_info_dict, encodeFilename(infofn))
except (OSError, IOError):
self.report_error(u'Cannot write metadata to JSON file ' + infofn)
@ -713,7 +714,7 @@ class YoutubeDL(object):
keep_video = None
for pp in self._pps:
try:
keep_video_wish,new_info = pp.run(info)
keep_video_wish, new_info = pp.run(info)
if keep_video_wish is not None:
if keep_video_wish:
keep_video = keep_video_wish
@ -752,16 +753,31 @@ class YoutubeDL(object):
with locked_file(fn, 'a', encoding='utf-8') as archive_file:
archive_file.write(vid_id + u'\n')
@staticmethod
def format_resolution(format):
if format.get('height') is not None:
if format.get('width') is not None:
res = u'%sx%s' % (format['width'], format['height'])
else:
res = u'%sp' % format['height']
else:
res = '???'
return res
def list_formats(self, info_dict):
formats_s = []
for format in info_dict.get('formats', [info_dict]):
formats_s.append("%s\t:\t%s\t[%s]" % (format['format_id'],
format['ext'],
format.get('format', '???'),
)
)
formats_s.append(u'%-15s: %-5s %-15s[%s]' % (
format['format_id'],
format['ext'],
format.get('format_note') or '-',
self.format_resolution(format),
)
)
if len(formats_s) != 1:
formats_s[0] += ' (worst)'
formats_s[0] += ' (worst)'
formats_s[-1] += ' (best)'
formats_s = "\n".join(formats_s)
self.to_screen(u"[info] Available formats for %s:\nformat code\textension\n%s" % (info_dict['id'], formats_s))
self.to_screen(u'[info] Available formats for %s:\n'
u'format code extension note resolution\n%s' % (
info_dict['id'], formats_s))

View File

@ -133,7 +133,7 @@ def parseOpts(overrideArguments=None):
def _hide_login_info(opts):
opts = list(opts)
for private_opt in ['-p', '--password', '-u', '--username']:
for private_opt in ['-p', '--password', '-u', '--username', '--video-password']:
try:
i = opts.index(private_opt)
opts[i+1] = '<PRIVATE>'
@ -179,6 +179,9 @@ def parseOpts(overrideArguments=None):
action='store_true', dest='update_self', help='update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)')
general.add_option('-i', '--ignore-errors',
action='store_true', dest='ignoreerrors', help='continue on download errors, for example to to skip unavailable videos in a playlist', default=False)
general.add_option('--abort-on-error',
action='store_false', dest='ignoreerrors',
help='Abort downloading of further videos (in the playlist or the command line) if an error occurs')
general.add_option('--dump-user-agent',
action='store_true', dest='dump_user_agent',
help='display the current browser identification', default=False)
@ -196,7 +199,7 @@ def parseOpts(overrideArguments=None):
general.add_option('--proxy', dest='proxy', default=None, help='Use the specified HTTP/HTTPS proxy', metavar='URL')
general.add_option('--no-check-certificate', action='store_true', dest='no_check_certificate', default=False, help='Suppress HTTPS certificate validation.')
general.add_option(
'--cache-dir', dest='cachedir', default=get_cachedir(),
'--cache-dir', dest='cachedir', default=get_cachedir(), metavar='DIR',
help='Location in the filesystem where youtube-dl can store downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl or ~/.cache/youtube-dl .')
general.add_option(
'--no-cache-dir', action='store_const', const=None, dest='cachedir',
@ -332,7 +335,9 @@ def parseOpts(overrideArguments=None):
help=('output filename template. Use %(title)s to get the title, '
'%(uploader)s for the uploader name, %(uploader_id)s for the uploader nickname if different, '
'%(autonumber)s to get an automatically incremented number, '
'%(ext)s for the filename extension, %(upload_date)s for the upload date (YYYYMMDD), '
'%(ext)s for the filename extension, '
'%(format)s for the format description (like "22 - 1280x720" or "HD")'
'%(upload_date)s for the upload date (YYYYMMDD), '
'%(extractor)s for the provider (youtube, metacafe, etc), '
'%(id)s for the video id , %(playlist)s for the playlist the video is in, '
'%(playlist_index)s for the position in the playlist and %% for a literal percent. '

View File

@ -174,12 +174,27 @@ class ArteTVPlus7IE(InfoExtractor):
# Some formats use the m3u8 protocol
formats = filter(lambda f: f.get('videoFormat') != 'M3U8', formats)
# We order the formats by quality
formats = sorted(formats, key=lambda f: int(f.get('height',-1)))
formats = list(formats) # in python3 filter returns an iterator
if re.match(r'[A-Z]Q', formats[0]['quality']) is not None:
sort_key = lambda f: ['HQ', 'MQ', 'EQ', 'SQ'].index(f['quality'])
else:
sort_key = lambda f: int(f.get('height',-1))
formats = sorted(formats, key=sort_key)
# Prefer videos without subtitles in the same language
formats = sorted(formats, key=lambda f: re.match(r'VO(F|A)-STM\1', f.get('versionCode', '')) is None)
# Pick the best quality
def _format(format_info):
quality = format_info['quality']
m_quality = re.match(r'\w*? - (\d*)p', quality)
if m_quality is not None:
quality = m_quality.group(1)
if format_info.get('versionCode') is not None:
format_id = u'%s-%s' % (quality, format_info['versionCode'])
else:
format_id = quality
info = {
'format_id': format_id,
'format_note': format_info.get('versionLibelle'),
'width': format_info.get('width'),
'height': format_info.get('height'),
}
@ -192,8 +207,6 @@ class ArteTVPlus7IE(InfoExtractor):
info['ext'] = determine_ext(info['url'])
return info
info_dict['formats'] = [_format(f) for f in formats]
# TODO: Remove when #980 has been merged
info_dict.update(info_dict['formats'][-1])
return info_dict
@ -207,7 +220,7 @@ class ArteTVCreativeIE(ArteTVPlus7IE):
u'url': u'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
u'file': u'050489-002.mp4',
u'info_dict': {
u'title': u'Agentur Amateur #2 - Corporate Design',
u'title': u'Agentur Amateur / Agence Amateur #2 : Corporate Design',
},
}

View File

@ -14,6 +14,7 @@ from ..utils import (
clean_html,
compiled_regex_type,
ExtractorError,
RegexNotFoundError,
unescapeHTML,
)
@ -61,9 +62,12 @@ class InfoExtractor(object):
* ext Will be calculated from url if missing
* format A human-readable description of the format
("mp4 container with h264/opus").
Calculated from width and height if missing.
Calculated from the format_id, width, height
and format_note fields if missing.
* format_id A short description of the format
("mp4_h264_opus" or "19")
* format_note Additional info about the format
("3D" or "DASH video")
* width Width of the video, if known
* height Height of the video, if known
@ -228,7 +232,7 @@ class InfoExtractor(object):
Perform a regex search on the given string, using a single or a list of
patterns returning the first matching group.
In case of failure return a default value or raise a WARNING or a
ExtractorError, depending on fatal, specifying the field name.
RegexNotFoundError, depending on fatal, specifying the field name.
"""
if isinstance(pattern, (str, compat_str, compiled_regex_type)):
mobj = re.search(pattern, string, flags)
@ -248,7 +252,7 @@ class InfoExtractor(object):
elif default is not None:
return default
elif fatal:
raise ExtractorError(u'Unable to extract %s' % _name)
raise RegexNotFoundError(u'Unable to extract %s' % _name)
else:
self._downloader.report_warning(u'unable to extract %s; '
u'please report this issue on http://yt-dl.org/bug' % _name)

View File

@ -28,6 +28,15 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
_VALID_URL = r'(?i)(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/(?:embed/)?video/([^/]+)'
IE_NAME = u'dailymotion'
_FORMATS = [
(u'stream_h264_ld_url', u'ld'),
(u'stream_h264_url', u'standard'),
(u'stream_h264_hq_url', u'hq'),
(u'stream_h264_hd_url', u'hd'),
(u'stream_h264_hd1080_url', u'hd180'),
]
_TESTS = [
{
u'url': u'http://www.dailymotion.com/video/x33vw9_tutoriel-de-youtubeur-dl-des-video_tech',
@ -60,7 +69,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
video_id = mobj.group(1).split('_')[0].split('?')[0]
video_extension = 'mp4'
url = 'http://www.dailymotion.com/video/%s' % video_id
# Retrieve video webpage to extract further information
@ -99,18 +107,24 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
msg = 'Couldn\'t get video, Dailymotion says: %s' % info['error']['title']
raise ExtractorError(msg, expected=True)
# TODO: support choosing qualities
for key in ['stream_h264_hd1080_url','stream_h264_hd_url',
'stream_h264_hq_url','stream_h264_url',
'stream_h264_ld_url']:
if info.get(key):#key in info and info[key]:
max_quality = key
self.to_screen(u'Using %s' % key)
break
else:
formats = []
for (key, format_id) in self._FORMATS:
video_url = info.get(key)
if video_url is not None:
m_size = re.search(r'H264-(\d+)x(\d+)', video_url)
if m_size is not None:
width, height = m_size.group(1), m_size.group(2)
else:
width, height = None, None
formats.append({
'url': video_url,
'ext': 'mp4',
'format_id': format_id,
'width': width,
'height': height,
})
if not formats:
raise ExtractorError(u'Unable to extract video URL')
video_url = info[max_quality]
# subtitles
video_subtitles = self.extract_subtitles(video_id)
@ -120,11 +134,10 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
return [{
'id': video_id,
'url': video_url,
'formats': formats,
'uploader': video_uploader,
'upload_date': video_upload_date,
'title': self._og_search_title(webpage),
'ext': video_extension,
'subtitles': video_subtitles,
'thumbnail': info['thumbnail_url']
}]

View File

@ -41,9 +41,9 @@ class GooglePlusIE(InfoExtractor):
# Extract update date
upload_date = self._html_search_regex(
r'''(?x)<a.+?class="o-T-s\s[^"]+"\s+style="display:\s*none"\s*>
r'''(?x)<a.+?class="o-U-s\s[^"]+"\s+style="display:\s*none"\s*>
([0-9]{4}-[0-9]{2}-[0-9]{2})</a>''',
webpage, u'upload date', fatal=False)
webpage, u'upload date', fatal=False, flags=re.VERBOSE)
if upload_date:
# Convert timestring to a format suitable for filename
upload_date = datetime.datetime.strptime(upload_date, "%Y-%m-%d")

View File

@ -19,7 +19,7 @@ class InternetVideoArchiveIE(InfoExtractor):
u'info_dict': {
u'title': u'SKYFALL',
u'description': u'In SKYFALL, Bond\'s loyalty to M is tested as her past comes back to haunt her. As MI6 comes under attack, 007 must track down and destroy the threat, no matter how personal the cost.',
u'duration': 156,
u'duration': 153,
},
}
@ -74,7 +74,7 @@ class InternetVideoArchiveIE(InfoExtractor):
})
formats = sorted(formats, key=lambda f: f['bitrate'])
info = {
return {
'id': video_id,
'title': item.find('title').text,
'formats': formats,
@ -82,6 +82,3 @@ class InternetVideoArchiveIE(InfoExtractor):
'description': item.find('description').text,
'duration': int(attr['duration']),
}
# TODO: Remove when #980 has been merged
info.update(formats[-1])
return info

View File

@ -90,8 +90,8 @@ class NHLVideocenterIE(NHLBaseInfoExtractor):
r'{statusIndex:0,index:0,.*?id:(.*?),'],
webpage, u'category id')
playlist_title = self._html_search_regex(
r'\?catid=%s">(.*?)</a>' % cat_id,
webpage, u'playlist title', flags=re.DOTALL)
r'tab0"[^>]*?>(.*?)</td>',
webpage, u'playlist title', flags=re.DOTALL).lower().capitalize()
data = compat_urllib_parse.urlencode({
'cid': cat_id,

View File

@ -20,7 +20,10 @@ class NowVideoIE(InfoExtractor):
video_id = mobj.group('id')
webpage_url = 'http://www.nowvideo.ch/video/' + video_id
embed_url = 'http://embed.nowvideo.ch/embed.php?v=' + video_id
webpage = self._download_webpage(webpage_url, video_id)
embed_page = self._download_webpage(embed_url, video_id,
u'Downloading embed page')
self.report_extraction(video_id)
@ -28,7 +31,7 @@ class NowVideoIE(InfoExtractor):
webpage, u'video title')
video_key = self._search_regex(r'var fkzd="(.*)";',
webpage, u'video key')
embed_page, u'video key')
api_call = "http://www.nowvideo.ch/api/player.api.php?file={0}&numOfErrors=0&cid=1&key={1}".format(video_id, video_key)
api_response = self._download_webpage(api_call, video_id,

View File

@ -10,7 +10,8 @@ class RedTubeIE(InfoExtractor):
u'file': u'66418.mp4',
u'md5': u'7b8c22b5e7098a3e1c09709df1126d2d',
u'info_dict': {
u"title": u"Sucked on a toilet"
u"title": u"Sucked on a toilet",
u"age_limit": 18,
}
}

View File

@ -16,7 +16,7 @@ class VideoDetectiveIE(InfoExtractor):
u'info_dict': {
u'title': u'KICK-ASS 2',
u'description': u'md5:65ba37ad619165afac7d432eaded6013',
u'duration': 138,
u'duration': 135,
},
}

View File

@ -1,3 +1,4 @@
# encoding: utf-8
import json
import re
import itertools
@ -10,6 +11,7 @@ from ..utils import (
clean_html,
get_element_by_attribute,
ExtractorError,
RegexNotFoundError,
std_headers,
unsmuggle_url,
)
@ -25,7 +27,7 @@ class VimeoIE(InfoExtractor):
{
u'url': u'http://vimeo.com/56015672',
u'file': u'56015672.mp4',
u'md5': u'8879b6cc097e987f02484baf890129e5',
u'md5': u'ae7a1d8b183758a0506b0622f37dfa14',
u'info_dict': {
u"upload_date": u"20121220",
u"description": u"This is a test case for youtube-dl.\nFor more information, see github.com/rg3/youtube-dl\nTest chars: \u2605 \" ' \u5e78 / \\ \u00e4 \u21ad \U0001d550",
@ -54,7 +56,22 @@ class VimeoIE(InfoExtractor):
u'title': u'Kathy Sierra: Building the minimum Badass User, Business of Software',
u'uploader': u'The BLN & Business of Software',
},
}
},
{
u'url': u'http://vimeo.com/68375962',
u'file': u'68375962.mp4',
u'md5': u'aaf896bdb7ddd6476df50007a0ac0ae7',
u'note': u'Video protected with password',
u'info_dict': {
u'title': u'youtube-dl password protected test video',
u'upload_date': u'20130614',
u'uploader_id': u'user18948128',
u'uploader': u'Jaime Marquínez Ferrándiz',
},
u'params': {
u'videopassword': u'youtube-dl',
},
},
]
def _login(self):
@ -129,18 +146,26 @@ class VimeoIE(InfoExtractor):
# Extract the config JSON
try:
config = self._search_regex([r' = {config:({.+?}),assets:', r'c=({.+?);'],
webpage, u'info section', flags=re.DOTALL)
config = json.loads(config)
except:
try:
config_url = self._html_search_regex(
r' data-config-url="(.+?)"', webpage, u'config URL')
config_json = self._download_webpage(config_url, video_id)
config = json.loads(config_json)
except RegexNotFoundError:
# For pro videos or player.vimeo.com urls
config = self._search_regex([r' = {config:({.+?}),assets:', r'c=({.+?);'],
webpage, u'info section', flags=re.DOTALL)
config = json.loads(config)
except Exception as e:
if re.search('The creator of this video has not given you permission to embed it on this domain.', webpage):
raise ExtractorError(u'The author has restricted the access to this video, try with the "--referer" option')
if re.search('If so please provide the correct password.', webpage):
if re.search('<form[^>]+?id="pw_form"', webpage) is not None:
self._verify_video_password(url, video_id, webpage)
return self._real_extract(url)
else:
raise ExtractorError(u'Unable to extract info section')
raise ExtractorError(u'Unable to extract info section',
cause=e)
# Extract title
video_title = config["video"]["title"]
@ -179,46 +204,45 @@ class VimeoIE(InfoExtractor):
# Vimeo specific: extract video codec and quality information
# First consider quality, then codecs, then take everything
# TODO bind to format param
codecs = [('h264', 'mp4'), ('vp8', 'flv'), ('vp6', 'flv')]
codecs = [('vp6', 'flv'), ('vp8', 'flv'), ('h264', 'mp4')]
files = { 'hd': [], 'sd': [], 'other': []}
config_files = config["video"].get("files") or config["request"].get("files")
for codec_name, codec_extension in codecs:
if codec_name in config_files:
if 'hd' in config_files[codec_name]:
files['hd'].append((codec_name, codec_extension, 'hd'))
elif 'sd' in config_files[codec_name]:
files['sd'].append((codec_name, codec_extension, 'sd'))
for quality in config_files.get(codec_name, []):
format_id = '-'.join((codec_name, quality)).lower()
key = quality if quality in files else 'other'
video_url = None
if isinstance(config_files[codec_name], dict):
file_info = config_files[codec_name][quality]
video_url = file_info.get('url')
else:
files['other'].append((codec_name, codec_extension, config_files[codec_name][0]))
file_info = {}
if video_url is None:
video_url = "http://player.vimeo.com/play_redirect?clip_id=%s&sig=%s&time=%s&quality=%s&codecs=%s&type=moogaloop_local&embed_location=" \
%(video_id, sig, timestamp, quality, codec_name.upper())
for quality in ('hd', 'sd', 'other'):
if len(files[quality]) > 0:
video_quality = files[quality][0][2]
video_codec = files[quality][0][0]
video_extension = files[quality][0][1]
self.to_screen(u'%s: Downloading %s file at %s quality' % (video_id, video_codec.upper(), video_quality))
break
else:
files[key].append({
'ext': codec_extension,
'url': video_url,
'format_id': format_id,
'width': file_info.get('width'),
'height': file_info.get('height'),
})
formats = []
for key in ('other', 'sd', 'hd'):
formats += files[key]
if len(formats) == 0:
raise ExtractorError(u'No known codec found')
video_url = None
if isinstance(config_files[video_codec], dict):
video_url = config_files[video_codec][video_quality].get("url")
if video_url is None:
video_url = "http://player.vimeo.com/play_redirect?clip_id=%s&sig=%s&time=%s&quality=%s&codecs=%s&type=moogaloop_local&embed_location=" \
%(video_id, sig, timestamp, video_quality, video_codec.upper())
return [{
'id': video_id,
'url': video_url,
'uploader': video_uploader,
'uploader_id': video_uploader_id,
'upload_date': video_upload_date,
'title': video_title,
'ext': video_extension,
'thumbnail': video_thumbnail,
'description': video_description,
'formats': formats,
}]

View File

@ -19,7 +19,8 @@ class XHamsterIE(InfoExtractor):
u'info_dict': {
u"upload_date": u"20121014",
u"uploader_id": u"Ruseful2011",
u"title": u"FemaleAgent Shy beauty takes the bait"
u"title": u"FemaleAgent Shy beauty takes the bait",
u"age_limit": 18,
}
},
{
@ -27,9 +28,10 @@ class XHamsterIE(InfoExtractor):
u'file': u'2221348.flv',
u'md5': u'e767b9475de189320f691f49c679c4c7',
u'info_dict': {
u"upload_date": u"20130914",
u"uploader_id": u"jojo747400",
u"title": u"Britney Spears Sexy Booty"
u"upload_date": u"20130914",
u"uploader_id": u"jojo747400",
u"title": u"Britney Spears Sexy Booty",
u"age_limit": 18,
}
}]
@ -72,6 +74,8 @@ class XHamsterIE(InfoExtractor):
video_thumbnail = self._search_regex(r'\'image\':\'(?P<thumbnail>[^\']+)\'',
webpage, u'thumbnail', fatal=False)
age_limit = self._rta_search(webpage)
return [{
'id': video_id,
'url': video_url,
@ -80,5 +84,6 @@ class XHamsterIE(InfoExtractor):
'description': video_description,
'upload_date': video_upload_date,
'uploader_id': video_uploader_id,
'thumbnail': video_thumbnail
'thumbnail': video_thumbnail,
'age_limit': age_limit,
}]

View File

@ -18,7 +18,8 @@ class XNXXIE(InfoExtractor):
u'file': u'1135332.flv',
u'md5': u'0831677e2b4761795f68d417e0b7b445',
u'info_dict': {
u"title": u"lida \u00bb Naked Funny Actress (5)"
u"title": u"lida \u00bb Naked Funny Actress (5)",
u"age_limit": 18,
}
}
@ -50,4 +51,5 @@ class XNXXIE(InfoExtractor):
'ext': 'flv',
'thumbnail': video_thumbnail,
'description': None,
'age_limit': 18,
}]

View File

@ -13,7 +13,8 @@ class XVideosIE(InfoExtractor):
u'file': u'939581.flv',
u'md5': u'1d0c835822f0a71a7bf011855db929d0',
u'info_dict': {
u"title": u"Funny Porns By >>>>S<<<<<< -1"
u"title": u"Funny Porns By >>>>S<<<<<< -1",
u"age_limit": 18,
}
}
@ -46,6 +47,7 @@ class XVideosIE(InfoExtractor):
'ext': 'flv',
'thumbnail': video_thumbnail,
'description': None,
'age_limit': 18,
}
return [info]

View File

@ -26,7 +26,8 @@ class YouPornIE(InfoExtractor):
u"upload_date": u"20101221",
u"description": u"Love & Sex Answers: http://bit.ly/DanAndJenn -- Is It Unhealthy To Masturbate Daily?",
u"uploader": u"Ask Dan And Jennifer",
u"title": u"Sex Ed: Is It Safe To Masturbate Daily?"
u"title": u"Sex Ed: Is It Safe To Masturbate Daily?",
u"age_limit": 18,
}
}

View File

@ -236,11 +236,13 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
'136': 'mp4',
'137': 'mp4',
'138': 'mp4',
'139': 'mp4',
'140': 'mp4',
'141': 'mp4',
'160': 'mp4',
# Dash mp4 audio
'139': 'm4a',
'140': 'm4a',
'141': 'm4a',
# Dash webm
'171': 'webm',
'172': 'webm',
@ -1150,7 +1152,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
list_page = self._download_webpage(list_url, video_id)
caption_list = xml.etree.ElementTree.fromstring(list_page.encode('utf-8'))
original_lang_node = caption_list.find('track')
if not original_lang_node or original_lang_node.attrib.get('kind') != 'asr' :
if original_lang_node is None or original_lang_node.attrib.get('kind') != 'asr' :
self._downloader.report_warning(u'Video doesn\'t have automatic captions')
return {}
original_lang = original_lang_node.attrib['lang_code']

View File

@ -572,6 +572,11 @@ class ExtractorError(Exception):
return u''.join(traceback.format_tb(self.traceback))
class RegexNotFoundError(ExtractorError):
"""Error when a regex didn't match"""
pass
class DownloadError(Exception):
"""Download Error exception.

View File

@ -1,2 +1,2 @@
__version__ = '2013.10.18.1'
__version__ = '2013.10.23.2'