release 2015.02.10

[extractor/common] Wrap extractor errors (Fixes #1194 )
For now, we just wrap some common errors. More may follow. We do not want to catch actual programming errors in the extractors, such as 1 // 0.
2015-02-10 01:19:52 +01:00 · 2015-02-10 01:17:23 +01:00 · 2015-02-09 19:08:51 +01:00 · 2015-02-09 16:05:01 +01:00 · 2015-02-09 15:59:19 +01:00 · 2015-02-09 15:59:14 +01:00
29 changed files with 633 additions and 214 deletions
--- a/2
+++ b/2
@ -108,3 +108,5 @@ Enam Mijbah Noor
 David Luhmer
 Shaya Goldberg
 Paul Hartmann
+Frans de Jonge
+Robin de Rooij
--- a/README.md
+++ b/README.md
@ -292,18 +292,20 @@ which means you can modify it, redistribute it or use it however you like.
                                     video results by putting a condition in
                                     brackets, as in -f "best[height=720]" (or
                                     -f "[filesize>10M]").  This works for
-                                     filesize, height, width, tbr, abr, vbr, and
-                                     fps and the comparisons <, <=, >, >=, =, !=
-                                     . Formats for which the value is not known
-                                     are excluded unless you put a question mark
-                                     (?) after the operator. You can combine
-                                     format filters, so  -f "[height <=?
-                                     720][tbr>500]" selects up to 720p videos
-                                     (or videos where the height is not known)
-                                     with a bitrate of at least 500 KBit/s. By
-                                     default, youtube-dl will pick the best
-                                     quality. Use commas to download multiple
-                                     audio formats, such as -f
+                                     filesize, height, width, tbr, abr, vbr,
+                                     asr, and fps and the comparisons <, <=, >,
+                                     >=, =, != and for ext, acodec, vcodec,
+                                     container, and protocol and the comparisons
+                                     =, != . Formats for which the value is not
+                                     known are excluded unless you put a
+                                     question mark (?) after the operator. You
+                                     can combine format filters, so  -f "[height
+                                     <=? 720][tbr>500]" selects up to 720p
+                                     videos (or videos where the height is not
+                                     known) with a bitrate of at least 500
+                                     KBit/s. By default, youtube-dl will pick
+                                     the best quality. Use commas to download
+                                     multiple audio formats, such as -f
                                     136/137/mp4/bestvideo,140/m4a/bestaudio.
                                     You can merge the video and audio of two
                                     formats into a single file using -f <video-
@ -532,6 +534,14 @@ Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the opt
    youtube-dl -- -wNyEUrxzFU
    youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"

+### Can you add support for this anime video site, or site which shows current movies for free?
+
+As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
+
+A note on the service that they don't host the infringing content, but just link to those who do, is evidence that the service should **not** be included into youtube-dl. The same goes for any DMCA note when the whole front page of the service is filled with videos they are not allowed to distribute. A "fair use" note is equally unconvincing if the service shows copyright-protected videos in full without authorization.
+
+Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
+
 ### How can I detect whether a given URL is supported by youtube-dl?

 For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -14,6 +14,7 @@
 - **AddAnime**
 - **AdobeTV**
 - **AdultSwim**
+ - **Aftenposten**
 - **Aftonbladet**
 - **AlJazeera**
 - **Allocine**
@ -391,6 +392,7 @@
 - **StreamCZ**
 - **StreetVoice**
 - **SunPorno**
+ - **SVTPlay**
 - **SWRMediathek**
 - **Syfy**
 - **SztvHu**
@ -441,6 +443,7 @@
 - **tvp.pl**
 - **tvp.pl:Series**
 - **TVPlay**: TV3Play and related services
+ - **Tweakers**
 - **twitch:bookmarks**
 - **twitch:chapter**
 - **twitch:past_broadcasts**
--- a/test/test_YoutubeDL.py
+++ b/test/test_YoutubeDL.py
@ -13,6 +13,7 @@ import copy
 from test.helper import FakeYDL, assertRegexpMatches
 from youtube_dl import YoutubeDL
 from youtube_dl.extractor import YoutubeIE
+from youtube_dl.postprocessor.common import PostProcessor


 class YDL(FakeYDL):
@ -370,5 +371,35 @@ class TestFormatSelection(unittest.TestCase):
            'vbr': 10,
        }), '^\s*10k$')

+    def test_postprocessors(self):
+        filename = 'post-processor-testfile.mp4'
+        audiofile = filename + '.mp3'
+
+        class SimplePP(PostProcessor):
+            def run(self, info):
+                with open(audiofile, 'wt') as f:
+                    f.write('EXAMPLE')
+                info['filepath']
+                return False, info
+
+        def run_pp(params):
+            with open(filename, 'wt') as f:
+                f.write('EXAMPLE')
+            ydl = YoutubeDL(params)
+            ydl.add_post_processor(SimplePP())
+            ydl.post_process(filename, {'filepath': filename})
+
+        run_pp({'keepvideo': True})
+        self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
+        self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
+        os.unlink(filename)
+        os.unlink(audiofile)
+
+        run_pp({'keepvideo': False})
+        self.assertFalse(os.path.exists(filename), '%s exists' % filename)
+        self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
+        os.unlink(audiofile)
+
+
 if __name__ == '__main__':
    unittest.main()
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@ -826,27 +826,44 @@ class YoutubeDL(object):
            '!=': operator.ne,
        }
        operator_rex = re.compile(r'''(?x)\s*\[
-            (?P<key>width|height|tbr|abr|vbr|filesize|fps)
+            (?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
            \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
            (?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
            \]$
            ''' % '|'.join(map(re.escape, OPERATORS.keys())))
        m = operator_rex.search(format_spec)
+        if m:
+            try:
+                comparison_value = int(m.group('value'))
+            except ValueError:
+                comparison_value = parse_filesize(m.group('value'))
+                if comparison_value is None:
+                    comparison_value = parse_filesize(m.group('value') + 'B')
+                if comparison_value is None:
+                    raise ValueError(
+                        'Invalid value %r in format specification %r' % (
+                            m.group('value'), format_spec))
+            op = OPERATORS[m.group('op')]
+
+        if not m:
+            STR_OPERATORS = {
+                '=': operator.eq,
+                '!=': operator.ne,
+            }
+            str_operator_rex = re.compile(r'''(?x)\s*\[
+                \s*(?P<key>ext|acodec|vcodec|container|protocol)
+                \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
+                \s*(?P<value>[a-zA-Z0-9_-]+)
+                \s*\]$
+                ''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
+            m = str_operator_rex.search(format_spec)
+            if m:
+                comparison_value = m.group('value')
+                op = STR_OPERATORS[m.group('op')]
+
        if not m:
            raise ValueError('Invalid format specification %r' % format_spec)

-        try:
-            comparison_value = int(m.group('value'))
-        except ValueError:
-            comparison_value = parse_filesize(m.group('value'))
-            if comparison_value is None:
-                comparison_value = parse_filesize(m.group('value') + 'B')
-            if comparison_value is None:
-                raise ValueError(
-                    'Invalid value %r in format specification %r' % (
-                        m.group('value'), format_spec))
-        op = OPERATORS[m.group('op')]
-
        def _filter(f):
            actual_value = f.get(m.group('key'))
            if actual_value is None:
@ -938,6 +955,9 @@ class YoutubeDL(object):
            def has_header(self, h):
                return h in self.headers

+            def get_header(self, h, default=None):
+                return self.headers.get(h, default)
+
        pr = _PseudoRequest(info_dict['url'])
        self.cookiejar.add_cookie_header(pr)
        return pr.headers.get('Cookie')
@ -1076,7 +1096,8 @@ class YoutubeDL(object):
                                else self.params['merge_output_format'])
                            selected_format = {
                                'requested_formats': formats_info,
-                                'format': rf,
+                                'format': '%s+%s' % (formats_info[0].get('format'),
+                                                     formats_info[1].get('format')),
                                'format_id': '%s+%s' % (formats_info[0].get('format_id'),
                                                        formats_info[1].get('format_id')),
                                'width': formats_info[0].get('width'),
@ -1525,7 +1546,6 @@ class YoutubeDL(object):
            line(f, idlen) for f in formats
            if f.get('preference') is None or f['preference'] >= -1000]
        if len(formats) > 1:
-            formats_s[0] += (' ' if self._format_note(formats[0]) else '') + '(worst)'
            formats_s[-1] += (' ' if self._format_note(formats[-1]) else '') + '(best)'

        header_line = line({
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@ -6,6 +6,7 @@ from .academicearth import AcademicEarthCourseIE
 from .addanime import AddAnimeIE
 from .adobetv import AdobeTVIE
 from .adultswim import AdultSwimIE
+from .aftenposten import AftenpostenIE
 from .aftonbladet import AftonbladetIE
 from .aljazeera import AlJazeeraIE
 from .alphaporno import AlphaPornoIE
@ -427,6 +428,7 @@ from .streamcloud import StreamcloudIE
 from .streamcz import StreamCZIE
 from .streetvoice import StreetVoiceIE
 from .sunporno import SunPornoIE
+from .svtplay import SVTPlayIE
 from .swrmediathek import SWRMediathekIE
 from .syfy import SyfyIE
 from .sztvhu import SztvHuIE
@ -475,6 +477,7 @@ from .tutv import TutvIE
 from .tvigle import TvigleIE
 from .tvp import TvpIE, TvpSeriesIE
 from .tvplay import TVPlayIE
+from .tweakers import TweakersIE
 from .twentyfourvideo import TwentyFourVideoIE
 from .twitch import (
    TwitchVideoIE,
--- a/youtube_dl/extractor/aftenposten.py
+++ b/youtube_dl/extractor/aftenposten.py
@ -0,0 +1,103 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    parse_iso8601,
+    xpath_with_ns,
+    xpath_text,
+    find_xpath_attr,
+)
+
+
+class AftenpostenIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
+
+    _TEST = {
+        'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=&section=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
+        'md5': 'fd828cd29774a729bf4d4425fe192972',
+        'info_dict': {
+            'id': '21039',
+            'ext': 'mov',
+            'title': 'TRAILER: "Sweatshop" - I can´t take any more',
+            'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
+            'timestamp': 1416927969,
+            'upload_date': '20141125',
+        }
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        video_id = self._html_search_regex(
+            r'data-xs-id="(\d+)"', webpage, 'video id')
+
+        data = self._download_xml(
+            'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
+
+        NS_MAP = {
+            'atom': 'http://www.w3.org/2005/Atom',
+            'xt': 'http://xstream.dk/',
+            'media': 'http://search.yahoo.com/mrss/',
+        }
+
+        entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
+
+        title = xpath_text(
+            entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
+        description = xpath_text(
+            entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
+        timestamp = parse_iso8601(xpath_text(
+            entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
+
+        formats = []
+        media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
+        for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
+            media_url = media_content.get('url')
+            if not media_url:
+                continue
+            tbr = int_or_none(media_content.get('bitrate'))
+            mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
+            if mobj:
+                formats.append({
+                    'url': mobj.group('url'),
+                    'play_path': 'mp4:%s' % mobj.group('playpath'),
+                    'app': mobj.group('app'),
+                    'ext': 'flv',
+                    'tbr': tbr,
+                    'format_id': 'rtmp-%d' % tbr,
+                })
+            else:
+                formats.append({
+                    'url': media_url,
+                    'tbr': tbr,
+                })
+        self._sort_formats(formats)
+
+        link = find_xpath_attr(
+            entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
+        if link is not None:
+            formats.append({
+                'url': link.get('href'),
+                'format_id': link.get('rel'),
+            })
+
+        thumbnails = [{
+            'url': splash.get('url'),
+            'width': int_or_none(splash.get('width')),
+            'height': int_or_none(splash.get('height')),
+        } for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'timestamp': timestamp,
+            'formats': formats,
+            'thumbnails': thumbnails,
+        }
--- a/youtube_dl/extractor/aparat.py
+++ b/youtube_dl/extractor/aparat.py
@ -20,6 +20,7 @@ class AparatIE(InfoExtractor):
            'id': 'wP8On',
            'ext': 'mp4',
            'title': 'تیم گلکسی 11 - زومیت',
+            'age_limit': 0,
        },
        # 'skip': 'Extremely unreliable',
    }
@ -34,7 +35,8 @@ class AparatIE(InfoExtractor):
                     video_id + '/vt/frame')
        webpage = self._download_webpage(embed_url, video_id)

-        video_urls = re.findall(r'fileList\[[0-9]+\]\s*=\s*"([^"]+)"', webpage)
+        video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
+            r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
        for i, video_url in enumerate(video_urls):
            req = HEADRequest(video_url)
            res = self._request_webpage(
@ -46,7 +48,7 @@ class AparatIE(InfoExtractor):

        title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
        thumbnail = self._search_regex(
-            r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
+            r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)

        return {
            'id': video_id,
@ -54,4 +56,5 @@ class AparatIE(InfoExtractor):
            'url': video_url,
            'ext': 'mp4',
            'thumbnail': thumbnail,
+            'age_limit': self._family_friendly_search(webpage),
        }
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@ -72,26 +72,29 @@ class BandcampIE(InfoExtractor):

        download_link = m_download.group(1)
        video_id = self._search_regex(
-            r'var TralbumData = {.*?id: (?P<id>\d+),?$',
-            webpage, 'video id', flags=re.MULTILINE | re.DOTALL)
+            r'(?ms)var TralbumData = {.*?id: (?P<id>\d+),?$',
+            webpage, 'video id')

        download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
        # We get the dictionary of the track from some javascript code
-        info = re.search(r'items: (.*?),$', download_webpage, re.MULTILINE).group(1)
-        info = json.loads(info)[0]
+        all_info = self._parse_json(self._search_regex(
+            r'(?sm)items: (.*?),$', download_webpage, 'items'), video_id)
+        info = All_info[0]
        # We pick mp3-320 for now, until format selection can be easily implemented.
        mp3_info = info['downloads']['mp3-320']
        # If we try to use this url it says the link has expired
        initial_url = mp3_info['url']
-        re_url = r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$'
-        m_url = re.match(re_url, initial_url)
+        m_url = re.match(
+            r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$',
+            initial_url)
        # We build the url we will use to get the final track url
        # This url is build in Bandcamp in the script download_bunde_*.js
        request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
        final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
        # If we could correctly generate the .rand field the url would be
        # in the "download_url" key
-        final_url = re.search(r'"retry_url":"(.*?)"', final_url_webpage).group(1)
+        final_url = self._search_regex(
+            r'"retry_url":"(.*?)"', final_url_webpage, 'final video URL')

        return {
            'id': video_id,
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@ -264,8 +264,15 @@ class InfoExtractor(object):

    def extract(self, url):
        """Extracts URL information and returns it in list of dicts."""
-        self.initialize()
-        return self._real_extract(url)
+        try:
+            self.initialize()
+            return self._real_extract(url)
+        except ExtractorError:
+            raise
+        except compat_http_client.IncompleteRead as e:
+            raise ExtractorError('A network error has occured.', cause=e, expected=True)
+        except (KeyError,) as e:
+            raise ExtractorError('An extractor error has occured.', cause=e)

    def set_downloader(self, downloader):
        """Sets the downloader for this IE."""
@ -656,6 +663,21 @@ class InfoExtractor(object):
        }
        return RATING_TABLE.get(rating.lower(), None)

+    def _family_friendly_search(self, html):
+        # See http://schema.org/VideoObj
+        family_friendly = self._html_search_meta('isFamilyFriendly', html)
+
+        if not family_friendly:
+            return None
+
+        RATING_TABLE = {
+            '1': 0,
+            'true': 0,
+            '0': 18,
+            'false': 18,
+        }
+        return RATING_TABLE.get(family_friendly.lower(), None)
+
    def _twitter_search_player(self, html):
        return self._html_search_meta('twitter:player', html,
                                      'twitter card player')
@ -707,9 +729,9 @@ class InfoExtractor(object):
                f.get('quality') if f.get('quality') is not None else -1,
                f.get('tbr') if f.get('tbr') is not None else -1,
                f.get('vbr') if f.get('vbr') is not None else -1,
-                ext_preference,
                f.get('height') if f.get('height') is not None else -1,
                f.get('width') if f.get('width') is not None else -1,
+                ext_preference,
                f.get('abr') if f.get('abr') is not None else -1,
                audio_ext_preference,
                f.get('fps') if f.get('fps') is not None else -1,
@ -765,7 +787,7 @@ class InfoExtractor(object):
        self.to_screen(msg)
        time.sleep(timeout)

-    def _extract_f4m_formats(self, manifest_url, video_id):
+    def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None):
        manifest = self._download_xml(
            manifest_url, video_id, 'Downloading f4m manifest',
            'Unable to download f4m manifest')
@ -778,26 +800,28 @@ class InfoExtractor(object):
            media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
        for i, media_el in enumerate(media_nodes):
            if manifest_version == '2.0':
-                manifest_url = '/'.join(manifest_url.split('/')[:-1]) + '/' + media_el.attrib.get('href')
+                manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
+                                + (media_el.attrib.get('href') or media_el.attrib.get('url')))
            tbr = int_or_none(media_el.attrib.get('bitrate'))
-            format_id = 'f4m-%d' % (i if tbr is None else tbr)
            formats.append({
-                'format_id': format_id,
+                'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
                'url': manifest_url,
                'ext': 'flv',
                'tbr': tbr,
                'width': int_or_none(media_el.attrib.get('width')),
                'height': int_or_none(media_el.attrib.get('height')),
+                'preference': preference,
            })
        self._sort_formats(formats)

        return formats

    def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
-                              entry_protocol='m3u8', preference=None):
+                              entry_protocol='m3u8', preference=None,
+                              m3u8_id=None):

        formats = [{
-            'format_id': 'm3u8-meta',
+            'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
            'url': m3u8_url,
            'ext': ext,
            'protocol': 'm3u8',
@ -833,9 +857,8 @@ class InfoExtractor(object):
                    formats.append({'url': format_url(line)})
                    continue
                tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
-
                f = {
-                    'format_id': 'm3u8-%d' % (tbr if tbr else len(formats)),
+                    'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
                    'url': format_url(line.strip()),
                    'tbr': tbr,
                    'ext': ext,
--- a/youtube_dl/extractor/gamekings.py
+++ b/youtube_dl/extractor/gamekings.py
@ -1,41 +1,67 @@
+# coding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
+from ..utils import (
+    xpath_text,
+    xpath_with_ns,
+)


 class GamekingsIE(InfoExtractor):
-    _VALID_URL = r'http://www\.gamekings\.tv/videos/(?P<name>[0-9a-z\-]+)'
-    _TEST = {
+    _VALID_URL = r'http://www\.gamekings\.tv/(?:videos|nieuws)/(?P<id>[^/]+)'
+    _TESTS = [{
        'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/',
        # MD5 is flaky, seems to change regularly
        # 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3',
        'info_dict': {
-            'id': '20130811',
+            'id': 'phoenix-wright-ace-attorney-dual-destinies-review',
            'ext': 'mp4',
            'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
            'description': 'md5:36fd701e57e8c15ac8682a2374c99731',
-        }
-    }
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+    }, {
+        # vimeo video
+        'url': 'http://www.gamekings.tv/videos/the-legend-of-zelda-majoras-mask/',
+        'md5': '12bf04dfd238e70058046937657ea68d',
+        'info_dict': {
+            'id': 'the-legend-of-zelda-majoras-mask',
+            'ext': 'mp4',
+            'title': 'The Legend of Zelda: Majora’s Mask',
+            'description': 'md5:9917825fe0e9f4057601fe1e38860de3',
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+    }, {
+        'url': 'http://www.gamekings.tv/nieuws/gamekings-extra-shelly-en-david-bereiden-zich-voor-op-de-livestream/',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
+        video_id = self._match_id(url)

-        mobj = re.match(self._VALID_URL, url)
-        name = mobj.group('name')
-        webpage = self._download_webpage(url, name)
-        video_url = self._og_search_video_url(webpage)
+        webpage = self._download_webpage(url, video_id)

-        video = re.search(r'[0-9]+', video_url)
-        video_id = video.group(0)
+        playlist_id = self._search_regex(
+            r'gogoVideo\(\s*\d+\s*,\s*"([^"]+)', webpage, 'playlist id')

-        # Todo: add medium format
-        video_url = video_url.replace(video_id, 'large/' + video_id)
+        playlist = self._download_xml(
+            'http://www.gamekings.tv/wp-content/themes/gk2010/rss_playlist.php?id=%s' % playlist_id,
+            video_id)
+
+        NS_MAP = {
+            'jwplayer': 'http://rss.jwpcdn.com/'
+        }
+
+        item = playlist.find('./channel/item')
+
+        thumbnail = xpath_text(item, xpath_with_ns('./jwplayer:image', NS_MAP), 'thumbnail')
+        video_url = item.find(xpath_with_ns('./jwplayer:source', NS_MAP)).get('file')

        return {
            'id': video_id,
-            'ext': 'mp4',
            'url': video_url,
            'title': self._og_search_title(webpage),
            'description': self._og_search_description(webpage),
+            'thumbnail': thumbnail,
        }
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@ -140,6 +140,19 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Ooyala'],
        },
+        # multiple ooyala embeds on SBN network websites
+        {
+            'url': 'http://www.sbnation.com/college-football-recruiting/2015/2/3/7970291/national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+            'info_dict': {
+                'id': 'national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+                'title': '25 lies you will tell yourself on National Signing Day - SBNation.com',
+            },
+            'playlist_mincount': 3,
+            'params': {
+                'skip_download': True,
+            },
+            'add_ie': ['Ooyala'],
+        },
        # google redirect
        {
            'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
@ -511,6 +524,19 @@ class GenericIE(InfoExtractor):
                'upload_date': '20150126',
            },
            'add_ie': ['Viddler'],
+        },
+        # jwplayer YouTube
+        {
+            'url': 'http://media.nationalarchives.gov.uk/index.php/webinar-using-discovery-national-archives-online-catalogue/',
+            'info_dict': {
+                'id': 'Mrj4DVp2zeA',
+                'ext': 'mp4',
+                'upload_date': '20150204',
+                'uploader': 'The National Archives UK',
+                'description': 'md5:a236581cd2449dd2df4f93412f3f01c6',
+                'uploader_id': 'NationalArchives08',
+                'title': 'Webinar: Using Discovery, The National Archives’ online catalogue',
+            },
        }
    ]

@ -882,10 +908,19 @@ class GenericIE(InfoExtractor):

        # Look for Ooyala videos
        mobj = (re.search(r'player\.ooyala\.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
-                re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage))
+                re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage) or
+                re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage))
        if mobj is not None:
            return OoyalaIE._build_url_result(mobj.group('ec'))

+        # Look for multiple Ooyala embeds on SBN network websites
+        mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
+        if mobj is not None:
+            embeds = self._parse_json(mobj.group(1), video_id, fatal=False)
+            if embeds:
+                return _playlist_from_matches(
+                    embeds, getter=lambda v: OoyalaIE._url_for_embed_code(v['provider_video_id']), ie='Ooyala')
+
        # Look for Aparat videos
        mobj = re.search(r'<iframe .*?src="(http://www\.aparat\.com/video/[^"]+)"', webpage)
        if mobj is not None:
@ -1012,7 +1047,12 @@ class GenericIE(InfoExtractor):

        # Look for embedded sbs.com.au player
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:www\.)sbs\.com\.au/ondemand/video/single/.+?)\1',
+            r'''(?x)
+            (?:
+                <meta\s+property="og:video"\s+content=|
+                <iframe[^>]+?src=
+            )
+            (["\'])(?P<url>https?://(?:www\.)?sbs\.com\.au/ondemand/video/.+?)\1''',
            webpage)
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'SBS')
@ -1043,6 +1083,8 @@ class GenericIE(InfoExtractor):
            return self.url_result(mobj.group('url'), 'Livestream')

        def check_video(vurl):
+            if YoutubeIE.suitable(vurl):
+                return True
            vpath = compat_urlparse.urlparse(vurl).path
            vext = determine_ext(vpath)
            return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml')
@ -1060,7 +1102,8 @@ class GenericIE(InfoExtractor):
                    JWPlayerOptions|
                    jwplayer\s*\(\s*["'][^'"]+["']\s*\)\s*\.setup
                )
-                .*?file\s*:\s*["\'](.*?)["\']''', webpage))
+                .*?
+                ['"]?file['"]?\s*:\s*["\'](.*?)["\']''', webpage))
        if not found:
            # Broaden the search a little bit
            found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage))
--- a/youtube_dl/extractor/goshgay.py
+++ b/youtube_dl/extractor/goshgay.py
@ -34,8 +34,6 @@ class GoshgayIE(InfoExtractor):
        duration = parse_duration(self._html_search_regex(
            r'<span class="duration">\s*-?\s*(.*?)</span>',
            webpage, 'duration', fatal=False))
-        family_friendly = self._html_search_meta(
-            'isFamilyFriendly', webpage, default='false')

        flashvars = compat_parse_qs(self._html_search_regex(
            r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"',
@ -49,5 +47,5 @@ class GoshgayIE(InfoExtractor):
            'title': title,
            'thumbnail': thumbnail,
            'duration': duration,
-            'age_limit': 0 if family_friendly == 'true' else 18,
+            'age_limit': self._family_friendly_search(webpage),
        }
--- a/youtube_dl/extractor/izlesene.py
+++ b/youtube_dl/extractor/izlesene.py
@ -80,9 +80,6 @@ class IzleseneIE(InfoExtractor):
            r'comment_count\s*=\s*\'([^\']+)\';',
            webpage, 'comment_count', fatal=False)

-        family_friendly = self._html_search_meta(
-            'isFamilyFriendly', webpage, 'age limit', fatal=False)
-
        content_url = self._html_search_meta(
            'contentURL', webpage, 'content URL', fatal=False)
        ext = determine_ext(content_url, 'mp4')
@ -120,6 +117,6 @@ class IzleseneIE(InfoExtractor):
            'duration': duration,
            'view_count': int_or_none(view_count),
            'comment_count': int_or_none(comment_count),
-            'age_limit': 18 if family_friendly == 'False' else 0,
+            'age_limit': self._family_friendly_search(webpage),
            'formats': formats,
        }
--- a/youtube_dl/extractor/mixcloud.py
+++ b/youtube_dl/extractor/mixcloud.py
@ -18,7 +18,7 @@ class MixcloudIE(InfoExtractor):
    _VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([^/]+)/([^/]+)'
    IE_NAME = 'mixcloud'

-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.mixcloud.com/dholbach/cryptkeeper/',
        'info_dict': {
            'id': 'dholbach-cryptkeeper',
@ -33,7 +33,20 @@ class MixcloudIE(InfoExtractor):
            'view_count': int,
            'like_count': int,
        },
-    }
+    }, {
+        'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
+        'info_dict': {
+            'id': 'gillespeterson-caribou-7-inch-vinyl-mix-chat',
+            'ext': 'm4a',
+            'title': 'Electric Relaxation vol. 3',
+            'description': 'md5:2b8aec6adce69f9d41724647c65875e8',
+            'uploader': 'Daniel Drumz',
+            'uploader_id': 'gillespeterson',
+            'thumbnail': 're:https?://.*\.jpg',
+            'view_count': int,
+            'like_count': int,
+        },
+    }]

    def _get_url(self, track_id, template_url):
        server_count = 30
@ -60,7 +73,7 @@ class MixcloudIE(InfoExtractor):
        webpage = self._download_webpage(url, track_id)

        preview_url = self._search_regex(
-            r'\s(?:data-preview-url|m-preview)="(.+?)"', webpage, 'preview url')
+            r'\s(?:data-preview-url|m-preview)="([^"]+)"', webpage, 'preview url')
        song_url = preview_url.replace('/previews/', '/c/originals/')
        template_url = re.sub(r'(stream\d*)', 'stream%d', song_url)
        final_song_url = self._get_url(track_id, template_url)
--- a/youtube_dl/extractor/npo.py
+++ b/youtube_dl/extractor/npo.py
@ -1,6 +1,6 @@
 from __future__ import unicode_literals

-from .common import InfoExtractor
+from .subtitles import SubtitlesInfoExtractor
 from ..utils import (
    fix_xml_ampersands,
    parse_duration,
@ -11,7 +11,7 @@ from ..utils import (
 )


-class NPOBaseIE(InfoExtractor):
+class NPOBaseIE(SubtitlesInfoExtractor):
    def _get_token(self, video_id):
        token_page = self._download_webpage(
            'http://ida.omroep.nl/npoplayer/i.js',
@ -161,6 +161,16 @@ class NPOIE(NPOBaseIE):

        self._sort_formats(formats)

+        subtitles = {}
+        if metadata.get('tt888') == 'ja':
+            subtitles['nl'] = 'http://e.omroep.nl/tt888/%s' % video_id
+
+        if self._downloader.params.get('listsubtitles', False):
+            self._list_available_subtitles(video_id, subtitles)
+            return
+
+        subtitles = self.extract_subtitles(video_id, subtitles)
+
        return {
            'id': video_id,
            'title': metadata['titel'],
@ -169,6 +179,7 @@ class NPOIE(NPOBaseIE):
            'upload_date': unified_strdate(metadata.get('gidsdatum')),
            'duration': parse_duration(metadata.get('tijdsduur')),
            'formats': formats,
+            'subtitles': subtitles,
        }


--- a/youtube_dl/extractor/rtlnow.py
+++ b/youtube_dl/extractor/rtlnow.py
@ -91,6 +91,15 @@ class RTLnowIE(InfoExtractor):
            },
        },
        {
+            'url': 'http://rtl-now.rtl.de/der-bachelor/folge-4.php?film_id=188729&player=1&season=5',
+            'info_dict': {
+                'id': '188729',
+                'ext': 'flv',
+                'upload_date': '20150204',
+                'description': 'md5:5e1ce23095e61a79c166d134b683cecc',
+                'title': 'Der Bachelor - Folge 4',
+            }
+        }, {
            'url': 'http://www.n-tvnow.de/deluxe-alles-was-spass-macht/thema-ua-luxushotel-fuer-vierbeiner.php?container_id=153819&player=1&season=0',
            'only_matching': True,
        },
@ -134,9 +143,18 @@ class RTLnowIE(InfoExtractor):
                    'player_url': video_page_url + 'includes/vodplayer.swf',
                }
            else:
-                fmt = {
-                    'url': filename.text,
-                }
+                mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text)
+                if mobj:
+                    fmt = {
+                        'url': 'rtmpe://fmspay-fra2.rtl.de/' + mobj.group('hoster'),
+                        'play_path': 'mp4:' + mobj.group('play_path'),
+                        'page_url': url,
+                        'player_url': video_page_url + 'includes/vodplayer.swf',
+                    }
+                else:
+                    fmt = {
+                        'url': filename.text,
+                    }
            fmt.update({
                'width': int_or_none(filename.get('width')),
                'height': int_or_none(filename.get('height')),
--- a/youtube_dl/extractor/rtp.py
+++ b/youtube_dl/extractor/rtp.py
@ -1,16 +1,16 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import json
+import re

 from .common import InfoExtractor
-from ..utils import js_to_json


 class RTPIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?'
    _TESTS = [{
        'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
+        'md5': 'e736ce0c665e459ddb818546220b4ef8',
        'info_dict': {
            'id': 'e174042',
            'ext': 'mp3',
@ -18,9 +18,6 @@ class RTPIE(InfoExtractor):
            'description': 'As paixões musicais de António Cartaxo e António Macedo',
            'thumbnail': 're:^https?://.*\.jpg',
        },
-        'params': {
-            'skip_download': True,  # RTMP download
-        },
    }, {
        'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
        'only_matching': True,
@ -37,20 +34,48 @@ class RTPIE(InfoExtractor):

        player_config = self._search_regex(
            r'(?s)RTPPLAY\.player\.newPlayer\(\s*(\{.*?\})\s*\)', webpage, 'player config')
-        config = json.loads(js_to_json(player_config))
+        config = self._parse_json(player_config, video_id)

        path, ext = config.get('file').rsplit('.', 1)
        formats = [{
+            'format_id': 'rtmp',
+            'ext': ext,
+            'vcodec': config.get('type') == 'audio' and 'none' or None,
+            'preference': -2,
+            'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
            'app': config.get('application'),
            'play_path': '{ext:s}:{path:s}'.format(ext=ext, path=path),
            'page_url': url,
-            'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
            'rtmp_live': config.get('live', False),
-            'ext': ext,
-            'vcodec': config.get('type') == 'audio' and 'none' or None,
            'player_url': 'http://programas.rtp.pt/play/player.swf?v3',
+            'rtmp_real_time': True,
        }]

+        # Construct regular HTTP download URLs
+        replacements = {
+            'audio': {
+                'format_id': 'mp3',
+                'pattern': r'^nas2\.share/wavrss/',
+                'repl': 'http://rsspod.rtp.pt/podcasts/',
+                'vcodec': 'none',
+            },
+            'video': {
+                'format_id': 'mp4_h264',
+                'pattern': r'^nas2\.share/h264/',
+                'repl': 'http://rsspod.rtp.pt/videocasts/',
+                'vcodec': 'h264',
+            },
+        }
+        r = replacements[config['type']]
+        if re.match(r['pattern'], config['file']) is not None:
+            formats.append({
+                'format_id': r['format_id'],
+                'url': re.sub(r['pattern'], r['repl'], config['file']),
+                'vcodec': r['vcodec'],
+            })
+
+        self._sort_formats(formats)
+
        return {
            'id': video_id,
            'title': title,
--- a/youtube_dl/extractor/rts.py
+++ b/youtube_dl/extractor/rts.py
@ -6,12 +6,14 @@ import re
 from .common import InfoExtractor
 from ..compat import (
    compat_str,
+    compat_urllib_parse_urlparse,
 )
 from ..utils import (
    int_or_none,
    parse_duration,
    parse_iso8601,
    unescapeHTML,
+    xpath_text,
 )


@ -159,11 +161,27 @@ class RTSIE(InfoExtractor):
            return int_or_none(self._search_regex(
                r'-([0-9]+)k\.', url, 'bitrate', default=None))

-        formats = [{
-            'format_id': fid,
-            'url': furl,
-            'tbr': extract_bitrate(furl),
-        } for fid, furl in info['streams'].items()]
+        formats = []
+        for format_id, format_url in info['streams'].items():
+            if format_url.endswith('.f4m'):
+                token = self._download_xml(
+                    'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
+                    video_id, 'Downloading %s token' % format_id)
+                auth_params = xpath_text(token, './/authparams', 'auth params')
+                if not auth_params:
+                    continue
+                formats.extend(self._extract_f4m_formats(
+                    '%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
+                    video_id, f4m_id=format_id))
+            elif format_url.endswith('.m3u8'):
+                formats.extend(self._extract_m3u8_formats(
+                    format_url, video_id, 'mp4', m3u8_id=format_id))
+            else:
+                formats.append({
+                    'format_id': format_id,
+                    'url': format_url,
+                    'tbr': extract_bitrate(format_url),
+                })

        if 'media' in info:
            formats.extend([{
--- a/youtube_dl/extractor/soulanime.py
+++ b/youtube_dl/extractor/soulanime.py
@ -1,80 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    HEADRequest,
-    urlhandle_detect_ext,
-)
-
-
-class SoulAnimeWatchingIE(InfoExtractor):
-    IE_NAME = "soulanime:watching"
-    IE_DESC = "SoulAnime video"
-    _TEST = {
-        'url': 'http://www.soul-anime.net/watching/seirei-tsukai-no-blade-dance-episode-9/',
-        'md5': '05fae04abf72298098b528e98abf4298',
-        'info_dict': {
-            'id': 'seirei-tsukai-no-blade-dance-episode-9',
-            'ext': 'mp4',
-            'title': 'seirei-tsukai-no-blade-dance-episode-9',
-            'description': 'seirei-tsukai-no-blade-dance-episode-9'
-        }
-    }
-    _VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/watch[^/]*/(?P<id>[^/]+)'
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        domain = mobj.group('domain')
-
-        page = self._download_webpage(url, video_id)
-
-        video_url_encoded = self._html_search_regex(
-            r'<div id="download">[^<]*<a href="(?P<url>[^"]+)"', page, 'url')
-        video_url = "http://www.soul-anime." + domain + video_url_encoded
-
-        ext_req = HEADRequest(video_url)
-        ext_handle = self._request_webpage(
-            ext_req, video_id, note='Determining extension')
-        ext = urlhandle_detect_ext(ext_handle)
-
-        return {
-            'id': video_id,
-            'url': video_url,
-            'ext': ext,
-            'title': video_id,
-            'description': video_id
-        }
-
-
-class SoulAnimeSeriesIE(InfoExtractor):
-    IE_NAME = "soulanime:series"
-    IE_DESC = "SoulAnime Series"
-
-    _VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/anime./(?P<id>[^/]+)'
-
-    _EPISODE_REGEX = r'<option value="(/watch[^/]*/[^"]+)">[^<]*</option>'
-
-    _TEST = {
-        'url': 'http://www.soul-anime.net/anime1/black-rock-shooter-tv/',
-        'info_dict': {
-            'id': 'black-rock-shooter-tv'
-        },
-        'playlist_count': 8
-    }
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        series_id = mobj.group('id')
-        domain = mobj.group('domain')
-
-        pattern = re.compile(self._EPISODE_REGEX)
-
-        page = self._download_webpage(url, series_id, "Downloading series page")
-        mobj = pattern.findall(page)
-
-        entries = [self.url_result("http://www.soul-anime." + domain + obj) for obj in mobj]
-
-        return self.playlist_result(entries, series_id)
--- a/youtube_dl/extractor/svtplay.py
+++ b/youtube_dl/extractor/svtplay.py
@ -0,0 +1,56 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    determine_ext,
+)
+
+
+class SVTPlayIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?svtplay\.se/video/(?P<id>[0-9]+)'
+    _TEST = {
+        'url': 'http://www.svtplay.se/video/2609989/sm-veckan/sm-veckan-rally-final-sasong-1-sm-veckan-rally-final',
+        'md5': 'f4a184968bc9c802a9b41316657aaa80',
+        'info_dict': {
+            'id': '2609989',
+            'ext': 'mp4',
+            'title': 'SM veckan vinter, Örebro - Rally, final',
+            'duration': 4500,
+            'thumbnail': 're:^https?://.*[\.-]jpg$',
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        info = self._download_json(
+            'http://www.svtplay.se/video/%s?output=json' % video_id, video_id)
+
+        title = info['context']['title']
+        thumbnail = info['context'].get('thumbnailImage')
+
+        video_info = info['video']
+        formats = []
+        for vr in video_info['videoReferences']:
+            vurl = vr['url']
+            if determine_ext(vurl) == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    vurl, video_id,
+                    ext='mp4', entry_protocol='m3u8_native',
+                    m3u8_id=vr.get('playerType')))
+            else:
+                formats.append({
+                    'format_id': vr.get('playerType'),
+                    'url': vurl,
+                })
+        self._sort_formats(formats)
+
+        duration = video_info.get('materialLength')
+
+        return {
+            'id': video_id,
+            'title': title,
+            'formats': formats,
+            'thumbnail': thumbnail,
+            'duration': duration,
+        }
--- a/youtube_dl/extractor/teamcoco.py
+++ b/youtube_dl/extractor/teamcoco.py
@ -15,7 +15,8 @@ class TeamcocoIE(InfoExtractor):
                'id': '80187',
                'ext': 'mp4',
                'title': 'Conan Becomes A Mary Kay Beauty Consultant',
-                'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.'
+                'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.',
+                'age_limit': 0,
            }
        }, {
            'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush',
@ -24,7 +25,8 @@ class TeamcocoIE(InfoExtractor):
                'id': '19705',
                'ext': 'mp4',
                "description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.",
-                "title": "Louis C.K. Interview Pt. 1 11/3/11"
+                "title": "Louis C.K. Interview Pt. 1 11/3/11",
+                'age_limit': 0,
            }
        }
    ]
@ -83,4 +85,5 @@ class TeamcocoIE(InfoExtractor):
            'title': self._og_search_title(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
            'description': self._og_search_description(webpage),
+            'age_limit': self._family_friendly_search(webpage),
        }
--- a/youtube_dl/extractor/trilulilu.py
+++ b/youtube_dl/extractor/trilulilu.py
@ -1,40 +1,55 @@
+# coding: utf-8
 from __future__ import unicode_literals

-import json
+import re

 from .common import InfoExtractor
+from ..utils import ExtractorError


 class TriluliluIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?trilulilu\.ro/video-[^/]+/(?P<id>[^/]+)'
+    _VALID_URL = r'https?://(?:www\.)?trilulilu\.ro/(?:video-[^/]+/)?(?P<id>[^/#\?]+)'
    _TEST = {
        'url': 'http://www.trilulilu.ro/video-animatie/big-buck-bunny-1',
+        'md5': 'c1450a00da251e2769b74b9005601cac',
        'info_dict': {
-            'id': 'big-buck-bunny-1',
+            'id': 'ae2899e124140b',
            'ext': 'mp4',
            'title': 'Big Buck Bunny',
            'description': ':) pentru copilul din noi',
        },
-        # Server ignores Range headers (--test)
-        'params': {
-            'skip_download': True
-        }
    }

    def _real_extract(self, url):
-        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)

+        if re.search(r'Fişierul nu este disponibil pentru vizionare în ţara dumneavoastră', webpage):
+            raise ExtractorError(
+                'This video is not available in your country.', expected=True)
+        elif re.search('Fişierul poate fi accesat doar de către prietenii lui', webpage):
+            raise ExtractorError('This video is private.', expected=True)
+
+        flashvars_str = self._search_regex(
+            r'block_flash_vars\s*=\s*(\{[^\}]+\})', webpage, 'flashvars', fatal=False, default=None)
+
+        if flashvars_str:
+            flashvars = self._parse_json(flashvars_str, display_id)
+        else:
+            raise ExtractorError(
+                'This page does not contain videos', expected=True)
+
+        if flashvars['isMP3'] == 'true':
+            raise ExtractorError(
+                'Audio downloads are currently not supported', expected=True)
+
+        video_id = flashvars['hash']
        title = self._og_search_title(webpage)
        thumbnail = self._og_search_thumbnail(webpage)
-        description = self._og_search_description(webpage)
-
-        log_str = self._search_regex(
-            r'block_flash_vars[ ]=[ ]({[^}]+})', webpage, 'log info')
-        log = json.loads(log_str)
+        description = self._og_search_description(webpage, default=None)

        format_url = ('http://fs%(server)s.trilulilu.ro/%(hash)s/'
-                      'video-formats2' % log)
+                      'video-formats2' % flashvars)
        format_doc = self._download_xml(
            format_url, video_id,
            note='Downloading formats',
@ -44,10 +59,10 @@ class TriluliluIE(InfoExtractor):
            'http://fs%(server)s.trilulilu.ro/stream.php?type=video'
            '&source=site&hash=%(hash)s&username=%(userid)s&'
            'key=ministhebest&format=%%s&sig=&exp=' %
-            log)
+            flashvars)
        formats = [
            {
-                'format': fnode.text,
+                'format_id': fnode.text.partition('-')[2],
                'url': video_url_template % fnode.text,
                'ext': fnode.text.partition('-')[0]
            }
@ -56,8 +71,8 @@ class TriluliluIE(InfoExtractor):
        ]

        return {
-            '_type': 'video',
            'id': video_id,
+            'display_id': display_id,
            'formats': formats,
            'title': title,
            'description': description,
--- a/youtube_dl/extractor/tvigle.py
+++ b/youtube_dl/extractor/tvigle.py
@ -1,6 +1,8 @@
 # encoding: utf-8
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
 from ..utils import (
    float_or_none,
@ -11,7 +13,7 @@ from ..utils import (
 class TvigleIE(InfoExtractor):
    IE_NAME = 'tvigle'
    IE_DESC = 'Интернет-телевидение Tvigle.ru'
-    _VALID_URL = r'http://(?:www\.)?tvigle\.ru/(?:[^/]+/)+(?P<id>[^/]+)/$'
+    _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'

    _TESTS = [
        {
@ -38,16 +40,22 @@ class TvigleIE(InfoExtractor):
                'duration': 186.080,
                'age_limit': 0,
            },
-        },
+        }, {
+            'url': 'https://cloud.tvigle.ru/video/5267604/',
+            'only_matching': True,
+        }
    ]

    def _real_extract(self, url):
-        display_id = self._match_id(url)
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id')
+        display_id = mobj.group('display_id')

-        webpage = self._download_webpage(url, display_id)
-
-        video_id = self._html_search_regex(
-            r'<li class="video-preview current_playing" id="(\d+)">', webpage, 'video id')
+        if not video_id:
+            webpage = self._download_webpage(url, display_id)
+            video_id = self._html_search_regex(
+                r'<li class="video-preview current_playing" id="(\d+)">',
+                webpage, 'video id')

        video_data = self._download_json(
            'http://cloud.tvigle.ru/api/play/video/%s/' % video_id, display_id)
--- a/youtube_dl/extractor/tweakers.py
+++ b/youtube_dl/extractor/tweakers.py
@ -0,0 +1,65 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    xpath_text,
+    xpath_with_ns,
+    int_or_none,
+    float_or_none,
+)
+
+
+class TweakersIE(InfoExtractor):
+    _VALID_URL = r'https?://tweakers\.net/video/(?P<id>\d+)'
+    _TEST = {
+        'url': 'https://tweakers.net/video/9926/new-nintendo-3ds-xl-op-alle-fronten-beter.html',
+        'md5': '1b5afa817403bb5baa08359dca31e6df',
+        'info_dict': {
+            'id': '9926',
+            'ext': 'mp4',
+            'title': 'New Nintendo 3DS XL - Op alle fronten beter',
+            'description': 'md5:f97324cc71e86e11c853f0763820e3ba',
+            'thumbnail': 're:^https?://.*\.jpe?g$',
+            'duration': 386,
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        playlist = self._download_xml(
+            'https://tweakers.net/video/s1playlist/%s/playlist.xspf' % video_id,
+            video_id)
+
+        NS_MAP = {
+            'xspf': 'http://xspf.org/ns/0/',
+            's1': 'http://static.streamone.nl/player/ns/0',
+        }
+
+        track = playlist.find(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP))
+
+        title = xpath_text(
+            track, xpath_with_ns('./xspf:title', NS_MAP), 'title')
+        description = xpath_text(
+            track, xpath_with_ns('./xspf:annotation', NS_MAP), 'description')
+        thumbnail = xpath_text(
+            track, xpath_with_ns('./xspf:image', NS_MAP), 'thumbnail')
+        duration = float_or_none(
+            xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'),
+            1000)
+
+        formats = [{
+            'url': location.text,
+            'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
+            'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
+            'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
+        } for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@ -780,8 +780,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
                    fo for fo in formats
                    if fo['format_id'] == format_id)
            except StopIteration:
-                f.update(self._formats.get(format_id, {}).items())
-                formats.append(f)
+                full_info = self._formats.get(format_id, {}).copy()
+                full_info.update(f)
+                formats.append(full_info)
            else:
                existing_format.update(f)
        return formats
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@ -297,8 +297,10 @@ def parseOpts(overrideArguments=None):
            ' You can filter the video results by putting a condition in'
            ' brackets, as in -f "best[height=720]"'
            ' (or -f "[filesize>10M]"). '
-            ' This works for filesize, height, width, tbr, abr, vbr, and fps'
-            ' and the comparisons <, <=, >, >=, =, != .'
+            ' This works for filesize, height, width, tbr, abr, vbr, asr, and fps'
+            ' and the comparisons <, <=, >, >=, =, !='
+            ' and for ext, acodec, vcodec, container, and protocol'
+            ' and the comparisons =, != .'
            ' Formats for which the value is not known are excluded unless you'
            ' put a question mark (?) after the operator.'
            ' You can combine format filters, so  '
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@ -166,14 +166,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
        if filecodec is None:
            raise PostProcessingError('WARNING: unable to obtain file audio codec with ffprobe')

-        uses_avconv = self._uses_avconv()
        more_opts = []
        if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
            if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']:
                # Lossless, but in another container
                acodec = 'copy'
                extension = 'm4a'
-                more_opts = ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
+                more_opts = ['-bsf:a', 'aac_adtstoasc']
            elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
                # Lossless if possible
                acodec = 'copy'
@ -189,9 +188,9 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
                more_opts = []
                if self._preferredquality is not None:
                    if int(self._preferredquality) < 10:
-                        more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
+                        more_opts += ['-q:a', self._preferredquality]
                    else:
-                        more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
+                        more_opts += ['-b:a', self._preferredquality + 'k']
        else:
            # We convert the audio (lossy)
            acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
@ -200,13 +199,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
            if self._preferredquality is not None:
                # The opus codec doesn't support the -aq option
                if int(self._preferredquality) < 10 and extension != 'opus':
-                    more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
+                    more_opts += ['-q:a', self._preferredquality]
                else:
-                    more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
+                    more_opts += ['-b:a', self._preferredquality + 'k']
            if self._preferredcodec == 'aac':
                more_opts += ['-f', 'adts']
            if self._preferredcodec == 'm4a':
-                more_opts += ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
+                more_opts += ['-bsf:a', 'aac_adtstoasc']
            if self._preferredcodec == 'vorbis':
                extension = 'ogg'
            if self._preferredcodec == 'wav':
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2015.02.03.1'
+__version__ = '2015.02.10'
Author	SHA1	Message	Date
Philipp Hagemeister	34814eb66e	release 2015.02.10	2015-02-10 01:19:52 +01:00
Philipp Hagemeister	3a5bcd0326	[extractor/common] Wrap extractor errors (Fixes #1194 ) For now, we just wrap some common errors. More may follow. We do not want to catch actual programming errors in the extractors, such as 1 // 0.	2015-02-10 01:17:23 +01:00
Philipp Hagemeister	99c2398bc6	[bandcamp] Use our API to get more stable error messages (#1194 )	2015-02-09 19:08:51 +01:00
Philipp Hagemeister	28f1272870	[svtplay] Correct test case	2015-02-09 16:05:01 +01:00
Philipp Hagemeister	f18e3a2fc0	release 2015.02.09.3	2015-02-09 15:59:19 +01:00
Philipp Hagemeister	c4c5dc27cb	Merge branch 'master' of github.com:rg3/youtube-dl	2015-02-09 15:59:14 +01:00
Naglis Jonaitis	2caf182f37	[trilulilu] Add support for videos without category in the URL (Closes #4067 ) Also, update the testcase, detect private/country restricted videos and modernize a bit.	2015-02-09 17:00:05 +02:00
Philipp Hagemeister	43f244b6d5	[YoutubeDL] Do not show worst in --list-formats output Nobody wants to know what the worst possible format is. And if they do, they can still provide -f worst.	2015-02-09 15:57:42 +01:00
Philipp Hagemeister	1309b396d0	[svtplay] Add new extractor (Fixes #4914 )	2015-02-09 15:56:59 +01:00
Jaime Marquínez Ferrándiz	ba61796458	[youtube] Don't override format info from the dash manifest (fixes #4911 )	2015-02-09 15:04:22 +01:00
Philipp Hagemeister	3255fe7141	release 2015.02.09.2	2015-02-09 14:46:30 +01:00
Philipp Hagemeister	e98b8e79ea	[generic] Improve SBS detection (Fixes #4899 )	2015-02-09 14:46:10 +01:00
Philipp Hagemeister	196121c51b	release 2015.02.09.1	2015-02-09 10:49:10 +01:00
Philipp Hagemeister	5269028951	[rtlnow] Add test for @mmue's extension (#4908 )	2015-02-09 10:47:19 +01:00
Philipp Hagemeister	f7bc056b5a	Merge remote-tracking branch 'mmue/fix-rtlnow'	2015-02-09 10:44:55 +01:00
Philipp Hagemeister	a0f7198544	[generic] Add support for jwPlayer YouTube videos This makes nationalarchives.gov.uk work (Fixes #4907, fixes #4876)	2015-02-09 10:43:01 +01:00
Philipp Hagemeister	dd8930684e	release 2015.02.09	2015-02-09 10:28:16 +01:00
Markus Müller	bdb186f3b0	fix rtlnow for newer series like "Der Bachelor" season 5	2015-02-08 21:55:39 +01:00
Sergey M․	64f9baa084	[options] Mention asr as possible filter	2015-02-09 01:35:16 +06:00
Philipp Hagemeister	b29231c040	release 2015.02.08	2015-02-08 20:28:38 +01:00
Sergey M․	6128bf07a9	[options] Update help on string comparisons	2015-02-09 01:27:27 +06:00
Sergey M․	2ec19e9558	[YoutubeDL] Allow filtering by audio sampling rate	2015-02-09 01:09:45 +06:00
Sergey M․	9ddb6925bf	[YoutubeDL] Allow filtering by string properties (#4906 )	2015-02-09 01:07:43 +06:00
Sergey M․	12931e1c6e	Credit @robin007bond for tweakers (#4881 ) and gamekings fixes (#4901 )	2015-02-08 23:33:29 +06:00
Sergey M․	41c23b0da5	[gamekings] Support videos from news pages	2015-02-08 23:12:59 +06:00
Sergey M․	2578ab19e4	Merge branch 'robin007bond-gamekings'	2015-02-08 23:03:31 +06:00
Sergey M․	d87ec897e9	[gamekings] Improve extraction	2015-02-08 23:03:12 +06:00
Sergey M․	3bd4bffb1c	Merge branch 'gamekings' of https://github.com/robin007bond/youtube-dl into robin007bond-gamekings	2015-02-08 22:46:43 +06:00
robin	c36b09a502	[Gamekings] Use thumbnail in return statement	2015-02-08 16:46:13 +01:00
Naglis Jonaitis	641eb10d34	Use _family_friendly_search for determining age_limit	2015-02-08 17:45:38 +02:00
robin	955c5505e7	[Gamekings] Use xpath XPath is used for extracting the video url and the thumbnail	2015-02-08 16:44:25 +01:00
Naglis Jonaitis	69319969de	[extractor/common] Add new helper method _family_friendly_search	2015-02-08 17:39:00 +02:00
Naglis Jonaitis	a14292e848	[soulanime] Remove extractor (#4554 ) Was supposed to be deleted by `67c2bcd`	2015-02-08 16:57:07 +02:00
robin	5d678df64a	[Gamekings] Download playlist Todo: URL and Thumbnail should be extracted with XPath	2015-02-08 15:34:37 +01:00
robin	8ca8cbe2bd	[Gamekings] Check string for vimeo, fix test The test now doesn't fail anymore. It just checks the string for having "vimeo" in it, instead of using the method for URL-checking, since it's returns an error. The tests don't fail, and the extractor works fine now.	2015-02-08 14:41:14 +01:00
robin	ba322d8209	[Gamekings] Added test and replaced video_url Quick and dirty fix for the Gamekings extractor. It gives an error about the video_url, but it downloads it now instead of giving a 404 error on newer Gamekings videos	2015-02-08 14:23:37 +01:00
robin	2f38289b79	[Gamekings] Fix order of replacement string Oops.	2015-02-08 13:49:32 +01:00
robin	f23a3ca699	[Gamekings] Fixed typo in URL replacement	2015-02-08 13:47:27 +01:00
robin	77d2b106cc	[Gamekings] Fix 404 when large isn't available When trying to download some GameKings videos, not all worked. This was because not all videos had a "/large"-URL available. The extractor checks now if the /large URL is available, if it isn't, it tries to get the normal URL.	2015-02-08 13:42:41 +01:00
Sergey M․	c0e46412e9	[aparat] Fix extraction (Closes #4897 )	2015-02-08 17:30:29 +06:00
Jaime Marquínez Ferrándiz	0161353d7d	[test/test_YoutubeDL] Remove debug print call	2015-02-06 23:58:01 +01:00
Jaime Marquínez Ferrándiz	2b4ecde2c8	[test/YoutubeDL] Add a simple test for postprocesors Just checks that the 'keepvideo' option works as intended.	2015-02-06 23:54:25 +01:00
Jaime Marquínez Ferrándiz	b3a286d69d	[YoutubeDL] _calc_cookies: add get_header method to _PseudoRequest (#4861 )	2015-02-06 22:23:06 +01:00
Jaime Marquínez Ferrándiz	467d3c9a0c	[ffmpeg] --extrac-audio: Use the same options for avconv and ffmpeg They have been available in ffmpeg since version 0.9, and we require 1.0 or higher.	2015-02-06 22:05:11 +01:00
Naglis Jonaitis	ad5747bad1	[rtp] Construct regular HTTP download URLs (#4882 )	2015-02-06 23:00:54 +02:00
Sergey M․	d6eb66ed3c	[aftenposten] Add extractor (Closes #4863 )	2015-02-07 01:46:54 +06:00
Sergey M․	7f2a9f1b49	[tvigle] Add support for cloud URLs (Closes #4887 )	2015-02-06 21:15:01 +06:00
Philipp Hagemeister	1e1896f2de	[extractor/common] Correct sort order. We should look at height and width before ext_preference.	2015-02-06 15:16:45 +01:00
Philipp Hagemeister	c831973366	release 2015.02.06	2015-02-06 14:38:30 +01:00
Naglis Jonaitis	1a2548d9e9	[rtp] Pass --realtime to rtmpdump (Fixes #4882 ) A workaround for video jumping back in time.	2015-02-06 13:44:46 +02:00
Sergey M․	3900eec27c	[extractor/common] Fix 2.0 manifest extraction (Closes #4830 )	2015-02-06 04:29:29 +06:00
Sergey M․	a02d212638	Merge branch 'robin007bond-tweakers'	2015-02-06 03:23:56 +06:00
Sergey M․	9c91a8fa70	[tweakers] Switch extraction to xspf playlist, extract all formats and meta (#4881 )	2015-02-06 03:23:42 +06:00
Sergey M․	41469f335e	Merge branch 'tweakers' of https://github.com/robin007bond/youtube-dl into robin007bond-tweakers	2015-02-06 02:59:33 +06:00
robin	67ce4f8820	Use match_id method instead of splitted URL	2015-02-05 21:49:13 +01:00
robin	bc63d56cca	Remove unnecessary TODO-comments	2015-02-05 21:40:18 +01:00
robin	c893d70805	Remove player-url in tweakers.py Player-url only needed for rmftp, not for regular URLs	2015-02-05 21:38:35 +01:00
robin	3ee6e02564	Edit Tweakers extractor Fixed code conventions (mainly adding two or more spaces before making an inline comment)	2015-02-05 19:59:36 +01:00
robin	e3aaace400	[tweakers] Add new extractor	2015-02-05 19:55:41 +01:00
Sergey M․	300753a069	[YoutubeDL] Fix video+audio format field (Closes #4880 )	2015-02-06 00:51:16 +06:00
Sergey M․	f13b88c616	[rts] Fix f4m and m3u8 extraction (Closes #4873 )	2015-02-05 22:17:50 +06:00
Sergey M․	60ca389c64	[extractor/common] Prefix f4m/m3u8 entries with identifier	2015-02-05 22:16:27 +06:00
Sergey M․	1b0f3919c1	Merge branch 'Frenzie-npo'	2015-02-05 20:15:13 +06:00
Sergey M․	6a348cf7d5	Credit @Frenzie for npo subtitles (#4878 )	2015-02-05 20:14:56 +06:00
Sergey M․	9e91449c8d	[npo] Fix subtitles (Closes #3638 )	2015-02-05 20:13:28 +06:00
Frans de Jonge	25e5ebf382	Add NPO.nl subtitles Implements #3638	2015-02-05 12:51:33 +01:00
Philipp Hagemeister	7dfc356625	release 2015.02.04	2015-02-04 16:09:35 +01:00
Sergey M․	58ba6c0160	[mixcloud] Fix extraction (Closes #4862 )	2015-02-04 19:47:55 +06:00
naglis	f076b63821	[generic/ooyala] Add support for Ooyala embeds on SBN network websites (Fixes #4859 )	2015-02-04 15:33:37 +02:00
Philipp Hagemeister	12f0454cd6	[README] Add an FAQ entry about anime sites	2015-02-03 14:18:15 +01:00