release 2016.06.16

[cda] Fix extraction (Closes #9803 )
[wimp] Fix extraction and update _TESTS
2016-06-16 22:40:55 +07:00 · 2016-06-16 22:33:12 +07:00 · 2016-06-16 12:27:21 +08:00 · 2016-06-16 12:26:45 +08:00 · 2016-06-16 11:00:54 +08:00 · 2016-06-15 22:34:55 +07:00
21 changed files with 357 additions and 60 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.12**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.16**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.06.12
+[debug] youtube-dl version 2016.06.16
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/2
+++ b/2
@@ -173,3 +173,5 @@ Kevin Deldycke
 inondle
 Tomáš Čech
 Déstin Reed
+Roman Tsiupa
+Artur Krysiak
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -142,9 +142,9 @@ After you have ensured this site is distributing it's content legally, you can f
    ```
 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
-7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
-8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
-9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
+7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
+8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L148-L252) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
+9. Check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](http://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
 10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:

        $ git add youtube_dl/extractor/extractors.py
--- a/README.md
+++ b/README.md
@@ -935,9 +935,9 @@ After you have ensured this site is distributing it's content legally, you can f
    ```
 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
-7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
-8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
-9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
+7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
+8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L148-L252) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
+9. Check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](http://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
 10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:

        $ git add youtube_dl/extractor/extractors.py
@@ -964,7 +964,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
    ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
 ```

-Most likely, you'll want to use various options. For a list of what can be done, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L121-L269). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
+Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.

 Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:

--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@@ -15,6 +15,7 @@
 set -e

 skip_tests=true
+gpg_sign_commits=""
 buildserver='localhost:8142'

 while true
@@ -24,6 +25,10 @@ case "$1" in
        skip_tests=false
        shift
    ;;
+    --gpg-sign-commits|-S)
+        gpg_sign_commits="-S"
+        shift
+    ;;
    --buildserver)
        buildserver="$2"
        shift 2
@@ -69,7 +74,7 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
 /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
 make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
 git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py
-git commit -m "release $version"
+git commit $gpg_sign_commits -m "release $version"

 /bin/echo -e "\n### Now tagging, signing and pushing..."
 git tag -s -m "Release $version" "$version"
@@ -116,7 +121,7 @@ git clone --branch gh-pages --single-branch . build/gh-pages
    "$ROOT/devscripts/gh-pages/update-copyright.py"
    "$ROOT/devscripts/gh-pages/update-sites.py"
    git add *.html *.html.in update
-    git commit -m "release $version"
+    git commit $gpg_sign_commits -m "release $version"
    git push "$ROOT" gh-pages
    git push "$ORIGIN_URL" gh-pages
 )
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -44,8 +44,8 @@
 - **appletrailers:section**
 - **archive.org**: archive.org videos
 - **ARD**
- - **ARD:mediathek**: Saarländischer Rundfunk
 - **ARD:mediathek**
+ - **ARD:mediathek**: Saarländischer Rundfunk
 - **arte.tv**
 - **arte.tv:+7**
 - **arte.tv:cinema**
@@ -535,6 +535,7 @@
 - **revision3:embed**
 - **RICE**
 - **RingTV**
+ - **RockstarGames**
 - **RottenTomatoes**
 - **Roxwel**
 - **RTBF**
@@ -699,6 +700,7 @@
 - **TVPlay**: TV3Play and related services
 - **Tweakers**
 - **twitch:chapter**
+ - **twitch:clips**
 - **twitch:past_broadcasts**
 - **twitch:profile**
 - **twitch:stream**
@@ -793,10 +795,11 @@
 - **WNL**
 - **WorldStarHipHop**
 - **wrzuta.pl**
+ - **wrzuta.pl:playlist**
 - **WSJ**: Wall Street Journal
 - **XBef**
 - **XboxClips**
- - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To
+ - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE
 - **XHamster**
 - **XHamsterEmbed**
 - **xiami:album**: 虾米音乐 - 专辑
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -640,6 +640,9 @@ class TestUtil(unittest.TestCase):
            "1":{"src":"skipped", "type": "application/vnd.apple.mpegURL"}
        }''')

+        inp = '''{"foo":101}'''
+        self.assertEqual(js_to_json(inp), '''{"foo":101}''')
+
    def test_js_to_json_edgecases(self):
        on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
        self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
--- a/youtube_dl/downloader/external.py
+++ b/youtube_dl/downloader/external.py
@@ -85,7 +85,7 @@ class ExternalFD(FileDownloader):
            cmd, stderr=subprocess.PIPE)
        _, stderr = p.communicate()
        if p.returncode != 0:
-            self.to_stderr(stderr)
+            self.to_stderr(stderr.decode('utf-8', 'replace'))
        return p.returncode


--- a/youtube_dl/extractor/cda.py
+++ b/youtube_dl/extractor/cda.py
@@ -58,7 +58,8 @@ class CDAIE(InfoExtractor):
        def extract_format(page, version):
            unpacked = decode_packed_codes(page)
            format_url = self._search_regex(
-                r"url:\\'(.+?)\\'", unpacked, '%s url' % version, fatal=False)
+                r"(?:file|url)\s*:\s*(\\?[\"'])(?P<url>http.+?)\1", unpacked,
+                '%s url' % version, fatal=False, group='url')
            if not format_url:
                return
            f = {
@@ -75,7 +76,8 @@ class CDAIE(InfoExtractor):
            info_dict['formats'].append(f)
            if not info_dict['duration']:
                info_dict['duration'] = parse_duration(self._search_regex(
-                    r"duration:\\'(.+?)\\'", unpacked, 'duration', fatal=False))
+                    r"duration\s*:\s*(\\?[\"'])(?P<duration>.+?)\1",
+                    unpacked, 'duration', fatal=False, group='duration'))

        extract_format(webpage, 'default')

--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -649,6 +649,7 @@ from .revision3 import (
 from .rice import RICEIE
 from .ringtv import RingTVIE
 from .ro220 import Ro220IE
+from .rockstargames import RockstarGamesIE
 from .rottentomatoes import RottenTomatoesIE
 from .roxwel import RoxwelIE
 from .rtbf import RTBFIE
@@ -862,6 +863,7 @@ from .twitch import (
    TwitchProfileIE,
    TwitchPastBroadcastsIE,
    TwitchStreamIE,
+    TwitchClipsIE,
 )
 from .twitter import (
    TwitterCardIE,
@@ -978,7 +980,10 @@ from .weiqitv import WeiqiTVIE
 from .wimp import WimpIE
 from .wistia import WistiaIE
 from .worldstarhiphop import WorldStarHipHopIE
-from .wrzuta import WrzutaIE
+from .wrzuta import (
+    WrzutaIE,
+    WrzutaPlaylistIE,
+)
 from .wsj import WSJIE
 from .xbef import XBefIE
 from .xboxclips import XboxClipsIE
--- a/youtube_dl/extractor/imdb.py
+++ b/youtube_dl/extractor/imdb.py
@@ -12,7 +12,7 @@ from ..utils import (
 class ImdbIE(InfoExtractor):
    IE_NAME = 'imdb'
    IE_DESC = 'Internet Movie Database trailers'
-    _VALID_URL = r'https?://(?:www|m)\.imdb\.com/video/[^/]+/vi(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video/[^/]+/|title/tt\d+.*?#lb-)vi(?P<id>\d+)'

    _TESTS = [{
        'url': 'http://www.imdb.com/video/imdb/vi2524815897',
@@ -25,6 +25,12 @@ class ImdbIE(InfoExtractor):
    }, {
        'url': 'http://www.imdb.com/video/_/vi2524815897',
        'only_matching': True,
+    }, {
+        'url': 'http://www.imdb.com/title/tt1667889/?ref_=ext_shr_eml_vi#lb-vi2524815897',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.imdb.com/title/tt1667889/#lb-vi2524815897',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/jwplatform.py
+++ b/youtube_dl/extractor/jwplatform.py
@@ -12,9 +12,35 @@ from ..utils import (


 class JWPlatformBaseIE(InfoExtractor):
+    @staticmethod
+    def _find_jwplayer_data(webpage):
+        # TODO: Merge this with JWPlayer-related codes in generic.py
+
+        mobj = re.search(
+            'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\((?P<options>[^)]+)\)',
+            webpage)
+        if mobj:
+            return mobj.group('options')
+
+    def _extract_jwplayer_data(self, webpage, video_id, *args, **kwargs):
+        jwplayer_data = self._parse_json(
+            self._find_jwplayer_data(webpage), video_id)
+        return self._parse_jwplayer_data(
+            jwplayer_data, video_id, *args, **kwargs)
+
    def _parse_jwplayer_data(self, jwplayer_data, video_id, require_title=True, m3u8_id=None, rtmp_params=None):
+        # JWPlayer backward compatibility: flattened playlists
+        # https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
+        if 'playlist' not in jwplayer_data:
+            jwplayer_data = {'playlist': [jwplayer_data]}
+
        video_data = jwplayer_data['playlist'][0]

+        # JWPlayer backward compatibility: flattened sources
+        # https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/playlist/item.js#L29-L35
+        if 'sources' not in video_data:
+            video_data['sources'] = [video_data]
+
        formats = []
        for source in video_data['sources']:
            source_url = self._proto_relative_url(source['file'])
--- a/youtube_dl/extractor/lynda.py
+++ b/youtube_dl/extractor/lynda.py
@@ -95,7 +95,6 @@ class LyndaIE(LyndaBaseIE):
    IE_NAME = 'lynda'
    IE_DESC = 'lynda.com videos'
    _VALID_URL = r'https?://www\.lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
-    _NETRC_MACHINE = 'lynda'

    _TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]'

--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -1,3 +1,4 @@
+# coding: utf-8
 from __future__ import unicode_literals

 import itertools
@@ -39,7 +40,25 @@ class PornHubIE(InfoExtractor):
            'dislike_count': int,
            'comment_count': int,
            'age_limit': 18,
-        }
+        },
+    }, {
+        # non-ASCII title
+        'url': 'http://www.pornhub.com/view_video.php?viewkey=1331683002',
+        'info_dict': {
+            'id': '1331683002',
+            'ext': 'mp4',
+            'title': '重庆婷婷女王足交',
+            'uploader': 'cj397186295',
+            'duration': 1753,
+            'view_count': int,
+            'like_count': int,
+            'dislike_count': int,
+            'comment_count': int,
+            'age_limit': 18,
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'http://www.pornhub.com/view_video.php?viewkey=ph557bbb6676d2d',
        'only_matching': True,
@@ -76,19 +95,25 @@ class PornHubIE(InfoExtractor):
                'PornHub said: %s' % error_msg,
                expected=True, video_id=video_id)

+        # video_title from flashvars contains whitespace instead of non-ASCII (see
+        # http://www.pornhub.com/view_video.php?viewkey=1331683002), not relying
+        # on that anymore.
+        title = self._html_search_meta(
+            'twitter:title', webpage, default=None) or self._search_regex(
+            (r'<h1[^>]+class=["\']title["\'][^>]*>(?P<title>[^<]+)',
+             r'<div[^>]+data-video-title=(["\'])(?P<title>.+?)\1',
+             r'shareTitle\s*=\s*(["\'])(?P<title>.+?)\1'),
+            webpage, 'title', group='title')
+
        flashvars = self._parse_json(
            self._search_regex(
                r'var\s+flashvars_\d+\s*=\s*({.+?});', webpage, 'flashvars', default='{}'),
            video_id)
        if flashvars:
-            video_title = flashvars.get('video_title')
            thumbnail = flashvars.get('image_url')
            duration = int_or_none(flashvars.get('video_duration'))
        else:
-            video_title, thumbnail, duration = [None] * 3
-
-        if not video_title:
-            video_title = self._html_search_regex(r'<h1 [^>]+>([^<]+)', webpage, 'title')
+            title, thumbnail, duration = [None] * 3

        video_uploader = self._html_search_regex(
            r'(?s)From:&nbsp;.+?<(?:a href="/users/|a href="/channels/|span class="username)[^>]+>(.+?)<',
@@ -137,7 +162,7 @@ class PornHubIE(InfoExtractor):
        return {
            'id': video_id,
            'uploader': video_uploader,
-            'title': video_title,
+            'title': title,
            'thumbnail': thumbnail,
            'duration': duration,
            'view_count': view_count,
--- a/youtube_dl/extractor/rockstargames.py
+++ b/youtube_dl/extractor/rockstargames.py
@@ -0,0 +1,69 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    parse_iso8601,
+)
+
+
+class RockstarGamesIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?rockstargames\.com/videos(?:/video/|#?/?\?.*\bvideo=)(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'https://www.rockstargames.com/videos/video/11544/',
+        'md5': '03b5caa6e357a4bd50e3143fc03e5733',
+        'info_dict': {
+            'id': '11544',
+            'ext': 'mp4',
+            'title': 'Further Adventures in Finance and Felony Trailer',
+            'description': 'md5:6d31f55f30cb101b5476c4a379e324a3',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'timestamp': 1464876000,
+            'upload_date': '20160602',
+        }
+    }, {
+        'url': 'http://www.rockstargames.com/videos#/?video=48',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        video = self._download_json(
+            'https://www.rockstargames.com/videoplayer/videos/get-video.json',
+            video_id, query={
+                'id': video_id,
+                'locale': 'en_us',
+            })['video']
+
+        title = video['title']
+
+        formats = []
+        for video in video['files_processed']['video/mp4']:
+            if not video.get('src'):
+                continue
+            resolution = video.get('resolution')
+            height = int_or_none(self._search_regex(
+                r'^(\d+)[pP]$', resolution or '', 'height', default=None))
+            formats.append({
+                'url': self._proto_relative_url(video['src']),
+                'format_id': resolution,
+                'height': height,
+            })
+
+        if not formats:
+            youtube_id = video.get('youtube_id')
+            if youtube_id:
+                return self.url_result(youtube_id, 'Youtube')
+
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': video.get('description'),
+            'thumbnail': self._proto_relative_url(video.get('screencap')),
+            'timestamp': parse_iso8601(video.get('created')),
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/twitch.py
+++ b/youtube_dl/extractor/twitch.py
@@ -16,6 +16,7 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    int_or_none,
+    js_to_json,
    orderedSet,
    parse_duration,
    parse_iso8601,
@@ -454,3 +455,45 @@ class TwitchStreamIE(TwitchBaseIE):
            'formats': formats,
            'is_live': True,
        }
+
+
+class TwitchClipsIE(InfoExtractor):
+    IE_NAME = 'twitch:clips'
+    _VALID_URL = r'https?://clips\.twitch\.tv/(?:[^/]+/)*(?P<id>[^/?#&]+)'
+
+    _TEST = {
+        'url': 'https://clips.twitch.tv/ea/AggressiveCobraPoooound',
+        'md5': '761769e1eafce0ffebfb4089cb3847cd',
+        'info_dict': {
+            'id': 'AggressiveCobraPoooound',
+            'ext': 'mp4',
+            'title': 'EA Play 2016 Live from the Novo Theatre',
+            'thumbnail': 're:^https?://.*\.jpg',
+            'creator': 'EA',
+            'uploader': 'stereotype_',
+            'uploader_id': 'stereotype_',
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        clip = self._parse_json(
+            self._search_regex(
+                r'(?s)clipInfo\s*=\s*({.+?});', webpage, 'clip info'),
+            video_id, transform_source=js_to_json)
+
+        video_url = clip['clip_video_url']
+        title = clip['channel_title']
+
+        return {
+            'id': video_id,
+            'url': video_url,
+            'title': title,
+            'thumbnail': self._og_search_thumbnail(webpage),
+            'creator': clip.get('broadcaster_display_name') or clip.get('broadcaster_login'),
+            'uploader': clip.get('curator_login'),
+            'uploader_id': clip.get('curator_display_name'),
+        }
--- a/youtube_dl/extractor/wimp.py
+++ b/youtube_dl/extractor/wimp.py
@@ -1,29 +1,33 @@
 from __future__ import unicode_literals

-from .common import InfoExtractor
 from .youtube import YoutubeIE
+from .jwplatform import JWPlatformBaseIE


-class WimpIE(InfoExtractor):
+class WimpIE(JWPlatformBaseIE):
    _VALID_URL = r'https?://(?:www\.)?wimp\.com/(?P<id>[^/]+)'
    _TESTS = [{
-        'url': 'http://www.wimp.com/maruexhausted/',
+        'url': 'http://www.wimp.com/maru-is-exhausted/',
        'md5': 'ee21217ffd66d058e8b16be340b74883',
        'info_dict': {
-            'id': 'maruexhausted',
+            'id': 'maru-is-exhausted',
            'ext': 'mp4',
            'title': 'Maru is exhausted.',
            'description': 'md5:57e099e857c0a4ea312542b684a869b8',
        }
    }, {
        'url': 'http://www.wimp.com/clowncar/',
-        'md5': '4e2986c793694b55b37cf92521d12bb4',
+        'md5': '5c31ad862a90dc5b1f023956faec13fe',
        'info_dict': {
-            'id': 'clowncar',
+            'id': 'cG4CEr2aiSg',
            'ext': 'webm',
-            'title': 'It\'s like a clown car.',
-            'description': 'md5:0e56db1370a6e49c5c1d19124c0d2fb2',
+            'title': 'Basset hound clown car...incredible!',
+            'description': '5 of my Bassets crawled in this dog loo! www.bellinghambassets.com\n\nFor licensing/usage please contact: licensing(at)jukinmediadotcom',
+            'upload_date': '20140303',
+            'uploader': 'Gretchen Hoey',
+            'uploader_id': 'gretchenandjeff1',
        },
+        'add_ie': ['Youtube'],
    }]

    def _real_extract(self, url):
@@ -41,14 +45,13 @@ class WimpIE(InfoExtractor):
                'ie_key': YoutubeIE.ie_key(),
            }

-        video_url = self._search_regex(
-            r'<video[^>]+>\s*<source[^>]+src=(["\'])(?P<url>.+?)\1',
-            webpage, 'video URL', group='url')
+        info_dict = self._extract_jwplayer_data(
+            webpage, video_id, require_title=False)

-        return {
+        info_dict.update({
            'id': video_id,
-            'url': video_url,
            'title': self._og_search_title(webpage),
-            'thumbnail': self._og_search_thumbnail(webpage),
            'description': self._og_search_description(webpage),
-        }
+        })
+
+        return info_dict
--- a/youtube_dl/extractor/wrzuta.py
+++ b/youtube_dl/extractor/wrzuta.py
@@ -5,8 +5,10 @@ import re

 from .common import InfoExtractor
 from ..utils import (
+    ExtractorError,
    int_or_none,
    qualities,
+    remove_start,
 )


@@ -26,16 +28,17 @@ class WrzutaIE(InfoExtractor):
            'uploader_id': 'laboratoriumdextera',
            'description': 'md5:7fb5ef3c21c5893375fda51d9b15d9cd',
        },
+        'skip': 'Redirected to wrzuta.pl',
    }, {
-        'url': 'http://jolka85.wrzuta.pl/audio/063jOPX5ue2/liber_natalia_szroeder_-_teraz_ty',
-        'md5': 'bc78077859bea7bcfe4295d7d7fc9025',
+        'url': 'http://vexling.wrzuta.pl/audio/01xBFabGXu6/james_horner_-_into_the_na_39_vi_world_bonus',
+        'md5': 'f80564fb5a2ec6ec59705ae2bf2ba56d',
        'info_dict': {
-            'id': '063jOPX5ue2',
-            'ext': 'ogg',
-            'title': 'Liber & Natalia Szroeder - Teraz Ty',
-            'duration': 203,
-            'uploader_id': 'jolka85',
-            'description': 'md5:2d2b6340f9188c8c4cd891580e481096',
+            'id': '01xBFabGXu6',
+            'ext': 'mp3',
+            'title': 'James Horner - Into The Na\'vi World [Bonus]',
+            'description': 'md5:30a70718b2cd9df3120fce4445b0263b',
+            'duration': 95,
+            'uploader_id': 'vexling',
        },
    }]

@@ -45,7 +48,10 @@ class WrzutaIE(InfoExtractor):
        typ = mobj.group('typ')
        uploader = mobj.group('uploader')

-        webpage = self._download_webpage(url, video_id)
+        webpage, urlh = self._download_webpage_handle(url, video_id)
+
+        if urlh.geturl() == 'http://www.wrzuta.pl/':
+            raise ExtractorError('Video removed', expected=True)

        quality = qualities(['SD', 'MQ', 'HQ', 'HD'])

@@ -80,3 +86,73 @@ class WrzutaIE(InfoExtractor):
            'description': self._og_search_description(webpage),
            'age_limit': embedpage.get('minimalAge', 0),
        }
+
+
+class WrzutaPlaylistIE(InfoExtractor):
+    """
+        this class covers extraction of wrzuta playlist entries
+        the extraction process bases on following steps:
+        * collect information of playlist size
+        * download all entries provided on
+          the playlist webpage (the playlist is split
+          on two pages: first directly reached from webpage
+          second: downloaded on demand by ajax call and rendered
+          using the ajax call response)
+        * in case size of extracted entries not reached total number of entries
+          use the ajax call to collect the remaining entries
+    """
+
+    IE_NAME = 'wrzuta.pl:playlist'
+    _VALID_URL = r'https?://(?P<uploader>[0-9a-zA-Z]+)\.wrzuta\.pl/playlista/(?P<id>[0-9a-zA-Z]+)'
+    _TESTS = [{
+        'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR/moja_muza',
+        'playlist_mincount': 14,
+        'info_dict': {
+            'id': '7XfO4vE84iR',
+            'title': 'Moja muza',
+        },
+    }, {
+        'url': 'http://heroesf70.wrzuta.pl/playlista/6Nj3wQHx756/lipiec_-_lato_2015_muzyka_swiata',
+        'playlist_mincount': 144,
+        'info_dict': {
+            'id': '6Nj3wQHx756',
+            'title': 'Lipiec - Lato 2015 Muzyka Świata',
+        },
+    }, {
+        'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        playlist_id = mobj.group('id')
+        uploader = mobj.group('uploader')
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        playlist_size = int_or_none(self._html_search_regex(
+            (r'<div[^>]+class=["\']playlist-counter["\'][^>]*>\d+/(\d+)',
+             r'<div[^>]+class=["\']all-counter["\'][^>]*>(.+?)</div>'),
+            webpage, 'playlist size', default=None))
+
+        playlist_title = remove_start(
+            self._og_search_title(webpage), 'Playlista: ')
+
+        entries = []
+        if playlist_size:
+            entries = [
+                self.url_result(entry_url)
+                for _, entry_url in re.findall(
+                    r'<a[^>]+href=(["\'])(http.+?)\1[^>]+class=["\']playlist-file-page',
+                    webpage)]
+            if playlist_size > len(entries):
+                playlist_content = self._download_json(
+                    'http://%s.wrzuta.pl/xhr/get_playlist_offset/%s' % (uploader, playlist_id),
+                    playlist_id,
+                    'Downloading playlist JSON',
+                    'Unable to download playlist JSON')
+                entries.extend([
+                    self.url_result(entry['filelink'])
+                    for entry in playlist_content.get('files', []) if entry.get('filelink')])
+
+        return self.playlist_result(entries, playlist_id, playlist_title)
--- a/youtube_dl/extractor/xfileshare.py
+++ b/youtube_dl/extractor/xfileshare.py
@@ -5,8 +5,10 @@ import re

 from .common import InfoExtractor
 from ..utils import (
+    decode_packed_codes,
    ExtractorError,
    int_or_none,
+    NO_DEFAULT,
    sanitized_Request,
    urlencode_postdata,
 )
@@ -23,20 +25,24 @@ class XFileShareIE(InfoExtractor):
        ('thevideobee.to', 'TheVideoBee'),
        ('vidto.me', 'Vidto'),
        ('streamin.to', 'Streamin.To'),
+        ('xvidstage.com', 'XVIDSTAGE'),
    )

    IE_DESC = 'XFileShare based sites: %s' % ', '.join(list(zip(*_SITES))[1])
    _VALID_URL = (r'https?://(?P<host>(?:www\.)?(?:%s))/(?:embed-)?(?P<id>[0-9a-zA-Z]+)'
                  % '|'.join(re.escape(site) for site in list(zip(*_SITES))[0]))

-    _FILE_NOT_FOUND_REGEX = r'>(?:404 - )?File Not Found<'
+    _FILE_NOT_FOUND_REGEXES = (
+        r'>(?:404 - )?File Not Found<',
+        r'>The file was removed by administrator<',
+    )

    _TESTS = [{
        'url': 'http://gorillavid.in/06y9juieqpmi',
        'md5': '5ae4a3580620380619678ee4875893ba',
        'info_dict': {
            'id': '06y9juieqpmi',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Rebecca Black My Moment Official Music Video Reaction-6GK87Rc8bzQ',
            'thumbnail': 're:http://.*\.jpg',
        },
@@ -78,6 +84,17 @@ class XFileShareIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Big Buck Bunny trailer',
        },
+    }, {
+        'url': 'http://xvidstage.com/e0qcnl03co6z',
+        'info_dict': {
+            'id': 'e0qcnl03co6z',
+            'ext': 'mp4',
+            'title': 'Chucky Prank 2015.mp4',
+        },
+    }, {
+        # removed by administrator
+        'url': 'http://xvidstage.com/amfy7atlkx25',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -87,7 +104,7 @@ class XFileShareIE(InfoExtractor):
        url = 'http://%s/%s' % (mobj.group('host'), video_id)
        webpage = self._download_webpage(url, video_id)

-        if re.search(self._FILE_NOT_FOUND_REGEX, webpage) is not None:
+        if any(re.search(p, webpage) for p in self._FILE_NOT_FOUND_REGEXES):
            raise ExtractorError('Video %s does not exist' % video_id, expected=True)

        fields = self._hidden_inputs(webpage)
@@ -113,10 +130,23 @@ class XFileShareIE(InfoExtractor):
             r'>Watch (.+) ',
             r'<h2 class="video-page-head">([^<]+)</h2>'],
            webpage, 'title', default=None) or self._og_search_title(webpage)).strip()
-        video_url = self._search_regex(
-            [r'file\s*:\s*["\'](http[^"\']+)["\'],',
-             r'file_link\s*=\s*\'(https?:\/\/[0-9a-zA-z.\/\-_]+)'],
-            webpage, 'file url')
+
+        def extract_video_url(default=NO_DEFAULT):
+            return self._search_regex(
+                (r'file\s*:\s*(["\'])(?P<url>http.+?)\1,',
+                 r'file_link\s*=\s*(["\'])(?P<url>http.+?)\1',
+                 r'addVariable\((\\?["\'])file\1\s*,\s*(\\?["\'])(?P<url>http.+?)\2\)',
+                 r'<embed[^>]+src=(["\'])(?P<url>http.+?)\1'),
+                webpage, 'file url', default=default, group='url')
+
+        video_url = extract_video_url(default=None)
+
+        if not video_url:
+            webpage = decode_packed_codes(self._search_regex(
+                r"(}\('(.+)',(\d+),(\d+),'[^']*\b(?:file|embed)\b[^']*'\.split\('\|'\))",
+                webpage, 'packed code'))
+            video_url = extract_video_url()
+
        thumbnail = self._search_regex(
            r'image\s*:\s*["\'](http[^"\']+)["\'],', webpage, 'thumbnail', default=None)

--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -1970,7 +1970,7 @@ def js_to_json(code):
        '(?:[^'\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^'\\]*'|
        /\*.*?\*/|,(?=\s*[\]}])|
        [a-zA-Z_][.a-zA-Z_0-9]*|
-        (?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:\s*:)?|
+        \b(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:\s*:)?|
        [0-9]+(?=\s*:)
        ''', fix_kv, code)

--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.06.12'
+__version__ = '2016.06.16'
Author	SHA1	Message	Date
Sergey M․	d2161cade5	release 2016.06.16	2016-06-16 22:40:55 +07:00
Sergey M․	27e5fa8198	[cda] Fix extraction (Closes #9803 )	2016-06-16 22:33:12 +07:00
Yen Chi Hsuan	efbd1eb51a	[wimp] Fix extraction and update _TESTS	2016-06-16 12:27:21 +08:00
Yen Chi Hsuan	369ff75081	[jwplatform] Improved JWPlayer support	2016-06-16 12:26:45 +08:00
Yen Chi Hsuan	47212f7bcb	[utils] Don't transform numbers not starting with a zero Fix test_Viidea and maybe others	2016-06-16 11:00:54 +08:00
Sergey M․	4c93ee8d14	[imdb] Improve _VALID_URL (Closes #9788 )	2016-06-15 22:34:55 +07:00
Yen Chi Hsuan	8bc4dbb1af	[wrzuta.pl] Detect error and update _TESTS	2016-06-14 11:14:59 +08:00
Sergey M․	6c3760292c	[pornhub] Improve title extraction (Closes #9777 )	2016-06-14 04:57:59 +07:00
Sergey M․	4cef70db6c	[devscripts/release.sh] Add flag for gpg-sign commits	2016-06-14 03:16:56 +07:00
Sergey M․	ff4af6ec59	[lynda] Remove superfluous _NETRC_MACHINE	2016-06-14 02:49:33 +07:00
Sergey M․	d01fb21d4c	release 2016.06.14	2016-06-14 02:19:42 +07:00
Sergey M․	a4ea28eee6	Credit @venth for wrzuta:playlist (#9341 )	2016-06-14 02:15:47 +07:00
Sergey M․	bc2a871f3e	Credit @dracony for rockstargames (#9737 )	2016-06-14 02:15:09 +07:00
Sergey M․	1759672eed	[wrzuta:playlist] Improve and simplify (Closes #9341 )	2016-06-14 02:13:54 +07:00
venth	fea55ef4a9	[wrzuta.pl:playlist] Added playlist extraction from wrzuta.pl	2016-06-14 02:10:48 +07:00
Sergey M․	16b6bd01d2	[rockstargames] Improve and add Youtube fallback (Closes #9737 )	2016-06-14 01:11:24 +07:00
Dracony	14d0f4e0f3	Added extractor for rockstargames.com	2016-06-14 01:09:35 +07:00
Sergey M․	778f969447	[twitch:clips] Add extractor (Closes #9767 )	2016-06-14 00:06:31 +07:00
Sergey M․	79cd8b3d8a	[README.md] Suggest checking extractor code under all Python versions	2016-06-13 10:04:04 +07:00
Sergey M․	b4663f12b1	[README.md] Update links to info dict metafields	2016-06-13 07:16:35 +07:00
Sergey M․	b50e02c1e4	[README.md] Update links to options available for YoutubeDL	2016-06-13 07:05:32 +07:00
Sergey M․	33b72ce64e	[xfileshare] Improve removed videos detection	2016-06-13 01:19:54 +07:00
Sergey M․	cf2bf840ba	[xfileshare] Fix test	2016-06-13 01:11:14 +07:00
Sergey M․	bccdac6874	[xfileshare:xvidstage] Add support for videos with packed codes (Closes #4335 )	2016-06-13 01:11:04 +07:00
Sergey M․	e69f9f5d68	[downloader/external] Decode error string before writing to stderr	2016-06-12 16:45:07 +07:00