release 2017.09.15

[ChangeLog] Actualize
[condenast] Fix extraction (closes #14196 , closes #14207 )
2017-09-15 21:48:06 +07:00 · 2017-09-15 21:45:18 +07:00 · 2017-09-15 02:01:17 +07:00 · 2017-09-14 20:47:23 +02:00 · 2017-09-14 20:37:46 +02:00 · 2017-09-14 23:50:19 +07:00
11 changed files with 240 additions and 75 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.09.11*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.09.11**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.09.15*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.09.15**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2017.09.11
+[debug] youtube-dl version 2017.09.15
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/15
+++ b/15
@@ -1,3 +1,18 @@
+version 2017.09.15
+
+Core
+* [downloader/fragment] Restart inconsistent incomplete fragment downloads
+  (#13731)
+* [YoutubeDL] Download raw subtitles files (#12909, #14191)
+
+Extractors
+* [condenast] Fix extraction (#14196, #14207)
+ [orf] Add support for f4m stories
+* [tv4] Relax URL regular expression (#14206)
+* [animeondemand] Bypass geo restriction
+ [animeondemand] Add support for flash videos (#9944)
+
+
 version 2017.09.11

 Extractors
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -593,6 +593,7 @@
 - **Openload**
 - **OraTV**
 - **orf:fm4**: radio FM4
+ - **orf:fm4:story**: fm4.orf.at stories
 - **orf:iptv**: iptv.ORF.at
 - **orf:oe1**: Radio Österreich 1
 - **orf:tvthek**: ORF TVthek
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -1763,29 +1763,30 @@ class YoutubeDL(object):
            ie = self.get_info_extractor(info_dict['extractor_key'])
            for sub_lang, sub_info in subtitles.items():
                sub_format = sub_info['ext']
-                if sub_info.get('data') is not None:
-                    sub_data = sub_info['data']
+                sub_filename = subtitles_filename(filename, sub_lang, sub_format)
+                if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
+                    self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
                else:
-                    try:
-                        sub_data = ie._download_webpage(
-                            sub_info['url'], info_dict['id'], note=False)
-                    except ExtractorError as err:
-                        self.report_warning('Unable to download subtitle for "%s": %s' %
-                                            (sub_lang, error_to_compat_str(err.cause)))
-                        continue
-                try:
-                    sub_filename = subtitles_filename(filename, sub_lang, sub_format)
-                    if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
-                        self.to_screen('[info] Video subtitle %s.%s is already_present' % (sub_lang, sub_format))
+                    self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
+                    if sub_info.get('data') is not None:
+                        try:
+                            # Use newline='' to prevent conversion of newline characters
+                            # See https://github.com/rg3/youtube-dl/issues/10268
+                            with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
+                                subfile.write(sub_info['data'])
+                        except (OSError, IOError):
+                            self.report_error('Cannot write subtitles file ' + sub_filename)
+                            return
                    else:
-                        self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
-                        # Use newline='' to prevent conversion of newline characters
-                        # See https://github.com/rg3/youtube-dl/issues/10268
-                        with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
-                            subfile.write(sub_data)
-                except (OSError, IOError):
-                    self.report_error('Cannot write subtitles file ' + sub_filename)
-                    return
+                        try:
+                            sub_data = ie._request_webpage(
+                                sub_info['url'], info_dict['id'], note=False).read()
+                            with io.open(encodeFilename(sub_filename), 'wb') as subfile:
+                                subfile.write(sub_data)
+                        except (ExtractorError, IOError, OSError, ValueError) as err:
+                            self.report_warning('Unable to download subtitle for "%s": %s' %
+                                                (sub_lang, error_to_compat_str(err)))
+                            continue

        if self.params.get('writeinfojson', False):
            infofn = replace_extension(filename, 'info.json', info_dict.get('ext'))
--- a/youtube_dl/downloader/fragment.py
+++ b/youtube_dl/downloader/fragment.py
@@ -151,10 +151,15 @@ class FragmentFD(FileDownloader):
        if self.__do_ytdl_file(ctx):
            if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))):
                self._read_ytdl_file(ctx)
+                if ctx['fragment_index'] > 0 and resume_len == 0:
+                    self.report_error(
+                        'Inconsistent state of incomplete fragment download. '
+                        'Restarting from the beginning...')
+                    ctx['fragment_index'] = resume_len = 0
+                    self._write_ytdl_file(ctx)
            else:
                self._write_ytdl_file(ctx)
-            if ctx['fragment_index'] > 0:
-                assert resume_len > 0
+                assert ctx['fragment_index'] == 0

        dest_stream, tmpfilename = sanitize_open(tmpfilename, open_mode)

--- a/youtube_dl/extractor/animeondemand.py
+++ b/youtube_dl/extractor/animeondemand.py
@@ -3,16 +3,13 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..compat import (
-    compat_urlparse,
-    compat_str,
-)
+from ..compat import compat_str
 from ..utils import (
    determine_ext,
    extract_attributes,
    ExtractorError,
-    sanitized_Request,
    urlencode_postdata,
+    urljoin,
 )


@@ -21,6 +18,8 @@ class AnimeOnDemandIE(InfoExtractor):
    _LOGIN_URL = 'https://www.anime-on-demand.de/users/sign_in'
    _APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
    _NETRC_MACHINE = 'animeondemand'
+    # German-speaking countries of Europe
+    _GEO_COUNTRIES = ['AT', 'CH', 'DE', 'LI', 'LU']
    _TESTS = [{
        # jap, OmU
        'url': 'https://www.anime-on-demand.de/anime/161',
@@ -46,6 +45,10 @@ class AnimeOnDemandIE(InfoExtractor):
        # Full length film, non-series, ger/jap, Dub/OmU, account required
        'url': 'https://www.anime-on-demand.de/anime/185',
        'only_matching': True,
+    }, {
+        # Flash videos
+        'url': 'https://www.anime-on-demand.de/anime/12',
+        'only_matching': True,
    }]

    def _login(self):
@@ -72,14 +75,13 @@ class AnimeOnDemandIE(InfoExtractor):
            'post url', default=self._LOGIN_URL, group='url')

        if not post_url.startswith('http'):
-            post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
-
-        request = sanitized_Request(
-            post_url, urlencode_postdata(login_form))
-        request.add_header('Referer', self._LOGIN_URL)
+            post_url = urljoin(self._LOGIN_URL, post_url)

        response = self._download_webpage(
-            request, None, 'Logging in as %s' % username)
+            post_url, None, 'Logging in as %s' % username,
+            data=urlencode_postdata(login_form), headers={
+                'Referer': self._LOGIN_URL,
+            })

        if all(p not in response for p in ('>Logout<', 'href="/users/sign_out"')):
            error = self._search_regex(
@@ -120,10 +122,11 @@ class AnimeOnDemandIE(InfoExtractor):
            formats = []

            for input_ in re.findall(
-                    r'<input[^>]+class=["\'].*?streamstarter_html5[^>]+>', html):
+                    r'<input[^>]+class=["\'].*?streamstarter[^>]+>', html):
                attributes = extract_attributes(input_)
+                title = attributes.get('data-dialog-header')
                playlist_urls = []
-                for playlist_key in ('data-playlist', 'data-otherplaylist'):
+                for playlist_key in ('data-playlist', 'data-otherplaylist', 'data-stream'):
                    playlist_url = attributes.get(playlist_key)
                    if isinstance(playlist_url, compat_str) and re.match(
                            r'/?[\da-zA-Z]+', playlist_url):
@@ -147,19 +150,38 @@ class AnimeOnDemandIE(InfoExtractor):
                        format_id_list.append(compat_str(num))
                    format_id = '-'.join(format_id_list)
                    format_note = ', '.join(filter(None, (kind, lang_note)))
-                    request = sanitized_Request(
-                        compat_urlparse.urljoin(url, playlist_url),
+                    item_id_list = []
+                    if format_id:
+                        item_id_list.append(format_id)
+                    item_id_list.append('videomaterial')
+                    playlist = self._download_json(
+                        urljoin(url, playlist_url), video_id,
+                        'Downloading %s JSON' % ' '.join(item_id_list),
                        headers={
                            'X-Requested-With': 'XMLHttpRequest',
                            'X-CSRF-Token': csrf_token,
                            'Referer': url,
                            'Accept': 'application/json, text/javascript, */*; q=0.01',
-                        })
-                    playlist = self._download_json(
-                        request, video_id, 'Downloading %s playlist JSON' % format_id,
-                        fatal=False)
+                        }, fatal=False)
                    if not playlist:
                        continue
+                    stream_url = playlist.get('streamurl')
+                    if stream_url:
+                        rtmp = re.search(
+                            r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+/))(?P<playpath>mp[34]:.+)',
+                            stream_url)
+                        if rtmp:
+                            formats.append({
+                                'url': rtmp.group('url'),
+                                'app': rtmp.group('app'),
+                                'play_path': rtmp.group('playpath'),
+                                'page_url': url,
+                                'player_url': 'https://www.anime-on-demand.de/assets/jwplayer.flash-55abfb34080700304d49125ce9ffb4a6.swf',
+                                'rtmp_real_time': True,
+                                'format_id': 'rtmp',
+                                'ext': 'flv',
+                            })
+                            continue
                    start_video = playlist.get('startvideo', 0)
                    playlist = playlist.get('playlist')
                    if not playlist or not isinstance(playlist, list):
@@ -222,7 +244,7 @@ class AnimeOnDemandIE(InfoExtractor):
                    f.update({
                        'id': '%s-%s' % (f['id'], m.group('kind').lower()),
                        'title': m.group('title'),
-                        'url': compat_urlparse.urljoin(url, m.group('href')),
+                        'url': urljoin(url, m.group('href')),
                    })
                    entries.append(f)

--- a/youtube_dl/extractor/condenast.py
+++ b/youtube_dl/extractor/condenast.py
@@ -116,16 +116,16 @@ class CondeNastIE(InfoExtractor):
        entries = [self.url_result(build_url(path), 'CondeNast') for path in paths]
        return self.playlist_result(entries, playlist_title=title)

-    def _extract_video_params(self, webpage):
-        query = {}
-        params = self._search_regex(
-            r'(?s)var params = {(.+?)}[;,]', webpage, 'player params', default=None)
-        if params:
-            query.update({
-                'videoId': self._search_regex(r'videoId: [\'"](.+?)[\'"]', params, 'video id'),
-                'playerId': self._search_regex(r'playerId: [\'"](.+?)[\'"]', params, 'player id'),
-                'target': self._search_regex(r'target: [\'"](.+?)[\'"]', params, 'target'),
-            })
+    def _extract_video_params(self, webpage, display_id):
+        query = self._parse_json(
+            self._search_regex(
+                r'(?s)var\s+params\s*=\s*({.+?})[;,]', webpage, 'player params',
+                default='{}'),
+            display_id, transform_source=js_to_json, fatal=False)
+        if query:
+            query['videoId'] = self._search_regex(
+                r'(?:data-video-id=|currentVideoId\s*=\s*)["\']([\da-f]+)',
+                webpage, 'video id', default=None)
        else:
            params = extract_attributes(self._search_regex(
                r'(<[^>]+data-js="video-player"[^>]+>)',
@@ -141,17 +141,27 @@ class CondeNastIE(InfoExtractor):
        video_id = params['videoId']

        video_info = None
-        if params.get('playerId'):
-            info_page = self._download_json(
-                'http://player.cnevids.com/player/video.js',
-                video_id, 'Downloading video info', fatal=False, query=params)
-            if info_page:
-                video_info = info_page.get('video')
-            if not video_info:
-                info_page = self._download_webpage(
-                    'http://player.cnevids.com/player/loader.js',
-                    video_id, 'Downloading loader info', query=params)
-        else:
+
+        # New API path
+        query = params.copy()
+        query['embedType'] = 'inline'
+        info_page = self._download_json(
+            'http://player.cnevids.com/embed-api.json', video_id,
+            'Downloading embed info', fatal=False, query=query)
+
+        # Old fallbacks
+        if not info_page:
+            if params.get('playerId'):
+                info_page = self._download_json(
+                    'http://player.cnevids.com/player/video.js', video_id,
+                    'Downloading video info', fatal=False, query=params)
+        if info_page:
+            video_info = info_page.get('video')
+        if not video_info:
+            info_page = self._download_webpage(
+                'http://player.cnevids.com/player/loader.js',
+                video_id, 'Downloading loader info', query=params)
+        if not video_info:
            info_page = self._download_webpage(
                'https://player.cnevids.com/inline/video/%s.js' % video_id,
                video_id, 'Downloading inline info', query={
@@ -215,7 +225,7 @@ class CondeNastIE(InfoExtractor):
        if url_type == 'series':
            return self._extract_series(url, webpage)
        else:
-            params = self._extract_video_params(webpage)
+            params = self._extract_video_params(webpage, display_id)
            info = self._search_json_ld(
                webpage, display_id, fatal=False)
            info.update(self._extract_video(params))
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -768,6 +768,7 @@ from .ora import OraTVIE
 from .orf import (
    ORFTVthekIE,
    ORFFM4IE,
+    ORFFM4StoryIE,
    ORFOE1IE,
    ORFIPTVIE,
 )
--- a/youtube_dl/extractor/orf.py
+++ b/youtube_dl/extractor/orf.py
@@ -6,14 +6,15 @@ import re
 from .common import InfoExtractor
 from ..compat import compat_str
 from ..utils import (
-    HEADRequest,
-    unified_strdate,
-    strip_jsonp,
-    int_or_none,
-    float_or_none,
    determine_ext,
+    float_or_none,
+    HEADRequest,
+    int_or_none,
+    orderedSet,
    remove_end,
+    strip_jsonp,
    unescapeHTML,
+    unified_strdate,
 )


@@ -307,3 +308,108 @@ class ORFIPTVIE(InfoExtractor):
            'upload_date': upload_date,
            'formats': formats,
        }
+
+
+class ORFFM4StoryIE(InfoExtractor):
+    IE_NAME = 'orf:fm4:story'
+    IE_DESC = 'fm4.orf.at stories'
+    _VALID_URL = r'https?://fm4\.orf\.at/stories/(?P<id>\d+)'
+
+    _TEST = {
+        'url': 'http://fm4.orf.at/stories/2865738/',
+        'playlist': [{
+            'md5': 'e1c2c706c45c7b34cf478bbf409907ca',
+            'info_dict': {
+                'id': '547792',
+                'ext': 'flv',
+                'title': 'Manu Delago und Inner Tongue live',
+                'description': 'Manu Delago und Inner Tongue haben bei der FM4 Soundpark Session live alles gegeben. Hier gibt es Fotos und die gesamte Session als Video.',
+                'duration': 1748.52,
+                'thumbnail': r're:^https?://.*\.jpg$',
+                'upload_date': '20170913',
+            },
+        }, {
+            'md5': 'c6dd2179731f86f4f55a7b49899d515f',
+            'info_dict': {
+                'id': '547798',
+                'ext': 'flv',
+                'title': 'Manu Delago und Inner Tongue live (2)',
+                'duration': 1504.08,
+                'thumbnail': r're:^https?://.*\.jpg$',
+                'upload_date': '20170913',
+                'description': 'Manu Delago und Inner Tongue haben bei der FM4 Soundpark Session live alles gegeben. Hier gibt es Fotos und die gesamte Session als Video.',
+            },
+        }],
+    }
+
+    def _real_extract(self, url):
+        story_id = self._match_id(url)
+        webpage = self._download_webpage(url, story_id)
+
+        entries = []
+        all_ids = orderedSet(re.findall(r'data-video(?:id)?="(\d+)"', webpage))
+        for idx, video_id in enumerate(all_ids):
+            data = self._download_json(
+                'http://bits.orf.at/filehandler/static-api/json/current/data.json?file=%s' % video_id,
+                video_id)[0]
+
+            duration = float_or_none(data['duration'], 1000)
+
+            video = data['sources']['q8c']
+            load_balancer_url = video['loadBalancerUrl']
+            abr = int_or_none(video.get('audioBitrate'))
+            vbr = int_or_none(video.get('bitrate'))
+            fps = int_or_none(video.get('videoFps'))
+            width = int_or_none(video.get('videoWidth'))
+            height = int_or_none(video.get('videoHeight'))
+            thumbnail = video.get('preview')
+
+            rendition = self._download_json(
+                load_balancer_url, video_id, transform_source=strip_jsonp)
+
+            f = {
+                'abr': abr,
+                'vbr': vbr,
+                'fps': fps,
+                'width': width,
+                'height': height,
+            }
+
+            formats = []
+            for format_id, format_url in rendition['redirect'].items():
+                if format_id == 'rtmp':
+                    ff = f.copy()
+                    ff.update({
+                        'url': format_url,
+                        'format_id': format_id,
+                    })
+                    formats.append(ff)
+                elif determine_ext(format_url) == 'f4m':
+                    formats.extend(self._extract_f4m_formats(
+                        format_url, video_id, f4m_id=format_id))
+                elif determine_ext(format_url) == 'm3u8':
+                    formats.extend(self._extract_m3u8_formats(
+                        format_url, video_id, 'mp4', m3u8_id=format_id))
+                else:
+                    continue
+            self._sort_formats(formats)
+
+            title = remove_end(self._og_search_title(webpage), ' - fm4.ORF.at')
+            if idx >= 1:
+                # Titles are duplicates, make them unique
+                title += ' (' + str(idx + 1) + ')'
+            description = self._og_search_description(webpage)
+            upload_date = unified_strdate(self._html_search_meta(
+                'dc.date', webpage, 'upload date'))
+
+            entries.append({
+                'id': video_id,
+                'title': title,
+                'description': description,
+                'duration': duration,
+                'thumbnail': thumbnail,
+                'upload_date': upload_date,
+                'formats': formats,
+            })
+
+        return self.playlist_result(entries)
--- a/youtube_dl/extractor/tv4.py
+++ b/youtube_dl/extractor/tv4.py
@@ -18,7 +18,7 @@ class TV4IE(InfoExtractor):
            tv4\.se/(?:[^/]+)/klipp/(?:.*)-|
            tv4play\.se/
            (?:
-                (?:program|barn)/(?:[^\?]+)\?video_id=|
+                (?:program|barn)/(?:[^/]+/|(?:[^\?]+)\?video_id=)|
                iframe/video/|
                film/|
                sport/|
@@ -63,6 +63,10 @@ class TV4IE(InfoExtractor):
            'url': 'http://www.tv4play.se/barn/looney-tunes?video_id=3062412',
            'only_matching': True,
        },
+        {
+            'url': 'http://www.tv4play.se/program/farang/3922081',
+            'only_matching': True,
+        }
    ]

    def _real_extract(self, url):
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2017.09.11'
+__version__ = '2017.09.15'
Author	SHA1	Message	Date
Sergey M․	159d304a9f	release 2017.09.15	2017-09-15 21:48:06 +07:00
Sergey M․	86e55e317c	[ChangeLog] Actualize	2017-09-15 21:45:18 +07:00
Sergey M․	c46680fb2a	[condenast] Fix extraction (closes #14196 , closes #14207 )	2017-09-15 02:01:17 +07:00
Philipp Hagemeister	fad9fc537d	[tv4] fix a test URL	2017-09-14 20:47:23 +02:00
Philipp Hagemeister	0732a90579	[orf] Add new extractor for f4m stories	2017-09-14 20:37:46 +02:00
Sergey M․	319fc70676	[tv4] Relax _VALID_URL (closes #14206 )	2017-09-14 23:50:19 +07:00
Sergey M․	e7c3e33456	[downloader/fragment] Restart inconsistent incomplete fragment downloads (#13731 )	2017-09-14 23:19:53 +07:00
Yen Chi Hsuan	757984af90	Merge pull request #12909 from remitamine/raw-sub [YoutubeDL] write raw subtitle files	2017-09-13 17:36:40 +08:00
Sergey M․	2f483758bc	[animeondemand] Improve and modernize	2017-09-11 04:32:35 +07:00
Sergey M․	018cc61549	[animeondemand] Bypass geo restriction	2017-09-11 04:23:42 +07:00
Sergey M․	2709d9fa28	[animeondemand] Add support for flash videos (closes #9944 )	2017-09-11 04:23:42 +07:00
Remita Amine	5ff1bc0cc1	[YoutubeDL] write raw subtitle files	2017-04-29 20:03:03 +01:00