release 2015.02.03.1

[wsj] Add new extractor (Fixes #4854 )
[sort_formats] Prefer bitrate over video size
2015-02-03 10:59:27 +01:00 · 2015-02-03 10:58:28 +01:00 · 2015-02-03 10:53:07 +01:00 · 2015-02-03 10:52:22 +01:00 · 2015-02-03 10:18:32 +01:00 · 2015-02-03 10:17:13 +01:00
28 changed files with 523 additions and 108 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -1,4 +1,6 @@
-Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
+**Please include the full output of youtube-dl when run with `-v`**.
+
+The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.

 Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):

@ -122,7 +124,7 @@ If you want to add support for a new site, you can follow this quick list (assum
 5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
 7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
-8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
+8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
 9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:

        $ git add youtube_dl/extractor/__init__.py
--- a/5
+++ b/5
@ -1,10 +1,7 @@
 all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites

 clean:
-	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json CONTRIBUTING.md.tmp
-
-cleanall: clean
-	rm -f youtube-dl youtube-dl.exe
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe

 PREFIX ?= /usr/local
 BINDIR ?= $(PREFIX)/bin
--- a/README.md
+++ b/README.md
@ -368,11 +368,11 @@ which means you can modify it, redistribute it or use it however you like.
    --add-metadata                   write metadata to the video file
    --xattrs                         write metadata to the video file's xattrs
                                     (using dublin core and xdg standards)
-    --fixup POLICY                   (experimental) Automatically correct known
-                                     faults of the file. One of never (do
-                                     nothing), warn (only emit a warning),
-                                     detect_or_warn(check whether we can do
-                                     anything about it, warn otherwise
+    --fixup POLICY                   Automatically correct known faults of the
+                                     file. One of never (do nothing), warn (only
+                                     emit a warning), detect_or_warn(the
+                                     default; fix file if we can, warn
+                                     otherwise)
    --prefer-avconv                  Prefer avconv over ffmpeg for running the
                                     postprocessors (default)
    --prefer-ffmpeg                  Prefer ffmpeg over avconv for running the
@ -728,7 +728,7 @@ In particular, every site support request issue should only pertain to services

 ###  Is anyone going to need the feature?

-Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
+Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.

 ###  Is your question about youtube-dl?

--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@ -35,7 +35,7 @@ if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $us
 if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi

 /bin/echo -e "\n### First of all, testing..."
-make cleanall
+make clean
 if $skip_tests ; then
    echo 'SKIPPING TESTS'
 else
@ -45,9 +45,9 @@ fi
 /bin/echo -e "\n### Changing version in version.py..."
 sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py

-/bin/echo -e "\n### Committing README.md and youtube_dl/version.py..."
-make README.md
-git add README.md youtube_dl/version.py
+/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
+make README.md CONTRIBUTING.md supportedsites
+git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
 git commit -m "release $version"

 /bin/echo -e "\n### Now tagging, signing and pushing..."
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -9,6 +9,7 @@
 - **8tracks**
 - **9gag**
 - **abc.net.au**
+ - **Abc7News**
 - **AcademicEarth:Course**
 - **AddAnime**
 - **AdobeTV**
@ -16,9 +17,12 @@
 - **Aftonbladet**
 - **AlJazeera**
 - **Allocine**
+ - **AlphaPorno**
 - **anitube.se**
 - **AnySex**
 - **Aparat**
+ - **AppleDailyAnimationNews**
+ - **AppleDailyRealtimeNews**
 - **AppleTrailers**
 - **archive.org**: archive.org videos
 - **ARD**
@ -30,8 +34,10 @@
 - **arte.tv:ddc**
 - **arte.tv:embed**
 - **arte.tv:future**
+ - **AtresPlayer**
+ - **ATTTechChannel**
 - **audiomack**
- - **AUEngine**
+ - **audiomack:album**
 - **Azubu**
 - **bambuser**
 - **bambuser:channel**
@ -71,8 +77,10 @@
 - **cmt.com**
 - **CNET**
 - **CNN**
+ - **CNNArticle**
 - **CNNBlogs**
 - **CollegeHumor**
+ - **CollegeRama**
 - **ComCarCoff**
 - **ComedyCentral**
 - **ComedyCentralShows**: The Daily Show / The Colbert Report
@ -82,23 +90,27 @@
 - **Crunchyroll**
 - **crunchyroll:playlist**
 - **CSpan**: C-SPAN
+ - **CtsNews**
 - **culturebox.francetvinfo.fr**
 - **dailymotion**
 - **dailymotion:playlist**
 - **dailymotion:user**
 - **daum.net**
 - **DBTV**
+ - **DctpTv**
 - **DeezerPlaylist**
 - **defense.gouv.fr**
 - **Discovery**
 - **divxstage**: DivxStage
 - **Dotsub**
+ - **DRBonanza**
 - **Dropbox**
 - **DrTuber**
 - **DRTV**
 - **Dump**
 - **dvtv**: http://video.aktualne.cz/
 - **EbaumsWorld**
+ - **EchoMsk**
 - **eHow**
 - **Einthusan**
 - **eitb.tv**
@ -108,6 +120,7 @@
 - **EMPFlix**
 - **Engadget**
 - **Eporner**
+ - **EroProfile**
 - **Escapist**
 - **EveryonesMixtape**
 - **exfm**: ex.fm
@ -143,6 +156,7 @@
 - **GDCVault**
 - **generic**: Generic downloader that works on some sites
 - **GiantBomb**
+ - **Giga**
 - **Glide**: Glide mobile video messages (glide.me)
 - **Globo**
 - **GodTube**
@ -153,9 +167,14 @@
 - **Grooveshark**
 - **Groupon**
 - **Hark**
+ - **HearThisAt**
 - **Heise**
+ - **HellPorno**
 - **Helsinki**: helsinki.fi
 - **HentaiStigma**
+ - **HistoricFilms**
+ - **hitbox**
+ - **hitbox:live**
 - **HornBunny**
 - **HostingBulk**
 - **HotNewHipHop**
@ -182,6 +201,7 @@
 - **jpopsuki.tv**
 - **Jukebox**
 - **Kankan**
+ - **Karaoketv**
 - **keek**
 - **KeezMovies**
 - **KhanAcademy**
@ -195,6 +215,7 @@
 - **LiveLeak**
 - **livestream**
 - **livestream:original**
+ - **LnkGo**
 - **lrt.lt**
 - **lynda**: lynda.com videos
 - **lynda:course**: lynda.com online courses
@ -235,6 +256,7 @@
 - **MySpass**
 - **myvideo**
 - **MyVidster**
+ - **n-tv.de**
 - **Naver**
 - **NBA**
 - **NBC**
@ -242,11 +264,16 @@
 - **ndr**: NDR.de - Mediathek
 - **NDTV**
 - **NerdCubedFeed**
+ - **Nerdist**
+ - **Netzkino**
 - **Newgrounds**
 - **Newstube**
+ - **NextMedia**
+ - **NextMediaActionNews**
 - **nfb**: National Film Board of Canada
 - **nfl.com**
 - **nhl.com**
+ - **nhl.com:news**: NHL news
 - **nhl.com:videocenter**: NHL videocenter category
 - **niconico**: ニコニコ動画
 - **NiconicoPlaylist**
@ -257,18 +284,20 @@
 - **Nowness**
 - **nowvideo**: NowVideo
 - **npo.nl**
+ - **npo.nl:live**
 - **NRK**
 - **NRKTV**
- - **NTV**
+ - **ntv.ru**
 - **Nuvid**
 - **NYTimes**
 - **ocw.mit.edu**
 - **OktoberfestTV**
 - **on.aol.com**
 - **Ooyala**
+ - **OpenFilm**
+ - **orf:fm4**: radio FM4
 - **orf:oe1**: Radio Österreich 1
 - **orf:tvthek**: ORF TVthek
- - **ORFFM4**: radio FM4
 - **parliamentlive.tv**: UK parliament videos
 - **Patreon**
 - **PBS**
@ -290,6 +319,7 @@
 - **Pyvideo**
 - **QuickVid**
 - **radio.de**
+ - **radiobremen**
 - **radiofrance**
 - **Rai**
 - **RBMARadio**
@ -300,6 +330,8 @@
 - **RottenTomatoes**
 - **Roxwel**
 - **RTBF**
+ - **Rte**
+ - **RTL2**
 - **RTLnow**
 - **rtlxl.nl**
 - **RTP**
@ -309,6 +341,7 @@
 - **RUHD**
 - **rutube**: Rutube videos
 - **rutube:channel**: Rutube channels
+ - **rutube:embed**: Rutube embedded videos
 - **rutube:movie**: Rutube movies
 - **rutube:person**: Rutube person videos
 - **RUTV**: RUTV.RU
@ -351,11 +384,12 @@
 - **Sport5**
 - **SportBox**
 - **SportDeutschland**
- - **SRMediathek**: Süddeutscher Rundfunk
+ - **SRMediathek**: Saarländischer Rundfunk
 - **stanfordoc**: Stanford Open ClassRoom
 - **Steam**
 - **streamcloud.eu**
 - **StreamCZ**
+ - **StreetVoice**
 - **SunPorno**
 - **SWRMediathek**
 - **Syfy**
@ -375,7 +409,9 @@
 - **TeleBruxelles**
 - **telecinco.es**
 - **TeleMB**
+ - **TeleTask**
 - **TenPlay**
+ - **TestTube**
 - **TF1**
 - **TheOnion**
 - **ThePlatform**
@ -403,8 +439,15 @@
 - **tv.dfb.de**
 - **tvigle**: Интернет-телевидение Tvigle.ru
 - **tvp.pl**
+ - **tvp.pl:Series**
 - **TVPlay**: TV3Play and related services
- - **Twitch**
+ - **twitch:bookmarks**
+ - **twitch:chapter**
+ - **twitch:past_broadcasts**
+ - **twitch:profile**
+ - **twitch:stream**
+ - **twitch:video**
+ - **twitch:vod**
 - **Ubu**
 - **udemy**
 - **udemy:course**
@ -433,6 +476,8 @@
 - **videoweed**: VideoWeed
 - **Vidme**
 - **Vidzi**
+ - **vier**
+ - **vier:videos**
 - **viki**
 - **vimeo**
 - **vimeo:album**
@ -460,11 +505,13 @@
 - **WDR**
 - **wdr:mobile**
 - **WDRMaus**: Sendung mit der Maus
+ - **WebOfStories**
 - **Weibo**
 - **Wimp**
 - **Wistia**
 - **WorldStarHipHop**
 - **wrzuta.pl**
+ - **WSJ**: Wall Street Journal
 - **XBef**
 - **XboxClips**
 - **XHamster**
@ -472,7 +519,9 @@
 - **XNXX**
 - **XTube**
 - **XTubeUser**: XTube user profile
+ - **Xuite**
 - **XVideos**
+ - **XXXYMovies**
 - **Yahoo**: Yahoo screen and movies
 - **YesJapan**
 - **Ynet**
@ -491,7 +540,6 @@
 - **youtube:search_url**: YouTube.com search URLs
 - **youtube:show**: YouTube.com (multi-season) shows
 - **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
- - **youtube:toplist**: YouTube.com top lists, "yttoplist:{channel}:{list title}" (Example: "yttoplist:music:Top Tracks")
 - **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
 - **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
 - **ZDF**
--- a/test/helper.py
+++ b/test/helper.py
@ -103,6 +103,16 @@ def expect_info_dict(self, got_dict, expected_dict):
            self.assertTrue(
                match_rex.match(got),
                'field %s (value: %r) should match %r' % (info_field, got, match_str))
+        elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
+            got = got_dict.get(info_field)
+            start_str = expected[len('startswith:'):]
+            self.assertTrue(
+                isinstance(got, compat_str),
+                'Expected a %s object, but got %s for field %s' % (
+                    compat_str.__name__, type(got).__name__, info_field))
+            self.assertTrue(
+                got.startswith(start_str),
+                'field %s (value: %r) should start with %r' % (info_field, got, start_str))
        elif isinstance(expected, type):
            got = got_dict.get(info_field)
            self.assertTrue(isinstance(got, expected),
--- a/test/test_utils.py
+++ b/test/test_utils.py
@ -156,6 +156,9 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(
            unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
            '20141126')
+        self.assertEqual(
+            unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
+            '20150202')

    def test_find_xpath_attr(self):
        testxml = '''<root>
@ -238,6 +241,8 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(parse_duration('5 s'), 5)
        self.assertEqual(parse_duration('3 min'), 180)
        self.assertEqual(parse_duration('2.5 hours'), 9000)
+        self.assertEqual(parse_duration('02:03:04'), 7384)
+        self.assertEqual(parse_duration('01:02:03:04'), 93784)

    def test_fix_xml_ampersands(self):
        self.assertEqual(
@ -371,6 +376,16 @@ class TestUtil(unittest.TestCase):
        on = js_to_json('{"abc": true}')
        self.assertEqual(json.loads(on), {'abc': True})

+        # Ignore JavaScript code as well
+        on = js_to_json('''{
+            "x": 1,
+            y: "a",
+            z: some.code
+        }''')
+        d = json.loads(on)
+        self.assertEqual(d['x'], 1)
+        self.assertEqual(d['y'], 'a')
+
    def test_clean_html(self):
        self.assertEqual(clean_html('a:\nb'), 'a: b')
        self.assertEqual(clean_html('a:\n   "b"'), 'a:    "b"')
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@ -964,9 +964,11 @@ class YoutubeDL(object):
            thumbnails.sort(key=lambda t: (
                t.get('preference'), t.get('width'), t.get('height'),
                t.get('id'), t.get('url')))
-            for t in thumbnails:
+            for i, t in enumerate(thumbnails):
                if 'width' in t and 'height' in t:
                    t['resolution'] = '%dx%d' % (t['width'], t['height'])
+                if t.get('id') is None:
+                    t['id'] = '%d' % i

        if thumbnails and 'thumbnail' not in info_dict:
            info_dict['thumbnail'] = thumbnails[-1]['url']
--- a/youtube_dl/downloader/external.py
+++ b/youtube_dl/downloader/external.py
@ -45,6 +45,12 @@ class ExternalFD(FileDownloader):
    def supports(cls, info_dict):
        return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')

+    def _source_address(self, command_option):
+        source_address = self.params.get('source_address')
+        if source_address is None:
+            return []
+        return [command_option, source_address]
+
    def _call_downloader(self, tmpfilename, info_dict):
        """ Either overwrite this or implement _make_cmd """
        cmd = self._make_cmd(tmpfilename, info_dict)
@ -72,6 +78,7 @@ class CurlFD(ExternalFD):
        cmd = [self.exe, '-o', tmpfilename]
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--interface')
        cmd += ['--', info_dict['url']]
        return cmd

@ -81,6 +88,7 @@ class WgetFD(ExternalFD):
        cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--bind-address')
        cmd += ['--', info_dict['url']]
        return cmd

@ -96,6 +104,7 @@ class Aria2cFD(ExternalFD):
        cmd += ['--out', os.path.basename(tmpfilename)]
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--interface')
        cmd += ['--', info_dict['url']]
        return cmd

--- a/youtube_dl/downloader/http.py
+++ b/youtube_dl/downloader/http.py
@ -3,6 +3,9 @@ from __future__ import unicode_literals
 import os
 import time

+from socket import error as SocketError
+import errno
+
 from .common import FileDownloader
 from ..compat import (
    compat_urllib_request,
@ -99,6 +102,11 @@ class HttpFD(FileDownloader):
                            resume_len = 0
                            open_mode = 'wb'
                            break
+            except SocketError as e:
+                if e.errno != errno.ECONNRESET:
+                    # Connection reset is no problem, just retry
+                    raise
+
            # Retry
            count += 1
            if count <= retries:
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@ -182,6 +182,7 @@ from .heise import HeiseIE
 from .hellporno import HellPornoIE
 from .helsinki import HelsinkiIE
 from .hentaistigma import HentaiStigmaIE
+from .historicfilms import HistoricFilmsIE
 from .hitbox import HitboxIE, HitboxLiveIE
 from .hornbunny import HornBunnyIE
 from .hostingbulk import HostingBulkIE
@ -284,6 +285,7 @@ from .ndr import NDRIE
 from .ndtv import NDTVIE
 from .netzkino import NetzkinoIE
 from .nerdcubed import NerdCubedFeedIE
+from .nerdist import NerdistIE
 from .newgrounds import NewgroundsIE
 from .newstube import NewstubeIE
 from .nextmedia import (
@ -316,7 +318,8 @@ from .nrk import (
    NRKIE,
    NRKTVIE,
 )
-from .ntv import NTVIE
+from .ntvde import NTVDeIE
+from .ntvru import NTVRuIE
 from .nytimes import NYTimesIE
 from .nuvid import NuvidIE
 from .oktoberfesttv import OktoberfestTVIE
@ -551,6 +554,7 @@ from .wimp import WimpIE
 from .wistia import WistiaIE
 from .worldstarhiphop import WorldStarHipHopIE
 from .wrzuta import WrzutaIE
+from .wsj import WSJIE
 from .xbef import XBefIE
 from .xboxclips import XboxClipsIE
 from .xhamster import XHamsterIE
--- a/youtube_dl/extractor/aftonbladet.py
+++ b/youtube_dl/extractor/aftonbladet.py
@ -1,8 +1,6 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor


@ -21,9 +19,7 @@ class AftonbladetIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.search(self._VALID_URL, url)
-
-        video_id = mobj.group('video_id')
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)

        # find internal video meta data
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@ -108,7 +108,7 @@ class BrightcoveIE(InfoExtractor):
        """

        # Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553
-        object_str = re.sub(r'(<param name="[^"]+" value="[^"]+")>',
+        object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>',
                            lambda m: m.group(1) + '/>', object_str)
        # Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608
        object_str = object_str.replace('<--', '<!--')
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@ -145,6 +145,7 @@ class InfoExtractor(object):
    thumbnail:      Full URL to a video thumbnail image.
    description:    Full video description.
    uploader:       Full name of the video uploader.
+    creator:        The main artist who created the video.
    timestamp:      UNIX timestamp of the moment the video became available.
    upload_date:    Video upload date (YYYYMMDD).
                    If not explicitly set, calculated from timestamp.
@ -704,11 +705,11 @@ class InfoExtractor(object):
                preference,
                f.get('language_preference') if f.get('language_preference') is not None else -1,
                f.get('quality') if f.get('quality') is not None else -1,
-                f.get('height') if f.get('height') is not None else -1,
-                f.get('width') if f.get('width') is not None else -1,
-                ext_preference,
                f.get('tbr') if f.get('tbr') is not None else -1,
                f.get('vbr') if f.get('vbr') is not None else -1,
+                ext_preference,
+                f.get('height') if f.get('height') is not None else -1,
+                f.get('width') if f.get('width') is not None else -1,
                f.get('abr') if f.get('abr') is not None else -1,
                audio_ext_preference,
                f.get('fps') if f.get('fps') is not None else -1,
@ -860,10 +861,13 @@ class InfoExtractor(object):
        return formats

    # TODO: improve extraction
-    def _extract_smil_formats(self, smil_url, video_id):
+    def _extract_smil_formats(self, smil_url, video_id, fatal=True):
        smil = self._download_xml(
            smil_url, video_id, 'Downloading SMIL file',
-            'Unable to download SMIL file')
+            'Unable to download SMIL file', fatal=fatal)
+        if smil is False:
+            assert not fatal
+            return []

        base = smil.find('./head/meta').get('base')

--- a/youtube_dl/extractor/drtv.py
+++ b/youtube_dl/extractor/drtv.py
@ -25,9 +25,15 @@ class DRTVIE(SubtitlesInfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)

-        programcard = self._download_json(
-            'http://www.dr.dk/mu/programcard/expanded/%s' % video_id, video_id, 'Downloading video JSON')
+        webpage = self._download_webpage(url, video_id)

+        video_id = self._search_regex(
+            r'data-(?:material-identifier|episode-slug)="([^"]+)"',
+            webpage, 'video id')
+
+        programcard = self._download_json(
+            'http://www.dr.dk/mu/programcard/expanded/%s' % video_id,
+            video_id, 'Downloading video JSON')
        data = programcard['Data'][0]

        title = data['Title']
--- a/youtube_dl/extractor/franceculture.py
+++ b/youtube_dl/extractor/franceculture.py
@ -1,77 +1,69 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import json
-import re
-
 from .common import InfoExtractor
 from ..compat import (
-    compat_parse_qs,
    compat_urlparse,
 )
+from ..utils import (
+    determine_ext,
+    int_or_none,
+)


 class FranceCultureIE(InfoExtractor):
-    _VALID_URL = r'(?P<baseurl>http://(?:www\.)?franceculture\.fr/)player/reecouter\?play=(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:www\.)?franceculture\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
    _TEST = {
        'url': 'http://www.franceculture.fr/player/reecouter?play=4795174',
        'info_dict': {
            'id': '4795174',
            'ext': 'mp3',
            'title': 'Rendez-vous au pays des geeks',
+            'alt_title': 'Carnet nomade | 13-14',
            'vcodec': 'none',
-            'uploader': 'Colette Fellous',
            'upload_date': '20140301',
-            'duration': 3601,
            'thumbnail': r're:^http://www\.franceculture\.fr/.*/images/player/Carnet-nomade\.jpg$',
-            'description': 'Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats-Unis dans la S ...',
+            'description': 'startswith:Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats',
+            'timestamp': 1393700400,
        }
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        baseurl = mobj.group('baseurl')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        params_code = self._search_regex(
-            r"<param name='movie' value='/sites/all/modules/rf/rf_player/swf/loader.swf\?([^']+)' />",
-            webpage, 'parameter code')
-        params = compat_parse_qs(params_code)
-        video_url = compat_urlparse.urljoin(baseurl, params['urlAOD'][0])
+
+        video_path = self._search_regex(
+            r'<a id="player".*?href="([^"]+)"', webpage, 'video path')
+        video_url = compat_urlparse.urljoin(url, video_path)
+        timestamp = int_or_none(self._search_regex(
+            r'<a id="player".*?data-date="([0-9]+)"',
+            webpage, 'upload date', fatal=False))
+        thumbnail = self._search_regex(
+            r'<a id="player".*?>\s+<img src="([^"]+)"',
+            webpage, 'thumbnail', fatal=False)

        title = self._html_search_regex(
-            r'<h1 class="title[^"]+">(.+?)</h1>', webpage, 'title')
+            r'<span class="title-diffusion">(.*?)</span>', webpage, 'title')
+        alt_title = self._html_search_regex(
+            r'<span class="title">(.*?)</span>',
+            webpage, 'alt_title', fatal=False)
+        description = self._html_search_regex(
+            r'<span class="description">(.*?)</span>',
+            webpage, 'description', fatal=False)
+
        uploader = self._html_search_regex(
            r'(?s)<div id="emission".*?<span class="author">(.*?)</span>',
-            webpage, 'uploader', fatal=False)
-        thumbnail_part = self._html_search_regex(
-            r'(?s)<div id="emission".*?<img src="([^"]+)"', webpage,
-            'thumbnail', fatal=False)
-        if thumbnail_part is None:
-            thumbnail = None
-        else:
-            thumbnail = compat_urlparse.urljoin(baseurl, thumbnail_part)
-        description = self._html_search_regex(
-            r'(?s)<p class="desc">(.*?)</p>', webpage, 'description')
-
-        info = json.loads(params['infoData'][0])[0]
-        duration = info.get('media_length')
-        upload_date_candidate = info.get('media_section5')
-        upload_date = (
-            upload_date_candidate
-            if (upload_date_candidate is not None and
-                re.match(r'[0-9]{8}$', upload_date_candidate))
-            else None)
+            webpage, 'uploader', default=None)
+        vcodec = 'none' if determine_ext(video_url.lower()) == 'mp3' else None

        return {
            'id': video_id,
            'url': video_url,
-            'vcodec': 'none' if video_url.lower().endswith('.mp3') else None,
-            'duration': duration,
+            'vcodec': vcodec,
            'uploader': uploader,
-            'upload_date': upload_date,
+            'timestamp': timestamp,
            'title': title,
+            'alt_title': alt_title,
            'thumbnail': thumbnail,
            'description': description,
        }
--- a/youtube_dl/extractor/historicfilms.py
+++ b/youtube_dl/extractor/historicfilms.py
@ -0,0 +1,46 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import parse_duration
+
+
+class HistoricFilmsIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?historicfilms\.com/(?:tapes/|play)(?P<id>\d+)'
+    _TEST = {
+        'url': 'http://www.historicfilms.com/tapes/4728',
+        'md5': 'd4a437aec45d8d796a38a215db064e9a',
+        'info_dict': {
+            'id': '4728',
+            'ext': 'mov',
+            'title': 'Historic Films: GP-7',
+            'description': 'md5:1a86a0f3ac54024e419aba97210d959a',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 2096,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        tape_id = self._search_regex(
+            r'class="tapeId">([^<]+)<', webpage, 'tape id')
+
+        title = self._og_search_title(webpage)
+        description = self._og_search_description(webpage)
+        thumbnail = self._html_search_meta(
+            'thumbnailUrl', webpage, 'thumbnails') or self._og_search_thumbnail(webpage)
+        duration = parse_duration(self._html_search_meta(
+            'duration', webpage, 'duration'))
+
+        video_url = 'http://www.historicfilms.com/video/%s_%s_web.mov' % (tape_id, video_id)
+
+        return {
+            'id': video_id,
+            'url': video_url,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+        }
--- a/youtube_dl/extractor/nerdist.py
+++ b/youtube_dl/extractor/nerdist.py
@ -0,0 +1,80 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+from ..utils import (
+    determine_ext,
+    parse_iso8601,
+    xpath_text,
+)
+
+
+class NerdistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?nerdist\.com/vepisode/(?P<id>[^/?#]+)'
+    _TEST = {
+        'url': 'http://www.nerdist.com/vepisode/exclusive-which-dc-characters-w',
+        'md5': '3698ed582931b90d9e81e02e26e89f23',
+        'info_dict': {
+            'display_id': 'exclusive-which-dc-characters-w',
+            'id': 'RPHpvJyr',
+            'ext': 'mp4',
+            'title': 'Your TEEN TITANS Revealed! Who\'s on the show?',
+            'thumbnail': 're:^https?://.*/thumbs/.*\.jpg$',
+            'description': 'Exclusive: Find out which DC Comics superheroes will star in TEEN TITANS Live-Action TV Show on Nerdist News with Jessica Chobot!',
+            'uploader': 'Eric Diaz',
+            'upload_date': '20150202',
+            'timestamp': 1422892808,
+        }
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+
+        video_id = self._search_regex(
+            r'''(?x)<script\s+(?:type="text/javascript"\s+)?
+                src="https?://content\.nerdist\.com/players/([a-zA-Z0-9_]+)-''',
+            webpage, 'video ID')
+        timestamp = parse_iso8601(self._html_search_meta(
+            'shareaholic:article_published_time', webpage, 'upload date'))
+        uploader = self._html_search_meta(
+            'shareaholic:article_author_name', webpage, 'article author')
+
+        doc = self._download_xml(
+            'http://content.nerdist.com/jw6/%s.xml' % video_id, video_id)
+        video_info = doc.find('.//item')
+        title = xpath_text(video_info, './title', fatal=True)
+        description = xpath_text(video_info, './description')
+        thumbnail = xpath_text(
+            video_info, './{http://rss.jwpcdn.com/}image', 'thumbnail')
+
+        formats = []
+        for source in video_info.findall('./{http://rss.jwpcdn.com/}source'):
+            vurl = source.attrib['file']
+            ext = determine_ext(vurl)
+            if ext == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    vurl, video_id, entry_protocol='m3u8_native', ext='mp4',
+                    preference=0))
+            elif ext == 'smil':
+                formats.extend(self._extract_smil_formats(
+                    vurl, video_id, fatal=False
+                ))
+            else:
+                formats.append({
+                    'format_id': ext,
+                    'url': vurl,
+                })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+            'formats': formats,
+            'uploader': uploader,
+        }
--- a/youtube_dl/extractor/nfl.py
+++ b/youtube_dl/extractor/nfl.py
@ -46,7 +46,18 @@ class NFLIE(InfoExtractor):
                'timestamp': 1388354455,
                'thumbnail': 're:^https?://.*\.jpg$',
            }
-        }
+        },
+        {
+            'url': 'http://www.nfl.com/news/story/0ap3000000467586/article/patriots-seahawks-involved-in-lategame-skirmish',
+            'info_dict': {
+                'id': '0ap3000000467607',
+                'ext': 'mp4',
+                'title': 'Frustrations flare on the field',
+                'description': 'Emotions ran high at the end of the Super Bowl on both sides of the ball after a dramatic finish.',
+                'timestamp': 1422850320,
+                'upload_date': '20150202',
+            },
+        },
    ]

    @staticmethod
@ -80,7 +91,11 @@ class NFLIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)

        config_url = NFLIE.prepend_host(host, self._search_regex(
-            r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL'))
+            r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL',
+            default='static/content/static/config/video/config.json'))
+        # For articles, the id in the url is not the video id
+        video_id = self._search_regex(
+            r'contentId\s*:\s*"([^"]+)"', webpage, 'video id', default=video_id)
        config = self._download_json(config_url, video_id,
                                     note='Downloading player config')
        url_template = NFLIE.prepend_host(
--- a/youtube_dl/extractor/normalboots.py
+++ b/youtube_dl/extractor/normalboots.py
@ -1,8 +1,6 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor

 from ..utils import (
@ -11,7 +9,7 @@ from ..utils import (


 class NormalbootsIE(InfoExtractor):
-    _VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<videoid>[0-9a-z-]*)/?$'
+    _VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<id>[0-9a-z-]*)/?$'
    _TEST = {
        'url': 'http://normalboots.com/video/home-alone-games-jontron/',
        'md5': '8bf6de238915dd501105b44ef5f1e0f6',
@ -30,19 +28,22 @@ class NormalbootsIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('videoid')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        video_uploader = self._html_search_regex(r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
-                                                 webpage, 'uploader')
-        raw_upload_date = self._html_search_regex('<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
-                                                  webpage, 'date')
-        video_upload_date = unified_strdate(raw_upload_date)

-        player_url = self._html_search_regex(r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"', webpage, 'url')
+        video_uploader = self._html_search_regex(
+            r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
+            webpage, 'uploader', fatal=False)
+        video_upload_date = unified_strdate(self._html_search_regex(
+            r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
+            webpage, 'date', fatal=False))
+
+        player_url = self._html_search_regex(
+            r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"',
+            webpage, 'player url')
        player_page = self._download_webpage(player_url, video_id)
-        video_url = self._html_search_regex(r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')
+        video_url = self._html_search_regex(
+            r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')

        return {
            'id': video_id,
--- a/youtube_dl/extractor/ntvde.py
+++ b/youtube_dl/extractor/ntvde.py
@ -0,0 +1,68 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    parse_duration,
+)
+
+
+class NTVDeIE(InfoExtractor):
+    IE_NAME = 'n-tv.de'
+    _VALID_URL = r'https?://(?:www\.)?n-tv\.de/mediathek/videos/[^/?#]+/[^/?#]+-article(?P<id>.+)\.html'
+
+    _TESTS = [{
+        'url': 'http://www.n-tv.de/mediathek/videos/panorama/Schnee-und-Glaette-fuehren-zu-zahlreichen-Unfaellen-und-Staus-article14438086.html',
+        'md5': '6ef2514d4b1e8e03ca24b49e2f167153',
+        'info_dict': {
+            'id': '14438086',
+            'ext': 'mp4',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'title': 'Schnee und Glätte führen zu zahlreichen Unfällen und Staus',
+            'alt_title': 'Winterchaos auf deutschen Straßen',
+            'description': 'Schnee und Glätte sorgen deutschlandweit für einen chaotischen Start in die Woche: Auf den Straßen kommt es zu kilometerlangen Staus und Dutzenden Glätteunfällen. In Düsseldorf und München wirbelt der Schnee zudem den Flugplan durcheinander. Dutzende Flüge landen zu spät, einige fallen ganz aus.',
+            'duration': 4020,
+            'timestamp': 1422892797,
+            'upload_date': '20150202',
+        },
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        info = self._parse_json(self._search_regex(
+            r'(?s)ntv.pageInfo.article =\s(\{.*?\});', webpage, 'info'),
+            video_id, transform_source=js_to_json)
+        timestamp = int_or_none(info.get('publishedDateAsUnixTimeStamp'))
+        vdata = self._parse_json(self._search_regex(
+            r'(?s)\$\(\s*"\#player"\s*\)\s*\.data\(\s*"player",\s*(\{.*?\})\);',
+            webpage, 'player data'),
+            video_id, transform_source=js_to_json)
+        duration = parse_duration(vdata.get('duration'))
+        formats = [{
+            'format_id': 'flash',
+            'url': 'rtmp://fms.n-tv.de/' + vdata['video'],
+        }, {
+            'format_id': 'mobile',
+            'url': 'http://video.n-tv.de' + vdata['videoMp4'],
+            'tbr': 400,  # estimation
+        }]
+        m3u8_url = 'http://video.n-tv.de' + vdata['videoM3u8']
+        formats.extend(self._extract_m3u8_formats(
+            m3u8_url, video_id, ext='mp4',
+            entry_protocol='m3u8_native', preference=0))
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': info['headline'],
+            'description': info.get('intro'),
+            'alt_title': info.get('kicker'),
+            'timestamp': timestamp,
+            'thumbnail': vdata.get('html5VideoPoster'),
+            'duration': duration,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/ntvru.py
+++ b/youtube_dl/extractor/ntvru.py
@ -1,15 +1,14 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
 from ..utils import (
    unescapeHTML
 )


-class NTVIE(InfoExtractor):
+class NTVRuIE(InfoExtractor):
+    IE_NAME = 'ntv.ru'
    _VALID_URL = r'http://(?:www\.)?ntv\.ru/(?P<id>.+)'

    _TESTS = [
@ -92,9 +91,7 @@ class NTVIE(InfoExtractor):
    ]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        page = self._download_webpage(url, video_id)

        video_id = self._html_search_regex(self._VIDEO_ID_REGEXES, page, 'video id')
--- a/youtube_dl/extractor/vevo.py
+++ b/youtube_dl/extractor/vevo.py
@ -9,6 +9,7 @@ from ..compat import (
 )
 from ..utils import (
    ExtractorError,
+    int_or_none,
 )


@ -192,9 +193,29 @@ class VevoIE(InfoExtractor):
        # Download via HLS API
        formats.extend(self._download_api_formats(video_id))

+        # Download SMIL
+        smil_blocks = sorted((
+            f for f in video_info['videoVersions']
+            if f['sourceType'] == 13),
+            key=lambda f: f['version'])
+        smil_url = '%s/Video/V2/VFILE/%s/%sr.smil' % (
+            self._SMIL_BASE_URL, video_id, video_id.lower())
+        if smil_blocks:
+            smil_url_m = self._search_regex(
+                r'url="([^"]+)"', smil_blocks[-1]['data'], 'SMIL URL',
+                default=None)
+            if smil_url_m is not None:
+                smil_url = smil_url_m
+        if smil_url:
+            smil_xml = self._download_webpage(
+                smil_url, video_id, 'Downloading SMIL info', fatal=False)
+            if smil_xml:
+                formats.extend(self._formats_from_smil(smil_xml))
+
        self._sort_formats(formats)
-        timestamp_ms = int(self._search_regex(
-            r'/Date\((\d+)\)/', video_info['launchDate'], 'launch date'))
+        timestamp_ms = int_or_none(self._search_regex(
+            r'/Date\((\d+)\)/',
+            video_info['launchDate'], 'launch date', fatal=False))

        return {
            'id': video_id,
--- a/youtube_dl/extractor/wsj.py
+++ b/youtube_dl/extractor/wsj.py
@ -0,0 +1,89 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    unified_strdate,
+)
+
+
+class WSJIE(InfoExtractor):
+    _VALID_URL = r'https?://video-api\.wsj\.com/api-video/player/iframe\.html\?guid=(?P<id>[a-zA-Z0-9-]+)'
+    IE_DESC = 'Wall Street Journal'
+    _TEST = {
+        'url': 'http://video-api.wsj.com/api-video/player/iframe.html?guid=1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
+        'md5': '9747d7a6ebc2f4df64b981e1dde9efa9',
+        'info_dict': {
+            'id': '1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
+            'ext': 'mp4',
+            'upload_date': '20150202',
+            'uploader_id': 'bbright',
+            'creator': 'bbright',
+            'categories': list,  # a long list
+            'duration': 90,
+            'title': 'Bills Coach Rex Ryan Updates His Old Jets Tattoo',
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        bitrates = [128, 174, 264, 320, 464, 664, 1264]
+        api_url = (
+            'http://video-api.wsj.com/api-video/find_all_videos.asp?'
+            'type=guid&count=1&query=%s&'
+            'fields=hls,adZone,thumbnailList,guid,state,secondsUntilStartTime,'
+            'author,description,name,linkURL,videoStillURL,duration,videoURL,'
+            'adCategory,catastrophic,linkShortURL,doctypeID,youtubeID,'
+            'titletag,rssURL,wsj-section,wsj-subsection,allthingsd-section,'
+            'allthingsd-subsection,sm-section,sm-subsection,provider,'
+            'formattedCreationDate,keywords,keywordsOmniture,column,editor,'
+            'emailURL,emailPartnerID,showName,omnitureProgramName,'
+            'omnitureVideoFormat,linkRelativeURL,touchCastID,'
+            'omniturePublishDate,%s') % (
+                video_id, ','.join('video%dkMP4Url' % br for br in bitrates))
+        info = self._download_json(api_url, video_id)['items'][0]
+
+        # Thumbnails are conveniently in the correct format already
+        thumbnails = info.get('thumbnailList')
+        creator = info.get('author')
+        uploader_id = info.get('editor')
+        categories = info.get('keywords')
+        duration = int_or_none(info.get('duration'))
+        upload_date = unified_strdate(
+            info.get('formattedCreationDate'), day_first=False)
+        title = info.get('name', info.get('titletag'))
+
+        formats = [{
+            'format_id': 'f4m',
+            'format_note': 'f4m (meta URL)',
+            'url': info['videoURL'],
+        }]
+        if info.get('hls'):
+            formats.extend(self._extract_m3u8_formats(
+                info['hls'], video_id, ext='mp4',
+                preference=0, entry_protocol='m3u8_native'))
+        for br in bitrates:
+            field = 'video%dkMP4Url' % br
+            if info.get(field):
+                formats.append({
+                    'format_id': 'mp4-%d' % br,
+                    'container': 'mp4',
+                    'tbr': br,
+                    'url': info[field],
+                })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'formats': formats,
+            'thumbnails': thumbnails,
+            'creator': creator,
+            'uploader_id': uploader_id,
+            'duration': duration,
+            'upload_date': upload_date,
+            'title': title,
+            'formats': formats,
+            'categories': categories,
+        }
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@ -698,10 +698,9 @@ def parseOpts(overrideArguments=None):
    postproc.add_option(
        '--fixup',
        metavar='POLICY', dest='fixup', default='detect_or_warn',
-        help='(experimental) Automatically correct known faults of the file. '
+        help='Automatically correct known faults of the file. '
             'One of never (do nothing), warn (only emit a warning), '
-             'detect_or_warn(check whether we can do anything about it, warn '
-             'otherwise')
+             'detect_or_warn(the default; fix file if we can, warn otherwise)')
    postproc.add_option(
        '--prefer-avconv',
        action='store_false', dest='prefer_ffmpeg',
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@ -511,8 +511,9 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
            metadata['artist'] = info['uploader_id']
        if info.get('description') is not None:
            metadata['description'] = info['description']
+            metadata['comment'] = info['description']
        if info.get('webpage_url') is not None:
-            metadata['comment'] = info['webpage_url']
+            metadata['purl'] = info['webpage_url']

        if not metadata:
            self._downloader.to_screen('[ffmpeg] There isn\'t any metadata to add')
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@ -701,7 +701,7 @@ def unified_strdate(date_str, day_first=True):
    # %z (UTC offset) is only supported in python>=3.2
    date_str = re.sub(r' ?(\+|-)[0-9]{2}:?[0-9]{2}$', '', date_str)
    # Remove AM/PM + timezone
-    date_str = re.sub(r'(?i)\s*(?:AM|PM)\s+[A-Z]+', '', date_str)
+    date_str = re.sub(r'(?i)\s*(?:AM|PM)(?:\s+[A-Z]+)?', '', date_str)

    format_expressions = [
        '%d %B %Y',
@ -1275,7 +1275,10 @@ def parse_duration(s):
            (?P<only_hours>[0-9.]+)\s*(?:hours?)|

            (?:
-                (?:(?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*)?
+                (?:
+                    (?:(?P<days>[0-9]+)\s*(?:[:d]|days?)\s*)?
+                    (?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*
+                )?
                (?P<mins>[0-9]+)\s*(?:[:m]|mins?|minutes?)\s*
            )?
            (?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*(?:s|secs?|seconds?)?
@ -1293,6 +1296,8 @@ def parse_duration(s):
        res += int(m.group('mins')) * 60
    if m.group('hours'):
        res += int(m.group('hours')) * 60 * 60
+    if m.group('days'):
+        res += int(m.group('days')) * 24 * 60 * 60
    if m.group('ms'):
        res += float(m.group('ms'))
    return res
@ -1543,7 +1548,7 @@ def js_to_json(code):
    res = re.sub(r'''(?x)
        "(?:[^"\\]*(?:\\\\|\\")?)*"|
        '(?:[^'\\]*(?:\\\\|\\')?)*'|
-        [a-zA-Z_][a-zA-Z_0-9]*
+        [a-zA-Z_][.a-zA-Z_0-9]*
        ''', fix_kv, code)
    res = re.sub(r',(\s*\])', lambda m: m.group(1), res)
    return res
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2015.02.02'
+__version__ = '2015.02.03.1'
Author	SHA1	Message	Date
Philipp Hagemeister	cd7342755f	release 2015.02.03.1	2015-02-03 10:59:27 +01:00
Philipp Hagemeister	9bb8e0a3f9	[wsj] Add new extractor (Fixes #4854 )	2015-02-03 10:58:28 +01:00
Philipp Hagemeister	1a6373ef39	[sort_formats] Prefer bitrate over video size 720p @ 1000KB/s looks way better than 1080p @ 500KB/s	2015-02-03 10:53:07 +01:00
Philipp Hagemeister	f6c24009be	[YoutubeDL] Calculate thumbnail IDs automatically	2015-02-03 10:52:22 +01:00
Philipp Hagemeister	d862042301	[aftonbladet] Modernize	2015-02-03 10:18:32 +01:00
Philipp Hagemeister	23d9ded655	[franceculture] Rewrite for new HTML scheme (Fixes #4853 )	2015-02-03 10:17:13 +01:00
Philipp Hagemeister	4c1a017e69	release 2015.02.03	2015-02-03 00:22:52 +01:00
Philipp Hagemeister	ee623d9247	[descripts/release] Regenerate auxiliary documentation on build as well	2015-02-03 00:22:17 +01:00
Philipp Hagemeister	330537d08a	[README] typo	2015-02-03 00:20:57 +01:00
Philipp Hagemeister	2cf0ecac7b	[ffmpeg] --add-metadata: Set comment and purl fields (Fixes #4847 )	2015-02-03 00:16:45 +01:00
Philipp Hagemeister	d200b11c7e	[Makefile] Simplify clean/cleanall	2015-02-03 00:14:42 +01:00
Philipp Hagemeister	d0eca21021	release 2015.02.02.5	2015-02-02 23:47:19 +01:00
Philipp Hagemeister	c1147c05e1	[brightcove] Fix up more generically invalid XML (Fixes #4849 )	2015-02-02 23:47:14 +01:00
Philipp Hagemeister	55898ad2cf	release 2015.02.02.4	2015-02-02 23:39:03 +01:00
Philipp Hagemeister	a465808592	Merge branch 'master' of github.com:rg3/youtube-dl	2015-02-02 23:38:54 +01:00
Philipp Hagemeister	5c4862bad4	[normalboots] Remove unused import	2015-02-02 23:38:45 +01:00
Philipp Hagemeister	995029a142	[nerdist] Add new extractor (Fixes #4851 )	2015-02-02 23:38:35 +01:00
Jaime Marquínez Ferrándiz	a57b562cff	[nfl] Add support for articles pages (fixes #4848 )	2015-02-02 23:17:00 +01:00
Philipp Hagemeister	531572578e	[normalboots] Modernize	2015-02-02 23:04:39 +01:00
Philipp Hagemeister	3a4cca687f	release 2015.02.02.3	2015-02-02 22:56:15 +01:00
Philipp Hagemeister	7d3d06a16c	[vevo] Restore SMIL support (#3656 )	2015-02-02 22:48:12 +01:00
Philipp Hagemeister	c21b1fbeeb	release 2015.02.02.2	2015-02-02 21:58:58 +01:00
Philipp Hagemeister	f920ce295e	[ntvru] Remove unused import	2015-02-02 21:58:17 +01:00
Philipp Hagemeister	7a7bd19c45	[n-tv.de] Use native m3u8 as best format	2015-02-02 21:57:48 +01:00
Philipp Hagemeister	8f4b58d70e	[ntvde] Add new extractor (Fixes #4850 )	2015-02-02 21:48:54 +01:00
Philipp Hagemeister	3fd45e03bf	[ntvru] Rename from NTV to clarify the difference between n-tv.de and ntv.ru	2015-02-02 20:43:02 +01:00
Philipp Hagemeister	869b4aeff4	release 2015.02.02.1	2015-02-02 20:35:04 +01:00
Philipp Hagemeister	cc9ca3ba6e	[downloader/external] Simplify source_address '' might actually be passed in, so check for None.	2015-02-02 20:33:25 +01:00
Philipp Hagemeister	ea71034bd3	Merge remote-tracking branch 'origin/master' Conflicts: youtube_dl/downloader/external.py	2015-02-02 20:32:07 +01:00
Philipp Hagemeister	9fffd0469f	[options] Mark --fixup as non-experimental and correct its help	2015-02-02 20:28:18 +01:00
Sergey M․	ae7773942e	[downloader/external] Simplify	2015-02-02 21:51:38 +06:00
Sergey M․	469a64cebf	[downloader/external] Simplify	2015-02-02 21:40:52 +06:00
Sergey M.	aae3fdcfae	Merge pull request #4845 from vijayanandnandam/master Passing source address option to external downloaders	2015-02-02 21:38:22 +06:00
vijayanand nandam	6a66904f8e	passing source address option to external downloaders	2015-02-02 20:51:40 +05:30
Sergey M․	78271e3319	[drtv] Extract material id (Closes #4814 )	2015-02-02 21:11:25 +06:00
Sergey M․	92bf0bcdf8	[historicfilms] Add extractor (Closes #4825 )	2015-02-02 20:52:37 +06:00
Philipp Hagemeister	1283204917	[http] PEP8 (#4831 )	2015-02-02 12:05:39 +01:00
Philipp Hagemeister	6789defea9	Merge pull request #4831 from light94/master Handling Connection Reset by Peer Error	2015-02-02 12:03:28 +01:00
light94	e77d2975af	Handling Connection Reset by Peer Error	2015-02-01 00:10:58 +05:30