Compare commits
48 Commits
2015.02.02
...
2015.02.06
Author | SHA1 | Date | |
---|---|---|---|
c831973366 | |||
1a2548d9e9 | |||
3900eec27c | |||
a02d212638 | |||
9c91a8fa70 | |||
41469f335e | |||
67ce4f8820 | |||
bc63d56cca | |||
c893d70805 | |||
3ee6e02564 | |||
e3aaace400 | |||
300753a069 | |||
f13b88c616 | |||
60ca389c64 | |||
1b0f3919c1 | |||
6a348cf7d5 | |||
9e91449c8d | |||
25e5ebf382 | |||
7dfc356625 | |||
58ba6c0160 | |||
f076b63821 | |||
12f0454cd6 | |||
cd7342755f | |||
9bb8e0a3f9 | |||
1a6373ef39 | |||
f6c24009be | |||
d862042301 | |||
23d9ded655 | |||
4c1a017e69 | |||
ee623d9247 | |||
330537d08a | |||
2cf0ecac7b | |||
d200b11c7e | |||
d0eca21021 | |||
c1147c05e1 | |||
55898ad2cf | |||
a465808592 | |||
5c4862bad4 | |||
995029a142 | |||
a57b562cff | |||
531572578e | |||
3a4cca687f | |||
7d3d06a16c | |||
c21b1fbeeb | |||
f920ce295e | |||
7a7bd19c45 | |||
8f4b58d70e | |||
3fd45e03bf |
1
AUTHORS
1
AUTHORS
@ -108,3 +108,4 @@ Enam Mijbah Noor
|
||||
David Luhmer
|
||||
Shaya Goldberg
|
||||
Paul Hartmann
|
||||
Frans de Jonge
|
||||
|
@ -1,4 +1,6 @@
|
||||
Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
||||
**Please include the full output of youtube-dl when run with `-v`**.
|
||||
|
||||
The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
||||
|
||||
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
|
||||
|
||||
@ -122,7 +124,7 @@ If you want to add support for a new site, you can follow this quick list (assum
|
||||
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
|
||||
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
|
||||
7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
|
||||
8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
|
||||
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
|
||||
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
|
||||
|
||||
$ git add youtube_dl/extractor/__init__.py
|
||||
|
5
Makefile
5
Makefile
@ -1,10 +1,7 @@
|
||||
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
|
||||
|
||||
clean:
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json CONTRIBUTING.md.tmp
|
||||
|
||||
cleanall: clean
|
||||
rm -f youtube-dl youtube-dl.exe
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
|
||||
|
||||
PREFIX ?= /usr/local
|
||||
BINDIR ?= $(PREFIX)/bin
|
||||
|
10
README.md
10
README.md
@ -532,6 +532,14 @@ Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the opt
|
||||
youtube-dl -- -wNyEUrxzFU
|
||||
youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"
|
||||
|
||||
### Can you add support for this anime video site, or site which shows current movies for free?
|
||||
|
||||
As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
|
||||
|
||||
A note on the service that they don't host the infringing content, but just link to those who do, is evidence that the service should **not** be included into youtube-dl. The same goes for any DMCA note when the whole front page of the service is filled with videos they are not allowed to distribute. A "fair use" note is equally unconvincing if the service shows copyright-protected videos in full without authorization.
|
||||
|
||||
Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
|
||||
|
||||
### How can I detect whether a given URL is supported by youtube-dl?
|
||||
|
||||
For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
|
||||
@ -728,7 +736,7 @@ In particular, every site support request issue should only pertain to services
|
||||
|
||||
### Is anyone going to need the feature?
|
||||
|
||||
Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
|
||||
Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
|
||||
|
||||
### Is your question about youtube-dl?
|
||||
|
||||
|
@ -35,7 +35,7 @@ if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $us
|
||||
if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
|
||||
|
||||
/bin/echo -e "\n### First of all, testing..."
|
||||
make cleanall
|
||||
make clean
|
||||
if $skip_tests ; then
|
||||
echo 'SKIPPING TESTS'
|
||||
else
|
||||
@ -45,9 +45,9 @@ fi
|
||||
/bin/echo -e "\n### Changing version in version.py..."
|
||||
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
|
||||
|
||||
/bin/echo -e "\n### Committing README.md and youtube_dl/version.py..."
|
||||
make README.md
|
||||
git add README.md youtube_dl/version.py
|
||||
/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
|
||||
make README.md CONTRIBUTING.md supportedsites
|
||||
git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
|
||||
git commit -m "release $version"
|
||||
|
||||
/bin/echo -e "\n### Now tagging, signing and pushing..."
|
||||
|
@ -9,6 +9,7 @@
|
||||
- **8tracks**
|
||||
- **9gag**
|
||||
- **abc.net.au**
|
||||
- **Abc7News**
|
||||
- **AcademicEarth:Course**
|
||||
- **AddAnime**
|
||||
- **AdobeTV**
|
||||
@ -16,9 +17,12 @@
|
||||
- **Aftonbladet**
|
||||
- **AlJazeera**
|
||||
- **Allocine**
|
||||
- **AlphaPorno**
|
||||
- **anitube.se**
|
||||
- **AnySex**
|
||||
- **Aparat**
|
||||
- **AppleDailyAnimationNews**
|
||||
- **AppleDailyRealtimeNews**
|
||||
- **AppleTrailers**
|
||||
- **archive.org**: archive.org videos
|
||||
- **ARD**
|
||||
@ -30,8 +34,10 @@
|
||||
- **arte.tv:ddc**
|
||||
- **arte.tv:embed**
|
||||
- **arte.tv:future**
|
||||
- **AtresPlayer**
|
||||
- **ATTTechChannel**
|
||||
- **audiomack**
|
||||
- **AUEngine**
|
||||
- **audiomack:album**
|
||||
- **Azubu**
|
||||
- **bambuser**
|
||||
- **bambuser:channel**
|
||||
@ -71,8 +77,10 @@
|
||||
- **cmt.com**
|
||||
- **CNET**
|
||||
- **CNN**
|
||||
- **CNNArticle**
|
||||
- **CNNBlogs**
|
||||
- **CollegeHumor**
|
||||
- **CollegeRama**
|
||||
- **ComCarCoff**
|
||||
- **ComedyCentral**
|
||||
- **ComedyCentralShows**: The Daily Show / The Colbert Report
|
||||
@ -82,23 +90,27 @@
|
||||
- **Crunchyroll**
|
||||
- **crunchyroll:playlist**
|
||||
- **CSpan**: C-SPAN
|
||||
- **CtsNews**
|
||||
- **culturebox.francetvinfo.fr**
|
||||
- **dailymotion**
|
||||
- **dailymotion:playlist**
|
||||
- **dailymotion:user**
|
||||
- **daum.net**
|
||||
- **DBTV**
|
||||
- **DctpTv**
|
||||
- **DeezerPlaylist**
|
||||
- **defense.gouv.fr**
|
||||
- **Discovery**
|
||||
- **divxstage**: DivxStage
|
||||
- **Dotsub**
|
||||
- **DRBonanza**
|
||||
- **Dropbox**
|
||||
- **DrTuber**
|
||||
- **DRTV**
|
||||
- **Dump**
|
||||
- **dvtv**: http://video.aktualne.cz/
|
||||
- **EbaumsWorld**
|
||||
- **EchoMsk**
|
||||
- **eHow**
|
||||
- **Einthusan**
|
||||
- **eitb.tv**
|
||||
@ -108,6 +120,7 @@
|
||||
- **EMPFlix**
|
||||
- **Engadget**
|
||||
- **Eporner**
|
||||
- **EroProfile**
|
||||
- **Escapist**
|
||||
- **EveryonesMixtape**
|
||||
- **exfm**: ex.fm
|
||||
@ -143,6 +156,7 @@
|
||||
- **GDCVault**
|
||||
- **generic**: Generic downloader that works on some sites
|
||||
- **GiantBomb**
|
||||
- **Giga**
|
||||
- **Glide**: Glide mobile video messages (glide.me)
|
||||
- **Globo**
|
||||
- **GodTube**
|
||||
@ -153,9 +167,14 @@
|
||||
- **Grooveshark**
|
||||
- **Groupon**
|
||||
- **Hark**
|
||||
- **HearThisAt**
|
||||
- **Heise**
|
||||
- **HellPorno**
|
||||
- **Helsinki**: helsinki.fi
|
||||
- **HentaiStigma**
|
||||
- **HistoricFilms**
|
||||
- **hitbox**
|
||||
- **hitbox:live**
|
||||
- **HornBunny**
|
||||
- **HostingBulk**
|
||||
- **HotNewHipHop**
|
||||
@ -182,6 +201,7 @@
|
||||
- **jpopsuki.tv**
|
||||
- **Jukebox**
|
||||
- **Kankan**
|
||||
- **Karaoketv**
|
||||
- **keek**
|
||||
- **KeezMovies**
|
||||
- **KhanAcademy**
|
||||
@ -195,6 +215,7 @@
|
||||
- **LiveLeak**
|
||||
- **livestream**
|
||||
- **livestream:original**
|
||||
- **LnkGo**
|
||||
- **lrt.lt**
|
||||
- **lynda**: lynda.com videos
|
||||
- **lynda:course**: lynda.com online courses
|
||||
@ -235,6 +256,7 @@
|
||||
- **MySpass**
|
||||
- **myvideo**
|
||||
- **MyVidster**
|
||||
- **n-tv.de**
|
||||
- **Naver**
|
||||
- **NBA**
|
||||
- **NBC**
|
||||
@ -242,11 +264,16 @@
|
||||
- **ndr**: NDR.de - Mediathek
|
||||
- **NDTV**
|
||||
- **NerdCubedFeed**
|
||||
- **Nerdist**
|
||||
- **Netzkino**
|
||||
- **Newgrounds**
|
||||
- **Newstube**
|
||||
- **NextMedia**
|
||||
- **NextMediaActionNews**
|
||||
- **nfb**: National Film Board of Canada
|
||||
- **nfl.com**
|
||||
- **nhl.com**
|
||||
- **nhl.com:news**: NHL news
|
||||
- **nhl.com:videocenter**: NHL videocenter category
|
||||
- **niconico**: ニコニコ動画
|
||||
- **NiconicoPlaylist**
|
||||
@ -257,18 +284,20 @@
|
||||
- **Nowness**
|
||||
- **nowvideo**: NowVideo
|
||||
- **npo.nl**
|
||||
- **npo.nl:live**
|
||||
- **NRK**
|
||||
- **NRKTV**
|
||||
- **NTV**
|
||||
- **ntv.ru**
|
||||
- **Nuvid**
|
||||
- **NYTimes**
|
||||
- **ocw.mit.edu**
|
||||
- **OktoberfestTV**
|
||||
- **on.aol.com**
|
||||
- **Ooyala**
|
||||
- **OpenFilm**
|
||||
- **orf:fm4**: radio FM4
|
||||
- **orf:oe1**: Radio Österreich 1
|
||||
- **orf:tvthek**: ORF TVthek
|
||||
- **ORFFM4**: radio FM4
|
||||
- **parliamentlive.tv**: UK parliament videos
|
||||
- **Patreon**
|
||||
- **PBS**
|
||||
@ -290,6 +319,7 @@
|
||||
- **Pyvideo**
|
||||
- **QuickVid**
|
||||
- **radio.de**
|
||||
- **radiobremen**
|
||||
- **radiofrance**
|
||||
- **Rai**
|
||||
- **RBMARadio**
|
||||
@ -300,6 +330,8 @@
|
||||
- **RottenTomatoes**
|
||||
- **Roxwel**
|
||||
- **RTBF**
|
||||
- **Rte**
|
||||
- **RTL2**
|
||||
- **RTLnow**
|
||||
- **rtlxl.nl**
|
||||
- **RTP**
|
||||
@ -309,6 +341,7 @@
|
||||
- **RUHD**
|
||||
- **rutube**: Rutube videos
|
||||
- **rutube:channel**: Rutube channels
|
||||
- **rutube:embed**: Rutube embedded videos
|
||||
- **rutube:movie**: Rutube movies
|
||||
- **rutube:person**: Rutube person videos
|
||||
- **RUTV**: RUTV.RU
|
||||
@ -351,11 +384,12 @@
|
||||
- **Sport5**
|
||||
- **SportBox**
|
||||
- **SportDeutschland**
|
||||
- **SRMediathek**: Süddeutscher Rundfunk
|
||||
- **SRMediathek**: Saarländischer Rundfunk
|
||||
- **stanfordoc**: Stanford Open ClassRoom
|
||||
- **Steam**
|
||||
- **streamcloud.eu**
|
||||
- **StreamCZ**
|
||||
- **StreetVoice**
|
||||
- **SunPorno**
|
||||
- **SWRMediathek**
|
||||
- **Syfy**
|
||||
@ -375,7 +409,9 @@
|
||||
- **TeleBruxelles**
|
||||
- **telecinco.es**
|
||||
- **TeleMB**
|
||||
- **TeleTask**
|
||||
- **TenPlay**
|
||||
- **TestTube**
|
||||
- **TF1**
|
||||
- **TheOnion**
|
||||
- **ThePlatform**
|
||||
@ -403,8 +439,16 @@
|
||||
- **tv.dfb.de**
|
||||
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||
- **tvp.pl**
|
||||
- **tvp.pl:Series**
|
||||
- **TVPlay**: TV3Play and related services
|
||||
- **Twitch**
|
||||
- **Tweakers**
|
||||
- **twitch:bookmarks**
|
||||
- **twitch:chapter**
|
||||
- **twitch:past_broadcasts**
|
||||
- **twitch:profile**
|
||||
- **twitch:stream**
|
||||
- **twitch:video**
|
||||
- **twitch:vod**
|
||||
- **Ubu**
|
||||
- **udemy**
|
||||
- **udemy:course**
|
||||
@ -433,6 +477,8 @@
|
||||
- **videoweed**: VideoWeed
|
||||
- **Vidme**
|
||||
- **Vidzi**
|
||||
- **vier**
|
||||
- **vier:videos**
|
||||
- **viki**
|
||||
- **vimeo**
|
||||
- **vimeo:album**
|
||||
@ -460,11 +506,13 @@
|
||||
- **WDR**
|
||||
- **wdr:mobile**
|
||||
- **WDRMaus**: Sendung mit der Maus
|
||||
- **WebOfStories**
|
||||
- **Weibo**
|
||||
- **Wimp**
|
||||
- **Wistia**
|
||||
- **WorldStarHipHop**
|
||||
- **wrzuta.pl**
|
||||
- **WSJ**: Wall Street Journal
|
||||
- **XBef**
|
||||
- **XboxClips**
|
||||
- **XHamster**
|
||||
@ -472,7 +520,9 @@
|
||||
- **XNXX**
|
||||
- **XTube**
|
||||
- **XTubeUser**: XTube user profile
|
||||
- **Xuite**
|
||||
- **XVideos**
|
||||
- **XXXYMovies**
|
||||
- **Yahoo**: Yahoo screen and movies
|
||||
- **YesJapan**
|
||||
- **Ynet**
|
||||
@ -491,7 +541,6 @@
|
||||
- **youtube:search_url**: YouTube.com search URLs
|
||||
- **youtube:show**: YouTube.com (multi-season) shows
|
||||
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
|
||||
- **youtube:toplist**: YouTube.com top lists, "yttoplist:{channel}:{list title}" (Example: "yttoplist:music:Top Tracks")
|
||||
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
|
||||
- **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
|
||||
- **ZDF**
|
||||
|
@ -103,6 +103,16 @@ def expect_info_dict(self, got_dict, expected_dict):
|
||||
self.assertTrue(
|
||||
match_rex.match(got),
|
||||
'field %s (value: %r) should match %r' % (info_field, got, match_str))
|
||||
elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
|
||||
got = got_dict.get(info_field)
|
||||
start_str = expected[len('startswith:'):]
|
||||
self.assertTrue(
|
||||
isinstance(got, compat_str),
|
||||
'Expected a %s object, but got %s for field %s' % (
|
||||
compat_str.__name__, type(got).__name__, info_field))
|
||||
self.assertTrue(
|
||||
got.startswith(start_str),
|
||||
'field %s (value: %r) should start with %r' % (info_field, got, start_str))
|
||||
elif isinstance(expected, type):
|
||||
got = got_dict.get(info_field)
|
||||
self.assertTrue(isinstance(got, expected),
|
||||
|
@ -156,6 +156,9 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(
|
||||
unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
|
||||
'20141126')
|
||||
self.assertEqual(
|
||||
unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
|
||||
'20150202')
|
||||
|
||||
def test_find_xpath_attr(self):
|
||||
testxml = '''<root>
|
||||
@ -238,6 +241,8 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(parse_duration('5 s'), 5)
|
||||
self.assertEqual(parse_duration('3 min'), 180)
|
||||
self.assertEqual(parse_duration('2.5 hours'), 9000)
|
||||
self.assertEqual(parse_duration('02:03:04'), 7384)
|
||||
self.assertEqual(parse_duration('01:02:03:04'), 93784)
|
||||
|
||||
def test_fix_xml_ampersands(self):
|
||||
self.assertEqual(
|
||||
@ -371,6 +376,16 @@ class TestUtil(unittest.TestCase):
|
||||
on = js_to_json('{"abc": true}')
|
||||
self.assertEqual(json.loads(on), {'abc': True})
|
||||
|
||||
# Ignore JavaScript code as well
|
||||
on = js_to_json('''{
|
||||
"x": 1,
|
||||
y: "a",
|
||||
z: some.code
|
||||
}''')
|
||||
d = json.loads(on)
|
||||
self.assertEqual(d['x'], 1)
|
||||
self.assertEqual(d['y'], 'a')
|
||||
|
||||
def test_clean_html(self):
|
||||
self.assertEqual(clean_html('a:\nb'), 'a: b')
|
||||
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
|
||||
|
@ -964,9 +964,11 @@ class YoutubeDL(object):
|
||||
thumbnails.sort(key=lambda t: (
|
||||
t.get('preference'), t.get('width'), t.get('height'),
|
||||
t.get('id'), t.get('url')))
|
||||
for t in thumbnails:
|
||||
for i, t in enumerate(thumbnails):
|
||||
if 'width' in t and 'height' in t:
|
||||
t['resolution'] = '%dx%d' % (t['width'], t['height'])
|
||||
if t.get('id') is None:
|
||||
t['id'] = '%d' % i
|
||||
|
||||
if thumbnails and 'thumbnail' not in info_dict:
|
||||
info_dict['thumbnail'] = thumbnails[-1]['url']
|
||||
@ -1074,7 +1076,8 @@ class YoutubeDL(object):
|
||||
else self.params['merge_output_format'])
|
||||
selected_format = {
|
||||
'requested_formats': formats_info,
|
||||
'format': rf,
|
||||
'format': '%s+%s' % (formats_info[0].get('format'),
|
||||
formats_info[1].get('format')),
|
||||
'format_id': '%s+%s' % (formats_info[0].get('format_id'),
|
||||
formats_info[1].get('format_id')),
|
||||
'width': formats_info[0].get('width'),
|
||||
|
@ -285,6 +285,7 @@ from .ndr import NDRIE
|
||||
from .ndtv import NDTVIE
|
||||
from .netzkino import NetzkinoIE
|
||||
from .nerdcubed import NerdCubedFeedIE
|
||||
from .nerdist import NerdistIE
|
||||
from .newgrounds import NewgroundsIE
|
||||
from .newstube import NewstubeIE
|
||||
from .nextmedia import (
|
||||
@ -317,7 +318,8 @@ from .nrk import (
|
||||
NRKIE,
|
||||
NRKTVIE,
|
||||
)
|
||||
from .ntv import NTVIE
|
||||
from .ntvde import NTVDeIE
|
||||
from .ntvru import NTVRuIE
|
||||
from .nytimes import NYTimesIE
|
||||
from .nuvid import NuvidIE
|
||||
from .oktoberfesttv import OktoberfestTVIE
|
||||
@ -473,6 +475,7 @@ from .tutv import TutvIE
|
||||
from .tvigle import TvigleIE
|
||||
from .tvp import TvpIE, TvpSeriesIE
|
||||
from .tvplay import TVPlayIE
|
||||
from .tweakers import TweakersIE
|
||||
from .twentyfourvideo import TwentyFourVideoIE
|
||||
from .twitch import (
|
||||
TwitchVideoIE,
|
||||
@ -552,6 +555,7 @@ from .wimp import WimpIE
|
||||
from .wistia import WistiaIE
|
||||
from .worldstarhiphop import WorldStarHipHopIE
|
||||
from .wrzuta import WrzutaIE
|
||||
from .wsj import WSJIE
|
||||
from .xbef import XBefIE
|
||||
from .xboxclips import XboxClipsIE
|
||||
from .xhamster import XHamsterIE
|
||||
|
@ -1,8 +1,6 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
@ -21,9 +19,7 @@ class AftonbladetIE(InfoExtractor):
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.search(self._VALID_URL, url)
|
||||
|
||||
video_id = mobj.group('video_id')
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
# find internal video meta data
|
||||
|
@ -108,7 +108,7 @@ class BrightcoveIE(InfoExtractor):
|
||||
"""
|
||||
|
||||
# Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553
|
||||
object_str = re.sub(r'(<param name="[^"]+" value="[^"]+")>',
|
||||
object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>',
|
||||
lambda m: m.group(1) + '/>', object_str)
|
||||
# Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608
|
||||
object_str = object_str.replace('<--', '<!--')
|
||||
|
@ -145,6 +145,7 @@ class InfoExtractor(object):
|
||||
thumbnail: Full URL to a video thumbnail image.
|
||||
description: Full video description.
|
||||
uploader: Full name of the video uploader.
|
||||
creator: The main artist who created the video.
|
||||
timestamp: UNIX timestamp of the moment the video became available.
|
||||
upload_date: Video upload date (YYYYMMDD).
|
||||
If not explicitly set, calculated from timestamp.
|
||||
@ -704,11 +705,11 @@ class InfoExtractor(object):
|
||||
preference,
|
||||
f.get('language_preference') if f.get('language_preference') is not None else -1,
|
||||
f.get('quality') if f.get('quality') is not None else -1,
|
||||
f.get('height') if f.get('height') is not None else -1,
|
||||
f.get('width') if f.get('width') is not None else -1,
|
||||
ext_preference,
|
||||
f.get('tbr') if f.get('tbr') is not None else -1,
|
||||
f.get('vbr') if f.get('vbr') is not None else -1,
|
||||
ext_preference,
|
||||
f.get('height') if f.get('height') is not None else -1,
|
||||
f.get('width') if f.get('width') is not None else -1,
|
||||
f.get('abr') if f.get('abr') is not None else -1,
|
||||
audio_ext_preference,
|
||||
f.get('fps') if f.get('fps') is not None else -1,
|
||||
@ -764,7 +765,7 @@ class InfoExtractor(object):
|
||||
self.to_screen(msg)
|
||||
time.sleep(timeout)
|
||||
|
||||
def _extract_f4m_formats(self, manifest_url, video_id):
|
||||
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None):
|
||||
manifest = self._download_xml(
|
||||
manifest_url, video_id, 'Downloading f4m manifest',
|
||||
'Unable to download f4m manifest')
|
||||
@ -777,26 +778,28 @@ class InfoExtractor(object):
|
||||
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
|
||||
for i, media_el in enumerate(media_nodes):
|
||||
if manifest_version == '2.0':
|
||||
manifest_url = '/'.join(manifest_url.split('/')[:-1]) + '/' + media_el.attrib.get('href')
|
||||
manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
|
||||
+ (media_el.attrib.get('href') or media_el.attrib.get('url')))
|
||||
tbr = int_or_none(media_el.attrib.get('bitrate'))
|
||||
format_id = 'f4m-%d' % (i if tbr is None else tbr)
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
|
||||
'url': manifest_url,
|
||||
'ext': 'flv',
|
||||
'tbr': tbr,
|
||||
'width': int_or_none(media_el.attrib.get('width')),
|
||||
'height': int_or_none(media_el.attrib.get('height')),
|
||||
'preference': preference,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return formats
|
||||
|
||||
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
|
||||
entry_protocol='m3u8', preference=None):
|
||||
entry_protocol='m3u8', preference=None,
|
||||
m3u8_id=None):
|
||||
|
||||
formats = [{
|
||||
'format_id': 'm3u8-meta',
|
||||
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
|
||||
'url': m3u8_url,
|
||||
'ext': ext,
|
||||
'protocol': 'm3u8',
|
||||
@ -832,9 +835,8 @@ class InfoExtractor(object):
|
||||
formats.append({'url': format_url(line)})
|
||||
continue
|
||||
tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
|
||||
|
||||
f = {
|
||||
'format_id': 'm3u8-%d' % (tbr if tbr else len(formats)),
|
||||
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
|
||||
'url': format_url(line.strip()),
|
||||
'tbr': tbr,
|
||||
'ext': ext,
|
||||
@ -860,10 +862,13 @@ class InfoExtractor(object):
|
||||
return formats
|
||||
|
||||
# TODO: improve extraction
|
||||
def _extract_smil_formats(self, smil_url, video_id):
|
||||
def _extract_smil_formats(self, smil_url, video_id, fatal=True):
|
||||
smil = self._download_xml(
|
||||
smil_url, video_id, 'Downloading SMIL file',
|
||||
'Unable to download SMIL file')
|
||||
'Unable to download SMIL file', fatal=fatal)
|
||||
if smil is False:
|
||||
assert not fatal
|
||||
return []
|
||||
|
||||
base = smil.find('./head/meta').get('base')
|
||||
|
||||
|
@ -1,77 +1,69 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class FranceCultureIE(InfoExtractor):
|
||||
_VALID_URL = r'(?P<baseurl>http://(?:www\.)?franceculture\.fr/)player/reecouter\?play=(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.franceculture.fr/player/reecouter?play=4795174',
|
||||
'info_dict': {
|
||||
'id': '4795174',
|
||||
'ext': 'mp3',
|
||||
'title': 'Rendez-vous au pays des geeks',
|
||||
'alt_title': 'Carnet nomade | 13-14',
|
||||
'vcodec': 'none',
|
||||
'uploader': 'Colette Fellous',
|
||||
'upload_date': '20140301',
|
||||
'duration': 3601,
|
||||
'thumbnail': r're:^http://www\.franceculture\.fr/.*/images/player/Carnet-nomade\.jpg$',
|
||||
'description': 'Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats-Unis dans la S ...',
|
||||
'description': 'startswith:Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats',
|
||||
'timestamp': 1393700400,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
baseurl = mobj.group('baseurl')
|
||||
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
params_code = self._search_regex(
|
||||
r"<param name='movie' value='/sites/all/modules/rf/rf_player/swf/loader.swf\?([^']+)' />",
|
||||
webpage, 'parameter code')
|
||||
params = compat_parse_qs(params_code)
|
||||
video_url = compat_urlparse.urljoin(baseurl, params['urlAOD'][0])
|
||||
|
||||
video_path = self._search_regex(
|
||||
r'<a id="player".*?href="([^"]+)"', webpage, 'video path')
|
||||
video_url = compat_urlparse.urljoin(url, video_path)
|
||||
timestamp = int_or_none(self._search_regex(
|
||||
r'<a id="player".*?data-date="([0-9]+)"',
|
||||
webpage, 'upload date', fatal=False))
|
||||
thumbnail = self._search_regex(
|
||||
r'<a id="player".*?>\s+<img src="([^"]+)"',
|
||||
webpage, 'thumbnail', fatal=False)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<h1 class="title[^"]+">(.+?)</h1>', webpage, 'title')
|
||||
r'<span class="title-diffusion">(.*?)</span>', webpage, 'title')
|
||||
alt_title = self._html_search_regex(
|
||||
r'<span class="title">(.*?)</span>',
|
||||
webpage, 'alt_title', fatal=False)
|
||||
description = self._html_search_regex(
|
||||
r'<span class="description">(.*?)</span>',
|
||||
webpage, 'description', fatal=False)
|
||||
|
||||
uploader = self._html_search_regex(
|
||||
r'(?s)<div id="emission".*?<span class="author">(.*?)</span>',
|
||||
webpage, 'uploader', fatal=False)
|
||||
thumbnail_part = self._html_search_regex(
|
||||
r'(?s)<div id="emission".*?<img src="([^"]+)"', webpage,
|
||||
'thumbnail', fatal=False)
|
||||
if thumbnail_part is None:
|
||||
thumbnail = None
|
||||
else:
|
||||
thumbnail = compat_urlparse.urljoin(baseurl, thumbnail_part)
|
||||
description = self._html_search_regex(
|
||||
r'(?s)<p class="desc">(.*?)</p>', webpage, 'description')
|
||||
|
||||
info = json.loads(params['infoData'][0])[0]
|
||||
duration = info.get('media_length')
|
||||
upload_date_candidate = info.get('media_section5')
|
||||
upload_date = (
|
||||
upload_date_candidate
|
||||
if (upload_date_candidate is not None and
|
||||
re.match(r'[0-9]{8}$', upload_date_candidate))
|
||||
else None)
|
||||
webpage, 'uploader', default=None)
|
||||
vcodec = 'none' if determine_ext(video_url.lower()) == 'mp3' else None
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'vcodec': 'none' if video_url.lower().endswith('.mp3') else None,
|
||||
'duration': duration,
|
||||
'vcodec': vcodec,
|
||||
'uploader': uploader,
|
||||
'upload_date': upload_date,
|
||||
'timestamp': timestamp,
|
||||
'title': title,
|
||||
'alt_title': alt_title,
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
}
|
||||
|
@ -140,6 +140,19 @@ class GenericIE(InfoExtractor):
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
},
|
||||
# multiple ooyala embeds on SBN network websites
|
||||
{
|
||||
'url': 'http://www.sbnation.com/college-football-recruiting/2015/2/3/7970291/national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
|
||||
'info_dict': {
|
||||
'id': 'national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
|
||||
'title': '25 lies you will tell yourself on National Signing Day - SBNation.com',
|
||||
},
|
||||
'playlist_mincount': 3,
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
},
|
||||
# google redirect
|
||||
{
|
||||
'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
|
||||
@ -882,10 +895,19 @@ class GenericIE(InfoExtractor):
|
||||
|
||||
# Look for Ooyala videos
|
||||
mobj = (re.search(r'player\.ooyala\.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
|
||||
re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage))
|
||||
re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage) or
|
||||
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage))
|
||||
if mobj is not None:
|
||||
return OoyalaIE._build_url_result(mobj.group('ec'))
|
||||
|
||||
# Look for multiple Ooyala embeds on SBN network websites
|
||||
mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
|
||||
if mobj is not None:
|
||||
embeds = self._parse_json(mobj.group(1), video_id, fatal=False)
|
||||
if embeds:
|
||||
return _playlist_from_matches(
|
||||
embeds, getter=lambda v: OoyalaIE._url_for_embed_code(v['provider_video_id']), ie='Ooyala')
|
||||
|
||||
# Look for Aparat videos
|
||||
mobj = re.search(r'<iframe .*?src="(http://www\.aparat\.com/video/[^"]+)"', webpage)
|
||||
if mobj is not None:
|
||||
|
@ -18,7 +18,7 @@ class MixcloudIE(InfoExtractor):
|
||||
_VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([^/]+)/([^/]+)'
|
||||
IE_NAME = 'mixcloud'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://www.mixcloud.com/dholbach/cryptkeeper/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach-cryptkeeper',
|
||||
@ -33,7 +33,20 @@ class MixcloudIE(InfoExtractor):
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
|
||||
'info_dict': {
|
||||
'id': 'gillespeterson-caribou-7-inch-vinyl-mix-chat',
|
||||
'ext': 'm4a',
|
||||
'title': 'Electric Relaxation vol. 3',
|
||||
'description': 'md5:2b8aec6adce69f9d41724647c65875e8',
|
||||
'uploader': 'Daniel Drumz',
|
||||
'uploader_id': 'gillespeterson',
|
||||
'thumbnail': 're:https?://.*\.jpg',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}]
|
||||
|
||||
def _get_url(self, track_id, template_url):
|
||||
server_count = 30
|
||||
@ -60,7 +73,7 @@ class MixcloudIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, track_id)
|
||||
|
||||
preview_url = self._search_regex(
|
||||
r'\s(?:data-preview-url|m-preview)="(.+?)"', webpage, 'preview url')
|
||||
r'\s(?:data-preview-url|m-preview)="([^"]+)"', webpage, 'preview url')
|
||||
song_url = preview_url.replace('/previews/', '/c/originals/')
|
||||
template_url = re.sub(r'(stream\d*)', 'stream%d', song_url)
|
||||
final_song_url = self._get_url(track_id, template_url)
|
||||
|
80
youtube_dl/extractor/nerdist.py
Normal file
80
youtube_dl/extractor/nerdist.py
Normal file
@ -0,0 +1,80 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
parse_iso8601,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
|
||||
class NerdistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?nerdist\.com/vepisode/(?P<id>[^/?#]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.nerdist.com/vepisode/exclusive-which-dc-characters-w',
|
||||
'md5': '3698ed582931b90d9e81e02e26e89f23',
|
||||
'info_dict': {
|
||||
'display_id': 'exclusive-which-dc-characters-w',
|
||||
'id': 'RPHpvJyr',
|
||||
'ext': 'mp4',
|
||||
'title': 'Your TEEN TITANS Revealed! Who\'s on the show?',
|
||||
'thumbnail': 're:^https?://.*/thumbs/.*\.jpg$',
|
||||
'description': 'Exclusive: Find out which DC Comics superheroes will star in TEEN TITANS Live-Action TV Show on Nerdist News with Jessica Chobot!',
|
||||
'uploader': 'Eric Diaz',
|
||||
'upload_date': '20150202',
|
||||
'timestamp': 1422892808,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'''(?x)<script\s+(?:type="text/javascript"\s+)?
|
||||
src="https?://content\.nerdist\.com/players/([a-zA-Z0-9_]+)-''',
|
||||
webpage, 'video ID')
|
||||
timestamp = parse_iso8601(self._html_search_meta(
|
||||
'shareaholic:article_published_time', webpage, 'upload date'))
|
||||
uploader = self._html_search_meta(
|
||||
'shareaholic:article_author_name', webpage, 'article author')
|
||||
|
||||
doc = self._download_xml(
|
||||
'http://content.nerdist.com/jw6/%s.xml' % video_id, video_id)
|
||||
video_info = doc.find('.//item')
|
||||
title = xpath_text(video_info, './title', fatal=True)
|
||||
description = xpath_text(video_info, './description')
|
||||
thumbnail = xpath_text(
|
||||
video_info, './{http://rss.jwpcdn.com/}image', 'thumbnail')
|
||||
|
||||
formats = []
|
||||
for source in video_info.findall('./{http://rss.jwpcdn.com/}source'):
|
||||
vurl = source.attrib['file']
|
||||
ext = determine_ext(vurl)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
vurl, video_id, entry_protocol='m3u8_native', ext='mp4',
|
||||
preference=0))
|
||||
elif ext == 'smil':
|
||||
formats.extend(self._extract_smil_formats(
|
||||
vurl, video_id, fatal=False
|
||||
))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': ext,
|
||||
'url': vurl,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'timestamp': timestamp,
|
||||
'formats': formats,
|
||||
'uploader': uploader,
|
||||
}
|
@ -46,7 +46,18 @@ class NFLIE(InfoExtractor):
|
||||
'timestamp': 1388354455,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://www.nfl.com/news/story/0ap3000000467586/article/patriots-seahawks-involved-in-lategame-skirmish',
|
||||
'info_dict': {
|
||||
'id': '0ap3000000467607',
|
||||
'ext': 'mp4',
|
||||
'title': 'Frustrations flare on the field',
|
||||
'description': 'Emotions ran high at the end of the Super Bowl on both sides of the ball after a dramatic finish.',
|
||||
'timestamp': 1422850320,
|
||||
'upload_date': '20150202',
|
||||
},
|
||||
},
|
||||
]
|
||||
|
||||
@staticmethod
|
||||
@ -80,7 +91,11 @@ class NFLIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
config_url = NFLIE.prepend_host(host, self._search_regex(
|
||||
r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL'))
|
||||
r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL',
|
||||
default='static/content/static/config/video/config.json'))
|
||||
# For articles, the id in the url is not the video id
|
||||
video_id = self._search_regex(
|
||||
r'contentId\s*:\s*"([^"]+)"', webpage, 'video id', default=video_id)
|
||||
config = self._download_json(config_url, video_id,
|
||||
note='Downloading player config')
|
||||
url_template = NFLIE.prepend_host(
|
||||
|
@ -1,8 +1,6 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
from ..utils import (
|
||||
@ -11,7 +9,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class NormalbootsIE(InfoExtractor):
|
||||
_VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<videoid>[0-9a-z-]*)/?$'
|
||||
_VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<id>[0-9a-z-]*)/?$'
|
||||
_TEST = {
|
||||
'url': 'http://normalboots.com/video/home-alone-games-jontron/',
|
||||
'md5': '8bf6de238915dd501105b44ef5f1e0f6',
|
||||
@ -30,19 +28,22 @@ class NormalbootsIE(InfoExtractor):
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('videoid')
|
||||
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video_uploader = self._html_search_regex(r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
|
||||
webpage, 'uploader')
|
||||
raw_upload_date = self._html_search_regex('<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
|
||||
webpage, 'date')
|
||||
video_upload_date = unified_strdate(raw_upload_date)
|
||||
|
||||
player_url = self._html_search_regex(r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"', webpage, 'url')
|
||||
video_uploader = self._html_search_regex(
|
||||
r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
|
||||
webpage, 'uploader', fatal=False)
|
||||
video_upload_date = unified_strdate(self._html_search_regex(
|
||||
r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
|
||||
webpage, 'date', fatal=False))
|
||||
|
||||
player_url = self._html_search_regex(
|
||||
r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"',
|
||||
webpage, 'player url')
|
||||
player_page = self._download_webpage(player_url, video_id)
|
||||
video_url = self._html_search_regex(r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')
|
||||
video_url = self._html_search_regex(
|
||||
r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
@ -1,6 +1,6 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .subtitles import SubtitlesInfoExtractor
|
||||
from ..utils import (
|
||||
fix_xml_ampersands,
|
||||
parse_duration,
|
||||
@ -11,7 +11,7 @@ from ..utils import (
|
||||
)
|
||||
|
||||
|
||||
class NPOBaseIE(InfoExtractor):
|
||||
class NPOBaseIE(SubtitlesInfoExtractor):
|
||||
def _get_token(self, video_id):
|
||||
token_page = self._download_webpage(
|
||||
'http://ida.omroep.nl/npoplayer/i.js',
|
||||
@ -161,6 +161,16 @@ class NPOIE(NPOBaseIE):
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
if metadata.get('tt888') == 'ja':
|
||||
subtitles['nl'] = 'http://e.omroep.nl/tt888/%s' % video_id
|
||||
|
||||
if self._downloader.params.get('listsubtitles', False):
|
||||
self._list_available_subtitles(video_id, subtitles)
|
||||
return
|
||||
|
||||
subtitles = self.extract_subtitles(video_id, subtitles)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': metadata['titel'],
|
||||
@ -169,6 +179,7 @@ class NPOIE(NPOBaseIE):
|
||||
'upload_date': unified_strdate(metadata.get('gidsdatum')),
|
||||
'duration': parse_duration(metadata.get('tijdsduur')),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
|
68
youtube_dl/extractor/ntvde.py
Normal file
68
youtube_dl/extractor/ntvde.py
Normal file
@ -0,0 +1,68 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
parse_duration,
|
||||
)
|
||||
|
||||
|
||||
class NTVDeIE(InfoExtractor):
|
||||
IE_NAME = 'n-tv.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?n-tv\.de/mediathek/videos/[^/?#]+/[^/?#]+-article(?P<id>.+)\.html'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.n-tv.de/mediathek/videos/panorama/Schnee-und-Glaette-fuehren-zu-zahlreichen-Unfaellen-und-Staus-article14438086.html',
|
||||
'md5': '6ef2514d4b1e8e03ca24b49e2f167153',
|
||||
'info_dict': {
|
||||
'id': '14438086',
|
||||
'ext': 'mp4',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'title': 'Schnee und Glätte führen zu zahlreichen Unfällen und Staus',
|
||||
'alt_title': 'Winterchaos auf deutschen Straßen',
|
||||
'description': 'Schnee und Glätte sorgen deutschlandweit für einen chaotischen Start in die Woche: Auf den Straßen kommt es zu kilometerlangen Staus und Dutzenden Glätteunfällen. In Düsseldorf und München wirbelt der Schnee zudem den Flugplan durcheinander. Dutzende Flüge landen zu spät, einige fallen ganz aus.',
|
||||
'duration': 4020,
|
||||
'timestamp': 1422892797,
|
||||
'upload_date': '20150202',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
info = self._parse_json(self._search_regex(
|
||||
r'(?s)ntv.pageInfo.article =\s(\{.*?\});', webpage, 'info'),
|
||||
video_id, transform_source=js_to_json)
|
||||
timestamp = int_or_none(info.get('publishedDateAsUnixTimeStamp'))
|
||||
vdata = self._parse_json(self._search_regex(
|
||||
r'(?s)\$\(\s*"\#player"\s*\)\s*\.data\(\s*"player",\s*(\{.*?\})\);',
|
||||
webpage, 'player data'),
|
||||
video_id, transform_source=js_to_json)
|
||||
duration = parse_duration(vdata.get('duration'))
|
||||
formats = [{
|
||||
'format_id': 'flash',
|
||||
'url': 'rtmp://fms.n-tv.de/' + vdata['video'],
|
||||
}, {
|
||||
'format_id': 'mobile',
|
||||
'url': 'http://video.n-tv.de' + vdata['videoMp4'],
|
||||
'tbr': 400, # estimation
|
||||
}]
|
||||
m3u8_url = 'http://video.n-tv.de' + vdata['videoM3u8']
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, ext='mp4',
|
||||
entry_protocol='m3u8_native', preference=0))
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': info['headline'],
|
||||
'description': info.get('intro'),
|
||||
'alt_title': info.get('kicker'),
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': vdata.get('html5VideoPoster'),
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
@ -1,15 +1,14 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
unescapeHTML
|
||||
)
|
||||
|
||||
|
||||
class NTVIE(InfoExtractor):
|
||||
class NTVRuIE(InfoExtractor):
|
||||
IE_NAME = 'ntv.ru'
|
||||
_VALID_URL = r'http://(?:www\.)?ntv\.ru/(?P<id>.+)'
|
||||
|
||||
_TESTS = [
|
||||
@ -92,9 +91,7 @@ class NTVIE(InfoExtractor):
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
|
||||
video_id = self._match_id(url)
|
||||
page = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._html_search_regex(self._VIDEO_ID_REGEXES, page, 'video id')
|
@ -49,6 +49,7 @@ class RTPIE(InfoExtractor):
|
||||
'ext': ext,
|
||||
'vcodec': config.get('type') == 'audio' and 'none' or None,
|
||||
'player_url': 'http://programas.rtp.pt/play/player.swf?v3',
|
||||
'rtmp_real_time': True,
|
||||
}]
|
||||
|
||||
return {
|
||||
|
@ -6,12 +6,14 @@ import re
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urllib_parse_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
unescapeHTML,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
|
||||
@ -159,11 +161,27 @@ class RTSIE(InfoExtractor):
|
||||
return int_or_none(self._search_regex(
|
||||
r'-([0-9]+)k\.', url, 'bitrate', default=None))
|
||||
|
||||
formats = [{
|
||||
'format_id': fid,
|
||||
'url': furl,
|
||||
'tbr': extract_bitrate(furl),
|
||||
} for fid, furl in info['streams'].items()]
|
||||
formats = []
|
||||
for format_id, format_url in info['streams'].items():
|
||||
if format_url.endswith('.f4m'):
|
||||
token = self._download_xml(
|
||||
'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
|
||||
video_id, 'Downloading %s token' % format_id)
|
||||
auth_params = xpath_text(token, './/authparams', 'auth params')
|
||||
if not auth_params:
|
||||
continue
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
'%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
|
||||
video_id, f4m_id=format_id))
|
||||
elif format_url.endswith('.m3u8'):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4', m3u8_id=format_id))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': format_url,
|
||||
'tbr': extract_bitrate(format_url),
|
||||
})
|
||||
|
||||
if 'media' in info:
|
||||
formats.extend([{
|
||||
|
65
youtube_dl/extractor/tweakers.py
Normal file
65
youtube_dl/extractor/tweakers.py
Normal file
@ -0,0 +1,65 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
xpath_text,
|
||||
xpath_with_ns,
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
)
|
||||
|
||||
|
||||
class TweakersIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://tweakers\.net/video/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://tweakers.net/video/9926/new-nintendo-3ds-xl-op-alle-fronten-beter.html',
|
||||
'md5': '1b5afa817403bb5baa08359dca31e6df',
|
||||
'info_dict': {
|
||||
'id': '9926',
|
||||
'ext': 'mp4',
|
||||
'title': 'New Nintendo 3DS XL - Op alle fronten beter',
|
||||
'description': 'md5:f97324cc71e86e11c853f0763820e3ba',
|
||||
'thumbnail': 're:^https?://.*\.jpe?g$',
|
||||
'duration': 386,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
playlist = self._download_xml(
|
||||
'https://tweakers.net/video/s1playlist/%s/playlist.xspf' % video_id,
|
||||
video_id)
|
||||
|
||||
NS_MAP = {
|
||||
'xspf': 'http://xspf.org/ns/0/',
|
||||
's1': 'http://static.streamone.nl/player/ns/0',
|
||||
}
|
||||
|
||||
track = playlist.find(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP))
|
||||
|
||||
title = xpath_text(
|
||||
track, xpath_with_ns('./xspf:title', NS_MAP), 'title')
|
||||
description = xpath_text(
|
||||
track, xpath_with_ns('./xspf:annotation', NS_MAP), 'description')
|
||||
thumbnail = xpath_text(
|
||||
track, xpath_with_ns('./xspf:image', NS_MAP), 'thumbnail')
|
||||
duration = float_or_none(
|
||||
xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'),
|
||||
1000)
|
||||
|
||||
formats = [{
|
||||
'url': location.text,
|
||||
'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
|
||||
'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
|
||||
'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
|
||||
} for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
@ -9,6 +9,7 @@ from ..compat import (
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
@ -192,9 +193,29 @@ class VevoIE(InfoExtractor):
|
||||
# Download via HLS API
|
||||
formats.extend(self._download_api_formats(video_id))
|
||||
|
||||
# Download SMIL
|
||||
smil_blocks = sorted((
|
||||
f for f in video_info['videoVersions']
|
||||
if f['sourceType'] == 13),
|
||||
key=lambda f: f['version'])
|
||||
smil_url = '%s/Video/V2/VFILE/%s/%sr.smil' % (
|
||||
self._SMIL_BASE_URL, video_id, video_id.lower())
|
||||
if smil_blocks:
|
||||
smil_url_m = self._search_regex(
|
||||
r'url="([^"]+)"', smil_blocks[-1]['data'], 'SMIL URL',
|
||||
default=None)
|
||||
if smil_url_m is not None:
|
||||
smil_url = smil_url_m
|
||||
if smil_url:
|
||||
smil_xml = self._download_webpage(
|
||||
smil_url, video_id, 'Downloading SMIL info', fatal=False)
|
||||
if smil_xml:
|
||||
formats.extend(self._formats_from_smil(smil_xml))
|
||||
|
||||
self._sort_formats(formats)
|
||||
timestamp_ms = int(self._search_regex(
|
||||
r'/Date\((\d+)\)/', video_info['launchDate'], 'launch date'))
|
||||
timestamp_ms = int_or_none(self._search_regex(
|
||||
r'/Date\((\d+)\)/',
|
||||
video_info['launchDate'], 'launch date', fatal=False))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
89
youtube_dl/extractor/wsj.py
Normal file
89
youtube_dl/extractor/wsj.py
Normal file
@ -0,0 +1,89 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
unified_strdate,
|
||||
)
|
||||
|
||||
|
||||
class WSJIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://video-api\.wsj\.com/api-video/player/iframe\.html\?guid=(?P<id>[a-zA-Z0-9-]+)'
|
||||
IE_DESC = 'Wall Street Journal'
|
||||
_TEST = {
|
||||
'url': 'http://video-api.wsj.com/api-video/player/iframe.html?guid=1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
|
||||
'md5': '9747d7a6ebc2f4df64b981e1dde9efa9',
|
||||
'info_dict': {
|
||||
'id': '1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20150202',
|
||||
'uploader_id': 'bbright',
|
||||
'creator': 'bbright',
|
||||
'categories': list, # a long list
|
||||
'duration': 90,
|
||||
'title': 'Bills Coach Rex Ryan Updates His Old Jets Tattoo',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
bitrates = [128, 174, 264, 320, 464, 664, 1264]
|
||||
api_url = (
|
||||
'http://video-api.wsj.com/api-video/find_all_videos.asp?'
|
||||
'type=guid&count=1&query=%s&'
|
||||
'fields=hls,adZone,thumbnailList,guid,state,secondsUntilStartTime,'
|
||||
'author,description,name,linkURL,videoStillURL,duration,videoURL,'
|
||||
'adCategory,catastrophic,linkShortURL,doctypeID,youtubeID,'
|
||||
'titletag,rssURL,wsj-section,wsj-subsection,allthingsd-section,'
|
||||
'allthingsd-subsection,sm-section,sm-subsection,provider,'
|
||||
'formattedCreationDate,keywords,keywordsOmniture,column,editor,'
|
||||
'emailURL,emailPartnerID,showName,omnitureProgramName,'
|
||||
'omnitureVideoFormat,linkRelativeURL,touchCastID,'
|
||||
'omniturePublishDate,%s') % (
|
||||
video_id, ','.join('video%dkMP4Url' % br for br in bitrates))
|
||||
info = self._download_json(api_url, video_id)['items'][0]
|
||||
|
||||
# Thumbnails are conveniently in the correct format already
|
||||
thumbnails = info.get('thumbnailList')
|
||||
creator = info.get('author')
|
||||
uploader_id = info.get('editor')
|
||||
categories = info.get('keywords')
|
||||
duration = int_or_none(info.get('duration'))
|
||||
upload_date = unified_strdate(
|
||||
info.get('formattedCreationDate'), day_first=False)
|
||||
title = info.get('name', info.get('titletag'))
|
||||
|
||||
formats = [{
|
||||
'format_id': 'f4m',
|
||||
'format_note': 'f4m (meta URL)',
|
||||
'url': info['videoURL'],
|
||||
}]
|
||||
if info.get('hls'):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
info['hls'], video_id, ext='mp4',
|
||||
preference=0, entry_protocol='m3u8_native'))
|
||||
for br in bitrates:
|
||||
field = 'video%dkMP4Url' % br
|
||||
if info.get(field):
|
||||
formats.append({
|
||||
'format_id': 'mp4-%d' % br,
|
||||
'container': 'mp4',
|
||||
'tbr': br,
|
||||
'url': info[field],
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'thumbnails': thumbnails,
|
||||
'creator': creator,
|
||||
'uploader_id': uploader_id,
|
||||
'duration': duration,
|
||||
'upload_date': upload_date,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'categories': categories,
|
||||
}
|
@ -511,8 +511,9 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
|
||||
metadata['artist'] = info['uploader_id']
|
||||
if info.get('description') is not None:
|
||||
metadata['description'] = info['description']
|
||||
metadata['comment'] = info['description']
|
||||
if info.get('webpage_url') is not None:
|
||||
metadata['comment'] = info['webpage_url']
|
||||
metadata['purl'] = info['webpage_url']
|
||||
|
||||
if not metadata:
|
||||
self._downloader.to_screen('[ffmpeg] There isn\'t any metadata to add')
|
||||
|
@ -701,7 +701,7 @@ def unified_strdate(date_str, day_first=True):
|
||||
# %z (UTC offset) is only supported in python>=3.2
|
||||
date_str = re.sub(r' ?(\+|-)[0-9]{2}:?[0-9]{2}$', '', date_str)
|
||||
# Remove AM/PM + timezone
|
||||
date_str = re.sub(r'(?i)\s*(?:AM|PM)\s+[A-Z]+', '', date_str)
|
||||
date_str = re.sub(r'(?i)\s*(?:AM|PM)(?:\s+[A-Z]+)?', '', date_str)
|
||||
|
||||
format_expressions = [
|
||||
'%d %B %Y',
|
||||
@ -1275,7 +1275,10 @@ def parse_duration(s):
|
||||
(?P<only_hours>[0-9.]+)\s*(?:hours?)|
|
||||
|
||||
(?:
|
||||
(?:(?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*)?
|
||||
(?:
|
||||
(?:(?P<days>[0-9]+)\s*(?:[:d]|days?)\s*)?
|
||||
(?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*
|
||||
)?
|
||||
(?P<mins>[0-9]+)\s*(?:[:m]|mins?|minutes?)\s*
|
||||
)?
|
||||
(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*(?:s|secs?|seconds?)?
|
||||
@ -1293,6 +1296,8 @@ def parse_duration(s):
|
||||
res += int(m.group('mins')) * 60
|
||||
if m.group('hours'):
|
||||
res += int(m.group('hours')) * 60 * 60
|
||||
if m.group('days'):
|
||||
res += int(m.group('days')) * 24 * 60 * 60
|
||||
if m.group('ms'):
|
||||
res += float(m.group('ms'))
|
||||
return res
|
||||
@ -1543,7 +1548,7 @@ def js_to_json(code):
|
||||
res = re.sub(r'''(?x)
|
||||
"(?:[^"\\]*(?:\\\\|\\")?)*"|
|
||||
'(?:[^'\\]*(?:\\\\|\\')?)*'|
|
||||
[a-zA-Z_][a-zA-Z_0-9]*
|
||||
[a-zA-Z_][.a-zA-Z_0-9]*
|
||||
''', fix_kv, code)
|
||||
res = re.sub(r',(\s*\])', lambda m: m.group(1), res)
|
||||
return res
|
||||
|
@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2015.02.02.1'
|
||||
__version__ = '2015.02.06'
|
||||
|
Reference in New Issue
Block a user