Compare commits
75 Commits
2016.09.04
...
2016.09.11
Author | SHA1 | Date | |
---|---|---|---|
|
0307d6fba6 | ||
|
fc150cba1d | ||
|
d667ab7fad | ||
|
eb87d4545a | ||
|
1c81476cbb | ||
|
bc9186c882 | ||
|
6599c72527 | ||
|
6bb05b32a9 | ||
|
fea74acad8 | ||
|
f01115c933 | ||
|
2cdbc06a1f | ||
|
2cb93afcd8 | ||
|
bfcda07a27 | ||
|
001a5fd3d7 | ||
|
1e35999c1e | ||
|
2512b17493 | ||
|
56c0ead4d3 | ||
|
7324243750 | ||
|
84a18e9b90 | ||
|
b29f842e0e | ||
|
f009fcac0d | ||
|
6c3affcb18 | ||
|
1e19ff2984 | ||
|
c6129feb7f | ||
|
bb5ebd4453 | ||
|
cb9cbd84ed | ||
|
4d5726b0d7 | ||
|
4614ad7b59 | ||
|
b717837190 | ||
|
2abad67e52 | ||
|
ad0e2b3359 | ||
|
37720844f6 | ||
|
6cfcb8ac36 | ||
|
7a979da8cb | ||
|
2fdc7b0e04 | ||
|
010d034fca | ||
|
02e552886f | ||
|
25042f7372 | ||
|
3f612f0767 | ||
|
17bf6e71cc | ||
|
881f35479d | ||
|
89f257d6e5 | ||
|
e78a5428b6 | ||
|
6656a82481 | ||
|
d7e794928d | ||
|
9c27188988 | ||
|
b84d311d53 | ||
|
f87feb4b68 | ||
|
2841bdcebb | ||
|
84b91dd4e3 | ||
|
92c9c2a88b | ||
|
9d54b02bae | ||
|
846d8b76a0 | ||
|
aa3f9fe695 | ||
|
8258f4457c | ||
|
948cd5b72d | ||
|
8d3737cda7 | ||
|
155bc674c4 | ||
|
c33c962adf | ||
|
bdcc046d12 | ||
|
a493f10208 | ||
|
f3eeaacb4e | ||
|
b4d6a85d60 | ||
|
0b36a96212 | ||
|
bc22a79694 | ||
|
340e31ca74 | ||
|
973dee491f | ||
|
1f85029d82 | ||
|
95be19d436 | ||
|
95843da529 | ||
|
abf2c79f95 | ||
|
b49ad71ce1 | ||
|
9127e1533d | ||
|
78e762d23c | ||
|
7be15d4097 |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.04.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.04.1**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.11.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.11.1**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.09.04.1
|
||||
[debug] youtube-dl version 2016.09.11.1
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
2
AUTHORS
2
AUTHORS
@@ -183,3 +183,5 @@ Petr Zvoníček
|
||||
Pratyush Singh
|
||||
Aleksander Nitecki
|
||||
Sebastian Blunt
|
||||
Matěj Cepl
|
||||
Xie Yanbo
|
||||
|
45
ChangeLog
45
ChangeLog
@@ -1,3 +1,48 @@
|
||||
version 2016.09.11.1
|
||||
|
||||
Extractors
|
||||
+ [tube8] Extract categories and tags (#10579)
|
||||
+ [pornhub] Extract categories and tags (#10499)
|
||||
* [openload] Temporary fix (#10408)
|
||||
+ [foxnews] Add support Fox News articles (#10598)
|
||||
* [viafree] Improve video id extraction (#10615)
|
||||
* [iwara] Fix extraction after relaunch (#10462, #3215)
|
||||
+ [tfo] Add extractor for tfo.org
|
||||
* [lrt] Fix audio extraction (#10566)
|
||||
* [9now] Fix extraction (#10561)
|
||||
+ [canalplus] Add support for c8.fr (#10577)
|
||||
* [newgrounds] Fix uploader extraction (#10584)
|
||||
+ [polskieradio:category] Add support for category lists (#10576)
|
||||
+ [ketnet] Add extractor for ketnet.be (#10343)
|
||||
+ [canvas] Add support for een.be (#10605)
|
||||
+ [telequebec] Add extractor for telequebec.tv (#1999)
|
||||
* [parliamentliveuk] Fix extraction (#9137)
|
||||
|
||||
|
||||
version 2016.09.08
|
||||
|
||||
Extractors
|
||||
+ [jwplatform] Extract height from format label
|
||||
+ [yahoo] Extract Brightcove Legacy Studio embeds (#9345)
|
||||
* [videomore] Fix extraction (#10592)
|
||||
* [foxgay] Fix extraction (#10480)
|
||||
+ [rmcdecouverte] Add extractor for rmcdecouverte.bfmtv.com (#9709)
|
||||
* [gamestar] Fix metadata extraction (#10479)
|
||||
* [puls4] Fix extraction (#10583)
|
||||
+ [cctv] Add extractor for CCTV and CNTV (#8153)
|
||||
+ [lci] Add extractor for lci.fr (#10573)
|
||||
+ [wat] Extract DASH formats
|
||||
+ [viafree] Improve video id detection (#10569)
|
||||
+ [trutv] Add extractor for trutv.com (#10519)
|
||||
+ [nick] Add support for nickelodeon.nl (#10559)
|
||||
+ [abcotvs:clips] Add support for clips.abcotvs.com
|
||||
+ [abcotvs] Add support for ABC Owned Television Stations sites (#9551)
|
||||
+ [miaopai] Add extractor for miaopai.com (#10556)
|
||||
* [gamestar] Fix metadata extraction (#10479)
|
||||
+ [bilibili] Add support for episodes (#10190)
|
||||
+ [tvnoe] Add extractor for tvnoe.cz (#10524)
|
||||
|
||||
|
||||
version 2016.09.04.1
|
||||
|
||||
Core
|
||||
|
10
README.md
10
README.md
@@ -851,6 +851,16 @@ will download the complete `PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re` playlist and cre
|
||||
|
||||
youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
|
||||
|
||||
### Should I add `--hls-prefer-native` into my config?
|
||||
|
||||
When youtube-dl detects an HLS video, it can download it either with the built-in downloader or ffmpeg. Since many HLS streams are slightly invalid and ffmpeg/youtube-dl each handle some invalid cases better than the other, there is an option to switch the downloader if needed.
|
||||
|
||||
When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
|
||||
|
||||
In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](http://rg3.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
|
||||
|
||||
If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
|
||||
|
||||
### Can you add support for this anime video site, or site which shows current movies for free?
|
||||
|
||||
As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
|
||||
|
@@ -60,6 +60,9 @@ if ! type pandoc >/dev/null 2>/dev/null; then echo 'ERROR: pandoc is missing'; e
|
||||
if ! python3 -c 'import rsa' 2>/dev/null; then echo 'ERROR: python3-rsa is missing'; exit 1; fi
|
||||
if ! python3 -c 'import wheel' 2>/dev/null; then echo 'ERROR: wheel is missing'; exit 1; fi
|
||||
|
||||
read -p "Is ChangeLog up to date? (y/n) " -n 1
|
||||
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
|
||||
|
||||
/bin/echo -e "\n### First of all, testing..."
|
||||
make clean
|
||||
if $skip_tests ; then
|
||||
|
@@ -19,9 +19,10 @@
|
||||
- **9now.com.au**
|
||||
- **abc.net.au**
|
||||
- **abc.net.au:iview**
|
||||
- **Abc7News**
|
||||
- **abcnews**
|
||||
- **abcnews:video**
|
||||
- **abcotvs**: ABC Owned Television Stations
|
||||
- **abcotvs:clips**
|
||||
- **AcademicEarth:Course**
|
||||
- **acast**
|
||||
- **acast:channel**
|
||||
@@ -128,6 +129,7 @@
|
||||
- **CBSNews**: CBS News
|
||||
- **CBSNewsLiveVideo**: CBS News Live Videos
|
||||
- **CBSSports**
|
||||
- **CCTV**
|
||||
- **CDA**
|
||||
- **CeskaTelevize**
|
||||
- **channel9**: Channel 9
|
||||
@@ -245,7 +247,8 @@
|
||||
- **Formula1**
|
||||
- **FOX**
|
||||
- **Foxgay**
|
||||
- **FoxNews**: Fox News and Fox Business Video
|
||||
- **foxnews**: Fox News and Fox Business Video
|
||||
- **foxnews:article**
|
||||
- **foxnews:insider**
|
||||
- **FoxSports**
|
||||
- **france2.fr:generation-quoi**
|
||||
@@ -324,6 +327,7 @@
|
||||
- **ivi**: ivi.ru
|
||||
- **ivi:compilation**: ivi.ru compilations
|
||||
- **ivideon**: Ivideon TV
|
||||
- **Iwara**
|
||||
- **Izlesene**
|
||||
- **JeuxVideo**
|
||||
- **Jove**
|
||||
@@ -337,6 +341,7 @@
|
||||
- **KarriereVideos**
|
||||
- **keek**
|
||||
- **KeezMovies**
|
||||
- **Ketnet**
|
||||
- **KhanAcademy**
|
||||
- **KickStarter**
|
||||
- **KonserthusetPlay**
|
||||
@@ -352,6 +357,7 @@
|
||||
- **kuwo:song**: 酷我音乐
|
||||
- **la7.it**
|
||||
- **Laola1Tv**
|
||||
- **LCI**
|
||||
- **Lcp**
|
||||
- **LcpPlay**
|
||||
- **Le**: 乐视网
|
||||
@@ -390,6 +396,7 @@
|
||||
- **Metacritic**
|
||||
- **Mgoon**
|
||||
- **MGTV**: 芒果TV
|
||||
- **MiaoPai**
|
||||
- **Minhateca**
|
||||
- **MinistryGrid**
|
||||
- **Minoto**
|
||||
@@ -536,6 +543,7 @@
|
||||
- **podomatic**
|
||||
- **Pokemon**
|
||||
- **PolskieRadio**
|
||||
- **PolskieRadioCategory**
|
||||
- **PornCom**
|
||||
- **PornHd**
|
||||
- **PornHub**: PornHub and Thumbzilla
|
||||
@@ -576,6 +584,7 @@
|
||||
- **revision3:embed**
|
||||
- **RICE**
|
||||
- **RingTV**
|
||||
- **RMCDecouverte**
|
||||
- **RockstarGames**
|
||||
- **RoosterTeeth**
|
||||
- **RottenTomatoes**
|
||||
@@ -696,9 +705,11 @@
|
||||
- **Telecinco**: telecinco.es, cuatro.com and mediaset.es
|
||||
- **Telegraaf**
|
||||
- **TeleMB**
|
||||
- **TeleQuebec**
|
||||
- **TeleTask**
|
||||
- **Telewebion**
|
||||
- **TF1**
|
||||
- **TFO**
|
||||
- **TheIntercept**
|
||||
- **ThePlatform**
|
||||
- **ThePlatformFeed**
|
||||
@@ -720,7 +731,7 @@
|
||||
- **ToypicsUser**: Toypics user profile
|
||||
- **TrailerAddict** (Currently broken)
|
||||
- **Trilulilu**
|
||||
- **trollvids**
|
||||
- **TruTV**
|
||||
- **Tube8**
|
||||
- **TubiTv**
|
||||
- **tudou**
|
||||
@@ -742,6 +753,7 @@
|
||||
- **TVCArticle**
|
||||
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||
- **tvland.com**
|
||||
- **TVNoe**
|
||||
- **tvp**: Telewizja Polska
|
||||
- **tvp:embed**: Telewizja Polska
|
||||
- **tvp:series**
|
||||
|
@@ -100,6 +100,7 @@ class ABCIViewIE(InfoExtractor):
|
||||
IE_NAME = 'abc.net.au:iview'
|
||||
_VALID_URL = r'https?://iview\.abc\.net\.au/programs/[^/]+/(?P<id>[^/?#]+)'
|
||||
|
||||
# ABC iview programs are normally available for 14 days only.
|
||||
_TESTS = [{
|
||||
'url': 'http://iview.abc.net.au/programs/gardening-australia/FA1505V024S00',
|
||||
'md5': '979d10b2939101f0d27a06b79edad536',
|
||||
@@ -112,6 +113,7 @@ class ABCIViewIE(InfoExtractor):
|
||||
'uploader_id': 'abc1',
|
||||
'timestamp': 1471719600,
|
||||
},
|
||||
'skip': 'Video gone',
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@@ -12,7 +12,7 @@ from ..compat import compat_urlparse
|
||||
|
||||
class AbcNewsVideoIE(AMPIE):
|
||||
IE_NAME = 'abcnews:video'
|
||||
_VALID_URL = 'http://abcnews.go.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://abcnews\.go\.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932',
|
||||
@@ -49,7 +49,7 @@ class AbcNewsVideoIE(AMPIE):
|
||||
|
||||
class AbcNewsIE(InfoExtractor):
|
||||
IE_NAME = 'abcnews'
|
||||
_VALID_URL = 'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
|
||||
|
@@ -1,13 +1,19 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_iso8601
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class Abc7NewsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://abc7news\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
|
||||
class ABCOTVSIE(InfoExtractor):
|
||||
IE_NAME = 'abcotvs'
|
||||
IE_DESC = 'ABC Owned Television Stations'
|
||||
_VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
|
||||
@@ -15,7 +21,7 @@ class Abc7NewsIE(InfoExtractor):
|
||||
'id': '472581',
|
||||
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
|
||||
'ext': 'mp4',
|
||||
'title': 'East Bay museum celebrates history of synthesized music',
|
||||
'title': 'East Bay museum celebrates vintage synthesizers',
|
||||
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'timestamp': 1421123075,
|
||||
@@ -41,7 +47,7 @@ class Abc7NewsIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
m3u8 = self._html_search_meta(
|
||||
'contentURL', webpage, 'm3u8 url', fatal=True)
|
||||
'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0]
|
||||
|
||||
formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
|
||||
self._sort_formats(formats)
|
||||
@@ -66,3 +72,41 @@ class Abc7NewsIE(InfoExtractor):
|
||||
'uploader': uploader,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class ABCOTVSClipsIE(InfoExtractor):
|
||||
IE_NAME = 'abcotvs:clips'
|
||||
_VALID_URL = r'https?://clips\.abcotvs\.com/(?:[^/]+/)*video/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://clips.abcotvs.com/kabc/video/214814',
|
||||
'info_dict': {
|
||||
'id': '214814',
|
||||
'ext': 'mp4',
|
||||
'title': 'SpaceX launch pad explosion destroys rocket, satellite',
|
||||
'description': 'md5:9f186e5ad8f490f65409965ee9c7be1b',
|
||||
'upload_date': '20160901',
|
||||
'timestamp': 1472756695,
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_data = self._download_json('https://clips.abcotvs.com/vogo/video/getByIds?ids=' + video_id, video_id)['results'][0]
|
||||
title = video_data['title']
|
||||
formats = self._extract_m3u8_formats(
|
||||
video_data['videoURL'].split('?')[0], video_id, 'mp4')
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': video_data.get('description'),
|
||||
'thumbnail': video_data.get('thumbnailURL'),
|
||||
'duration': int_or_none(video_data.get('duration')),
|
||||
'timestamp': int_or_none(video_data.get('pubDate')),
|
||||
'formats': formats,
|
||||
}
|
@@ -238,7 +238,7 @@ class ARDMediathekIE(InfoExtractor):
|
||||
|
||||
|
||||
class ARDIE(InfoExtractor):
|
||||
_VALID_URL = '(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
||||
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
|
||||
'md5': 'd216c3a86493f9322545e045ddc3eb35',
|
||||
|
@@ -10,11 +10,12 @@ from ..utils import (
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
unified_timestamp,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class BiliBiliIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||
@@ -77,6 +78,17 @@ class BiliBiliIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['upload time'],
|
||||
}, {
|
||||
'url': 'http://bangumi.bilibili.com/anime/v/40068',
|
||||
'md5': '08d539a0884f3deb7b698fb13ba69696',
|
||||
'info_dict': {
|
||||
'id': '40068',
|
||||
'ext': 'mp4',
|
||||
'duration': 1402.357,
|
||||
'title': '混沌武士 : 第7集 四面楚歌 A Risky Racket',
|
||||
'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
|
||||
'thumbnail': 're:^http?://.+\.jpg',
|
||||
},
|
||||
}]
|
||||
|
||||
_APP_KEY = '6f90a59ac58a4123'
|
||||
@@ -84,13 +96,19 @@ class BiliBiliIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
cid = compat_parse_qs(self._search_regex(
|
||||
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||
webpage, 'player parameters'))['cid'][0]
|
||||
if 'anime/v' not in url:
|
||||
cid = compat_parse_qs(self._search_regex(
|
||||
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||
webpage, 'player parameters'))['cid'][0]
|
||||
else:
|
||||
js = self._download_json(
|
||||
'http://bangumi.bilibili.com/web_api/get_source', video_id,
|
||||
data=urlencode_postdata({'episode_id': video_id}),
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'})
|
||||
cid = js['result']['cid']
|
||||
|
||||
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
|
||||
sign = hashlib.md5((payload + self._BILIBILI_KEY).encode('utf-8')).hexdigest()
|
||||
@@ -125,6 +143,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
description = self._html_search_meta('description', webpage)
|
||||
timestamp = unified_timestamp(self._html_search_regex(
|
||||
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', fatal=False))
|
||||
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
|
||||
|
||||
# TODO 'view_count' requires deobfuscating Javascript
|
||||
info = {
|
||||
@@ -132,7 +151,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
'title': title,
|
||||
'description': description,
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': self._html_search_meta('thumbnailUrl', webpage),
|
||||
'thumbnail': thumbnail,
|
||||
'duration': float_or_none(video_info.get('timelength'), scale=1000),
|
||||
}
|
||||
|
||||
|
@@ -23,6 +23,7 @@ class CanalplusIE(InfoExtractor):
|
||||
(?:(?:www|m)\.)?canalplus\.fr|
|
||||
(?:www\.)?piwiplus\.fr|
|
||||
(?:www\.)?d8\.tv|
|
||||
(?:www\.)?c8\.fr|
|
||||
(?:www\.)?d17\.tv|
|
||||
(?:www\.)?itele\.fr
|
||||
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
|
||||
@@ -35,6 +36,7 @@ class CanalplusIE(InfoExtractor):
|
||||
'canalplus': 'cplus',
|
||||
'piwiplus': 'teletoon',
|
||||
'd8': 'd8',
|
||||
'c8': 'd8',
|
||||
'd17': 'd17',
|
||||
'itele': 'itele',
|
||||
}
|
||||
|
@@ -1,11 +1,13 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
|
||||
|
||||
class CanvasIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?canvas\.be/video/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<site_id>canvas|een)\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.canvas.be/video/de-afspraak/najaar-2015/de-afspraak-veilt-voor-de-warmste-week',
|
||||
'md5': 'ea838375a547ac787d4064d8c7860a6c',
|
||||
@@ -38,22 +40,42 @@ class CanvasIE(InfoExtractor):
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.een.be/sorry-voor-alles/herbekijk-sorry-voor-alles',
|
||||
'info_dict': {
|
||||
'id': 'mz-ast-11a587f8-b921-4266-82e2-0bce3e80d07f',
|
||||
'display_id': 'herbekijk-sorry-voor-alles',
|
||||
'ext': 'mp4',
|
||||
'title': 'Herbekijk Sorry voor alles',
|
||||
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'duration': 3788.06,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.canvas.be/check-point/najaar-2016/de-politie-uw-vriend',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
site_id, display_id = mobj.group('site_id'), mobj.group('id')
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
title = self._search_regex(
|
||||
title = (self._search_regex(
|
||||
r'<h1[^>]+class="video__body__header__title"[^>]*>(.+?)</h1>',
|
||||
webpage, 'title', default=None) or self._og_search_title(webpage)
|
||||
webpage, 'title', default=None) or self._og_search_title(
|
||||
webpage)).strip()
|
||||
|
||||
video_id = self._html_search_regex(
|
||||
r'data-video=(["\'])(?P<id>.+?)\1', webpage, 'video id', group='id')
|
||||
|
||||
data = self._download_json(
|
||||
'https://mediazone.vrt.be/api/v1/canvas/assets/%s' % video_id, display_id)
|
||||
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
||||
% (site_id, video_id), display_id)
|
||||
|
||||
formats = []
|
||||
for target in data['targetUrls']:
|
||||
|
@@ -30,7 +30,7 @@ class CartoonNetworkIE(TurnerBaseIE):
|
||||
return self._extract_cvp_info(
|
||||
'http://www.cartoonnetwork.com/video-seo-svc/episodeservices/getCvpPlaylist?networkName=CN2&' + query, video_id, {
|
||||
'secure': {
|
||||
'media_src': 'http://apple-secure.cdn.turner.com/toon/big',
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big',
|
||||
'tokenizer_src': 'http://www.cartoonnetwork.com/cntv/mvpd/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
})
|
||||
|
53
youtube_dl/extractor/cctv.py
Normal file
53
youtube_dl/extractor/cctv.py
Normal file
@@ -0,0 +1,53 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
|
||||
|
||||
class CCTVIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
|
||||
(?:
|
||||
cctv\.(?:com|cn)|
|
||||
cntv\.cn
|
||||
)/
|
||||
(?:
|
||||
video/[^/]+/(?P<id>[0-9a-f]{32})|
|
||||
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
|
||||
)'''
|
||||
_TESTS = [{
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'md5': '819c7b49fc3927d529fb4cd555621823',
|
||||
'info_dict': {
|
||||
'id': '454368eb19ad44a1925bf1eb96140a61',
|
||||
'ext': 'mp4',
|
||||
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
|
||||
'only_matching': True
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, display_id = re.match(self._VALID_URL, url).groups()
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
|
||||
webpage, 'video_id')
|
||||
api_data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
|
||||
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': api_data['title'],
|
||||
'formats': self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
|
||||
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
|
||||
}
|
@@ -394,7 +394,7 @@ class DailymotionUserIE(DailymotionPlaylistIE):
|
||||
|
||||
|
||||
class DailymotionCloudIE(DailymotionBaseInfoExtractor):
|
||||
_VALID_URL_PREFIX = r'http://api\.dmcloud\.net/(?:player/)?embed/'
|
||||
_VALID_URL_PREFIX = r'https?://api\.dmcloud\.net/(?:player/)?embed/'
|
||||
_VALID_URL = r'%s[^/]+/(?P<id>[^/?]+)' % _VALID_URL_PREFIX
|
||||
_VALID_EMBED_URL = r'%s[^/]+/[^\'"]+' % _VALID_URL_PREFIX
|
||||
|
||||
|
@@ -5,11 +5,14 @@ from .abc import (
|
||||
ABCIE,
|
||||
ABCIViewIE,
|
||||
)
|
||||
from .abc7news import Abc7NewsIE
|
||||
from .abcnews import (
|
||||
AbcNewsIE,
|
||||
AbcNewsVideoIE,
|
||||
)
|
||||
from .abcotvs import (
|
||||
ABCOTVSIE,
|
||||
ABCOTVSClipsIE,
|
||||
)
|
||||
from .academicearth import AcademicEarthCourseIE
|
||||
from .acast import (
|
||||
ACastIE,
|
||||
@@ -143,6 +146,7 @@ from .cbsnews import (
|
||||
)
|
||||
from .cbssports import CBSSportsIE
|
||||
from .ccc import CCCIE
|
||||
from .cctv import CCTVIE
|
||||
from .cda import CDAIE
|
||||
from .ceskatelevize import CeskaTelevizeIE
|
||||
from .channel9 import Channel9IE
|
||||
@@ -289,6 +293,7 @@ from .fox import FOXIE
|
||||
from .foxgay import FoxgayIE
|
||||
from .foxnews import (
|
||||
FoxNewsIE,
|
||||
FoxNewsArticleIE,
|
||||
FoxNewsInsiderIE,
|
||||
)
|
||||
from .foxsports import FoxSportsIE
|
||||
@@ -391,6 +396,7 @@ from .ivi import (
|
||||
IviCompilationIE
|
||||
)
|
||||
from .ivideon import IvideonIE
|
||||
from .iwara import IwaraIE
|
||||
from .izlesene import IzleseneIE
|
||||
from .jeuxvideo import JeuxVideoIE
|
||||
from .jove import JoveIE
|
||||
@@ -403,6 +409,7 @@ from .kankan import KankanIE
|
||||
from .karaoketv import KaraoketvIE
|
||||
from .karrierevideos import KarriereVideosIE
|
||||
from .keezmovies import KeezMoviesIE
|
||||
from .ketnet import KetnetIE
|
||||
from .khanacademy import KhanAcademyIE
|
||||
from .kickstarter import KickStarterIE
|
||||
from .keek import KeekIE
|
||||
@@ -421,6 +428,7 @@ from .kuwo import (
|
||||
)
|
||||
from .la7 import LA7IE
|
||||
from .laola1tv import Laola1TvIE
|
||||
from .lci import LCIIE
|
||||
from .lcp import (
|
||||
LcpPlayIE,
|
||||
LcpIE,
|
||||
@@ -471,6 +479,7 @@ from .metacafe import MetacafeIE
|
||||
from .metacritic import MetacriticIE
|
||||
from .mgoon import MgoonIE
|
||||
from .mgtv import MGTVIE
|
||||
from .miaopai import MiaoPaiIE
|
||||
from .microsoftvirtualacademy import (
|
||||
MicrosoftVirtualAcademyIE,
|
||||
MicrosoftVirtualAcademyCourseIE,
|
||||
@@ -664,7 +673,10 @@ from .pluralsight import (
|
||||
)
|
||||
from .podomatic import PodomaticIE
|
||||
from .pokemon import PokemonIE
|
||||
from .polskieradio import PolskieRadioIE
|
||||
from .polskieradio import (
|
||||
PolskieRadioIE,
|
||||
PolskieRadioCategoryIE,
|
||||
)
|
||||
from .porn91 import Porn91IE
|
||||
from .porncom import PornComIE
|
||||
from .pornhd import PornHdIE
|
||||
@@ -718,6 +730,7 @@ from .revision3 import (
|
||||
)
|
||||
from .rice import RICEIE
|
||||
from .ringtv import RingTVIE
|
||||
from .rmcdecouverte import RMCDecouverteIE
|
||||
from .ro220 import Ro220IE
|
||||
from .rockstargames import RockstarGamesIE
|
||||
from .roosterteeth import RoosterTeethIE
|
||||
@@ -854,10 +867,12 @@ from .telebruxelles import TeleBruxellesIE
|
||||
from .telecinco import TelecincoIE
|
||||
from .telegraaf import TelegraafIE
|
||||
from .telemb import TeleMBIE
|
||||
from .telequebec import TeleQuebecIE
|
||||
from .teletask import TeleTaskIE
|
||||
from .telewebion import TelewebionIE
|
||||
from .testurl import TestURLIE
|
||||
from .tf1 import TF1IE
|
||||
from .tfo import TFOIE
|
||||
from .theintercept import TheInterceptIE
|
||||
from .theplatform import (
|
||||
ThePlatformIE,
|
||||
@@ -886,7 +901,7 @@ from .toutv import TouTvIE
|
||||
from .toypics import ToypicsUserIE, ToypicsIE
|
||||
from .traileraddict import TrailerAddictIE
|
||||
from .trilulilu import TriluliluIE
|
||||
from .trollvids import TrollvidsIE
|
||||
from .trutv import TruTVIE
|
||||
from .tube8 import Tube8IE
|
||||
from .tubitv import TubiTvIE
|
||||
from .tudou import (
|
||||
@@ -916,6 +931,7 @@ from .tvc import (
|
||||
)
|
||||
from .tvigle import TvigleIE
|
||||
from .tvland import TVLandIE
|
||||
from .tvnoe import TVNoeIE
|
||||
from .tvp import (
|
||||
TVPEmbedIE,
|
||||
TVPIE,
|
||||
|
@@ -1,18 +1,24 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import itertools
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
get_element_by_id,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
class FoxgayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml'
|
||||
_TEST = {
|
||||
'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml',
|
||||
'md5': '80d72beab5d04e1655a56ad37afe6841',
|
||||
'md5': '344558ccfea74d33b7adbce22e577f54',
|
||||
'info_dict': {
|
||||
'id': '2582',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:6122f7ae0fc6b21ebdf59c5e083ce25a',
|
||||
'description': 'md5:5e51dc4405f1fd315f7927daed2ce5cf',
|
||||
'title': 'Fuck Turkish-style',
|
||||
'description': 'md5:6ae2d9486921891efe89231ace13ffdf',
|
||||
'age_limit': 18,
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
},
|
||||
@@ -22,27 +28,35 @@ class FoxgayIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>(?P<title>.*?)</title>',
|
||||
webpage, 'title', fatal=False)
|
||||
description = self._html_search_regex(
|
||||
r'<div class="ico_desc"><h2>(?P<description>.*?)</h2>',
|
||||
webpage, 'description', fatal=False)
|
||||
title = remove_end(self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title'), ' - Foxgay.com')
|
||||
description = get_element_by_id('inf_tit', webpage)
|
||||
|
||||
# The default user-agent with foxgay cookies leads to pages without videos
|
||||
self._downloader.cookiejar.clear('.foxgay.com')
|
||||
# Find the URL for the iFrame which contains the actual video.
|
||||
iframe_url = self._html_search_regex(
|
||||
r'<iframe[^>]+src=([\'"])(?P<url>[^\'"]+)\1', webpage,
|
||||
'video frame', group='url')
|
||||
iframe = self._download_webpage(
|
||||
self._html_search_regex(r'iframe src="(?P<frame>.*?)"', webpage, 'video frame'),
|
||||
video_id)
|
||||
video_url = self._html_search_regex(
|
||||
r"v_path = '(?P<vid>http://.*?)'", iframe, 'url')
|
||||
thumb_url = self._html_search_regex(
|
||||
r"t_path = '(?P<thumb>http://.*?)'", iframe, 'thumbnail', fatal=False)
|
||||
iframe_url, video_id, headers={'User-Agent': 'curl/7.50.1'},
|
||||
note='Downloading video frame')
|
||||
video_data = self._parse_json(self._search_regex(
|
||||
r'video_data\s*=\s*([^;]+);', iframe, 'video data'), video_id)
|
||||
|
||||
formats = [{
|
||||
'url': source,
|
||||
'height': resolution,
|
||||
} for source, resolution in zip(
|
||||
video_data['sources'], video_data.get('resolutions', itertools.repeat(None)))]
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': video_url,
|
||||
'formats': formats,
|
||||
'description': description,
|
||||
'thumbnail': thumb_url,
|
||||
'thumbnail': video_data.get('act_vid', {}).get('thumb'),
|
||||
'age_limit': 18,
|
||||
}
|
||||
|
@@ -7,6 +7,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class FoxNewsIE(AMPIE):
|
||||
IE_NAME = 'foxnews'
|
||||
IE_DESC = 'Fox News and Fox Business Video'
|
||||
_VALID_URL = r'https?://(?P<host>video\.(?:insider\.)?fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
|
||||
_TESTS = [
|
||||
@@ -66,6 +67,35 @@ class FoxNewsIE(AMPIE):
|
||||
return info
|
||||
|
||||
|
||||
class FoxNewsArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?foxnews\.com/(?!v)([^/]+/)+(?P<id>[a-z-]+)'
|
||||
IE_NAME = 'foxnews:article'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.foxnews.com/politics/2016/09/08/buzz-about-bud-clinton-camp-denies-claims-wore-earpiece-at-forum.html',
|
||||
'md5': '62aa5a781b308fdee212ebb6f33ae7ef',
|
||||
'info_dict': {
|
||||
'id': '5116295019001',
|
||||
'ext': 'mp4',
|
||||
'title': 'Trump and Clinton asked to defend positions on Iraq War',
|
||||
'description': 'Veterans react on \'The Kelly File\'',
|
||||
'timestamp': 1473299755,
|
||||
'upload_date': '20160908',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
video_id = self._html_search_regex(
|
||||
r'data-video-id=([\'"])(?P<id>[^\'"]+)\1',
|
||||
webpage, 'video ID', group='id')
|
||||
return self.url_result(
|
||||
'http://video.foxnews.com/v/' + video_id,
|
||||
FoxNewsIE.ie_key())
|
||||
|
||||
|
||||
class FoxNewsInsiderIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://insider\.foxnews\.com/([^/]+/)+(?P<id>[a-z-]+)'
|
||||
IE_NAME = 'foxnews:insider'
|
||||
@@ -83,6 +113,10 @@ class FoxNewsInsiderIE(InfoExtractor):
|
||||
'upload_date': '20160825',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [FoxNewsIE.ie_key()],
|
||||
}
|
||||
|
||||
|
@@ -1,14 +1,10 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
str_to_int,
|
||||
unified_strdate,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
@@ -21,8 +17,9 @@ class GameStarIE(InfoExtractor):
|
||||
'id': '76110',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den vollständigen Trailer an.',
|
||||
'thumbnail': 'http://images.gamestar.de/images/idgwpgsgp/bdb/2494525/600x.jpg',
|
||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'timestamp': 1406542020,
|
||||
'upload_date': '20140728',
|
||||
'duration': 17
|
||||
}
|
||||
@@ -32,41 +29,27 @@ class GameStarIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
og_title = self._og_search_title(webpage)
|
||||
title = re.sub(r'\s*- Video (bei|-) GameStar\.de$', '', og_title)
|
||||
|
||||
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id
|
||||
|
||||
description = self._og_search_description(webpage).strip()
|
||||
|
||||
thumbnail = self._proto_relative_url(
|
||||
self._og_search_thumbnail(webpage), scheme='http:')
|
||||
|
||||
upload_date = unified_strdate(self._html_search_regex(
|
||||
r'<span style="float:left;font-size:11px;">Datum: ([0-9]+\.[0-9]+\.[0-9]+) ',
|
||||
webpage, 'upload_date', fatal=False))
|
||||
|
||||
duration = parse_duration(self._html_search_regex(
|
||||
r' Länge: ([0-9]+:[0-9]+)</span>', webpage, 'duration',
|
||||
fatal=False))
|
||||
|
||||
view_count = str_to_int(self._html_search_regex(
|
||||
r' Zuschauer: ([0-9\.]+) ', webpage,
|
||||
'view_count', fatal=False))
|
||||
# TODO: there are multiple ld+json objects in the webpage,
|
||||
# while _search_json_ld finds only the first one
|
||||
json_ld = self._parse_json(self._search_regex(
|
||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
||||
webpage, 'JSON-LD', group='json_ld'), video_id)
|
||||
info_dict = self._json_ld(json_ld, video_id)
|
||||
info_dict['title'] = remove_end(info_dict['title'], ' - GameStar')
|
||||
|
||||
view_count = json_ld.get('interactionCount')
|
||||
comment_count = int_or_none(self._html_search_regex(
|
||||
r'>Kommentieren \(([0-9]+)\)</a>', webpage, 'comment_count',
|
||||
r'([0-9]+) Kommentare</span>', webpage, 'comment_count',
|
||||
fatal=False))
|
||||
|
||||
return {
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': url,
|
||||
'ext': 'mp4',
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
'upload_date': upload_date,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
@@ -19,7 +19,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class GloboIE(InfoExtractor):
|
||||
_VALID_URL = '(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})'
|
||||
_VALID_URL = r'(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})'
|
||||
|
||||
_API_URL_TEMPLATE = 'http://api.globovideos.com/videos/%s/playlist'
|
||||
_SECURITY_URL_TEMPLATE = 'http://security.video.globo.com/videos/%s/hash?player=flash&version=17.0.0.132&resource_id=%s'
|
||||
@@ -396,7 +396,7 @@ class GloboIE(InfoExtractor):
|
||||
|
||||
|
||||
class GloboArticleIE(InfoExtractor):
|
||||
_VALID_URL = 'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/]+)(?:\.html)?'
|
||||
_VALID_URL = r'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/]+)(?:\.html)?'
|
||||
|
||||
_VIDEOID_REGEXES = [
|
||||
r'\bdata-video-id=["\'](\d{7,})',
|
||||
|
77
youtube_dl/extractor/iwara.py
Normal file
77
youtube_dl/extractor/iwara.py
Normal file
@@ -0,0 +1,77 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urllib_parse_urlparse
|
||||
from ..utils import remove_end
|
||||
|
||||
|
||||
class IwaraIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.|ecchi\.)?iwara\.tv/videos/(?P<id>[a-zA-Z0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://iwara.tv/videos/amVwUl1EHpAD9RD',
|
||||
'md5': '1d53866b2c514b23ed69e4352fdc9839',
|
||||
'info_dict': {
|
||||
'id': 'amVwUl1EHpAD9RD',
|
||||
'ext': 'mp4',
|
||||
'title': '【MMD R-18】ガールフレンド carry_me_off',
|
||||
'age_limit': 18,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://ecchi.iwara.tv/videos/Vb4yf2yZspkzkBO',
|
||||
'md5': '7e5f1f359cd51a027ba4a7b7710a50f0',
|
||||
'info_dict': {
|
||||
'id': '0B1LvuHnL-sRFNXB1WHNqbGw4SXc',
|
||||
'ext': 'mp4',
|
||||
'title': '[3D Hentai] Kyonyu Ã\x97 Genkai Ã\x97 Emaki Shinobi Girls.mp4',
|
||||
'age_limit': 18,
|
||||
},
|
||||
'add_ie': ['GoogleDrive'],
|
||||
}, {
|
||||
'url': 'http://www.iwara.tv/videos/nawkaumd6ilezzgq',
|
||||
'md5': '1d85f1e5217d2791626cff5ec83bb189',
|
||||
'info_dict': {
|
||||
'id': '6liAP9s2Ojc',
|
||||
'ext': 'mp4',
|
||||
'age_limit': 0,
|
||||
'title': '[MMD] Do It Again Ver.2 [1080p 60FPS] (Motion,Camera,Wav+DL)',
|
||||
'description': 'md5:590c12c0df1443d833fbebe05da8c47a',
|
||||
'upload_date': '20160910',
|
||||
'uploader': 'aMMDsork',
|
||||
'uploader_id': 'UCVOFyOSCyFkXTYYHITtqB7A',
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage, urlh = self._download_webpage_handle(url, video_id)
|
||||
|
||||
hostname = compat_urllib_parse_urlparse(urlh.geturl()).hostname
|
||||
# ecchi is 'sexy' in Japanese
|
||||
age_limit = 18 if hostname.split('.')[0] == 'ecchi' else 0
|
||||
|
||||
entries = self._parse_html5_media_entries(url, webpage, video_id)
|
||||
|
||||
if not entries:
|
||||
iframe_url = self._html_search_regex(
|
||||
r'<iframe[^>]+src=([\'"])(?P<url>[^\'"]+)\1',
|
||||
webpage, 'iframe URL', group='url')
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': iframe_url,
|
||||
'age_limit': age_limit,
|
||||
}
|
||||
|
||||
title = remove_end(self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title'), ' | Iwara')
|
||||
|
||||
info_dict = entries[0]
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'age_limit': age_limit,
|
||||
})
|
||||
|
||||
return info_dict
|
@@ -63,10 +63,17 @@ class JWPlatformBaseIE(InfoExtractor):
|
||||
'ext': ext,
|
||||
})
|
||||
else:
|
||||
height = int_or_none(source.get('height'))
|
||||
if height is None:
|
||||
# Often no height is provided but there is a label in
|
||||
# format like 1080p.
|
||||
height = int_or_none(self._search_regex(
|
||||
r'^(\d{3,})[pP]$', source.get('label') or '',
|
||||
'height', default=None))
|
||||
a_format = {
|
||||
'url': source_url,
|
||||
'width': int_or_none(source.get('width')),
|
||||
'height': int_or_none(source.get('height')),
|
||||
'height': height,
|
||||
'ext': ext,
|
||||
}
|
||||
if source_url.startswith('rtmp'):
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class KaraoketvIE(InfoExtractor):
|
||||
_VALID_URL = r'http://www.karaoketv.co.il/[^/]+/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://www\.karaoketv\.co\.il/[^/]+/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.karaoketv.co.il/%D7%A9%D7%99%D7%A8%D7%99_%D7%A7%D7%A8%D7%99%D7%95%D7%A7%D7%99/58356/%D7%90%D7%99%D7%96%D7%95%D7%9F',
|
||||
'info_dict': {
|
||||
|
52
youtube_dl/extractor/ketnet.py
Normal file
52
youtube_dl/extractor/ketnet.py
Normal file
@@ -0,0 +1,52 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class KetnetIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.ketnet.be/kijken/zomerse-filmpjes',
|
||||
'md5': 'd907f7b1814ef0fa285c0475d9994ed7',
|
||||
'info_dict': {
|
||||
'id': 'zomerse-filmpjes',
|
||||
'ext': 'mp4',
|
||||
'title': 'Gluur mee op de filmset en op Pennenzakkenrock',
|
||||
'description': 'Gluur mee met Ghost Rockers op de filmset',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.ketnet.be/kijken/karrewiet/uitzending-8-september-2016',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ketnet.be/achter-de-schermen/sien-repeteert-voor-stars-for-life',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
config = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)playerConfig\s*=\s*({.+?})\s*;', webpage,
|
||||
'player config'),
|
||||
video_id)
|
||||
|
||||
title = config['title']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
config['source']['hls'], video_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id='hls')
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': config.get('description'),
|
||||
'thumbnail': config.get('image'),
|
||||
'series': config.get('program'),
|
||||
'episode': config.get('episode'),
|
||||
'formats': formats,
|
||||
}
|
24
youtube_dl/extractor/lci.py
Normal file
24
youtube_dl/extractor/lci.py
Normal file
@@ -0,0 +1,24 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class LCIIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?lci\.fr/[^/]+/[\w-]+-(?P<id>\d+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.lci.fr/international/etats-unis-a-j-62-hillary-clinton-reste-sans-voix-2001679.html',
|
||||
'md5': '2fdb2538b884d4d695f9bd2bde137e6c',
|
||||
'info_dict': {
|
||||
'id': '13244802',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hillary Clinton et sa quinte de toux, en plein meeting',
|
||||
'description': 'md5:a4363e3a960860132f8124b62f4a01c9',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
wat_id = self._search_regex(r'data-watid=[\'"](\d+)', webpage, 'wat id')
|
||||
return self.url_result('wat:' + wat_id, 'Wat', wat_id)
|
@@ -1,8 +1,11 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
remove_end,
|
||||
@@ -12,8 +15,10 @@ from ..utils import (
|
||||
class LRTIE(InfoExtractor):
|
||||
IE_NAME = 'lrt.lt'
|
||||
_VALID_URL = r'https?://(?:www\.)?lrt\.lt/mediateka/irasas/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
# m3u8 download
|
||||
'url': 'http://www.lrt.lt/mediateka/irasas/54391/',
|
||||
'md5': 'fe44cf7e4ab3198055f2c598fc175cb0',
|
||||
'info_dict': {
|
||||
'id': '54391',
|
||||
'ext': 'mp4',
|
||||
@@ -23,20 +28,45 @@ class LRTIE(InfoExtractor):
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # m3u8 download
|
||||
}, {
|
||||
# direct mp3 download
|
||||
'url': 'http://www.lrt.lt/mediateka/irasas/1013074524/',
|
||||
'md5': '389da8ca3cad0f51d12bed0c844f6a0a',
|
||||
'info_dict': {
|
||||
'id': '1013074524',
|
||||
'ext': 'mp3',
|
||||
'title': 'Kita tema 2016-09-05 15:05',
|
||||
'description': 'md5:1b295a8fc7219ed0d543fc228c931fb5',
|
||||
'duration': 3008,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' - LRT')
|
||||
m3u8_url = self._search_regex(
|
||||
r'file\s*:\s*(["\'])(?P<url>.+?)\1\s*\+\s*location\.hash\.substring\(1\)',
|
||||
webpage, 'm3u8 url', group='url')
|
||||
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
|
||||
|
||||
formats = []
|
||||
for _, file_url in re.findall(
|
||||
r'file\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage):
|
||||
ext = determine_ext(file_url)
|
||||
if ext not in ('m3u8', 'mp3'):
|
||||
continue
|
||||
# mp3 served as m3u8 produces stuttered media file
|
||||
if ext == 'm3u8' and '.mp3' in file_url:
|
||||
continue
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
file_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
fatal=False))
|
||||
elif ext == 'mp3':
|
||||
formats.append({
|
||||
'url': file_url,
|
||||
'vcodec': 'none',
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
|
40
youtube_dl/extractor/miaopai.py
Normal file
40
youtube_dl/extractor/miaopai.py
Normal file
@@ -0,0 +1,40 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class MiaoPaiIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?miaopai\.com/show/(?P<id>[-A-Za-z0-9~_]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.miaopai.com/show/n~0hO7sfV1nBEw4Y29-Hqg__.htm',
|
||||
'md5': '095ed3f1cd96b821add957bdc29f845b',
|
||||
'info_dict': {
|
||||
'id': 'n~0hO7sfV1nBEw4Y29-Hqg__',
|
||||
'ext': 'mp4',
|
||||
'title': '西游记音乐会的秒拍视频',
|
||||
'thumbnail': 're:^https?://.*/n~0hO7sfV1nBEw4Y29-Hqg___m.jpg',
|
||||
}
|
||||
}
|
||||
|
||||
_USER_AGENT_IPAD = 'Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, headers={'User-Agent': self._USER_AGENT_IPAD})
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title')
|
||||
thumbnail = self._html_search_regex(
|
||||
r'<div[^>]+class=(?P<q1>[\'"]).*\bvideo_img\b.*(?P=q1)[^>]+data-url=(?P<q2>[\'"])(?P<url>[^\'"]+)(?P=q2)',
|
||||
webpage, 'thumbnail', fatal=False, group='url')
|
||||
videos = self._parse_html5_media_entries(url, webpage, video_id)
|
||||
info = videos[0]
|
||||
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
})
|
||||
return info
|
@@ -35,7 +35,8 @@ class MoeVideoIE(InfoExtractor):
|
||||
'height': 360,
|
||||
'duration': 179,
|
||||
'filesize': 17822500,
|
||||
}
|
||||
},
|
||||
'skip': 'Video has been removed',
|
||||
},
|
||||
{
|
||||
'url': 'http://playreplay.net/video/77107.7f325710a627383d40540d8e991a',
|
||||
|
@@ -1,15 +1,12 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class NewgroundsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?newgrounds\.com/(?:audio/listen|portal/view)/(?P<id>[0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.newgrounds.com/audio/listen/549479',
|
||||
'url': 'https://www.newgrounds.com/audio/listen/549479',
|
||||
'md5': 'fe6033d297591288fa1c1f780386f07a',
|
||||
'info_dict': {
|
||||
'id': '549479',
|
||||
@@ -18,7 +15,7 @@ class NewgroundsIE(InfoExtractor):
|
||||
'uploader': 'Burn7',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.newgrounds.com/portal/view/673111',
|
||||
'url': 'https://www.newgrounds.com/portal/view/673111',
|
||||
'md5': '3394735822aab2478c31b1004fe5e5bc',
|
||||
'info_dict': {
|
||||
'id': '673111',
|
||||
@@ -29,24 +26,20 @@ class NewgroundsIE(InfoExtractor):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
music_id = mobj.group('id')
|
||||
webpage = self._download_webpage(url, music_id)
|
||||
media_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, media_id)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>([^>]+)</title>', webpage, 'title')
|
||||
|
||||
uploader = self._html_search_regex(
|
||||
[r',"artist":"([^"]+)",', r'[\'"]owner[\'"]\s*:\s*[\'"]([^\'"]+)[\'"],'],
|
||||
webpage, 'uploader')
|
||||
r'Author\s*<a[^>]+>([^<]+)', webpage, 'uploader', fatal=False)
|
||||
|
||||
music_url_json_string = self._html_search_regex(
|
||||
r'({"url":"[^"]+"),', webpage, 'music url') + '}'
|
||||
music_url_json = json.loads(music_url_json_string)
|
||||
music_url = music_url_json['url']
|
||||
music_url = self._parse_json(self._search_regex(
|
||||
r'"url":("[^"]+"),', webpage, ''), media_id)
|
||||
|
||||
return {
|
||||
'id': music_id,
|
||||
'id': media_id,
|
||||
'title': title,
|
||||
'url': music_url,
|
||||
'uploader': uploader,
|
||||
|
@@ -69,13 +69,16 @@ class NickIE(MTVServicesInfoExtractor):
|
||||
|
||||
class NickDeIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'nick.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?nick\.de/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:nick\.de|nickelodeon\.nl)/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.nick.de/playlist/3773-top-videos/videos/episode/17306-zu-wasser-und-zu-land-rauchende-erdnusse',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nick.de/shows/342-icarly',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.nickelodeon.nl/shows/474-spongebob/videos/17403-een-kijkje-in-de-keuken-met-sandy-van-binnenuit',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@@ -44,7 +44,20 @@ class NineNowIE(InfoExtractor):
|
||||
page_data = self._parse_json(self._search_regex(
|
||||
r'window\.__data\s*=\s*({.*?});', webpage,
|
||||
'page data'), display_id)
|
||||
common_data = page_data.get('episode', {}).get('episode') or page_data.get('clip', {}).get('clip')
|
||||
|
||||
for kind in ('episode', 'clip'):
|
||||
current_key = page_data.get(kind, {}).get(
|
||||
'current%sKey' % kind.capitalize())
|
||||
if not current_key:
|
||||
continue
|
||||
cache = page_data.get(kind, {}).get('%sCache' % kind, {})
|
||||
if not cache:
|
||||
continue
|
||||
common_data = (cache.get(current_key) or list(cache.values())[0])[kind]
|
||||
break
|
||||
else:
|
||||
raise ExtractorError('Unable to find video data')
|
||||
|
||||
video_data = common_data['video']
|
||||
|
||||
if video_data.get('drm'):
|
||||
|
@@ -90,7 +90,7 @@ class OnetBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class OnetIE(OnetBaseIE):
|
||||
_VALID_URL = 'https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
IE_NAME = 'onet.tv'
|
||||
|
||||
_TEST = {
|
||||
|
@@ -60,7 +60,7 @@ class OpenloadIE(InfoExtractor):
|
||||
if j >= 33 and j <= 126:
|
||||
j = ((j + 14) % 94) + 33
|
||||
if idx == len(enc_data) - 1:
|
||||
j += 1
|
||||
j += 3
|
||||
video_url_chars += compat_chr(j)
|
||||
|
||||
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
|
||||
|
@@ -1,53 +1,40 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class ParliamentLiveUKIE(InfoExtractor):
|
||||
IE_NAME = 'parliamentlive.tv'
|
||||
IE_DESC = 'UK parliament videos'
|
||||
_VALID_URL = r'https?://www\.parliamentlive\.tv/Main/Player\.aspx\?(?:[^&]+&)*?meetingId=(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?parliamentlive\.tv/Event/Index/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.parliamentlive.tv/Main/Player.aspx?meetingId=15121&player=windowsmedia',
|
||||
'url': 'http://parliamentlive.tv/Event/Index/c1e9d44d-fd6c-4263-b50f-97ed26cc998b',
|
||||
'info_dict': {
|
||||
'id': '15121',
|
||||
'ext': 'asf',
|
||||
'title': 'hoc home affairs committee, 18 mar 2014.pm',
|
||||
'description': 'md5:033b3acdf83304cd43946b2d5e5798d1',
|
||||
'id': 'c1e9d44d-fd6c-4263-b50f-97ed26cc998b',
|
||||
'ext': 'mp4',
|
||||
'title': 'Home Affairs Committee',
|
||||
'uploader_id': 'FFMPEG-01',
|
||||
'timestamp': 1422696664,
|
||||
'upload_date': '20150131',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # Requires mplayer (mms)
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
asx_url = self._html_search_regex(
|
||||
r'embed.*?src="([^"]+)" name="MediaPlayer"', webpage,
|
||||
'metadata URL')
|
||||
asx = self._download_xml(asx_url, video_id, 'Downloading ASX metadata')
|
||||
video_url = asx.find('.//REF').attrib['HREF']
|
||||
|
||||
title = self._search_regex(
|
||||
r'''(?x)player\.setClipDetails\(
|
||||
(?:(?:[0-9]+|"[^"]+"),\s*){2}
|
||||
"([^"]+",\s*"[^"]+)"
|
||||
''',
|
||||
webpage, 'title').replace('", "', ', ')
|
||||
description = self._html_search_regex(
|
||||
r'(?s)<span id="MainContentPlaceHolder_CaptionsBlock_WitnessInfo">(.*?)</span>',
|
||||
webpage, 'description')
|
||||
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
'http://vodplayer.parliamentlive.tv/?mid=' + video_id, video_id)
|
||||
widget_config = self._parse_json(self._search_regex(
|
||||
r'kWidgetConfig\s*=\s*({.+});',
|
||||
webpage, 'kaltura widget config'), video_id)
|
||||
kaltura_url = 'kaltura:%s:%s' % (widget_config['wid'][1:], widget_config['entry_id'])
|
||||
event_title = self._download_json(
|
||||
'http://parliamentlive.tv/Event/GetShareVideo/' + video_id, video_id)['event']['title']
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'ext': 'asf',
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'title': event_title,
|
||||
'description': '',
|
||||
'url': kaltura_url,
|
||||
'ie_key': 'Kaltura',
|
||||
}
|
||||
|
@@ -1,14 +1,17 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import itertools
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urlparse
|
||||
)
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
int_or_none,
|
||||
strip_or_none,
|
||||
unified_timestamp,
|
||||
@@ -97,3 +100,81 @@ class PolskieRadioIE(InfoExtractor):
|
||||
description = strip_or_none(self._og_search_description(webpage))
|
||||
|
||||
return self.playlist_result(entries, playlist_id, title, description)
|
||||
|
||||
|
||||
class PolskieRadioCategoryIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?polskieradio\.pl/\d+(?:,[^/]+)?/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.polskieradio.pl/7/5102,HISTORIA-ZYWA',
|
||||
'info_dict': {
|
||||
'id': '5102',
|
||||
'title': 'HISTORIA ŻYWA',
|
||||
},
|
||||
'playlist_mincount': 38,
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/7/4807',
|
||||
'info_dict': {
|
||||
'id': '4807',
|
||||
'title': 'Vademecum 1050. rocznicy Chrztu Polski'
|
||||
},
|
||||
'playlist_mincount': 5
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/7/129,Sygnaly-dnia?ref=source',
|
||||
'only_matching': True
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/37,RedakcjaKatolicka/4143,Kierunek-Krakow',
|
||||
'info_dict': {
|
||||
'id': '4143',
|
||||
'title': 'Kierunek Kraków',
|
||||
},
|
||||
'playlist_mincount': 61
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/10,czworka/214,muzyka',
|
||||
'info_dict': {
|
||||
'id': '214',
|
||||
'title': 'Muzyka',
|
||||
},
|
||||
'playlist_mincount': 61
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/7,Jedynka/5102,HISTORIA-ZYWA',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.polskieradio.pl/8,Dwojka/196,Publicystyka',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if PolskieRadioIE.suitable(url) else super(PolskieRadioCategoryIE, cls).suitable(url)
|
||||
|
||||
def _entries(self, url, page, category_id):
|
||||
content = page
|
||||
for page_num in itertools.count(2):
|
||||
for a_entry, entry_id in re.findall(
|
||||
r'(?s)<article[^>]+>.*?(<a[^>]+href=["\']/\d+/\d+/Artykul/(\d+)[^>]+>).*?</article>',
|
||||
content):
|
||||
entry = extract_attributes(a_entry)
|
||||
href = entry.get('href')
|
||||
if not href:
|
||||
continue
|
||||
yield self.url_result(
|
||||
compat_urlparse.urljoin(url, href), PolskieRadioIE.ie_key(),
|
||||
entry_id, entry.get('title'))
|
||||
mobj = re.search(
|
||||
r'<div[^>]+class=["\']next["\'][^>]*>\s*<a[^>]+href=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
content)
|
||||
if not mobj:
|
||||
break
|
||||
next_url = compat_urlparse.urljoin(url, mobj.group('url'))
|
||||
content = self._download_webpage(
|
||||
next_url, category_id, 'Downloading page %s' % page_num)
|
||||
|
||||
def _real_extract(self, url):
|
||||
category_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, category_id)
|
||||
title = self._html_search_regex(
|
||||
r'<title>([^<]+) - [^<]+ - [^<]+</title>',
|
||||
webpage, 'title', fatal=False)
|
||||
return self.playlist_result(
|
||||
self._entries(url, webpage, category_id),
|
||||
category_id, title)
|
||||
|
@@ -15,6 +15,7 @@ from ..compat import (
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
orderedSet,
|
||||
sanitized_Request,
|
||||
str_to_int,
|
||||
@@ -48,6 +49,8 @@ class PornHubIE(InfoExtractor):
|
||||
'dislike_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 18,
|
||||
'tags': list,
|
||||
'categories': list,
|
||||
},
|
||||
}, {
|
||||
# non-ASCII title
|
||||
@@ -63,6 +66,8 @@ class PornHubIE(InfoExtractor):
|
||||
'dislike_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 18,
|
||||
'tags': list,
|
||||
'categories': list,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -183,6 +188,15 @@ class PornHubIE(InfoExtractor):
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
page_params = self._parse_json(self._search_regex(
|
||||
r'page_params\.zoneDetails\[([\'"])[^\'"]+\1\]\s*=\s*(?P<data>{[^}]+})',
|
||||
webpage, 'page parameters', group='data', default='{}'),
|
||||
video_id, transform_source=js_to_json, fatal=False)
|
||||
tags = categories = None
|
||||
if page_params:
|
||||
tags = page_params.get('tags', '').split(',')
|
||||
categories = page_params.get('categories', '').split(',')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'uploader': video_uploader,
|
||||
@@ -195,6 +209,8 @@ class PornHubIE(InfoExtractor):
|
||||
'comment_count': comment_count,
|
||||
'formats': formats,
|
||||
'age_limit': 18,
|
||||
'tags': tags,
|
||||
'categories': categories,
|
||||
}
|
||||
|
||||
|
||||
|
@@ -15,7 +15,111 @@ from ..utils import (
|
||||
)
|
||||
|
||||
|
||||
class ProSiebenSat1IE(InfoExtractor):
|
||||
class ProSiebenSat1BaseIE(InfoExtractor):
|
||||
def _extract_video_info(self, url, clip_id):
|
||||
client_location = url
|
||||
|
||||
video = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos',
|
||||
clip_id, 'Downloading videos JSON', query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
'ids': clip_id,
|
||||
})[0]
|
||||
|
||||
if video.get('is_protected') is True:
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
duration = float_or_none(video.get('duration'))
|
||||
source_ids = [compat_str(source['id']) for source in video['sources']]
|
||||
|
||||
client_id = self._SALT[:2] + sha1(''.join([clip_id, self._SALT, self._TOKEN, client_location, self._SALT, self._CLIENT_NAME]).encode('utf-8')).hexdigest()
|
||||
|
||||
sources = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources' % clip_id,
|
||||
clip_id, 'Downloading sources JSON', query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
})
|
||||
server_id = sources['server_id']
|
||||
|
||||
def fix_bitrate(bitrate):
|
||||
bitrate = int_or_none(bitrate)
|
||||
if not bitrate:
|
||||
return None
|
||||
return (bitrate // 1000) if bitrate % 1000 == 0 else bitrate
|
||||
|
||||
formats = []
|
||||
for source_id in source_ids:
|
||||
client_id = self._SALT[:2] + sha1(''.join([self._SALT, clip_id, self._TOKEN, server_id, client_location, source_id, self._SALT, self._CLIENT_NAME]).encode('utf-8')).hexdigest()
|
||||
urls = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url' % clip_id,
|
||||
clip_id, 'Downloading urls JSON', fatal=False, query={
|
||||
'access_token': self._TOKEN,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': self._CLIENT_NAME,
|
||||
'server_id': server_id,
|
||||
'source_ids': source_id,
|
||||
})
|
||||
if not urls:
|
||||
continue
|
||||
if urls.get('status_code') != 0:
|
||||
raise ExtractorError('This video is unavailable', expected=True)
|
||||
urls_sources = urls['sources']
|
||||
if isinstance(urls_sources, dict):
|
||||
urls_sources = urls_sources.values()
|
||||
for source in urls_sources:
|
||||
source_url = source.get('url')
|
||||
if not source_url:
|
||||
continue
|
||||
protocol = source.get('protocol')
|
||||
mimetype = source.get('mimetype')
|
||||
if mimetype == 'application/f4m+xml' or 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
source_url, clip_id, f4m_id='hds', fatal=False))
|
||||
elif mimetype == 'application/x-mpegURL':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, clip_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = fix_bitrate(source['bitrate'])
|
||||
if protocol in ('rtmp', 'rtmpe'):
|
||||
mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
|
||||
if not mobj:
|
||||
continue
|
||||
path = mobj.group('path')
|
||||
mp4colon_index = path.rfind('mp4:')
|
||||
app = path[:mp4colon_index]
|
||||
play_path = path[mp4colon_index:]
|
||||
formats.append({
|
||||
'url': '%s/%s' % (mobj.group('url'), app),
|
||||
'app': app,
|
||||
'play_path': play_path,
|
||||
'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
|
||||
'page_url': 'http://www.prosieben.de',
|
||||
'tbr': tbr,
|
||||
'ext': 'flv',
|
||||
'format_id': 'rtmp%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
else:
|
||||
formats.append({
|
||||
'url': source_url,
|
||||
'tbr': tbr,
|
||||
'format_id': 'http%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
IE_NAME = 'prosiebensat1'
|
||||
IE_DESC = 'ProSiebenSat.1 Digital'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:prosieben|prosiebenmaxx|sixx|sat1|kabeleins|the-voice-of-germany|7tv)\.(?:de|at|ch)|ran\.de|fem\.com)/(?P<id>.+)'
|
||||
@@ -188,6 +292,9 @@ class ProSiebenSat1IE(InfoExtractor):
|
||||
},
|
||||
]
|
||||
|
||||
_TOKEN = 'prosieben'
|
||||
_SALT = '01!8d8F_)r9]4s[qeuXfP%'
|
||||
_CLIENT_NAME = 'kolibri-2.0.19-splec4'
|
||||
_CLIPID_REGEXES = [
|
||||
r'"clip_id"\s*:\s+"(\d+)"',
|
||||
r'clipid: "(\d+)"',
|
||||
@@ -234,123 +341,22 @@ class ProSiebenSat1IE(InfoExtractor):
|
||||
def _extract_clip(self, url, webpage):
|
||||
clip_id = self._html_search_regex(
|
||||
self._CLIPID_REGEXES, webpage, 'clip id')
|
||||
|
||||
access_token = 'prosieben'
|
||||
client_name = 'kolibri-2.0.19-splec4'
|
||||
client_location = url
|
||||
|
||||
video = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos',
|
||||
clip_id, 'Downloading videos JSON', query={
|
||||
'access_token': access_token,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
'ids': clip_id,
|
||||
})[0]
|
||||
|
||||
if video.get('is_protected') is True:
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
duration = float_or_none(video.get('duration'))
|
||||
source_ids = [compat_str(source['id']) for source in video['sources']]
|
||||
|
||||
g = '01!8d8F_)r9]4s[qeuXfP%'
|
||||
client_id = g[:2] + sha1(''.join([clip_id, g, access_token, client_location, g, client_name]).encode('utf-8')).hexdigest()
|
||||
|
||||
sources = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources' % clip_id,
|
||||
clip_id, 'Downloading sources JSON', query={
|
||||
'access_token': access_token,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
})
|
||||
server_id = sources['server_id']
|
||||
|
||||
title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
|
||||
|
||||
def fix_bitrate(bitrate):
|
||||
bitrate = int_or_none(bitrate)
|
||||
if not bitrate:
|
||||
return None
|
||||
return (bitrate // 1000) if bitrate % 1000 == 0 else bitrate
|
||||
|
||||
formats = []
|
||||
for source_id in source_ids:
|
||||
client_id = g[:2] + sha1(''.join([g, clip_id, access_token, server_id, client_location, source_id, g, client_name]).encode('utf-8')).hexdigest()
|
||||
urls = self._download_json(
|
||||
'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url' % clip_id,
|
||||
clip_id, 'Downloading urls JSON', fatal=False, query={
|
||||
'access_token': access_token,
|
||||
'client_id': client_id,
|
||||
'client_location': client_location,
|
||||
'client_name': client_name,
|
||||
'server_id': server_id,
|
||||
'source_ids': source_id,
|
||||
})
|
||||
if not urls:
|
||||
continue
|
||||
if urls.get('status_code') != 0:
|
||||
raise ExtractorError('This video is unavailable', expected=True)
|
||||
urls_sources = urls['sources']
|
||||
if isinstance(urls_sources, dict):
|
||||
urls_sources = urls_sources.values()
|
||||
for source in urls_sources:
|
||||
source_url = source.get('url')
|
||||
if not source_url:
|
||||
continue
|
||||
protocol = source.get('protocol')
|
||||
mimetype = source.get('mimetype')
|
||||
if mimetype == 'application/f4m+xml' or 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
source_url, clip_id, f4m_id='hds', fatal=False))
|
||||
elif mimetype == 'application/x-mpegURL':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, clip_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = fix_bitrate(source['bitrate'])
|
||||
if protocol in ('rtmp', 'rtmpe'):
|
||||
mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
|
||||
if not mobj:
|
||||
continue
|
||||
path = mobj.group('path')
|
||||
mp4colon_index = path.rfind('mp4:')
|
||||
app = path[:mp4colon_index]
|
||||
play_path = path[mp4colon_index:]
|
||||
formats.append({
|
||||
'url': '%s/%s' % (mobj.group('url'), app),
|
||||
'app': app,
|
||||
'play_path': play_path,
|
||||
'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
|
||||
'page_url': 'http://www.prosieben.de',
|
||||
'tbr': tbr,
|
||||
'ext': 'flv',
|
||||
'format_id': 'rtmp%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
else:
|
||||
formats.append({
|
||||
'url': source_url,
|
||||
'tbr': tbr,
|
||||
'format_id': 'http%s' % ('-%d' % tbr if tbr else ''),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = self._extract_video_info(url, clip_id)
|
||||
description = self._html_search_regex(
|
||||
self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
upload_date = unified_strdate(self._html_search_regex(
|
||||
self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
|
||||
|
||||
return {
|
||||
info.update({
|
||||
'id': clip_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'upload_date': upload_date,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
})
|
||||
return info
|
||||
|
||||
def _extract_playlist(self, url, webpage):
|
||||
playlist_id = self._html_search_regex(
|
||||
|
@@ -1,88 +1,51 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .prosiebensat1 import ProSiebenSat1BaseIE
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
unified_strdate,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
compat_str,
|
||||
)
|
||||
|
||||
|
||||
class Puls4IE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?puls4\.com/video/[^/]+/play/(?P<id>[0-9]+)'
|
||||
class Puls4IE(ProSiebenSat1BaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>(?:[^/]+/)*?videos/[^?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.puls4.com/video/pro-und-contra/play/2716816',
|
||||
'md5': '49f6a6629747eeec43cef6a46b5df81d',
|
||||
'url': 'http://www.puls4.com/2-minuten-2-millionen/staffel-3/videos/2min2miotalk/Tobias-Homberger-von-myclubs-im-2min2miotalk-118118',
|
||||
'md5': 'fd3c6b0903ac72c9d004f04bc6bb3e03',
|
||||
'info_dict': {
|
||||
'id': '2716816',
|
||||
'ext': 'mp4',
|
||||
'title': 'Pro und Contra vom 23.02.2015',
|
||||
'description': 'md5:293e44634d9477a67122489994675db6',
|
||||
'duration': 2989,
|
||||
'upload_date': '20150224',
|
||||
'id': '118118',
|
||||
'ext': 'flv',
|
||||
'title': 'Tobias Homberger von myclubs im #2min2miotalk',
|
||||
'description': 'md5:f9def7c5e8745d6026d8885487d91955',
|
||||
'upload_date': '20160830',
|
||||
'uploader': 'PULS_4',
|
||||
},
|
||||
'skip': 'Only works from Germany',
|
||||
}, {
|
||||
'url': 'http://www.puls4.com/video/kult-spielfilme/play/1298106',
|
||||
'md5': '6a48316c8903ece8dab9b9a7bf7a59ec',
|
||||
'info_dict': {
|
||||
'id': '1298106',
|
||||
'ext': 'mp4',
|
||||
'title': 'Lucky Fritz',
|
||||
},
|
||||
'skip': 'Only works from Germany',
|
||||
}]
|
||||
_TOKEN = 'puls4'
|
||||
_SALT = '01!kaNgaiNgah1Ie4AeSha'
|
||||
_CLIENT_NAME = ''
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
error_message = self._html_search_regex(
|
||||
r'<div[^>]+class="message-error"[^>]*>(.+?)</div>',
|
||||
webpage, 'error message', default=None)
|
||||
if error_message:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
|
||||
|
||||
real_url = self._html_search_regex(
|
||||
r'\"fsk-button\".+?href=\"([^"]+)',
|
||||
webpage, 'fsk_button', default=None)
|
||||
if real_url:
|
||||
webpage = self._download_webpage(real_url, video_id)
|
||||
|
||||
player = self._search_regex(
|
||||
r'p4_video_player(?:_iframe)?\("video_\d+_container"\s*,(.+?)\);\s*\}',
|
||||
webpage, 'player')
|
||||
|
||||
player_json = self._parse_json(
|
||||
'[%s]' % player, video_id,
|
||||
transform_source=lambda s: s.replace('undefined,', ''))
|
||||
|
||||
formats = None
|
||||
result = None
|
||||
|
||||
for v in player_json:
|
||||
if isinstance(v, list) and not formats:
|
||||
formats = [{
|
||||
'url': f['url'],
|
||||
'format': 'hd' if f.get('hd') else 'sd',
|
||||
'width': int_or_none(f.get('size_x')),
|
||||
'height': int_or_none(f.get('size_y')),
|
||||
'tbr': int_or_none(f.get('bitrate')),
|
||||
} for f in v]
|
||||
self._sort_formats(formats)
|
||||
elif isinstance(v, dict) and not result:
|
||||
result = {
|
||||
'id': video_id,
|
||||
'title': v['videopartname'].strip(),
|
||||
'description': v.get('videotitle'),
|
||||
'duration': int_or_none(v.get('videoduration') or v.get('episodeduration')),
|
||||
'upload_date': unified_strdate(v.get('clipreleasetime')),
|
||||
'uploader': v.get('channel'),
|
||||
}
|
||||
|
||||
result['formats'] = formats
|
||||
|
||||
return result
|
||||
path = self._match_id(url)
|
||||
content_path = self._download_json(
|
||||
'http://www.puls4.com/api/json-fe/page/' + path, path)['content'][0]['url']
|
||||
media = self._download_json(
|
||||
'http://www.puls4.com' + content_path,
|
||||
content_path)['mediaCurrent']
|
||||
player_content = media['playerContent']
|
||||
info = self._extract_video_info(url, player_content['id'])
|
||||
info.update({
|
||||
'id': compat_str(media['objectId']),
|
||||
'title': player_content['title'],
|
||||
'description': media.get('description'),
|
||||
'thumbnail': media.get('previewLink'),
|
||||
'upload_date': unified_strdate(media.get('date')),
|
||||
'duration': parse_duration(player_content.get('duration')),
|
||||
'episode': player_content.get('episodePartName'),
|
||||
'show': media.get('channel'),
|
||||
'season_id': player_content.get('seasonId'),
|
||||
'uploader': player_content.get('sourceCompany'),
|
||||
})
|
||||
return info
|
||||
|
39
youtube_dl/extractor/rmcdecouverte.py
Normal file
39
youtube_dl/extractor/rmcdecouverte.py
Normal file
@@ -0,0 +1,39 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
|
||||
|
||||
class RMCDecouverteIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/mediaplayer-replay.*?\bid=(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://rmcdecouverte.bfmtv.com/mediaplayer-replay/?id=1430&title=LES%20HEROS%20DU%2088e%20ETAGE',
|
||||
'info_dict': {
|
||||
'id': '5111223049001',
|
||||
'ext': 'mp4',
|
||||
'title': ': LES HEROS DU 88e ETAGE',
|
||||
'description': 'Découvrez comment la bravoure de deux hommes dans la Tour Nord du World Trade Center a sauvé la vie d\'innombrables personnes le 11 septembre 2001.',
|
||||
'uploader_id': '1969646226001',
|
||||
'upload_date': '20160904',
|
||||
'timestamp': 1472951103,
|
||||
},
|
||||
'params': {
|
||||
# rtmp download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Only works from France',
|
||||
}
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1969646226001/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
@@ -88,7 +88,7 @@ class RutubeIE(InfoExtractor):
|
||||
class RutubeEmbedIE(InfoExtractor):
|
||||
IE_NAME = 'rutube:embed'
|
||||
IE_DESC = 'Rutube embedded videos'
|
||||
_VALID_URL = 'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://rutube.ru/video/embed/6722881?vk_puid37=&vk_puid38=',
|
||||
|
@@ -103,7 +103,7 @@ class SpiegelIE(InfoExtractor):
|
||||
|
||||
|
||||
class SpiegelArticleIE(InfoExtractor):
|
||||
_VALID_URL = 'https?://www\.spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
_VALID_URL = r'https?://www\.spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
IE_NAME = 'Spiegel:Article'
|
||||
IE_DESC = 'Articles on spiegel.de'
|
||||
_TESTS = [{
|
||||
|
@@ -53,7 +53,7 @@ class TBSIE(TurnerBaseIE):
|
||||
'media_src': 'http://ht.cdn.turner.com/%s/big' % site,
|
||||
},
|
||||
'secure': {
|
||||
'media_src': 'http://apple-secure.cdn.turner.com/%s/big' % site,
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/%s/big' % site,
|
||||
'tokenizer_src': 'http://www.%s.com/video/processors/services/token_ipadAdobe.do' % domain,
|
||||
},
|
||||
})
|
||||
|
36
youtube_dl/extractor/telequebec.py
Normal file
36
youtube_dl/extractor/telequebec.py
Normal file
@@ -0,0 +1,36 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none
|
||||
|
||||
|
||||
class TeleQuebecIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://zonevideo\.telequebec\.tv/media/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://zonevideo.telequebec.tv/media/20984/le-couronnement-de-new-york/couronnement-de-new-york',
|
||||
'md5': 'fe95a0957e5707b1b01f5013e725c90f',
|
||||
'info_dict': {
|
||||
'id': '20984',
|
||||
'ext': 'mp4',
|
||||
'title': 'Le couronnement de New York',
|
||||
'description': 'md5:f5b3d27a689ec6c1486132b2d687d432',
|
||||
'upload_date': '20160220',
|
||||
'timestamp': 1455965438,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
media_id = self._match_id(url)
|
||||
media_data = self._download_json(
|
||||
'https://mnmedias.api.telequebec.tv/api/v2/media/' + media_id,
|
||||
media_id)['media']
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': media_id,
|
||||
'url': 'limelight:media:' + media_data['streamInfo']['sourceId'],
|
||||
'title': media_data['title'],
|
||||
'description': media_data.get('descriptions', [{'text': None}])[0].get('text'),
|
||||
'duration': int_or_none(media_data.get('durationInMilliseconds'), 1000),
|
||||
'ie_key': 'LimelightMedia',
|
||||
}
|
53
youtube_dl/extractor/tfo.py
Normal file
53
youtube_dl/extractor/tfo.py
Normal file
@@ -0,0 +1,53 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
HEADRequest,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class TFOIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?tfo\.org/(?:en|fr)/(?:[^/]+/){2}(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.tfo.org/en/universe/tfo-247/100463871/video-game-hackathon',
|
||||
'md5': '47c987d0515561114cf03d1226a9d4c7',
|
||||
'info_dict': {
|
||||
'id': '100463871',
|
||||
'ext': 'mp4',
|
||||
'title': 'Video Game Hackathon',
|
||||
'description': 'md5:558afeba217c6c8d96c60e5421795c07',
|
||||
'upload_date': '20160212',
|
||||
'timestamp': 1455310233,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
self._request_webpage(HEADRequest('http://www.tfo.org/'), video_id)
|
||||
infos = self._download_json(
|
||||
'http://www.tfo.org/api/web/video/get_infos', video_id, data=json.dumps({
|
||||
'product_id': video_id,
|
||||
}).encode(), headers={
|
||||
'X-tfo-session': self._get_cookies('http://www.tfo.org/')['tfo-session'].value,
|
||||
})
|
||||
if infos.get('success') == 0:
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, infos['msg']), expected=True)
|
||||
video_data = infos['data']
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'url': 'limelight:media:' + video_data['llid'],
|
||||
'title': video_data['title'],
|
||||
'description': video_data.get('description'),
|
||||
'series': video_data.get('collection'),
|
||||
'season_number': int_or_none(video_data.get('season')),
|
||||
'episode_number': int_or_none(video_data.get('episode')),
|
||||
'duration': int_or_none(video_data.get('duration')),
|
||||
'ie_key': 'LimelightMedia',
|
||||
}
|
@@ -1,10 +1,14 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from ..compat import compat_parse_qs
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
|
||||
|
||||
class TlcDeIE(InfoExtractor):
|
||||
@@ -35,5 +39,5 @@ class TlcDeIE(InfoExtractor):
|
||||
title = mobj.group('title')
|
||||
webpage = self._download_webpage(url, title)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(brightcove_legacy_url)['@videoPlayer'][0]
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
||||
|
@@ -1,36 +0,0 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .nuevo import NuevoBaseIE
|
||||
|
||||
|
||||
class TrollvidsIE(NuevoBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?trollvids\.com/video/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
|
||||
IE_NAME = 'trollvids'
|
||||
_TEST = {
|
||||
'url': 'http://trollvids.com/video/2349002/%E3%80%90MMD-R-18%E3%80%91%E3%82%AC%E3%83%BC%E3%83%AB%E3%83%95%E3%83%AC%E3%83%B3%E3%83%89-carrymeoff',
|
||||
'md5': '1d53866b2c514b23ed69e4352fdc9839',
|
||||
'info_dict': {
|
||||
'id': '2349002',
|
||||
'ext': 'mp4',
|
||||
'title': '【MMD R-18】ガールフレンド carry_me_off',
|
||||
'age_limit': 18,
|
||||
'duration': 216.78,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id')
|
||||
|
||||
info = self._extract_nuevo(
|
||||
'http://trollvids.com/nuevo/player/config.php?v=%s' % video_id,
|
||||
video_id)
|
||||
info.update({
|
||||
'display_id': display_id,
|
||||
'age_limit': 18
|
||||
})
|
||||
return info
|
35
youtube_dl/extractor/trutv.py
Normal file
35
youtube_dl/extractor/trutv.py
Normal file
@@ -0,0 +1,35 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .turner import TurnerBaseIE
|
||||
|
||||
|
||||
class TruTVIE(TurnerBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?trutv\.com(?:(?P<path>/shows/[^/]+/videos/[^/?#]+?)\.html|/full-episodes/[^/]+/(?P<id>\d+))'
|
||||
_TEST = {
|
||||
'url': 'http://www.trutv.com/shows/10-things/videos/you-wont-believe-these-sports-bets.html',
|
||||
'md5': '2cdc844f317579fed1a7251b087ff417',
|
||||
'info_dict': {
|
||||
'id': '/shows/10-things/videos/you-wont-believe-these-sports-bets',
|
||||
'ext': 'mp4',
|
||||
'title': 'You Won\'t Believe These Sports Bets',
|
||||
'description': 'Jamie Lee sits down with a bookie to discuss the bizarre world of illegal sports betting.',
|
||||
'upload_date': '20130305',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
path, video_id = re.match(self._VALID_URL, url).groups()
|
||||
if path:
|
||||
data_src = 'http://www.trutv.com/video/cvp/v2/xml/content.xml?id=%s.xml' % path
|
||||
else:
|
||||
data_src = 'http://www.trutv.com/tveverywhere/services/cvpXML.do?titleId=' + video_id
|
||||
return self._extract_cvp_info(
|
||||
data_src, path, {
|
||||
'secure': {
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/trutv/big',
|
||||
'tokenizer_src': 'http://www.trutv.com/tveverywhere/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
})
|
@@ -1,5 +1,7 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
str_to_int,
|
||||
@@ -21,7 +23,13 @@ class Tube8IE(KeezMoviesIE):
|
||||
'title': 'Kasia music video',
|
||||
'age_limit': 18,
|
||||
'duration': 230,
|
||||
'categories': ['Teen'],
|
||||
'tags': ['dancing'],
|
||||
},
|
||||
'params': {
|
||||
'proxy': '127.0.0.1:8118',
|
||||
}
|
||||
|
||||
}, {
|
||||
'url': 'http://www.tube8.com/shemale/teen/blonde-cd-gets-kidnapped-by-two-blacks-and-punished-for-being-a-slutty-girl/19569151/',
|
||||
'only_matching': True,
|
||||
@@ -51,6 +59,17 @@ class Tube8IE(KeezMoviesIE):
|
||||
r'<span id="allCommentsCount">(\d+)</span>',
|
||||
webpage, 'comment count', fatal=False))
|
||||
|
||||
category = self._search_regex(
|
||||
r'Category:\s*</strong>\s*<a[^>]+href=[^>]+>([^<]+)',
|
||||
webpage, 'category', fatal=False)
|
||||
categories = [category] if category else None
|
||||
|
||||
tags_str = self._search_regex(
|
||||
r'(?s)Tags:\s*</strong>(.+?)</(?!a)',
|
||||
webpage, 'tags', fatal=False)
|
||||
tags = [t for t in re.findall(
|
||||
r'<a[^>]+href=[^>]+>([^<]+)', tags_str)] if tags_str else None
|
||||
|
||||
info.update({
|
||||
'description': description,
|
||||
'uploader': uploader,
|
||||
@@ -58,6 +77,8 @@ class Tube8IE(KeezMoviesIE):
|
||||
'like_count': like_count,
|
||||
'dislike_count': dislike_count,
|
||||
'comment_count': comment_count,
|
||||
'categories': categories,
|
||||
'tags': tags,
|
||||
})
|
||||
|
||||
return info
|
||||
|
@@ -12,7 +12,7 @@ from ..utils import (
|
||||
parse_duration,
|
||||
xpath_attr,
|
||||
update_url_query,
|
||||
compat_urlparse,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
@@ -24,6 +24,7 @@ class TurnerBaseIE(InfoExtractor):
|
||||
video_data = self._download_xml(data_src, video_id)
|
||||
video_id = video_data.attrib['id']
|
||||
title = xpath_text(video_data, 'headline', fatal=True)
|
||||
content_id = xpath_text(video_data, 'contentId') or video_id
|
||||
# rtmp_src = xpath_text(video_data, 'akamai/src')
|
||||
# if rtmp_src:
|
||||
# splited_rtmp_src = rtmp_src.split(',')
|
||||
@@ -54,7 +55,7 @@ class TurnerBaseIE(InfoExtractor):
|
||||
# auth = self._download_webpage(
|
||||
# protected_path_data['tokenizer_src'], query={
|
||||
# 'path': protected_path,
|
||||
# 'videoId': video_id,
|
||||
# 'videoId': content_id,
|
||||
# 'aifp': aifp,
|
||||
# })
|
||||
# token = xpath_text(auth, 'token')
|
||||
@@ -72,8 +73,11 @@ class TurnerBaseIE(InfoExtractor):
|
||||
auth = self._download_xml(
|
||||
secure_path_data['tokenizer_src'], video_id, query={
|
||||
'path': secure_path,
|
||||
'videoId': video_id,
|
||||
'videoId': content_id,
|
||||
})
|
||||
error_msg = xpath_text(auth, 'error/msg')
|
||||
if error_msg:
|
||||
raise ExtractorError(error_msg, expected=True)
|
||||
token = xpath_text(auth, 'token')
|
||||
if not token:
|
||||
continue
|
||||
@@ -93,19 +97,9 @@ class TurnerBaseIE(InfoExtractor):
|
||||
formats.extend(self._extract_smil_formats(
|
||||
video_url, video_id, fatal=False))
|
||||
elif ext == 'm3u8':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', m3u8_id=format_id or 'hls',
|
||||
fatal=False)
|
||||
if m3u8_formats:
|
||||
# Sometimes final URLs inside m3u8 are unsigned, let's fix this
|
||||
# ourselves
|
||||
qs = compat_urlparse.urlparse(video_url).query
|
||||
if qs:
|
||||
query = compat_urlparse.parse_qs(qs)
|
||||
for m3u8_format in m3u8_formats:
|
||||
m3u8_format['url'] = update_url_query(m3u8_format['url'], query)
|
||||
m3u8_format['extra_param_to_segment_url'] = qs
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4',
|
||||
m3u8_id=format_id or 'hls', fatal=False))
|
||||
elif ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
update_url_query(video_url, {'hdcore': '3.7.0'}),
|
||||
|
49
youtube_dl/extractor/tvnoe.py
Normal file
49
youtube_dl/extractor/tvnoe.py
Normal file
@@ -0,0 +1,49 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
get_element_by_class,
|
||||
js_to_json,
|
||||
)
|
||||
|
||||
|
||||
class TVNoeIE(JWPlatformBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?tvnoe\.cz/video/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.tvnoe.cz/video/10362',
|
||||
'md5': 'aee983f279aab96ec45ab6e2abb3c2ca',
|
||||
'info_dict': {
|
||||
'id': '10362',
|
||||
'ext': 'mp4',
|
||||
'series': 'Noční univerzita',
|
||||
'title': 'prof. Tomáš Halík, Th.D. - Návrat náboženství a střet civilizací',
|
||||
'description': 'md5:f337bae384e1a531a52c55ebc50fff41',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
iframe_url = self._search_regex(
|
||||
r'<iframe[^>]+src="([^"]+)"', webpage, 'iframe URL')
|
||||
|
||||
ifs_page = self._download_webpage(iframe_url, video_id)
|
||||
jwplayer_data = self._parse_json(
|
||||
self._find_jwplayer_data(ifs_page),
|
||||
video_id, transform_source=js_to_json)
|
||||
info_dict = self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, require_title=False, base_url=iframe_url)
|
||||
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'title': clean_html(get_element_by_class(
|
||||
'field-name-field-podnazev', webpage)),
|
||||
'description': clean_html(get_element_by_class(
|
||||
'field-name-body', webpage)),
|
||||
'series': clean_html(get_element_by_class('title', webpage))
|
||||
})
|
||||
|
||||
return info_dict
|
@@ -348,6 +348,29 @@ class ViafreeIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [TVPlayIE.ie_key()],
|
||||
}, {
|
||||
# with relatedClips
|
||||
'url': 'http://www.viafree.se/program/reality/sommaren-med-youtube-stjarnorna/sasong-1/avsnitt-1',
|
||||
'info_dict': {
|
||||
'id': '758770',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sommaren med YouTube-stjärnorna S01E01',
|
||||
'description': 'md5:2bc69dce2c4bb48391e858539bbb0e3f',
|
||||
'series': 'Sommaren med YouTube-stjärnorna',
|
||||
'season': 'Säsong 1',
|
||||
'season_number': 1,
|
||||
'duration': 1326,
|
||||
'timestamp': 1470905572,
|
||||
'upload_date': '20160811',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': [TVPlayIE.ie_key()],
|
||||
}, {
|
||||
# Different og:image URL schema
|
||||
'url': 'www.viafree.se/program/reality/sommaren-med-youtube-stjarnorna/sasong-1/avsnitt-2',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.viafree.no/programmer/underholdning/det-beste-vorspielet/sesong-2/episode-1',
|
||||
'only_matching': True,
|
||||
@@ -365,8 +388,38 @@ class ViafreeIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'currentVideo["\']\s*:\s*.+?["\']id["\']\s*:\s*["\'](?P<id>\d{6,})',
|
||||
webpage, 'video id')
|
||||
data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)window\.App\s*=\s*({.+?})\s*;\s*</script',
|
||||
webpage, 'data', default='{}'),
|
||||
video_id, transform_source=lambda x: re.sub(
|
||||
r'(?s)function\s+[a-zA-Z_][\da-zA-Z_]*\s*\([^)]*\)\s*{[^}]*}\s*',
|
||||
'null', x), fatal=False)
|
||||
|
||||
video_id = None
|
||||
|
||||
if data:
|
||||
video_id = try_get(
|
||||
data, lambda x: x['context']['dispatcher']['stores'][
|
||||
'ContentPageProgramStore']['currentVideo']['id'],
|
||||
compat_str)
|
||||
|
||||
# Fallback #1 (extract from og:image URL schema)
|
||||
if not video_id:
|
||||
thumbnail = self._og_search_thumbnail(webpage, default=None)
|
||||
if thumbnail:
|
||||
video_id = self._search_regex(
|
||||
# Patterns seen:
|
||||
# http://cdn.playapi.mtgx.tv/imagecache/600x315/cloud/content-images/inbox/765166/a2e95e5f1d735bab9f309fa345cc3f25.jpg
|
||||
# http://cdn.playapi.mtgx.tv/imagecache/600x315/cloud/content-images/seasons/15204/758770/4a5ba509ca8bc043e1ebd1a76131cdf2.jpg
|
||||
r'https?://[^/]+/imagecache/(?:[^/]+/)+(\d{6,})/',
|
||||
thumbnail, 'video id', default=None)
|
||||
|
||||
# Fallback #2. Extract from raw JSON string.
|
||||
# May extract wrong video id if relatedClips is present.
|
||||
if not video_id:
|
||||
video_id = self._search_regex(
|
||||
r'currentVideo["\']\s*:\s*.+?["\']id["\']\s*:\s*["\'](\d{6,})',
|
||||
webpage, 'video id')
|
||||
|
||||
return self.url_result('mtg:%s' % video_id, TVPlayIE.ie_key())
|
||||
|
@@ -342,7 +342,7 @@ class TwitterIE(InfoExtractor):
|
||||
|
||||
class TwitterAmplifyIE(TwitterBaseIE):
|
||||
IE_NAME = 'twitter:amplify'
|
||||
_VALID_URL = 'https?://amp\.twimg\.com/v/(?P<id>[0-9a-f\-]{36})'
|
||||
_VALID_URL = r'https?://amp\.twimg\.com/v/(?P<id>[0-9a-f\-]{36})'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://amp.twimg.com/v/0ba0c3c7-0af3-4c0a-bed5-7efd1ffa2951',
|
||||
|
@@ -6,8 +6,7 @@ import re
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
parse_iso8601,
|
||||
xpath_element,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
@@ -17,38 +16,32 @@ class VideomoreIE(InfoExtractor):
|
||||
_VALID_URL = r'videomore:(?P<sid>\d+)$|https?://videomore\.ru/(?:(?:embed|[^/]+/[^/]+)/|[^/]+\?.*\btrack_id=)(?P<id>\d+)(?:[/?#&]|\.(?:xml|json)|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://videomore.ru/kino_v_detalayah/5_sezon/367617',
|
||||
'md5': '70875fbf57a1cd004709920381587185',
|
||||
'md5': '44455a346edc0d509ac5b5a5b531dc35',
|
||||
'info_dict': {
|
||||
'id': '367617',
|
||||
'ext': 'flv',
|
||||
'title': 'В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'description': 'В гостях – лучшие романтические комедии года, «Выживший» Иньярриту и «Стив Джобс» Дэнни Бойла.',
|
||||
'title': 'Кино в деталях 5 сезон В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'series': 'Кино в деталях',
|
||||
'episode': 'В гостях Алексей Чумаков и Юлия Ковальчук',
|
||||
'episode_number': None,
|
||||
'season': 'Сезон 2015',
|
||||
'season_number': 5,
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 2910,
|
||||
'age_limit': 16,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 16,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://videomore.ru/embed/259974',
|
||||
'info_dict': {
|
||||
'id': '259974',
|
||||
'ext': 'flv',
|
||||
'title': '80 серия',
|
||||
'description': '«Медведей» ждет решающий матч. Макеев выясняет отношения со Стрельцовым. Парни узнают подробности прошлого Макеева.',
|
||||
'title': 'Молодежка 2 сезон 40 серия',
|
||||
'series': 'Молодежка',
|
||||
'episode': '80 серия',
|
||||
'episode_number': 40,
|
||||
'season': '2 сезон',
|
||||
'season_number': 2,
|
||||
'episode': '40 серия',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 2809,
|
||||
'age_limit': 16,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'age_limit': 16,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -58,13 +51,8 @@ class VideomoreIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '341073',
|
||||
'ext': 'flv',
|
||||
'title': 'Команда проиграла из-за Бакина?',
|
||||
'description': 'Молодежка 3 сезон скоро',
|
||||
'series': 'Молодежка',
|
||||
'title': 'Промо Команда проиграла из-за Бакина?',
|
||||
'episode': 'Команда проиграла из-за Бакина?',
|
||||
'episode_number': None,
|
||||
'season': 'Промо',
|
||||
'season_number': 99,
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'duration': 29,
|
||||
'age_limit': 16,
|
||||
@@ -109,43 +97,33 @@ class VideomoreIE(InfoExtractor):
|
||||
'http://videomore.ru/video/tracks/%s.xml' % video_id,
|
||||
video_id, 'Downloading video XML')
|
||||
|
||||
video_url = xpath_text(video, './/video_url', 'video url', fatal=True)
|
||||
item = xpath_element(video, './/playlist/item', fatal=True)
|
||||
|
||||
title = xpath_text(
|
||||
item, ('./title', './episode_name'), 'title', fatal=True)
|
||||
|
||||
video_url = xpath_text(item, './video_url', 'video url', fatal=True)
|
||||
formats = self._extract_f4m_formats(video_url, video_id, f4m_id='hds')
|
||||
self._sort_formats(formats)
|
||||
|
||||
data = self._download_json(
|
||||
'http://videomore.ru/video/tracks/%s.json' % video_id,
|
||||
video_id, 'Downloading video JSON')
|
||||
thumbnail = xpath_text(item, './thumbnail_url')
|
||||
duration = int_or_none(xpath_text(item, './duration'))
|
||||
view_count = int_or_none(xpath_text(item, './views'))
|
||||
comment_count = int_or_none(xpath_text(item, './count_comments'))
|
||||
age_limit = int_or_none(xpath_text(item, './min_age'))
|
||||
|
||||
title = data.get('title') or data['project_title']
|
||||
description = data.get('description') or data.get('description_raw')
|
||||
timestamp = parse_iso8601(data.get('published_at'))
|
||||
duration = int_or_none(data.get('duration'))
|
||||
view_count = int_or_none(data.get('views'))
|
||||
age_limit = parse_age_limit(data.get('min_age'))
|
||||
thumbnails = [{
|
||||
'url': thumbnail,
|
||||
} for thumbnail in data.get('big_thumbnail_urls', [])]
|
||||
|
||||
series = data.get('project_title')
|
||||
episode = data.get('title')
|
||||
episode_number = int_or_none(data.get('episode_of_season') or None)
|
||||
season = data.get('season_title')
|
||||
season_number = int_or_none(data.get('season_pos') or None)
|
||||
series = xpath_text(item, './project_name')
|
||||
episode = xpath_text(item, './episode_name')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'series': series,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
'season': season,
|
||||
'season_number': season_number,
|
||||
'thumbnails': thumbnails,
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'comment_count': comment_count,
|
||||
'age_limit': age_limit,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@@ -86,38 +86,50 @@ class WatIE(InfoExtractor):
|
||||
|
||||
def extract_url(path_template, url_type):
|
||||
req_url = 'http://www.wat.tv/get/%s' % (path_template % video_id)
|
||||
head = self._request_webpage(HEADRequest(req_url), video_id, 'Extracting %s url' % url_type)
|
||||
red_url = head.geturl()
|
||||
if req_url == red_url:
|
||||
raise ExtractorError(
|
||||
'%s said: Sorry, this video is not available from your country.' % self.IE_NAME,
|
||||
expected=True)
|
||||
return red_url
|
||||
head = self._request_webpage(HEADRequest(req_url), video_id, 'Extracting %s url' % url_type, fatal=False)
|
||||
if head:
|
||||
red_url = head.geturl()
|
||||
if req_url != red_url:
|
||||
return red_url
|
||||
return None
|
||||
|
||||
def remove_bitrate_limit(manifest_url):
|
||||
return re.sub(r'(?:max|min)_bitrate=\d+&?', '', manifest_url)
|
||||
|
||||
formats = []
|
||||
try:
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
m3u8_url = extract_url('ipad/%s.m3u8', 'm3u8')
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios.', 'web.').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
format_id = m3u8_format['format_id'].replace('hls', 'http')
|
||||
fmt_url = re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': fmt_url,
|
||||
'format_id': format_id,
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
manifest_urls = self._download_json(
|
||||
'http://www.wat.tv/get/webhtml/' + video_id, video_id)
|
||||
m3u8_url = manifest_urls.get('hls')
|
||||
if m3u8_url:
|
||||
m3u8_url = remove_bitrate_limit(m3u8_url)
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
if m3u8_formats:
|
||||
formats.extend(m3u8_formats)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
m3u8_url.replace('ios', 'web').replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
http_url = extract_url('android5/%s.mp4', 'http')
|
||||
if http_url:
|
||||
for m3u8_format in m3u8_formats:
|
||||
vbr, abr = m3u8_format.get('vbr'), m3u8_format.get('abr')
|
||||
if not vbr or not abr:
|
||||
continue
|
||||
format_id = m3u8_format['format_id'].replace('hls', 'http')
|
||||
fmt_url = re.sub(r'%s-\d+00-\d+' % video_id, '%s-%d00-%d' % (video_id, round(vbr / 100), round(abr)), http_url)
|
||||
if self._is_valid_url(fmt_url, video_id, format_id):
|
||||
f = m3u8_format.copy()
|
||||
f.update({
|
||||
'url': fmt_url,
|
||||
'format_id': format_id,
|
||||
'protocol': 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
mpd_url = manifest_urls.get('mpd')
|
||||
if mpd_url:
|
||||
formats.extend(self._extract_mpd_formats(remove_bitrate_limit(
|
||||
mpd_url), video_id, mpd_id='dash', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
except ExtractorError:
|
||||
abr = 64
|
||||
|
@@ -19,7 +19,10 @@ from ..utils import (
|
||||
determine_ext,
|
||||
)
|
||||
|
||||
from .brightcove import BrightcoveNewIE
|
||||
from .brightcove import (
|
||||
BrightcoveLegacyIE,
|
||||
BrightcoveNewIE,
|
||||
)
|
||||
from .nbc import NBCSportsVPlayerIE
|
||||
|
||||
|
||||
@@ -223,6 +226,11 @@ class YahooIE(InfoExtractor):
|
||||
if nbc_sports_url:
|
||||
return self.url_result(nbc_sports_url, NBCSportsVPlayerIE.ie_key())
|
||||
|
||||
# Look for Brightcove Legacy Studio embeds
|
||||
bc_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
if bc_url:
|
||||
return self.url_result(bc_url, BrightcoveLegacyIE.ie_key())
|
||||
|
||||
# Look for Brightcove New Studio embeds
|
||||
bc_url = BrightcoveNewIE._extract_url(webpage)
|
||||
if bc_url:
|
||||
|
@@ -2417,7 +2417,7 @@ class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
|
||||
|
||||
class YoutubeHistoryIE(YoutubeFeedsInfoExtractor):
|
||||
IE_DESC = 'Youtube watch history, ":ythistory" for short (requires authentication)'
|
||||
_VALID_URL = 'https?://www\.youtube\.com/feed/history|:ythistory'
|
||||
_VALID_URL = r'https?://www\.youtube\.com/feed/history|:ythistory'
|
||||
_FEED_NAME = 'history'
|
||||
_PLAYLIST_TITLE = 'Youtube History'
|
||||
|
||||
|
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2016.09.04.1'
|
||||
__version__ = '2016.09.11.1'
|
||||
|
Reference in New Issue
Block a user