Youtube link to embed - What is going on in the code? - ruby-on-rails

Could someone break it out what ruby is doing in here?
I would like to understand the code so I could adapt it in the future.
One of the things I wanted to do was to adapt it to Vimeo, but I need to understand what is going on so I can learn more.
module ApplicationHelper
def youtube_embed(youtube_url)
if youtube_url[/youtu\.be\/([^\?]*)/]
youtube_id = $1
else
# Regex from # http://stackoverflow.com/questions/3452546/javascript-regex-how-to-get-youtube-video-id-from-url/4811367#4811367
youtube_url[/^.*((v\/)|(embed\/)|(watch\?))\??v?=?([^\&\?]*).*/]
youtube_id = $5
end
%Q{<iframe title="YouTube video player" src="https://www.youtube.com/embed/#{ youtube_id }?rel=0&enablejsapi=1" frameborder="0" allowfullscreen></iframe>}
end
end

This piece of code tries to build code for embet youtube player from any youtube link on video.
This if youtube_url[/youtu\.be\/([^\?]*)/] verifies if given url has a "youtu.be/...." format and then extracts video id. ($1 contains result of evaluating regular expression (expression between /.../) )
If string do not starts with 'youtu.be', code uses some tricky regular expression (which explanation ypu could find if follow link to stackoverflow in code) and then from $5 also gets video id.
And then it returns a string that is the html-code of embed youtube player for that video. It inserts video id at the end of this string using #{youtube_id}

Related

How do some sites download YouTube captions?

This is somewhat of a duplicate question of Does YouTube API forbid to download video captions if you are not it's owner?, Get YouTube captions and Does YouTube API forbid to download video captions if you are not it's owner?, which all basically say it's not possible unless to download captions via the YouTube API unless you are the owner or third-party contributions are not enabled; however, my question is how to sites like http://downsub.com/ or http://www.lilsubs.com/ have access to all captions?
In other words, when I access the YouTube API myself (even with youtubepartner and youtube.force-ssl scopes), I can only download the captions of some videos, but when I try the same videos that failed for me with 403: The permissions associated with the request are not sufficient to download the caption track. The request might not be properly authorized, or the video order might not have enabled third-party contributions for this caption. on these other sites, it works fine. I'm assuming they are using the YouTube API to access the captions, but what special sauce are they using? Some special partner key? An different API version? Are they just scraping from the videos themselves or something?
Send a GET request on:
http://video.google.com/timedtext?lang={LANG}&v={VIDEOID}
Example for your video in comment: http://video.google.com/timedtext?lang=ko&v=0db1_qWZjRA
Let's look at another example of yours, i.e. https://www.youtube.com/watch?v=7068mw-6lmI (and I agree about differentiation part in your comment).
There are multiple subtitles available for the video
English
Korean
Spanish
Korean (auto-generated) also called asr (automatic speech recognition)
These stand for the subtitle name parameter (i.e., name=English).
lang stands for the country code.
In your example: https://www.youtube.com/api/timedtext?lang=es-MX&v=7068mw-6lmI&name=Spanish
If subtitle track is available, it is possible to do translation form it, namely using tlang parameter.
https://www.youtube.com/api/timedtext?lang=en&v=7068mw-6lmI&name=English&tlang=lv
https://www.youtube.com/api/timedtext?lang=ko&v=7068mw-6lmI&name=Korean&tlang=lv
This would be my bid for what these sites are using, i.e. translation of the available subtitle track (confirm by trying to use a video without subtitle track as input for one of their sites).
As for asr signature seems to always be needed, but as long as one of the subtitle tracks are available, you could use that for translation. E.g. in your OP comment example:
https://www.youtube.com/api/timedtext?lang=en&v=vx6NCUyg1NE&tlang=lv
Looks like the last example is special with both of subtitle tracks being asr (checked with Chrome -> Inspect -> Network) therefore you need to omit the subtitle name parameter part. This difference unfortunately is not visible in YouTube video's settings wheel.
A 2022 answer:
Option 1: Send a curl request to the webpage: curl -L "https://youtu.be/YbJOTdZBX1g", search for timedtext in the result, and you would get a URL. replace \u0026 with & and you get the link for the subtitle.
Option 2: Use the yt-dlp package:
# For installing see: https://github.com/yt-dlp/yt-dlp#with-pip
from yt_dlp import YoutubeDL
ydl_opts = {
"skip_download": True,
"writesubtitles": True,
"subtitleslangs": ["all", "-live_chat"],
# Looks like formats available are vtt, ttml, srv3, srv2, srv1, json3
"subtitlesformat": "json3",
# You can skip the following option
"sleep_interval_subtitles": 1,
}
with YoutubeDL(ydl_opts) as ydl:
ydl.download(["YbJOTdZBX1g"])
There is this unofficial API used by Youtube :
https://www.youtube.com/api/timedtext?lang={LANG}&v={VIDEO_ID}
LANG here is ISO 639-1 2 letter country code. For your example it would be :
https://www.youtube.com/api/timedtext?lang=ko&v=0db1_qWZjRA
You can check it in network tab while toggling the closed caption button :
I have used youtube-transcript-api successfully to retrieve transcripts. The below is a demo to dump the transcript into HTML with links back to the timestamps in the video:
import sys
from youtube_transcript_api import YouTubeTranscriptApi
video_id = sys.argv[1]
# Retrieve the available transcripts
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
# Just use the first transcript, let it raise an exception if none exist.
transcript = next(iter(transcript_list))
print("<html><body>")
for line_map in transcript.fetch():
st_sec = int(line_map['start'] / 60)
st_msec = int(line_map['start'] - st_sec * 60)
tstmp = f"{st_sec}:{st_msec}"
link_to_tstmp = f"https://youtu.be/{video_id}?t={st_sec*60}"
tstmp_str = ("%2d:%-2d" % (st_sec, st_msec)).replace(" ", " ")
#print(f"{st_sec}:{st_msec} {line_map['text']}")
print("""%s %s<br/>""" % (link_to_tstmp, tstmp_str, line_map['text']))
print("</html></body>")
If there are multiple transcripts, the library provides API to search by language etc.
You can further tweak the logic to merge text so you only get one link every so many minutes. I got good results for a lecture by linking at every 1 min and format the lines into a HTML table.

Downloading a YouTube video through Wget

I am trying to download YouTube videos through Wget. The first thing necessary is to capture the URL of the actual video resource. Suppose I want to download this video: video. Opening up the page in the Firebug console reveals something like this:
The link which I have encircled looks like the link to the resource, for there we see only the video: http://www.youtube.com/v/r-KBncrOggI?version=3&autohide=1. However, when I am trying to download this resource with Wget, a 4 KB file of name r-KBncrOggI#version=3&autohide=1 gets stored in my hard-drive, nothing else. What should I do to get the actual video?
And secondly, is there a way to capture different resources for videos of different resolutions, like 360px, 480px, etc.?
Here is one VERY simplified, yet functional version of the youtube-download utility I cited on my another answer:
#!/usr/bin/env perl
use strict;
use warnings;
# CPAN modules we depend on
use JSON::XS;
use LWP::UserAgent;
use URI::Escape;
# Initialize the User Agent
# YouTube servers are weird, so *don't* parse headers!
my $ua = LWP::UserAgent->new(parse_head => 0);
# fetch video page or abort
my $res = $ua->get($ARGV[0]);
die "bad HTTP response" unless $res->is_success;
# scrape video metadata
if ($res->content =~ /\byt\.playerConfig\s*=\s*({.+?});/sx) {
# parse as JSON or abort
my $json = eval { decode_json $1 };
die "bad JSON: $1" if $#;
# inside the JSON 'args' property, there's an encoded
# url_encoded_fmt_stream_map property which points
# to stream URLs and signatures
while ($json->{args}{url_encoded_fmt_stream_map} =~ /\burl=(http.+?)&sig=([0-9A-F\.]+)/gx) {
# decode URL and attach signature
my $url = uri_unescape($1) . "&signature=$2";
print $url, "\n";
}
}
Usage example (it returns several URLs to streams with different encoding/quality):
$ perl youtube.pl http://www.youtube.com/watch?v=r-KBncrOggI | head -n 1
http://r19---sn-bg07sner.c.youtube.com/videoplayback?fexp=923014%2C916623%2C920704%2C912806%2C922403%2C922405%2C929901%2C913605%2C925710%2C929104%2C929110%2C908493%2C920201%2C913302%2C919009%2C911116%2C926403%2C910221%2C901451&ms=au&mv=m&mt=1357996514&cp=U0hUTVBNUF9FUUNONF9IR1RCOk01RjRyaG4wTHdQ&id=afe2819dcace8202&ratebypass=yes&key=yt1&newshard=yes&expire=1358022107&ip=201.52.68.216&ipbits=8&upn=m-kyX9-4Tgc&sparams=cp%2Cid%2Cip%2Cipbits%2Citag%2Cratebypass%2Csource%2Cupn%2Cexpire&itag=44&sver=3&source=youtube,quality=large&signature=A1E7E91DD087067ED59101EF2AE421A3503C7FED.87CBE6AE7FB8D9E2B67FEFA9449D0FA769AEA739
I'm afraid it's not that easy do get the right link for the video resource.
The link you got, http://www.youtube.com/v/r-KBncrOggI?version=3&autohide=1, points to the player rather than the video itself. There is one Perl utility, youtube-download, which is well-maintained and does the trick. This is how to get the HQ version (magic fmt=18) of that video:
stas#Stanislaws-MacBook-Pro:~$ youtube-download -o "{title}.{suffix}" --fmt 18 r-KBncrOggI
--> Working on r-KBncrOggI
Downloading `Sourav Ganguly in Farhan Akhtar's Show - Oye! It's Friday!.mp4`
75161060/75161060 (100.00%)
Download successful!
stas#Stanislaws-MacBook-Pro:~$
There might be better command-line YouTube Downloaders around. But sorry, one doesn't simply download a video using Firebug and wget any more :(
The only way I know to capture that URL manually is by watching the active downloads of the browser:
That largest data chunks are video data, so you can copy its URL:
http://s.youtube.com/s?lact=111116&uga=m30&volume=4.513679238953965&sd=BBE62AA4AHH1357937949850490&rendering=accelerated&fs=0&decoding=software&nsivbblmax=679542.000&hcbt=105.345&sendtmp=1&fmt=35&w=640&vtmp=1&referrer=None&hl=en_US&nsivbblmin=486355.000&nsivbblmean=603805.166&md=1&plid=AATTCZEEeM825vCx&ns=yt&ptk=youtube_none&csipt=watch7&rt=110.904&tsphab=1&nsiabblmax=129097.000&tspne=0&tpmt=110&nsiabblmin=123113.000&tspfdt=436&hbd=30900552&et=110.146&hbt=30.770&st=70.213&cfps=25&cr=BR&h=480&screenw=1440&nsiabblmean=125949.872&cpn=JlqV9j_oE1jzk7Zc&nsivbblc=343&nsiabblc=343&docid=r-KBncrOggI&len=1302.676&screenh=900&abd=1&pixel_ratio=1&bc=26131333&playerw=854&idpj=0&hcbd=25408143&playerh=510&ldpj=0&fexp=920704,919009,922403,916709,912806,929110,928008,920201,901451,909708,913605,925710,916623,929104,913302,910221,911116,914093,922405,929901&scoville=1&el=detailpage&bd=6676317&nsidf=1&vid=Yfg8gnutZoTD4G5SVKCxpsPvirbqG7pvR&bt=40.333&mos=0&vq=auto
However, for a large video, this will only return a part of the stream unless you figure out the URL query parameter responsible for stream range to be downloaded and adjust it.
A bonus: everything changes periodically as YouTube is constantly evolving. So, don't do that manually unless you carve pain.

Youtube "end=" embed tag not working?

I am trying to embed a Youtube video on my site with specific start and end times. Sites such as snipsnip.it and splicd.com use the start= and end= tags in the iframe src like so:
<iframe src='http://www.youtube.com/embed/OwjfE2ylbWU?start=5&end=10' width='640' height='360'>
</iframe>
However, this does not work on my web page. The video starts at the right time but then just plays till the end of the video. The Youtube API states that there is no "end=" tag, yet these sites all use it successfully.
Any idea on how to get embedded Youtube videos to end at a specific point?
splicd.com doesn't actually depend on the YouTube to stop the video. They poll the player with the following JavaScript and the YouTube Player API:
function checkYouTubePlayHead()
{
current = player.getCurrentTime();
if((current >= end) && splice) {
player.seekTo(start, true);=
player.pauseVideo();
}
if(current > start)
played = true;
}
It was working before without any js to implement, youtube just changed to googleapis video repository 2 days ago, and that messed up the end tag. It'll be fixed soon hopefully, not you're the only one, who need a solution for this. So far, this worked fine:
http://www.youtube.com/v/81hChAAt3So&start=107&end=115s&autoplay=0&autohide=0&theme=dark&color=white&rel=0&modestbranding=1&showinfo=0
Now most of the parameters are not passed. Be patient :)

How to check if a url contains video?

hi
i am creating a script ruby on rails in which user shares a link and if the link contains a video, the embed code of the video is extracted. in other words, i am trying to implement a "facebook post link" like feature.. can someone please guide me how can this be achieved??
The only way I can think of to do this would be to manually check each post for a link to the video, for example:
YOUTUBE_EMBED = '<iframe title="YouTube video player" width="640" height="390" src="http://www.youtube.com/embed/VIDEO_ID" frameborder="0" allowfullscreen></iframe>'
if comment =~ /.*http:\/\/(\w+\.)?youtube.com\/watch\?v=(\w+).*/
return YOUTUBE_EMBED.gsub(/VIDEO_ID/, $2)
end
Then repeat this process for each video site. I am wrestling with a similar concept so if you figure out a better way to do it let me know!
You can analyze http headers for that link
require "net/http"
Net::HTTP.start('www.jhepple.com', 80) do |http|
http.head('/support/SampleMovies/MPEG-1.mpg')
end['content-type']
Outputs "video/mpeg"
Now that you know it's really a video, do what you want with it
You could also use a utility like file and fork a system process so that it executes a command like file -b downloaded_file.mpg.
So your code would look something like this:
IO.popen("file -b /path/to/video.mpg") { |stdout| #stdout = stdout.gets }
if not #stdout.grep(/MPEG/).empty?
puts "MPEG Detected"
end
Flash videos usually have the extension .flv, so you just need to look for files that have it.
html_souce.scan(%r|href\s*=\s*\"[^\"]+\.flv\"|)
If you need other file formats, just change the regexp.
You can use an external service like embedl.ly
Here you're a gem that may help you https://github.com/judofyr/ruby-oembed

How to parse embedded videos from youtube, vimeo, etc

I'm working with Ruby On Rails 2.3.8 and I'm using TinyMCE with image and video upload functionalities.
I've figured out that when I insert a Vimeo video, it won't work, because it needs it's own iframe, as the following:
<iframe src="http://player.vimeo.com/video/16430948" width="400" height="225" frameborder="0"></iframe><p>YOU! - Heart from KUSKUS on Vimeo.</p>
I'm now wondering how to show either youtube (which work just fine), vimeo, and other kind of embedded videos.
Update:
Searching on the internet I've found the following code, in the file /plugins/media/media.js, within getType function:
// Vimeo
if ( v.match(/^http:\/\/(?:www\.){0,1}vimeo\.com\/(\d+)$/) ) {
f.width.value = '400';
f.height.value = '321';
f.src.value = 'http://vimeo.com/moogaloop.swf?clip_id=' + v.match(/^http:\/\/(?:www\.){0,1}vimeo\.com\/(\d+)$/)[1];
return 'flash';
}
But it's not working for me. At least, all I see is that it's treating it as it was a common flash video, instead of inserting an iframe on the html for playing it (as it's done when you click the "Embed" button at vimeo.com).
The iframe tag usually gets removed (cleanup) if you do not specify otherwise.
Add this to your tinymce configuration to keep iframes inside the editor:
extended_valid_elements:"iframe[id|class|title|style|align|frameborder|height|longdesc|marginheight|marginwidth|name|scrolling|src|width]",
This thread might be of help too.

Resources