youtube-dl, download files which has subtitle - youtube

I am trying to download videos of a channel which has subtitle. There are more than thousands files, but only a few has subtitle.
youtube-dl --all-subs -ciw -o "./tmp/%(playlist_index)s - %(title)s.%(ext)s" https://www.youtube.com/watch?v=eVW0Xz85qSA&list=PLElG6fwk_0UmBgC02jKJePx
If I can run a command after every url download, it can be good enough. In this case I will check existence of any subtitle file and decide to keep it or remove it.
maybe --exec is good, but it did not work for me as I expected.

This could be probably done in more elegant way.
But it works for me.
First extract urls that have subtitles (replace "playlist_url" with actual url of a playlist, of course)
youtube-dl --write-sub -ij playlist_url | jq -r ".subtitles" \
| grep -Eo "v=[^\&]+" | sort -u > urls.txt \
&& sed -i -e s7^7https:\/\/www\.youtube\.com\/watch\?7 urls.txt
and then download those files with a batch input
youtube-dl -cia urls.txt
* notice that proper playlist_url for channel adress you provided is "https://www.youtube.com/user/SonechkoProject/videos"

Related

using grep command to get spectfic word [LINUX]

I have a test.txt file with links for example:
google.com?test=
google.com?hello=
and this code
xargs -0 -n1 -a FUZZvul.txt -d '\n' -P 20 -I % curl -ks1L '%/?=DarkLotus' | grep -a 'DarkLotus'
When I type a specific word, such as DarkLotus, in the terminal, it checks the links in the file and it brings me the word which is reflected in the links i provided in the test file
There is no problem here, the problem is that I have many links, and when the result appears in the terminal, I do not know which site reflected the DarkLotus word.
How can i do it?
Try -n option. It shows the line number of file with the matched line.
Best Regards,
Haridas.
I'm not sure what you are up to there, but can you invert it? grep by default prints matching lines. The problem here is you are piping the input from the stdout of the previous commands into grep, and that can lack context at grep. Since you have a file to work with:
$ grep 'DarkLotus' FUZZvul.txt
If your intention is to also follow the link then it might be easier to write a bash script:
#!/bin/bash
for line in `grep 'DarkLotus FUZZvul.txt`
do
link=# extract link from line
echo ${link}
curl -ks1L ${link}
done
Then you could make your script accept user input:
#/bin/bash
word="${0}"
for line in `grep ${word} FUZZvul.txt`
...
and then
$ my_link_getter "DarkLotus"
https://google?somearg=DarkLotus
...
And then you could make the txt file a parameter.
etc.

Tool for edit lvm.conf file

is there any lvm.conf editor?
I'm trying to set global_filter, use_lvmtad and some other options, currently using sed:
sed -i /etc/lvm/lvm.conf \
-e "s/use_lvmetad = 1/use_lvmetad = 0/" \
-e "/^ *[^#] *global_filter/d" \
-e "/^devices {/a\ global_filter = [ \"r|/dev/drbd.*|\", \"r|/dev/dm-.*|\", \"r|/dev/zd.*|\" ]"
but I don't like this too much, is there any better way?
I found only lvmconfig tool, but it can only display certain configuration sections, and can't edit them.
If you using Ubuntu variant then you can use the LVM GUI to configure and manage the LVM. Refer this link
It seems that augtool is exactly what I was looking for.
These two packages should be enough to proper processing lvm.conf file:
apt install augeas-tools augeas-lenses
Example usage:
augtool print /files/etc/lvm/lvm.conf
And you should get the whole parse tree on stdout.
If the parser fails you won’t get any output, print the error message using:
augtool print /files/etc/lvm/lvm.conf/error
The augtool equivalent for the sed command from the original question:
augtool -s <<EOT
set /files/etc/lvm/lvm.conf/global/dict/use_lvmetad/int "0"
rm /files/etc/lvm/lvm.conf/devices/dict/global_filter
set /files/etc/lvm/lvm.conf/devices/dict/global_filter/list/0/str "r|^/dev/drbd.*|"
set /files/etc/lvm/lvm.conf/devices/dict/global_filter/list/1/str "r|/dev/dm-.*|"
set /files/etc/lvm/lvm.conf/devices/dict/global_filter/list/2/str "r|/dev/zd.*|"
EOT

Youtube-DL AUDIO-ONLY Playlist

I want to download an AUDIO-ONLY playlist using youtube-dl. I've got the basics down. My version of youtube-dl is up to date, and the command I'm using is:
youtube-dl -x --extract-audio --yes-playlist --playlist-start 1 --playlist-end 18 \
https://www.youtube.com/watch?v=NRHoSXxTpgI&index=1&list=OLAK5uy_lowZyOuAZVs41vEtzV6e0KU8Iue1YQlzg
But it keeps getting stuck on
Deleting original file [filename] (pass -k to keep)
Github doesn't seem to be of any help: https://github.com/rg3/youtube-dl/issues/12570
Any ideas?
The ampersand (&) is a special character in most shells, advising the shell to run the command so far in the background. This should be plainly visible in the first lines after the command execution, which will likely say something like
[1] 17156
[2] 17157
These is your shell telling you the process ID of the new background processes.
You must escape ampersands with a backslash or quotes, like this:
youtube-dl -x --yes-playlist --playlist-start 1 --playlist-end 18 \
'https://www.youtube.com/watch?v=NRHoSXxTpgI&index=1&list=OLAK5uy_lowZyOuAZVs41vEtzV6e0KU8Iue1YQlzg'
--extract-audio is the same as -x, and thus can be deleted.
For more information, see the youtube-dl FAQ.
-x = extract audio only
-i = ignore errors (skip unavailable files)
use only the 'list=...' part, delete the other parameters from copied URL
default is first to last
youtube-dl -ix https://www.youtube.com/watch?list=...
for individual tracks add e.g. --playlist-items 4,7-9
see also: github/ytdl-org: output template examples

tar pre-run to evaluate expected size or amount of files

The problem:
I have a back-end process that at some point he collect and build a big tar file.
This tar receive few directories and an exclude files.
the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).
i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.
the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'
10x for everyone who is willing to assist.
You can pipe the output to the wc tool instead of actually making a file.
With file listing (verbose):
[git#server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080
Without:
[git#server]$ tar czf - ./test-dir | wc -c
2734080
Why don't you run a
DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[#]} -type f | wc -l
beforehand. This gets all the files (-type f) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable
If you want to know the size of all the stored files, you can use du
DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[#]} | tail -1 | awk -F ' ' '{print $1}'
This prints the disk usage with du, calculates a grand total (-c flag), gets the last line (example 4378921 total), and uses just the first column with awk

How do get the data stream link for any video of youtube?

I'm not sure if this is the right place to post this question,I googled a lot about this,but nothing turned up,. for a link of the form
http://www.youtube.com/watch?v=[video_id]
How do i get the link for the data stream?
The following bash script will retrieve youtube streaming url's. I know this is outdated, but maybe this will help someone.
#!/bin/bash
[ -z "$1" ] && printf "usage: `basename $0` <youtube_url>\n" && exit 1
ID="$(echo "$1" | grep -o "v=[^\&]*" | sed 's|v=||')"
URL="$(curl -s "http://www.youtube.com/get_video_info?&video_id=$ID" | sed -e 's|%253A|:|g' -e 's|%252F|/|g' -e 's|%253F|?|g' -e 's|%253D|=|g' -e 's|%2525|%|g' -e 's|%2526|\&|g' -e 's|\%26|\&|g' -e 's|%3D|=|g' -e 's|type=video/[^/]*&sig|signature|g' | grep -o "http://o-o---preferred[^:]*signature=[^\&]*" | head -1)"
[ -z "$URL" ] && printf "Nothing was found...\n" && exit 2
echo "$URL"
Here's a quick lesson in reverse-engineering the YouTube page to extract the stream data.
In the HTML you'll find a <script> tag which defines a variable "swfHTML" - it looks like this: "var swfHTML = (isIE) ? "...
The text in the quotes that follows that snippet is the HTML that displays the Flash object. Note, this text is a set of broken up strings that get concatenated so you'll need to clean it up (i.e. strip instances of '" + "' and and escaping backslashes in order to get the HTML string.)
Once clean you'll need to find the <param> tag with name="flashvars", the value of this tag is an &-delimited URL. Do a split on the & and you'll get your key-value pairs for all the data relating to this video.
The main key you're looking for is "fmt_url_map" and it's an URL Encoded string of Comma-Separated Values starting with "35|" or "34|" or other. (These are defined in another key, "fmt_list" to be files of resolution 854x480 for 35, 640x360 for 34, etc..)
each channel provides rss-data, wich is not updated immediatelly.
Here is a generator for Youtube RSS Files. You should be able to deduce the location of videofiles based on the RSS information. The flv files should be streamable but other formats are also provided.
EDIT:
http://www.referd.info/ is no longer available. It basically was a service where you provided the youtube link and it dereferenced it found all possible downloadsources for that video. I am sure those services are still out there... this one isnt anymore.
You Need Open Link Like This
http://www.youtube.com/get_video_info?&video_id=OjEG808MfF4
And Find Your Stream Your In Return Data

Resources