How do I use configuration variables in Tesseract? - ios

I went through this tutorial successfully: Tesseract OCR Tutorial for iOS. It uses the Tesseract OCR iOS framework.
The app works well with the sample image provided by the tutorial, but none of my own images that I test work (the output is gibberish).
To troubleshoot, the docs recommend toggling a configuration variable tessedit_write_images to true (or using configfile get.images) to view the image file to be processed. But I don't see where to set the boolean value and I'm not sure where to place or how to use a configfile.
Search for "tessedit_write_images" in the files in Xcode don't return anything.

You can set configuration variables either by providing as command line option or in a configuration file
Using Command Line Option
$tesseract input.jpg output.txt --oem 2 -l eng -c tessedit_write_images=1
-c configvar=value
Set value for control parameter. Multiple -c arguments are allowed.
Using Config File (myConfig) Option
$ tesseract Lord_Saraswathi.jpg text --oem 2 -l eng myConfig
$ cat myConfig
tessedit_write_images 1

Related

Why my rsync scripts do no work on the new mac? [duplicate]

I am making an NW.js app on macOS, and want to run the app in dev mode
by double-clicking on an icon.
In the first step, I'm trying to make my shell script work.
Using VS Code on Windows (I wanted to gain time), I have created a run-nw file at the root of my project, containing this:
#!/bin/bash
cd "src"
npm install
cd ..
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &
but I get this output:
$ sh ./run-nw
: command not found
: No such file or directory
: command not found
: No such file or directory
Usage: npm <command>
where <command> is one of: (snip commands list)
(snip npm help)
npm#3.10.3 /usr/local/lib/node_modules/npm
: command not found
: No such file or directory
: command not found
Some things I don't understand.
It seems that it takes empty lines as commands.
In my editor (VS Code) I have tried to replace \r\n with \n
(in case the \r creates problems) but it changes nothing.
It seems that it doesn't find the folders
(with or without the dirname instruction),
or maybe it doesn't know about the cd command ?
It seems that it doesn't understand the install argument to npm.
The part that really weirds me out, is that it still runs the app
(if I did an npm install manually)...
Not able to make it work properly, and suspecting something weird with
the file itself, I created a new one directly on the Mac, using vim this time.
I entered the exact same instructions, and... now it works without any
issues.
A diff on the two files reveals exactly zero difference.
What can be the difference? What can make the first script not work? How can I find out?
Update
Following the accepted answer's recommendations, after the wrong line
endings came back, I checked multiple things.
It turns out that since I copied my ~/.gitconfig from my Windows
machine, I had autocrlf=true, so every time I modified the bash
file under Windows, it re-set the line endings to \r\n.
So, in addition to running dos2unix (which you will have to
install using Homebrew on a Mac), if you're using Git, check your
.gitconfig file.
Yes. Bash scripts are sensitive to line-endings, both in the script itself and in data it processes. They should have Unix-style line-endings, i.e., each line is terminated with a Line Feed character (decimal 10, hex 0A in ASCII).
DOS/Windows line endings in the script
With Windows or DOS-style line endings , each line is terminated with a Carriage Return followed by a Line Feed character. You can see this otherwise invisible character in the output of cat -v yourfile:
$ cat -v yourfile
#!/bin/bash^M
^M
cd "src"^M
npm install^M
^M
cd ..^M
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
In this case, the carriage return (^M in caret notation or \r in C escape notation) is not treated as whitespace. Bash interprets the first line after the shebang (consisting of a single carriage return character) as the name of a command/program to run.
Since there is no command named ^M, it prints : command not found
Since there is no directory named "src"^M (or src^M), it prints : No such file or directory
It passes install^M instead of install as an argument to npm which causes npm to complain.
DOS/Windows line endings in input data
Like above, if you have an input file with carriage returns:
hello^M
world^M
then it will look completely normal in editors and when writing it to screen, but tools may produce strange results. For example, grep will fail to find lines that are obviously there:
$ grep 'hello$' file.txt || grep -x "hello" file.txt
(no match because the line actually ends in ^M)
Appended text will instead overwrite the line because the carriage returns moves the cursor to the start of the line:
$ sed -e 's/$/!/' file.txt
!ello
!orld
String comparison will seem to fail, even though strings appear to be the same when writing to screen:
$ a="hello"; read b < file.txt
$ if [[ "$a" = "$b" ]]
then echo "Variables are equal."
else echo "Sorry, $a is not equal to $b"
fi
Sorry, hello is not equal to hello
Solutions
The solution is to convert the file to use Unix-style line endings. There are a number of ways this can be accomplished:
This can be done using the dos2unix program:
dos2unix filename
Open the file in a capable text editor (Sublime, Notepad++, not Notepad) and configure it to save files with Unix line endings, e.g., with Vim, run the following command before (re)saving:
:set fileformat=unix
If you have a version of the sed utility that supports the -i or --in-place option, e.g., GNU sed, you could run the following command to strip trailing carriage returns:
sed -i 's/\r$//' filename
With other versions of sed, you could use output redirection to write to a new file. Be sure to use a different filename for the redirection target (it can be renamed later).
sed 's/\r$//' filename > filename.unix
Similarly, the tr translation filter can be used to delete unwanted characters from its input:
tr -d '\r' <filename >filename.unix
Cygwin Bash
With the Bash port for Cygwin, there’s a custom igncr option that can be set to ignore the Carriage Return in line endings (presumably because many of its users use native Windows programs to edit their text files).
This can be enabled for the current shell by running set -o igncr.
Setting this option applies only to the current shell process so it can be useful when sourcing files with extraneous carriage returns. If you regularly encounter shell scripts with DOS line endings and want this option to be set permanently, you could set an environment variable called SHELLOPTS (all capital letters) to include igncr. This environment variable is used by Bash to set shell options when it starts (before reading any startup files).
Useful utilities
The file utility is useful for quickly seeing which line endings are used in a text file. Here’s what it prints for for each file type:
Unix line endings: Bourne-Again shell script, ASCII text executable
Mac line endings: Bourne-Again shell script, ASCII text executable, with CR line terminators
DOS line endings: Bourne-Again shell script, ASCII text executable, with CRLF line terminators
The GNU version of the cat utility has a -v, --show-nonprinting option that displays non-printing characters.
The dos2unix utility is specifically written for converting text files between Unix, Mac and DOS line endings.
Useful links
Wikipedia has an excellent article covering the many different ways of marking the end of a line of text, the history of such encodings and how newlines are treated in different operating systems, programming languages and Internet protocols (e.g., FTP).
Files with classic Mac OS line endings
With Classic Mac OS (pre-OS X), each line was terminated with a Carriage Return (decimal 13, hex 0D in ASCII). If a script file was saved with such line endings, Bash would only see one long line like so:
#!/bin/bash^M^Mcd "src"^Mnpm install^M^Mcd ..^M./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
Since this single long line begins with an octothorpe (#), Bash treats the line (and the whole file) as a single comment.
Note: In 2001, Apple launched Mac OS X which was based on the BSD-derived NeXTSTEP operating system. As a result, OS X also uses Unix-style LF-only line endings and since then, text files terminated with a CR have become extremely rare. Nevertheless, I think it’s worthwhile to show how Bash would attempt to interpret such files.
On JetBrains products (PyCharm, PHPStorm, IDEA, etc.), you'll need to click on CRLF/LF to toggle between the two types of line separators (\r\n and \n).
I was trying to startup my docker container from Windows and got this:
Bash script and /bin/bash^M: bad interpreter: No such file or directory
I was using git bash and the problem was about the git config, then I just did the steps below and it worked. It will configure Git to not convert line endings on checkout:
git config --global core.autocrlf input
delete your local repository
clone it again.
Many thanks to Jason Harmon in this link:
https://forums.docker.com/t/error-while-running-docker-code-in-powershell/34059/6
Before that, I tried this, that didn't works:
dos2unix scriptname.sh
sed -i -e 's/\r$//' scriptname.sh
sed -i -e 's/^M$//' scriptname.sh
If you're using the read command to read from a file (or pipe) that is (or might be) in DOS/Windows format, you can take advantage of the fact that read will trim whitespace from the beginning and ends of lines. If you tell it that carriage returns are whitespace (by adding them to the IFS variable), it'll trim them from the ends of lines.
In bash (or zsh or ksh), that means you'd replace this standard idiom:
IFS= read -r somevar # This will not trim CR
with this:
IFS=$'\r' read -r somevar # This *will* trim CR
(Note: the -r option isn't related to this, it's just usually a good idea to avoid mangling backslashes.)
If you're not using the IFS= prefix (e.g. because you want to split the data into fields), then you'd replace this:
read -r field1 field2 ... # This will not trim CR
with this:
IFS=$' \t\n\r' read -r field1 field2 ... # This *will* trim CR
If you're using a shell that doesn't support the $'...' quoting mode (e.g. dash, the default /bin/sh on some Linux distros), or your script even might be run with such a shell, then you need to get a little more complex:
cr="$(printf '\r')"
IFS="$cr" read -r somevar # Read trimming *only* CR
IFS="$IFS$cr" read -r field1 field2 ... # Read trimming CR and whitespace, and splitting fields
Note that normally, when you change IFS, you should put it back to normal as soon as possible to avoid weird side effects; but in all these cases, it's a prefix to the read command, so it only affects that one command and doesn't have to be reset afterward.
Coming from a duplicate, if the problem is that you have files whose names contain ^M at the end, you can rename them with
for f in *$'\r'; do
mv "$f" "${f%$'\r'}"
done
You properly want to fix whatever caused these files to have broken names in the first place (probably a script which created them should be dos2unixed and then rerun?) but sometimes this is not feasible.
The $'\r' syntax is Bash-specific; if you have a different shell, maybe you need to use some other notation. Perhaps see also Difference between sh and bash
Since VS Code is being used, we can see CRLF or LF in the bottom right depending on what's being used and if we click on it we can change between them (LF is being used in below example):
We can also use the "Change End of Line Sequence" command from the command pallet. Whatever's easier to remember since they're functionally the same.
One more way to get rid of the unwanted CR ('\r') character is to run the tr command, for example:
$ tr -d '\r' < dosScript.py > nixScript.py
I ran into this issue when I use git with WSL.
git has a feature where it changes the line-ending of files according to the OS you are using, on Windows it make sure the line endings are \r\n which is not compatible with Linux which uses only \n.
You can resolve this problem by adding a file name .gitattributes to your git root directory and add lines as following:
config/* text eol=lf
run.sh text eol=lf
In this example all files inside config directory will have only line-feed line ending and run.sh file as well.
For Notepad++ users, this can be solved by:
The simplest way on MAC / Linux - create a file using 'touch' command, open this file with VI or VIM editor, paste your code and save. This would automatically remove the windows characters.
If you are using a text editor like BBEdit you can do it at the status bar. There is a selection where you can switch.
For IntelliJ users, here is the solution for writing Linux script.
Use LF - Unix and masOS (\n)
Scripts may call each other.
An even better magic solution is to convert all scripts in the folder/subfolders:
find . -name "*.sh" -exec sed -i -e 's/\r$//' {} +
You can use dos2unix too but many servers do not have it installed by default.
For the sake of completeness, I'll point out another solution which can solve this problem permanently without the need to run dos2unix all the time:
sudo ln -s /bin/bash `printf 'bash\r'`

GNU join overwriting output columns [duplicate]

I am making an NW.js app on macOS, and want to run the app in dev mode
by double-clicking on an icon.
In the first step, I'm trying to make my shell script work.
Using VS Code on Windows (I wanted to gain time), I have created a run-nw file at the root of my project, containing this:
#!/bin/bash
cd "src"
npm install
cd ..
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &
but I get this output:
$ sh ./run-nw
: command not found
: No such file or directory
: command not found
: No such file or directory
Usage: npm <command>
where <command> is one of: (snip commands list)
(snip npm help)
npm#3.10.3 /usr/local/lib/node_modules/npm
: command not found
: No such file or directory
: command not found
Some things I don't understand.
It seems that it takes empty lines as commands.
In my editor (VS Code) I have tried to replace \r\n with \n
(in case the \r creates problems) but it changes nothing.
It seems that it doesn't find the folders
(with or without the dirname instruction),
or maybe it doesn't know about the cd command ?
It seems that it doesn't understand the install argument to npm.
The part that really weirds me out, is that it still runs the app
(if I did an npm install manually)...
Not able to make it work properly, and suspecting something weird with
the file itself, I created a new one directly on the Mac, using vim this time.
I entered the exact same instructions, and... now it works without any
issues.
A diff on the two files reveals exactly zero difference.
What can be the difference? What can make the first script not work? How can I find out?
Update
Following the accepted answer's recommendations, after the wrong line
endings came back, I checked multiple things.
It turns out that since I copied my ~/.gitconfig from my Windows
machine, I had autocrlf=true, so every time I modified the bash
file under Windows, it re-set the line endings to \r\n.
So, in addition to running dos2unix (which you will have to
install using Homebrew on a Mac), if you're using Git, check your
.gitconfig file.
Yes. Bash scripts are sensitive to line-endings, both in the script itself and in data it processes. They should have Unix-style line-endings, i.e., each line is terminated with a Line Feed character (decimal 10, hex 0A in ASCII).
DOS/Windows line endings in the script
With Windows or DOS-style line endings , each line is terminated with a Carriage Return followed by a Line Feed character. You can see this otherwise invisible character in the output of cat -v yourfile:
$ cat -v yourfile
#!/bin/bash^M
^M
cd "src"^M
npm install^M
^M
cd ..^M
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
In this case, the carriage return (^M in caret notation or \r in C escape notation) is not treated as whitespace. Bash interprets the first line after the shebang (consisting of a single carriage return character) as the name of a command/program to run.
Since there is no command named ^M, it prints : command not found
Since there is no directory named "src"^M (or src^M), it prints : No such file or directory
It passes install^M instead of install as an argument to npm which causes npm to complain.
DOS/Windows line endings in input data
Like above, if you have an input file with carriage returns:
hello^M
world^M
then it will look completely normal in editors and when writing it to screen, but tools may produce strange results. For example, grep will fail to find lines that are obviously there:
$ grep 'hello$' file.txt || grep -x "hello" file.txt
(no match because the line actually ends in ^M)
Appended text will instead overwrite the line because the carriage returns moves the cursor to the start of the line:
$ sed -e 's/$/!/' file.txt
!ello
!orld
String comparison will seem to fail, even though strings appear to be the same when writing to screen:
$ a="hello"; read b < file.txt
$ if [[ "$a" = "$b" ]]
then echo "Variables are equal."
else echo "Sorry, $a is not equal to $b"
fi
Sorry, hello is not equal to hello
Solutions
The solution is to convert the file to use Unix-style line endings. There are a number of ways this can be accomplished:
This can be done using the dos2unix program:
dos2unix filename
Open the file in a capable text editor (Sublime, Notepad++, not Notepad) and configure it to save files with Unix line endings, e.g., with Vim, run the following command before (re)saving:
:set fileformat=unix
If you have a version of the sed utility that supports the -i or --in-place option, e.g., GNU sed, you could run the following command to strip trailing carriage returns:
sed -i 's/\r$//' filename
With other versions of sed, you could use output redirection to write to a new file. Be sure to use a different filename for the redirection target (it can be renamed later).
sed 's/\r$//' filename > filename.unix
Similarly, the tr translation filter can be used to delete unwanted characters from its input:
tr -d '\r' <filename >filename.unix
Cygwin Bash
With the Bash port for Cygwin, there’s a custom igncr option that can be set to ignore the Carriage Return in line endings (presumably because many of its users use native Windows programs to edit their text files).
This can be enabled for the current shell by running set -o igncr.
Setting this option applies only to the current shell process so it can be useful when sourcing files with extraneous carriage returns. If you regularly encounter shell scripts with DOS line endings and want this option to be set permanently, you could set an environment variable called SHELLOPTS (all capital letters) to include igncr. This environment variable is used by Bash to set shell options when it starts (before reading any startup files).
Useful utilities
The file utility is useful for quickly seeing which line endings are used in a text file. Here’s what it prints for for each file type:
Unix line endings: Bourne-Again shell script, ASCII text executable
Mac line endings: Bourne-Again shell script, ASCII text executable, with CR line terminators
DOS line endings: Bourne-Again shell script, ASCII text executable, with CRLF line terminators
The GNU version of the cat utility has a -v, --show-nonprinting option that displays non-printing characters.
The dos2unix utility is specifically written for converting text files between Unix, Mac and DOS line endings.
Useful links
Wikipedia has an excellent article covering the many different ways of marking the end of a line of text, the history of such encodings and how newlines are treated in different operating systems, programming languages and Internet protocols (e.g., FTP).
Files with classic Mac OS line endings
With Classic Mac OS (pre-OS X), each line was terminated with a Carriage Return (decimal 13, hex 0D in ASCII). If a script file was saved with such line endings, Bash would only see one long line like so:
#!/bin/bash^M^Mcd "src"^Mnpm install^M^Mcd ..^M./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
Since this single long line begins with an octothorpe (#), Bash treats the line (and the whole file) as a single comment.
Note: In 2001, Apple launched Mac OS X which was based on the BSD-derived NeXTSTEP operating system. As a result, OS X also uses Unix-style LF-only line endings and since then, text files terminated with a CR have become extremely rare. Nevertheless, I think it’s worthwhile to show how Bash would attempt to interpret such files.
On JetBrains products (PyCharm, PHPStorm, IDEA, etc.), you'll need to click on CRLF/LF to toggle between the two types of line separators (\r\n and \n).
I was trying to startup my docker container from Windows and got this:
Bash script and /bin/bash^M: bad interpreter: No such file or directory
I was using git bash and the problem was about the git config, then I just did the steps below and it worked. It will configure Git to not convert line endings on checkout:
git config --global core.autocrlf input
delete your local repository
clone it again.
Many thanks to Jason Harmon in this link:
https://forums.docker.com/t/error-while-running-docker-code-in-powershell/34059/6
Before that, I tried this, that didn't works:
dos2unix scriptname.sh
sed -i -e 's/\r$//' scriptname.sh
sed -i -e 's/^M$//' scriptname.sh
If you're using the read command to read from a file (or pipe) that is (or might be) in DOS/Windows format, you can take advantage of the fact that read will trim whitespace from the beginning and ends of lines. If you tell it that carriage returns are whitespace (by adding them to the IFS variable), it'll trim them from the ends of lines.
In bash (or zsh or ksh), that means you'd replace this standard idiom:
IFS= read -r somevar # This will not trim CR
with this:
IFS=$'\r' read -r somevar # This *will* trim CR
(Note: the -r option isn't related to this, it's just usually a good idea to avoid mangling backslashes.)
If you're not using the IFS= prefix (e.g. because you want to split the data into fields), then you'd replace this:
read -r field1 field2 ... # This will not trim CR
with this:
IFS=$' \t\n\r' read -r field1 field2 ... # This *will* trim CR
If you're using a shell that doesn't support the $'...' quoting mode (e.g. dash, the default /bin/sh on some Linux distros), or your script even might be run with such a shell, then you need to get a little more complex:
cr="$(printf '\r')"
IFS="$cr" read -r somevar # Read trimming *only* CR
IFS="$IFS$cr" read -r field1 field2 ... # Read trimming CR and whitespace, and splitting fields
Note that normally, when you change IFS, you should put it back to normal as soon as possible to avoid weird side effects; but in all these cases, it's a prefix to the read command, so it only affects that one command and doesn't have to be reset afterward.
Coming from a duplicate, if the problem is that you have files whose names contain ^M at the end, you can rename them with
for f in *$'\r'; do
mv "$f" "${f%$'\r'}"
done
You properly want to fix whatever caused these files to have broken names in the first place (probably a script which created them should be dos2unixed and then rerun?) but sometimes this is not feasible.
The $'\r' syntax is Bash-specific; if you have a different shell, maybe you need to use some other notation. Perhaps see also Difference between sh and bash
Since VS Code is being used, we can see CRLF or LF in the bottom right depending on what's being used and if we click on it we can change between them (LF is being used in below example):
We can also use the "Change End of Line Sequence" command from the command pallet. Whatever's easier to remember since they're functionally the same.
One more way to get rid of the unwanted CR ('\r') character is to run the tr command, for example:
$ tr -d '\r' < dosScript.py > nixScript.py
I ran into this issue when I use git with WSL.
git has a feature where it changes the line-ending of files according to the OS you are using, on Windows it make sure the line endings are \r\n which is not compatible with Linux which uses only \n.
You can resolve this problem by adding a file name .gitattributes to your git root directory and add lines as following:
config/* text eol=lf
run.sh text eol=lf
In this example all files inside config directory will have only line-feed line ending and run.sh file as well.
For Notepad++ users, this can be solved by:
The simplest way on MAC / Linux - create a file using 'touch' command, open this file with VI or VIM editor, paste your code and save. This would automatically remove the windows characters.
If you are using a text editor like BBEdit you can do it at the status bar. There is a selection where you can switch.
For IntelliJ users, here is the solution for writing Linux script.
Use LF - Unix and masOS (\n)
Scripts may call each other.
An even better magic solution is to convert all scripts in the folder/subfolders:
find . -name "*.sh" -exec sed -i -e 's/\r$//' {} +
You can use dos2unix too but many servers do not have it installed by default.
For the sake of completeness, I'll point out another solution which can solve this problem permanently without the need to run dos2unix all the time:
sudo ln -s /bin/bash `printf 'bash\r'`

Google Cloud Platform - Viewing downloaded files after wget

I am completing this tutorial and am at the part where you download the code for the tutorial. The request we send to Github is:
wget https://github.com/GoogleCloudPlatform/cloudml-samples/archive/master.zip
I understand that this downloads archive to GCP, and I can see the files in the Cloud shell, but is there a way to see the files through the Google Console GUI? I would like to browse the files I have downloaded to understand their structure better.
By clicking on the pencil icon on the top right corner, the Cloud Shell Code editor will pop.
Quoting the documentation:
"The built-in code editor is based on Orion. You can use the code
editor to browse file directories as well as view and edit files, with
continued access to the Cloud Shell. The code editor is available by
default with every Cloud Shell instance."
You can find more info here: https://cloud.google.com/shell/docs/features#code_editor
If you prefer to use the command line to view files, you can install and run the tree Unix CLI command 1 and run it in Cloud Shell to list contents of directories in a tree-like format.
install tree => $ sudo apt-get install tree
run it => $ tree ./ -h --filelimit 4
-h will show human readable size of files/directories
and you can use --filelimit to set the maximum number of directories to descent within the list.
Use $ man tree to see the available parameters for the command, or check the man online documentation here: https://linux.die.net/man/1/tree

Error using the 'find' command to generate a collection file on opencv

I am facing a problem generating a collection file of the positive images to train the Haar Cascade in OpenCV to detect a car. On every tutorial I found on the internet, it is the same command, however i am unable to execute it.
I am using Command Prompt and Windows Power Shell to execute this command. find ./positive_images/ -iname '.*pgm' > positives.txt the screenshot of the output I am running this command from root of my directory. The positive images are stored in positive_images folder.
OUTPUT:
File not found - '*pgm'
However, the positive_images directory contains 550 files with .pgm extension.
Error File not found - '*pgm'
I am using Command Prompt and Windows Power Shell to execute this command:
find ./positive_images/ -iname '.*pgm' > positives.txt
The above command is using the syntax of a Unix version of find, but you are running it under Windows. PowerShell does not have a built in find command so you are running C:\Windows\System32\find.exe.
Notes:
Unix find is used to search for files.
Windows find is used to search for string in files.
As you are running on Windows you need to use dir instead of find:
dir /b /s positive_images\*.pgm > positives.txt
Further Reading
An A-Z Index of the Windows CMD command line - An excellent reference for all things Windows cmd line related.
dir - Display a list of files and subfolders.
find - Search for a text string in a file & display all the lines where it is found.

how to set up default download location in youtube-dl [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 months ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
how can I set default download location in youtube-dl so that everything that I download with youtube-dl goes into that default directory?
You need to use the -o switch with the Configuration file
Output on youtube-dl is handled with the --output or -o switch; pass it as an option, followed by the destination you want to save your downloads to:
youtube-dl -o "%USERPROFILE%\Desktop\%(title)s-%(id)s.%(ext)s" www.youtube.com/link/to/video
Note that -o has a dual function in that it also sets a template for how your output files will be named, using variables. In this example, it will output the title of the original downloaded video followed by the file extension, which is my personal preference. For all of the variables that can be used in a filename, have a look at the youtube-dl documentation here.
youtube-dl also allows use of a configuration file - a file that can be used to configure the switches you most frequently use so the program can pull them from there instead, saving you from having to explicitly call them each time you run it. This is what you'll need for the default download location that you're looking for. The configuration file can be used to set a default output destination so that you never have to explicitly set an output again.
To set up a configuration file for youtube-dl, assuming you have Windows:
In %APPDATA%\Roaming, create a youtube-dl folder if one doesn't already exist.
Inside that folder, create a plain text file named config.txt.
Place youtube-dl options in the file as you'd normally use them on the command line with youtube-dl, placing each one on a new line. For example, for the output switch, you'd use: -o %USERPROFILE%\Desktop. For more on the Configuration file, read the documentation on it here.
Overriding the Configuration file
Even when an option is configured in a configuration file, it can be overridden by calling it explicitly from the command line. So, if you have -o set in a configuration file to be the default location for downloads, but want to save downloads to somewhere else for a current job, simply calling -o on the command line will override the configuration file for the current run of the program only.
I find a way to directly download files in Downloads folder. I search for long hours. I copied my entire function then you can understand the context around. Here is my code it will maybe helpful for someone:
import os
def download_audio(request):
SAVE_PATH = '/'.join(os.getcwd().split('/')[:3]) + '/Downloads'
ydl_opts = {
'format': 'bestaudio/best',
'postprocessors': [{
'key': 'FFmpegExtractAudio',
'preferredcodec': 'mp3',
'preferredquality': '192',
}],
'outtmpl':SAVE_PATH + '/%(title)s.%(ext)s',
}
link = request.GET.get('video_url')
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(["https://www.youtube.com/watch?v="+link])
Tell me if there is a problem.
According to the configuration documentation, you can configure youtube-dl with a global or user-specific configuration file:
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and macOS, the system wide configuration file is located at /etc/youtube-dl.conf and the user wide configuration file at ~/.config/youtube-dl/config. On Windows, the user wide configuration file locations are %APPDATA%\youtube-dl\config.txt or C:\Users\<user name>\youtube-dl.conf. Note that by default configuration file may not exist so you may need to create it yourself.
On linux, this would be your user config file:
# Save all my videos to the Videos directory:
-o ~/Videos/%(title)s.%(ext)s
Depending on your needs, I think moving the file afterwards would be just as usefull:
--exec CMD Execute a command on the file after
downloading, similar to find's -exec
syntax. Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
By creating a function which will move the file
Here is the complete solution I use:
from youtube_dl import YoutubeDL
ydl_opts = {
'format': 'best',
'outtmpl': 'DIR-PATH-HERE%(title)s'+'.mp4',
'noplaylist': True,
'extract-audio': True,
}
video = "https://www.youtube.com/watch?v=SlPhMPnQ58k"
with YoutubeDL(ydl_opts) as ydl:
info_dict = ydl.extract_info(video, download=True)
video_url = info_dict.get("url", None)
video_title = info_dict.get('title', None)
video_length = info_dict.get('duration')
# print(video_title)
In command line or in the bash file use the double quotes, like this:
"%userprofile%/Desktop/DL/%(title)s-%(id)s.%(ext)s"
My bash command:
youtube-dl -c -i -f "mp4" -o "/home/Youtube_Downloads/%(title)s-%(id)s.%(ext)s" -a youtube_list
where 'youtube_list' - a raw text file with Youtube links, that goes line by line
I found there is an official comment by the authors about that specific question.
In the manual, here's what they say: (man youtube-dl):
How do I put downloads into a specific folder?
Use the -o to specify an output template, for example -o "/home/user/videos/%(title)s-%(id)s.%(ext)s". If you want this for all of your downloads, put the option into your configuration file.
That filename pattern is the default, as per the man as well:
The current default template is %(title)s-%(id)s.%(ext)s.
I agree it would be nice to have the output folder decoupled from the default template in case the default changes one day, but I'm guessing the authors must have had a reason to have it this way.
This is the EXACT ANOTHER USEFUL method to download your video into a desired DIRECTORY, and also keep the native filename of the download.
Decide where you want to create a configuration file.
Create a file, "youtube-dl.conf". You can create a youtube-dl.txt first it it's easier, but the file must be "youtube-dl.conf".
Here is a basic sample of a config file: this is where you want your downloads to go. This is all you have to put into the file. Where -o is the flag, %userprofile%/Desktop/DL/ is where I want the download to go, and %(title)s-%(id)s.%(ext)s is the command to keep the native filename.This is your config file below:
-o %userprofile%/Desktop/DL/%(title)s-%(id)s.%(ext)s
Options found here
Config here
The command paramaters: %program% -f %option% "%youtubelink%" "%MYCONFIG%" "%MYPATH%"
Batch File setup:
::Variables:
Set program="%USERPROFILE%\Desktop\YOUTUBE-DL\v20201209\youtube-dl.exe"
Set option=best
SET MYPATH="%USERPROFILE%\Desktop\YOUTUBE-DL\v20201209\config"
SET MYCONFIG="--config-location"
SET MYDLDIR="%USERPROFILE%\Desktop\DL"
SET INSTR='%%(title)s-%%(id)s.%%(ext)s'
MKDIR "%USERPROFILE%\Desktop\DL"
::Ask user for input.
Set /P youtubelink=[Past Link]:
:: For use of config file, for default download location.
%program% -f %option% "%youtubelink%" "%MYCONFIG%" "%MYPATH%"
:: There are many ways to accomplish this:
:: For Batch File, NOTE extra (%) character needed.
:: "%program%" -f "option" --merge-output-format mp4 -o "%MYDLDIR%"\%%(title)s-%%(id)s.%%(ext)s %youtubelink%
:: or this use of variable
:: "%program%" -f "option" --merge-output-format mp4 -o "%MYDLDIR%"\%INSTR% %youtubelink%
NOTE: The use of "quotes" when there are spaces in your variable options.
Final Message:
Create the config file, put it in a folder (directory) that you wish to refer to it. Go to your youtube-dl.exe file and pass the "parameters" listed above to it using your CMD or a batch file. Done. (contribution, and being kind)

Resources