What do the last two entries of PATH tell you? - path

I'm still learning how PATH works, what exactly is echo $PATH's output meant to be? What do they tell us?

What exactly is echo $PATH's output meant to be?
The current path.
What do the last two entries tell you?
The last two directories on your path.
In other words, it's unclear what you're asking. Do you not know what a path is? Are you asking what a particular sequence of special characters mean in the context of a path? What is the explicit path you're asking about?

$PATH variable is simply a list of paths that your system automatically checks whenever you run a command on, say your bash terminal.
PATH is a colon : (Unix-like) or semi-colon ; (Windows) separated list.
Whenever you run a command like ls -lrt, your system looks for the definition of the command (or function) ls. So the definition of the PATH variable being established, we can answer your two questions:
what exactly is echo $PATH's output meant to be
The $PATH's output provides a list of paths.
What do the last two entries of PATH tell you?
PATH is list is of all the paths where your system will look for a command. So the last 2 entries tell you the last 2 paths out of the list.

Related

Using regex in Docker COPY for digits

Im trying to copy a file in docker with below format
database-20.2.1.zip
I have tried using something like below, but it does not seems to work
COPY databse-+([0-9]).+([0-9]).+([0-9]).zip /docker-entrypoint-initdb.d/database.zip
Is this something that can be done in docker copy ?
Dockerfile COPY uses shell globs, not regular expressions. The actual implementation uses the Go filepath.Match syntax. That syntax doesn't allow some of the combinations that regular expressions do: you can match any single digit [0-9], or any number of characters *, but not any number of digits.
Depending on how strict you want to be about what files you'll accept and how consistent the filename format is, any of the following will work:
COPY database-[0-9][0-9].[0-9].[0-9].zip ./database.zip
COPY database-*.*.*.zip ./database.zip
COPY database-*.zip ./database.zip
In all cases note that the pattern can match multiple files in the build context. If the right-hand side of COPY is a single file name (not ending with /) but the glob matches multiple files you will get a build error. In this case that's probably what you want.

How to grep for files using 'and' operator, words might not be on the same line

I have a directory /dir
which has several text files in it, These files may or may not contain the words 'rock' and 'stone', so basically some files might just contain the word 'rock', some may just contain the word 'stone', some may contain both, and some may contain neither.
How can I list all files in this directory that contain both 'rock' and 'stone'? These words might not be on the same line so I don't think piping through grep twice would work.
Appreciate any help, I was not able to find a stackoverflow post with this problem so I figured I'd ask.
To search files that match the given two (or more) words at any line anywhere in the file, you may want to try ugrep:
ugrep -F --files -e 'rock' --and -e 'stone' dir
This only matches files that have both rock and stone in them. Lines are output that have rock or stone, or you can use option -l to just list files. The -F option searches strings (like grep -F and fgrep), --files applies the --and file-wide, which you want instead of applying the --and per line. Note that we have more than one pattern in this case, so option -e should be used (like grep also requires this).
A shorter form with --bool:
ugrep -F --files --bool 'rock stone' dir
where --bool formulates a Boolean query with space as AND (or use AND).
If you want to search directory dir recursively in subdirectories, use option -r.

Bash - grep command inconsistent with man page

I am trying to understand and read the man page. Yet everyday I find more inconsistent syntax and I would like some clarification to whether I am misunderstanding something.
Within the man page, it specifies the syntax for grep is grep [OPTIONS] [-e PATTERN]... [-f FILE]... [FILE...]
I got a working example that recursively searches all files within a directory for a keyword.
grep -rnw . -e 'memes
Now this example works, but I find it very inconsistent with the man page. The directory (Which the man page has written as [FILE...] but specifies the use case for if file == directory in the man page) is located last. Yet in this example it is located after [OPTIONS] and before [-e PATTERN].... Why is this allowed, it does not follow the specified regex fule of using this command?
Why is this allowed, it does not follow the specified regex fule of using this command?
The lines in the SYNOPSIS section of a manpage are not to be understood as strict regular expressions, but as a brief description of the syntax of a utility's arguments.
Depending on the particular application, the parser might be more or less flexible on how it accepts its options. After all, each program can implement whatever grammar they like for their arguments. Therefore, some might allow options at the beginning, at the end, or even in-between files (typically with ways to handle ambiguity that may arisa, e.g. reading from the standard input with -, filenames starting with -...).
Now, of course, there are some ways to do it that are common. For instance, POSIX.1-2017 12.1 Utility Argument Syntax says:
This section describes the argument syntax of the standard utilities and introduces terminology used throughout POSIX.1-2017 for describing the arguments processed by the utilities.
In your particular case, your implementation of grep (probably GNU's grep) allows to pass options in-between the file list, as you have discovered.
For more information, see:
https://unix.stackexchange.com/questions/17833/understand-synopsis-in-manpage
Are there standards for Linux command line switches and arguments?
https://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html
You can also leverage .
grep ‘string’ * -lR

dash equivalent to bash's curly bracket syntax?

In bash, php/{composer,sismo} expands to php/composer php/sismo. Is there any way to do this with /bin/sh (which I believe is dash), the system shell ? I'm writing git hooks and would like to stay away from bash as long as I can.
You can use printf.
% printf 'str1%s\t' 'str2' 'str3' 'str4'
str1str2 str1str3 str1str4
There doesn't seem to be a way. You will have to use loops to generate these names, perhaps in a function. Or use variables to substitute common parts, maybe with "set -u" to prevent typos.
I see that you prefer dash for performance reasons, however you don't seem to provide any numbers to substantiate your decision. I'd suggest you measure actual performance difference and reevaluate. You might be falling for premature optimization, as well. Consider how much implementation and debugging time you'll save by using Bash vs. possible performance drop.
I really like the printf solution provided by #mikeserv, but I thought I'd provide an example using a loop.
The below would probably be most useful if you wish to execute one command for each expanded string, rather than provide both strings as args to the same command.
for X in composer sismo; do
echo "php/$X" # replace 'echo' with your command
done
You could, however, rewrite it as
ARGS="$(for X in composer sismo; do echo "php/$X"; done)"
echo $ARGS # replace 'echo' with your command
Note that $ARGS is unquoted in the above command, and be aware that this means that its content is wordsplitted (i.e. if any your original strings contain spaces, it will break).

What's a "canonical path"?

So, an absolute path is a way to get to a certain file or location describing the full route to it, the full path, and it's OS dependent (the absolute paths for Windows and Linux, for example, are different). A relative path, on the other hand, is a route to a file or location which is described from the current location .. (two dots) indicating a superior level in the directories tree. That has been clear to me for several years now.
When searching I've even seen that there are canonicalized files too!
All I know is that CANONICAL means something like "according to the rules" or something.
Can somebody enlighten me in therms of theory about canonical stuff?
The whole point of making anything "canonical" is so that you can compare two things. For example, both ../../here/bar/x and ./test/../../bar/x may refer to the same location, but you can't do a textual comparison on the two paths. However, if you turn them into their canonical representation, they both become ../bar/x, and we see that they actually refer to the same thing.
In short, it is often the case that you have many ways of referring to one thing, and in that case you may be able to define a canonical representation which is unique and which allows you to get a handle on col­lections of such things.
(If you're looking for more examples, all of mathematics is full of "canonical" constructions for all sorts of objects, and very much with the same purpose in mind. Maybe this Wikipedia article can provide some ad­ditional directions.)
A good way to define a canonical path will be: the shortest absolute path (short, in the meaning of string-length).
This is an example of the difference between an absolute path and a canonical path:
absolute path: C:\abc\..\abc\file.txt
canonical path: C:\abc\file.txt
Canonicalization is a type of normalization which allows an object to be identified in a unique way. A relative path cannot do it, by definition.
For more info:
https://en.wikipedia.org/wiki/Canonicalization
https://en.wikipedia.org/wiki/Canonical_form
What a canonical path is (or its difference from an absolute path) is system dependent.
Typically if a (full) path contains aliases, shortcuts or symbolic links the canonical path resolves all these into the actual directories they refer.
Example: if /bin/a is a sym link, you can find it anywhere you request for an absolute path e.g. from java.io.File#getAbsolutePath while the real file (i.e. the actual target of the link) i.e. usr/local/bin/a would be return as a canonical path e.g. from java.io.File#getCanonicalPath
A good definition of a canonical path is given in the documentation of readlink in GNU Coreutils. It is specified that 'Canonicalize mode' returns an equivalent path that doesn't have any of these things:
hard links to self (.) and parent (..) directories
repeated separators (/)
symbolic links
The string length is irrelevant, as is demonstrated in the following example.
You can experiment with readlink -f (canonicalize mode) or its preferred equivalent command realpath to see the difference between an 'absolute path' and a 'canonical absolute path' for some programs on your system if you are running linux or are using GNU Coreutils.
I can get the path of 'java' on my system using which
$ which java
/usr/bin/java
This path, however, is actually a symbolic link to another symbolic link. This symbolic link chain can be displayed using namei.
$ namei $(which java)
f: /usr/bin/java
d /
d usr
d bin
l java -> /etc/alternatives/java
d /
d etc
d alternatives
l java -> /usr/lib/jvm/java-17-openjdk-amd64/bin/java
d /
d usr
d lib
d jvm
d java-17-openjdk-amd64
d bin
- java
The canonical path can be found using the previously mentioned realpath command.
$ realpath $(which java)
/usr/lib/jvm/java-17-openjdk-amd64/bin/java
The most issues with canonical paths occur when you are passing the name of a dir and not file. For file, if we are providing absolute path that is also the canonical path. But for dir it means omitting the last "/". For example, "/var/tmp/foo" is a canonical path while "/var/tmp/foo/" is not.

Resources