TAR with --to-command - tar

I'd like to calculate MD5 for all files in a tar archive. I tried tar with --to-command.
tar -xf abc.tar --to-command='md5sum'
it outputs like below.
cb6bf052c851c1c30801ef27c9af1968 -
f509549ab4eeaa84774a4af0231cccae -
Then I want to replace '-' with file name.
tar -xf abc.tar --to-command='md5sum | sed "s#-#$TAR_FILENAME#"' it reports error.
md5sum: |: No such file or directory
md5sum: sed: No such file or directory
md5sum: s#-#./bin/busybox#: No such file or directory
tar: 23255: Child returned status 1

You don't have a shell so this won't work (you also might see that the | gets to md5sum as an argument). one way could be to invoke the shell yourself, but there is some hassle with nested quotes:
tar xf some.tar --to-command 'sh -c "md5sum | sed \"s|-|\$TAR_FILENAME|\""'

At first, it's better to avoid using sed, not only because it's slow, but because $TAR_FILENAME can contain magic chars to be interpreted by sed (you already noticed that, having to use # instead of / for substitution command, didnt you?). Use deadproof solution, like head, followed by echoing actual filename.
Then, as Patrick mentions in his answer, you can't use complex commands without having them wrapped with shell, but for convenience I suggest to use built-it shell escapement ability, for bash it's printf '%q' "something", so the final command be like:
tar xf some.tar \
--to-command="sh -c $(printf '%q' 'md5sum | head -c 34 && printf "%s\n" "$TAR_FILENAME"')"
"34" is number of bytes before file name in md5sum output format; && instead of ; to allow md5sum's error code (if any) reach tar; printf instead of echo used because filenames with leading "-" may be interpreted by echo as options.

Related

JQ adds single quotes while saving in environment variables

OK, this might be a silly question. I've got the test.json file:
{
"timestamp": 1234567890,
"report": "AgeReport"
}
What I want to do is to extract timestamp and report values and store them in some env variables:
export $(cat test.json | jq -r '#sh "TIMESTAMP=\(.timestamp) REPORT=\(.report)"')
and the result is:
echo $TIMESTAMP $REPORT
1234567890 'AgeReport'
The problem is that those single quotes break other commands.
How can I get rid of those single quotes?
NOTE: I'm gonna leave the accepted answer as is, but see #Inian's answer for a better solution.
Why make it convoluted with using eval and have a quoting mess? Rather simply emit the variables by joining them with NULL (\u0000) and read it back in the shell environment
{
IFS= read -r -d '' TIMESTAMP
IFS= read -r -d '' REPORT
} < <(jq -r '(.timestamp|tostring) + "\u0000" + .report + "\u0000"' test.json)
This makes your parsing more robust by making the fields joined by NULL delimiter, which can't be part of your string sequence.
From the jq man-page, the #sh command converts its input to be
escaped suitable for use in a command-line for a POSIX shell.
So, rather than attempting to splice the output of jq into the shell's export command which would require carefully removing some quoting, you can generate the entire commandline inside jq, and then execute it with eval:
eval "$(
cat test.json |\
jq -r '#sh "export TIMESTAMP=\(.timestamp) REPORT=\(.report)"'
)"

Combine grep -v with grep -r?

I want to remove an entire line of text from all files in a given directory. I know I can use grep -v foo filename to do this one file at a time. And I know I can use grep -r foo to search recursively through a directory. How do I combine these commands to remove a given line of text from all files in a directory?
The UNIX command to find files is named find, not grep. Forget you ever heard of grep -r as it's just a bad idea, here's the right way to find files and perform some action on them:
find . -type f -print | xargs sed -i '/badline/d'
Try something like:
grep -vlre 'foo' . | xargs sed -i 's/pattern/replacement/g'
Broken down:
grep:
-v 'Inverse match'
-l 'Show filename'
-r 'Search recursively'
-e 'Extended pattern search'
xargs: For each entry perform
sed -i: replace inline
I think this would work:
grep -ilre 'Foo' . | xargs sed -i 'extension' 'Foo/d'
Where 'extension' refers to the addition to the file name. It will make a copy of the original file with the extension you designated and the modified file will have the original filename. I added -i in case you require it to be case insensitive.
modified file1 becomes "file1"
original file1 becomes "file1extension"
invalid command code ., despite escaping periods, using sed
One of the responses suggests that the newer version of sed's -i option in OSX is slightly different so you need to add an extension. The file is being interpreted as a command, which is why you are seeing that error.

Ignoring directories from a file

I am in the process of creating a script that lists all files opened via lsof output. I would like to checksum specific files and ignore directories from that output but am at a loss to do so EFFECTIVELY. For example: (I'm using FreeBSD btw)
lsof | awk '/\//{print $9}' | sort -u | head -n 5
prints:
/
/bin/sleep
/dev/bpf
What I'd like to do is: FROM that output, ignore any directories and perform an md5 on FILES (not directories).
Any pointers?
Give a try to following perl command:
lsof | perl -MDigest::MD5=md5_hex -ane '
$f = $F[ $#F ];
-f $f and printf qq|%s %s\n|, $f, md5_hex( $f )
'
It filters lsof output to plain files (-f). Take a look into perlfunc to change it to add different kind of files.
It outputs each file and its md5 separated by a space character. An example in my system is like:
/usr/lib/libm-2.17.so a2d3b2de9a1f59fb99427714fefb49ca
/usr/lib/libdl-2.17.so d74d8ac16c2d13128964353d4be7061a
/usr/lib/libnsl-2.17.so 34b6909ec60c337c21b044642b9baa3d
/usr/lib/ld-2.17.so 3d0e7b5b5c4e59c5c4b6a858cc79fcf1
/usr/sbin/lsof b9b8fbc8f296e47969713f6369d97c0d
/usr/lib/locale/locale-archive 3ea56273193198a718b9a5de33d553db
/usr/lib/libc-2.17.so ba51eeb4025b7f5d7f400f1968f4b5f9
/usr/lib/ld-2.17.so 3d0e7b5b5c4e59c5c4b6a858cc79fcf1
...

tar pre-run to evaluate expected size or amount of files

The problem:
I have a back-end process that at some point he collect and build a big tar file.
This tar receive few directories and an exclude files.
the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).
i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.
the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'
10x for everyone who is willing to assist.
You can pipe the output to the wc tool instead of actually making a file.
With file listing (verbose):
[git#server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080
Without:
[git#server]$ tar czf - ./test-dir | wc -c
2734080
Why don't you run a
DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[#]} -type f | wc -l
beforehand. This gets all the files (-type f) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable
If you want to know the size of all the stored files, you can use du
DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[#]} | tail -1 | awk -F ' ' '{print $1}'
This prints the disk usage with du, calculates a grand total (-c flag), gets the last line (example 4378921 total), and uses just the first column with awk

How can I have grep not print out 'No such file or directory' errors?

I'm grepping through a large pile of code managed by git, and whenever I do a grep, I see piles and piles of messages of the form:
> grep pattern * -R -n
whatever/.git/svn: No such file or directory
Is there any way I can make those lines go away?
You can use the -s or --no-messages flag to suppress errors.
-s, --no-messages suppress error messages
grep pattern * -s -R -n
If you are grepping through a git repository, I'd recommend you use git grep. You don't need to pass in -R or the path.
git grep pattern
That will show all matches from your current directory down.
Errors like that are usually sent to the "standard error" stream, which you can pipe to a file or just make disappear on most commands:
grep pattern * -R -n 2>/dev/null
I have seen that happening several times, with broken links (symlinks that point to files that do not exist), grep tries to search on the target file, which does not exist (hence the correct and accurate error message).
I normally don't bother while doing sysadmin tasks over the console, but from within scripts I do look for text files with "find", and then grep each one:
find /etc -type f -exec grep -nHi -e "widehat" {} \;
Instead of:
grep -nRHi -e "widehat" /etc
I usually don't let grep do the recursion itself. There are usually a few directories you want to skip (.git, .svn...)
You can do clever aliases with stances like that one:
find . \( -name .svn -o -name .git \) -prune -o -type f -exec grep -Hn pattern {} \;
It may seem overkill at first glance, but when you need to filter out some patterns it is quite handy.
Have you tried the -0 option in xargs? Something like this:
ls -r1 | xargs -0 grep 'some text'
Use -I in grep.
Example: grep SEARCH_ME -Irs ~/logs.
I redirect stderr to stdout and then use grep's invert-match (-v) to exclude the warning/error string that I want to hide:
grep -r <pattern> * 2>&1 | grep -v "No such file or directory"
I was getting lots of these errors running "M-x rgrep" from Emacs on Windows with /Git/usr/bin in my PATH. Apparently in that case, M-x rgrep uses "NUL" (the Windows null device) rather than "/dev/null". I fixed the issue by adding this to .emacs:
;; Prevent issues with the Windows null device (NUL)
;; when using cygwin find with rgrep.
(defadvice grep-compute-defaults (around grep-compute-defaults-advice-null-device)
"Use cygwin's /dev/null as the null-device."
(let ((null-device "/dev/null"))
ad-do-it))
(ad-activate 'grep-compute-defaults)
One easy way to make grep return zero status all the time is to use || true
→ echo "Hello" | grep "This won't be found" || true
→ echo $?
0
As you can see the output value here is 0 (Success)

Resources