tar: This does not look like a tar archive - tar

I split a huge folder:
tar cvpf - somedir | split -b 50000m
I then transfered split files to another server and merge it:
cat x* > somedir.tar.gz
but when I tried to extract the file it shows errors:
tar xvf tar xvf somedir.tar.gz tar: This does not look like a tar
archive tar: Skipping to next header tar: Archive contains obsolescent
base-64 headers tar: Error exit delayed from previous errors
How to fix this problem?

It is not guaranteed that x* will expand to the same order in which the files are split. Assuming the file is split into three chunks then the first chunk would have the tar(1) header so you'll have to assemble them back in the same way.
Use ls(1) with the -t option to concatenate the files in that order.
Hope that helps.

Related

Unix. Parse file with full paths to SHA256 checksums files. Run command in each path/file

I have a file file.txt with filenames ending with *.sha256, including the full paths of each file. This is a toy example:
file.txt:
/path/a/9b/x3.sha256
/path/7c/7j/y2.vcf.gz.sha256
/path/e/g/7z.sha256
Each line has a different path/file. The *.sha256 files have checksums.
I want to run the command "sha256sum -c" on each of these *.sha256 files and write the output to an output_file.txt. However, this command only accepts the name of the .sha256 file, not the name including its full path. I have tried the following:
while read in; do
sha256sum -c "$in" >> output_file.txt
done < file.txt
but I get:
"sha256sum: WARNING: 1 listed file could not be read"
which is due to the path included in the command.
Any suggestion is welcome
#!/bin/bash
while read in
do
thedir=$(dirname "$in")
thefile=$(basename "$in")
cd "$thedir"
sha256sum -c "$thefile" >>output_file.txt
done < file.txt
Modify your code to extract the directory and file parts of your in variable.

extract a file from xz file

I have a huge file file.tar.xz containing many smaller text files with a similar structure. I want to quickly examine a file out of the compressed file and have a glimpse of files content structure. I don't have information about names of the files within the compressed file. Is there anyway to extract a single file out given the above the above scenario?
Thank you.
EDIT: I don't want to tar -xvf file.tar.xz.
Based on the discussion in the comments, I tried the following which worked for me. It might not be the most optimal solution, the regex might need some improvement, but you'll get the idea.
I first created a demo archive:
cd /tmp
mkdir demo
for i in {1..100}; do echo $i > "demo/$i.txt"; done
cd demo && tar cfJ ../demo.tar.xz * && cd ..
demo.tar.xz now contains 100 txt files.
The following lists the contents of the archive, selects the first file and stores the path within the archive into the variable firstfile:
firstfile=`tar -tvf demo.tar.xz | grep -Po -m1 "(?<=:[0-9]{2} ).*$"`
echo $firstfile will output 1.txt.
You can now extract this single file from the archive:
tar xf demo.tar.xz $firstfile

This does not look like a tar archive

[root#c0002242 lfeng]# tar -zxvf /opt/test/ALLscripts.tar.gz -C /opt/test1
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
Could you please help me on this ?
Run the command
$ file ALLscripts.tar.gz
Compare the output, if it's gzip (as shown below) then use unzip tool to extract it
$ ALLscripts.tar.gz: gzip compressed data,from Unix
I was facing this error because my file was not downloaded yet and I was trying to extract it :).

Unpack tar.gz folder to part of filename

I have a file dagens_130325.tar.gz containing the folder dagens. In one folder I have hundreds of these daily info. I would like to unpack dagens_130325.tar.gz/dagens to 130325 with all the files inside. Then 130326 etc.
Is there a way to do it?
Not sure this is the right stack where to ask this kind of question, however try with
tar -zxvf dagens_130325.tar.gz -C /tmp/130325 dagens
This way, the folder dagens for the archive dagens_130325.tar.gz is going to be extracted into /tmp/130325. However, note that the target folder must exist, otherwise the command will fail
So, supposedly you have 4 archives in the form dagens_1.tar.gz, dagens_2.tar.gz, ..., you can write an extract.sh file containing
#!/bin/bash
for i in {1..4}
do
mkdir /tmp/$i
FILE="dagens_$i.tar.gz"
tar -zxvf $FILE -C /tmp/$i dagens
done
Having this file the execute permission, being in the same folder as your archives and executing it should produced the result you asked.
This was the solution I came up with in the end
#!/bin/bash
search_dir=/yourdir/with/tar.gz
for entry in "$search_dir"/*.tar.gz
do
substring=$(basename "$entry")
echo $substring
sub2=${substring:7:6}
tar -xvzf $substring
rm -rf $sub2
mv dagens $sub2
done
use
#!/bin/bash
for file in dagens_*.tar.gz
do
from=${file%_*} #removes chars after _
to=${file#*_} #removes chars before _
to=${to%.t*} #removes chars after .t (.tar.gz)
tar -zxf $file --show-transformed --transform "s/$from/$to/"
done

tar pre-run to evaluate expected size or amount of files

The problem:
I have a back-end process that at some point he collect and build a big tar file.
This tar receive few directories and an exclude files.
the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).
i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.
the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'
10x for everyone who is willing to assist.
You can pipe the output to the wc tool instead of actually making a file.
With file listing (verbose):
[git#server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080
Without:
[git#server]$ tar czf - ./test-dir | wc -c
2734080
Why don't you run a
DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[#]} -type f | wc -l
beforehand. This gets all the files (-type f) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable
If you want to know the size of all the stored files, you can use du
DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[#]} | tail -1 | awk -F ' ' '{print $1}'
This prints the disk usage with du, calculates a grand total (-c flag), gets the last line (example 4378921 total), and uses just the first column with awk

Resources