tarring and untarring between two remote hosts - tar

I have two systems that I'm splitting processing between, and I'm trying to find the most efficient way to move the data between the two. I've figured out how to tar and gzip to an archive on the first server ("serverA") and then use rsync to copy to the remote host ("serverB"). However, when I untar/unzip the data there, it saves the archive including the full path name from the original server. So if on server A my data is in:
/serverA/directory/a/lot/of/subdirs/myData/*
and, using this command:
tar -zcvf /serverA/directory/a/lot/of/subdirs/myData-archive.tar.gz /serverA/directory/a/lot/of/subdirs/myData/
Everything in .../myData is successfully tarred and zipped in myData-archive.tar.gz
However, after copying the archive, when I try to untar/unzip on the second host (I manually log in here to finish the processing, the first step of which is to untar/unzip) using this command:
tar -zxvf /serverB/current/directory/myData-archive.tar.gz
It untars everything in my current directory (serverB/current/directory/), however it looks like this:
/serverB/current/directory/serverA/directory/a/lot/of/subdirs/myData/Data*ext
How should I formulate both the tar commands so that my data ends up in a directory called
/serverB/current/directory/dataHERE/
?
I know I'll need the -C flag to untar into a different directory (in my case, /serverB/current/directory/dataHERE ), but I still can't figure out how to make it so that the entire path is not included when the archive gets untarred. I've seen similar posts but none that I saw discussed how to do this when moving between to different hosts.
UPDATE: per one of the answers in this question, I changed my commands to:
tar/zip on serverA:
tar -zcvf /serverA/directory/a/lot/of/subdirs/myData-archive.tar.gz serverA/directory/a/lot/of/subdirs/myData/ -C /serverA/directory/a/lot/of/subdirs/ myData
and, untar/unzip:
tar -zxvf /serverB/current/directory/myData-archive.tar.gz -C /serverB/current/directory/dataHERE
And now, not only does it untar/unzip the data to:
/serverB/current/directory/dataHERE/
like I wanted, but it also puts another copy of the data here:
/serverB/current/directory/serverA/directory/a/lot/of/subdirs/myData/
which I don't want. How do I need to fix my commands so that it only puts data in the first place?

On serverA do
( cd /serverA/directory/a/lot/of/subdirs; tar -zcvf myData-archive.tar.gz myData; )

After some more messing around, I figured out how to achieve what I wanted:
To tar on serverA:
tar -zcvf /serverA/directory/a/lot/of/subdirs/myData-archive.tar.gz -C /serverA/directory/a/lot/of/subdirs/ myData
Then to untar on serverB:
tar -zxvf /serverB/current/directory/myData-archive.tar.gz -C /serverB/current/directory/dataHERE

Related

Exclude a directory from `podman/docker export` stream and save to a file

I have a container that I want to export as a .tar file. I have used a podman run with a tar --exclude=/dir1 --exclude=/dir2 … that outputs to a file located on a bind-mounted host dir. But recently this has been giving me some tar: .: file changed as we read it errors, which podman/docker export would avoid. Besides the export I suppose is more efficient. So I'm trying to migrate to using the export, but the major obstacle is I can't seem to find a way to exclude paths from the tar stream.
If possible, I'd like to avoid modifying a tar archive already saved on disk, and instead modify the stream before it gets saved to a file.
I've been banging my head for multiple hours, trying useless advices from ChatGPT, looking at cpio, and attempting to pipe the podman export to tar --exclude … command. With the last I did have small success at some point, but couldn't make tar save the result to a particularly named file.
Any suggestions?
(note: I do not make distinction between docker and podman here as their export command is completely the same, and it's useful for searchability)

how is 1 tar split into several archives extracted as 1 full archive

So I've never done this before with the actual tar command windows: easy select all three extract here, Linux probably something similar. but I'm using a system that cant do that, so I need the command for extracting multiple split tars as one. Tried tar -xvf file.tar file.tar file.tar but it didn't work.
answer: $ cat filenamewithoutpartnumberorextension* > file.tar

Why is copy slower than move?

I had a big file that I'm moving about. The normal protocol in the lab is to copy it somewhere and then delete it.
I decided to change it to mv.
My question is, why is mv so much faster than cp?
To test it out I generated a file 2.7 GB in size.
time cp test.txt copy.txt
Took real 0m20.113s
time mv test.txt copy.txt
Took real 0m12.403s.
TL;DR mv was almost twice as fast as copy. Any explanations? Is this an expected result?
EDIT-
I decided to move/copy the folder to a destination other than the current folder.
time cp test.txt ../copy.txt
and
time mv test.txt ../copy.txt
This time cp took 9.238s and mv took only 0.297s. So not what some of the answers were suggesting.
UPDATE
The answers are right. When I tried to mv the file to a different disk on the same system, mv and cp took almost the same time.
When you mv a file on the same filesystem, the system just has to change directory entries to reflect your renaming. Data in the file is not even read.
(same filesystem means: same directory or same directory tree/same drive, provided that source and destination directories do not traverse symlinks leading to another filesystem of course!)
When you mv a file across file systems, it has the same effect as cp + rm: no speed gain (apart from the fact that you only run one command, and consistency is guaranteed: you don't have to check if cp succeeded to perform the rm)
(older versions of mv refused to move directories across filesystems, because they only did the renaming)
Be careful, it is not equivalent. cp overwrites destination by default, whereas mv will fail renaming a file/dir into an existing file/dir.

How do I extract a TAR to a different destination directory

On server A, I created a tar file (backup.tar.gz) of the entire website /www. The tar file includes the top-level directory www
On server B, I want to put those files into /public_html but not include the top level directory www
Of course, tar -xzif backup.tar.gz places everything into /public_html/www
How do I do this?
Thanks!
You can use the --transform option to change the beginning of the archived file names to something else. As an example, in my case I had installed owncloud in directory named sscloud instead of owncloud. This caused problems when upgrading from the *.tar file. So I used the transform option like so:
tar xvf owncloud-10.3.2.tar.bz2 --transform='s/owncloud/sscloud/' --overwrite
The transform option takes sed-like commands. The above will replace the first occurrence of owncloud with sscloud.
Answer is:
tar --strip-components 1 -xvf backup.tar.gz

Tar Error [can not open: not a directory]

I have made some archive file with the tar gnome GUI on Ubuntu but when I try to extract them
tar zxvf archive_name
I get following error
Cannot open: Not a directory
What is the problem ?
Try extracting the archive in an empty directory; any existing files/directories in the extract target usually cause problems if names overlap.
I encountered the same issue (for each file within an archive) and I solved it by appending ".tar.gz" to the archive filename as I'd managed to download a PECL package without a file extension:
mv pecl_http pecl_http.tar.gz
I was then able to issue the following command to extract the contents of the archive:
tar -xzf pecl_http.tar.gz
You probably might already have a file with the same name that the tar is extracting a directory.
Try to tar in different location.
tar zxvf tar_name.tgz --one-top-level=new_directory_name
Try using tar -zxvf archive_name instead. I believe that the command format has changed, and it now requires the z (unzip) x (extract) v (verbose) f (filename...) parts as switches instead of plain text. The error comes from tar trying to do something to the file zxvf, which of course does not exist.

Resources