Wednesday, April 19, 2006

UNIX: ssh + tar + gzip -q = goodness

To retrieve a hierarchy of files from a remote server (or to copy it back to a remote server), I often do something like:
ssh servername "tar cvzf - dirname" | tar xvfz -
However, I usually get the following error message:
gzip: stdin: decompression OK, trailing garbage ignored
tar: Child returned status 2
tar: Error exit delayed from previous errors
Strangely enough, as I write this, I get the error message copying something from one FreeBSD system to another FreeBSD system, but I don't get it when copying something from one FreeBSD system to my Ubuntu system. Weird.

I put up with this problem for years. However, I recently needed to use it in a Makefile. Having an error like that is fine when you're a human, but a non-zero return code is a deal-breaker in a Makefile. I needed to clean up my act.

One easy way to make the problem go away is to not use the "z" flag for both instances of tar. This is somewhat icky, because it really would be nice to have the content gzipped. Otherwise, it could take too long to transfer.

Finally, I found the real solution on the gzip Web site. Instead of passing the "z" flag to tar when untarring, use gunzip separately and pass the "q" flag to tell it to be quiet:
ssh server "tar cvzf - dirname" | gunzip -q | tar xvf -
By the way, as is standard in UNIX, there are plenty of other variations of this tar + ssh idiom. For instance, consider:
ssh server "cd myapplication/share/locale && 
find . -name '*.po' -o -name '*.mo' |
xargs tar cvzf -" |
gunzip -q | tar xvf -

10 comments:

Anonymous said...

From Brandon Golm:

do you have anything agains cpio, and letting ssh handle the compression
with '-C' ??

well... your way is good too.

Shannon -jj Behrens said...

I can never remember the arguments for cpio ;)

I didn't know about ssh's "-C" argument. That's *even better* (less typing!), although I sure am glad to know the right way to work around my other way.

Shannon -jj Behrens said...

ssh -C servername "tar cvf - dirname" | tar xvf -

Yep, worked just fine!

Leon Atkinson said...

I wonder if compression is all that relevant, anyway. Most of the time, I have a fast, unsaturated network between my local machine and any remote machine.

Leon Atkinson said...

OK--I couldn't resist testing. It's relevant for my 768K DSL connection. My crude test showed 1 minute versus 6 minutes.

I wonder if I shouldn't add "Compression yes" to /etc/ssh/ssh_config.

Pistahh said...

Take a look at rsync. :)

Anonymous said...

I don't find cpio all that difficult:
cpio -i means data is coming from stdin,
cpio -o means data is going to stdout.

Also
find directory (-some-complicated-query) | cpio -o
find directory (-some-complicated-query) -print0 | cpio -0o
is easier than
find directory (-some-complicated-query) | xargs -d'\n' tar cf -
find directory (-some-complicated-query) -print0 | xargs -0 tar cf -
(for safety with working with filenames with spaces and other interesting characters)

Shannon -jj Behrens said...

> I don't find cpio all that difficult:

+1 for usefulness. I stand corrected!

Mark said...

your problem here was that you had the "v" option on the sending side of the tar, instead of just on the untar.
ssh user@host "tar czf - dirname " | tar zxvf -
should work fine.

Shannon -jj Behrens said...

> your problem here was that you had the "v" option on the sending side of the tar, instead of just on the untar.

Wow, that makes perfect sense! Thanks!