DEV Community

Adam La Rosa
Adam La Rosa

Posted on

Playing With Files

Everyone has to deal with file management at some point. But archives? Compression? Splits & concatenation? This is where the fun begins!

Got a bunch of files? Want to put them together in one file for transfer or some such? Tar to the rescue! From the man page...

The tar command creates, adds files to, or extracts files from an archive
file in "tar" format.  A tar archive is often stored on a magnetic tape,
but can be stored equally well on a floppy, CD-ROM, or in a regular disk
file.

Tar got its name from the days where data was stored on mag tape. A command such as:

tar -cvf name.of.file.to.create.tar /home/user/*

would (c) create a new archive file, (v) provide "verbose" output to view progress, and (f) output to a file. The following parameters are the name of the file you're creating and the target to make the archive from.

While creating archives is great and all, sometimes the amount of data is too big for comfort and compression is needed. Luckily tar has an option to use gzip, a common compression algorithm. From the gzip man...

The gzip utility reduces the size of the named files using adaptive
Lempel-Ziv coding, in deflate mode.

There are different compression methods to choose from but if you're not particular adding a "z" flag to the above tar command will include compression using gzip.

tar -zcvf name.of.file.to.create.tar.gz /home/user/*

Now say you've got your archive, all zipped up, and it's still too big. It might make sense to split that file into more reasonable sized chunks. Well, there's a "split" command for that!

Split it fairly straight forward. There's one particular flag, -b, which specifies how big you'd like each portion to be. For example:

split -b 100m superhugefile.tar.gz "prefix-of-split-"

What this does is tell split you'd like to take "superhugefile.tar.gz", split it into 100 megabyte files, and begin each filename with "prefix-of-split-". What this will then do is create a number of files named as such:

prefix-of-split-aa
prefix-of-split-ab
prefix-of-split-ac
prefix-of-split-ad

...etc.

But what now? You've got all these files & it's time to put them back together. Cat to the rescue!

For as long as I can remember the "cat" command was my way of viewing the contents of a file. By itself a command such as:

cat readme.txt

will display the contents of readme.txt to the screen. Little did I know that with the help of a output redirection operator cat can join the files together! From the cat man page:

The cat utility reads files sequentially, writing them to the standard
output.  The file operands are processed in command-line order.  If file
is a single dash (`-') or absent, cat reads from the standard input.

So a simple command like

cat prefix-of-split-* > superhugefile.tar.gz

will reassemble your file!

Command line tools like these are why I've come to love unix & unix-like operating systems. If it's the macOS, Linux, or a BSD, having these familiar tools available provides a sense of comfort.

Top comments (0)