"split" and "cat" - Split and Join Files

This section provides a tutorial example on how to use 'split' command to split it into chunks and use 'cat' command to join chunks back into a single file.

If you have a large file, generated from an archive tool, a video generator, a memory dump, or a database backup, you may have trouble to open or copy it.

One workaround is to split it into chunks with the "split" command and keep those chunks in sub-directory. You can join those chunks back into the original file with the "cat" command.

Here is what I did to create a large compressed archive file and split it into chunks.

1. Create the compressed archive file with the "tar -c -z" command:

herong$ cd /var/lib/mysql

herong$ tar -c -z -f /tmp/database-backup.tar.gz data
... wait for the "tar" command to finish

herong$ cd /tmp

herong$ ls -l *.zip
-rwx------. 1 herong herong 9482463744 Oct 31 01:52 database-backup.tar.gz

2. Split the large file into chunks in a sub-directory. Option "-d" says to use numeric suffixes instead of alphabetic, like "chunk-00" and "chunk-01". Option "-b 1000000000" says to split it with 1000000000 bytes per chunk.

herong$ mkdir database-backup
herong$ cd database-backup

herong$ split -d -b 1000000000 ../database-backup.tar.gz chunk-
... wait for the 'split' command to finish

herong$ ls -l chunk*
-rwx------. 1 herong herong 1000000000 Oct 31 02:04 chunk-00
-rwx------. 1 herong herong 1000000000 Oct 31 02:05 chunk-01
-rwx------. 1 herong herong 1000000000 Oct 31 02:05 chunk-02
-rwx------. 1 herong herong 1000000000 Oct 31 02:06 chunk-03
-rwx------. 1 herong herong 1000000000 Oct 31 02:06 chunk-04
-rwx------. 1 herong herong 1000000000 Oct 31 02:06 chunk-05
-rwx------. 1 herong herong 1000000000 Oct 31 02:07 chunk-06
-rwx------. 1 herong herong 1000000000 Oct 31 02:07 chunk-07
-rwx------. 1 herong herong 1000000000 Oct 31 02:07 chunk-08
-rwx------. 1 herong herong  482463744 Oct 31 02:08 chunk-09

3. Save the original file name in the sub-directory. and delete the original file.

herong$ touch database-backup.tar.gz

herong$ rm ../database-backup.tar.gz

4. Copy them to other devices is much easier now.

5. Use "cat" command to join chunks back whenever needed:

herong$ cd database-backup

herong$ cat chunk* > database-backup.tar.gz
... wait for the "cat" command to finish

herong$ ls -l *.tar.gz
-rwx------. 1 herong herong 9482463744 Oct 31 02:52 database-backup.tar.gz

6. If joining chunks and creating the original large file is a problem, you can pipe the "cat" command output to the "tar -x -z" command directly:

herong$ cat chunk* > tar -x -v -z
... wait until all files are extracted

Note that only the the "tar" and "gzip" combination gives you this nice feature of managing a large archive as a stream of sequential chunks.

If you split a large ZIP file into chunks, you will not be able to use "unzip" as an output stream pipe on the "cat" command. This is because the table of content is stored at the end of the ZIP file, which is in the last chunk.

Someone on the Internet said that the "jar -xv" command is able unzip ZIP files as a stream of sequential chunks. Note that "jar" is a ZIP tool provided from the JDK (Java Development Kit) package. If you want to try it, here is the command:

herong$ cd database-backup

herong$ cat chunk* | jar -xv

Table of Contents

 About This Book

 Introduction to Linux Systems

 Process Management

Files and Directories

 "find" - Search for Files

 "more", "head" and "cat" - Read Files

"split" and "cat" - Split and Join Files

 Truncate Log Files

 "compress/uncompress" - Compressed *.Z Files

 "gzip/gunzip" - Compressed *.gz Files

 "xz/unxz" - Compressed *.xz or *.lzma Files

 "tar -c" and "tar -x" - Create and Extract Archive Files

 "zip" and "unzip" - Create and Extract ZIP Files

 "Operation not permitted" Error on macOS

 Running Apache Web Server (httpd) on Linux Systems

 Running PHP Scripts on Linux Systems

 Running MySQL Database Server on Linux Systems

 Running Python Scripts on Linux Systems

 Conda - Environment and Package Manager

 GCC - C/C++ Compiler

 Graphics Environments on Linux

 SquirrelMail - Webmail in PHP

 Tools and Utilities

 References

 Full Version in PDF/EPUB