Abstract: In this article, you will learn the most common flags used with the tar command, how to create and extract tar archives, and how to create and extract gzip compressed tar archives.
This article is shared from the HUAWEI cloud community " Linux: Compress and Extract Files, ", author: Tiamo_T.
How does the Linux tar command work?
The tar command is used to create .tar, .tar.gz, .tgz, or tar.bz2 archives, usually called "tarball". The extensions .tar.gz and .tgz are used to identify archives generated using gzip compression to reduce the size of the archives. The file with the extension .tar.bz2 is compressed using bzip2.
Linux distributions provide tar binary files, which can support gzip compression without the help of external commands. As we will see in this article, this may not apply to other types of compression.
Let's start with three examples of the tar command to get familiar with the most common signs.
Create an archive containing two files
This is a basic example of the tar command, in this case we don't use compression:
tar -cf archive.tar testfile1 testfile2
This command creates an archive file named archive.tar that contains two files: testfile1 and testfile2.
This is the meaning of the two signs:
- -c (same as -create): Create a new archive
- -f: It allows to specify an archive file (in this case called archive.tar)
The file command confirms that archive.tar is an archive:
[myuser@localhost]$ file archive.tar
archive.tar: POSIX tar archive (GNU)
Another useful flag is the -v flag, which provides detailed output of files processed during the tar
If we also pass the -v flag when creating the archive, let's see how the output changes:
[myuser@localhost]$ tar -cfv archive.tar testfile1 testfile2
tar: archive.tar: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
Strange, for some reason, we got an error...
This is because the tar command creates a named archive based on the content after the -f flag, in this case, after the -f flag is v.
The result is an archive named v, as you can see from the ls output below:
[myuser@localhost]$ ls -al
total 20
drwxrwxr-x. 2 myuser mygroup 4096 Jul 17 09:42 .
drwxrwxrwt. 6 root root 4096 Jul 17 09:38 ..
-rw-rw-r--. 1 myuser mygroup 0 Jul 17 09:38 testfile1
-rw-rw-r--. 1 myuser mygroup 0 Jul 17 09:38 testfile2
-rw-rw-r--. 1 myuser mygroup 10240 Jul 17 09:42 v
[myuser@localhost]$ file v
v: POSIX tar archive (GNU)
The "no such file or directory" directory is due to tar trying to create an archive named v, which contains three files: archive.tar, testfile1, and testfile2.
But archive.tar does not exist, so an error occurs.
This shows how important the order of tar's flags is.
Let's swap the -f and -v flags in the tar command and try again:
[myuser@localhost]$ tar -cvf archive.tar testfile1 testfile2
testfile1
testfile2
This time everything went well, and the detailed logo shows the names of the two files added to the archive we are creating.
make sense?
List all files in the tar archive in detail
To list all the files in the tar archive without extracting their contents, we will introduce a fourth flag:
-t: lists the contents of the file
We can now put together the three flags: -t, -v, and -f to view the files in the archive we created earlier:
[myuser@localhost]$ tar -tvf archive.tar
-rw-rw-r-- myuser/mygroup 0 2020-07-17 09:38 testfile1
-rw-rw-r-- myuser/mygroup 0 2020-07-17 09:38 testfile2
Should I use Dash with Tar?
I have noticed that in some cases the dash before the logo appears, but this is not always the case.
So, let's see if passing the dash makes any difference.
First, let's try to run the same command without the dash in front of the flag:
[myuser@localhost]$ tar tvf archive.tar
-rw-rw-r-- myuser/mygroup 0 2020-07-17 09:38 testfile1
-rw-rw-r-- myuser/mygroup 0 2020-07-17 09:38 testfile2
The output is the same, which means there is no need for dashes.
Just to give you an idea, you can run the tar command as follows and get the same output:
tar -t -v -f archive.tar
tar -tvf archive.tar
tar -tvf archive.tar
tar --list --verbose --file archive.tar
The last command uses the long option style as the flag provided to the Linux command.
You can see that it is much easier to use the short version of the logo.
Extract all files from the archive
Let's introduce an additional flag that allows the contents of the tar archive to be extracted. This is the -x flag.
To extract the contents of the file we created earlier, we can use the following command:
tar -xvf archive.tar
(the two lines below are the output of the command in the shell)
testfile1
testfile2
ls -al
total 20
drwxrwxr-x 2 myuser mygroup 59 Feb 10 21:21 .
drwxr-xr-x 3 myuser mygroup 55 Feb 10 21:21 ..
-rw-rw-r-- 1 myuser mygroup 10240 Feb 10 21:17 archive.tar
-rw-rw-r-- 1 myuser mygroup 54 Feb 10 21:17 testfile1
-rw-rw-r-- 1 myuser mygroup 78 Feb 10 21:17 testfile2
As you can see, we use the -x flag to extract the contents of the archive, the -v flag for detailed extraction, and the -f flag to refer to the archive file specified after the flag (archive.tar) .
Note: As mentioned before, we only enter the dash character once before all signs. We can specify a dash before each logo, and the output will be the same.
tar -x -v -f archive.tar
There is also a way to extract individual files from the archive.
In this case, considering that there are only two files in our archive, it doesn't make much difference. However, if you have an archive of thousands of files and you only need one of them, then it makes a huge difference.
This is common if you have a backup script to create an archive of log files for the past 30 days, and you only want to view the contents of the log files for a specific date.
To extract only testfile1 from archive.tar, you can use the following general syntax:
tar -xvf {archive_file} {path_to_file_to_extract}
In our specific case:
tar -xvf archive.tar testfile1
Let's see what happens if I create a tar archive containing two directories:
[myuser@localhost]$ ls -ltr
total 8
drwxrwxr-x. 2 myuser mygroup 4096 Jul 17 10:34 dir1
drwxrwxr-x. 2 myuser mygroup 4096 Jul 17 10:34 dir2
[myuser@localhost]$ tar -cvf archive.tar dir*
dir1/
dir1/testfile1
dir2/
dir2/testfile2
Note: Please note that I used the wildcard * to include any file or directory whose name starts with "dir" in the archive.
If I just want to extract testfile1 the command will be:
tar -xvf archive.tar dir1/testfile1
After decompression, the original directory structure is retained, so I will get testfile1 in dir1:
[myuser@localhost]$ ls -al dir1/
total 8
drwxrwxr-x. 2 myuser mygroup 4096 Jul 17 10:36 .
drwxrwxr-x. 3 myuser mygroup 4096 Jul 17 10:36 ..
-rw-rw-r--. 1 myuser mygroup 0 Jul 17 10:34 testfile1
Is everything clear?
Reduce the size of the tar file
Gzip and Bzip2 compression can be used to reduce the size of tar archives.
The other tar flags to enable compression are:
- -z used for Gzip compression: the long flag is -gzip
- -j used for Bzip2 compression: the long flag is –bzip2
To create a gzipped tar archive named archive.tar.gz with verbose output, we will use the following command (which is also one of the most commonly used commands when creating tar archives):
tar -czvf archive.tar.gz testfile1 testfile2
And extract its content, we will use:
tar -xzvf archive.tar.gz
We can also use the .tgz extension instead of .tar.gz, and the result is the same.
Now, let's create an archive compressed with bzip2:
[myuser@localhost]$ tar -cvjf archive.tar.bz2 testfile*
testfile1
testfile2
/bin/sh: bzip2: command not found
tar: Child returned status 127
tar: Error is not recoverable: exiting now
The error "bzip2: command not found" indicates that the tar command is trying to compress using the bzip2 command, but the command cannot be found on our Linux system.
The solution is to install bzip2. The process depends on the Linux distribution you are using, in my case it is CentOS with yum as the package manager.
Let's install bzip2 using the following yum command:
yum install bzip2
I can use the which command to confirm that the bzip2 binary file exists:
[myuser@localhost]$ which bzip2
/usr/bin/bzip2
Now, if I run the tar command again with bzip2 compression:
[myuser@localhost]$ tar -cvjf archive.tar.bz2 testfile*
testfile1
testfile2
[myuser@localhost]$ ls -al
total 16
drwxrwxr-x. 2 myuser mygroup 4096 Jul 17 10:45 .
drwxrwxrwt. 6 root root 4096 Jul 17 10:53 ..
-rw-rw-r--. 1 myuser mygroup 136 Jul 17 10:54 archive.tar.bz2
-rw-rw-r--. 1 myuser mygroup 128 Jul 17 10:45 archive.tar.gz
-rw-rw-r--. 1 myuser mygroup 0 Jul 17 10:44 testfile1
-rw-rw-r--. 1 myuser mygroup 0 Jul 17 10:44 testfile2
everything is normal!
In addition, considering that I am curious, I want to view the difference between the two archives (.tar.gz and .tar.bz2) based on the Linux file command:
[myuser@localhost]$ file archive.tar.gz
archive.tar.gz: gzip compressed data, last modified: Fri Jul 17 10:45:04 2020, from Unix, original size 10240
[myuser@localhost]$ file archive.tar.bz2
archive.tar.bz2: bzip2 compressed data, block size = 900k
As you can see, Linux can distinguish files generated using two different compression algorithms.
in conclusion
In this article, you learned about the most common flags used with the tar command, how to create and extract tar archives, and how to create and extract gzip compressed tar archives.
Let's review all the signs again:
• -c: create a new archive
• -f: allows you to specify the file name of the archive
• -t: List the contents of the file
• -v: List the processed files in detail
• -x: extract files from the archive
• -z: use gzip compression
• -j: use bzip2 compression
Click to follow, and learn about Huawei Cloud's fresh technology for the first time~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。