The tar command linux users rely on every day is one of those foundational tools that separates casual users from folks who actually understand how Unix systems work. Whether you’re backing up your homelab, packaging source code for distribution, or transferring files between servers, tar is the tool that makes it happen.
I’ve been using tar since my Ubuntu 8.04 days, and I’ll be honest – I’ve made every mistake in the book with it. But those mistakes taught me how to use tar properly, and that’s exactly what I want to share with you today.
In this guide, I’ll walk you through everything from basic archive creation to advanced techniques like incremental backups and SSH piping. By the end, you’ll understand not just how to use tar, but why it works the way it does.
What is the tar Command in Linux?
The Simple Definition: Tape Archive for File Bundling
The name “tar” stands for Tape Archive. Back in the early days of Unix, administrators needed a reliable way to back up files to magnetic tape drives. Tar was born from that need, and while tape drives are now mostly museum pieces, tar remains the standard for bundling files on Linux systems.
At its core, tar does one thing exceptionally well: it takes multiple files and directories and bundles them into a single archive file. Think of it like putting everything into a cardboard box before shipping. The box itself doesn’t compress anything – it just keeps everything together.
Get a VPS from as low as $11/year! WOW!
Why tar Matters for Every Linux User
Here’s what makes tar indispensable:
- Preserves everything: File permissions, timestamps, ownership, symbolic links – tar keeps all the Unix metadata intact
- Directory structure: Your folder hierarchy comes along for the ride
- Compression flexibility: Pair tar with gzip, bzip2, or xz for different compression needs
- Universal compatibility: Every Linux system has tar installed by default
Before creating large archives, I always recommend checking available disk space first. Nothing ruins your day like running out of disk mid-archive.
My First tar Disaster (And What It Taught Me About Backups)
Let me share a story from my early sysadmin days. I was migrating some critical config files from one server to another. Created a beautiful tar archive, felt pretty smart about myself, and promptly deleted the originals to free up space.
You can probably guess where this is going.
The archive was corrupted. I still don’t know exactly what went wrong – maybe disk errors, maybe I interrupted it mid-write. But I learned the most important tar lesson that day: always verify your archive before deleting the source files.
Now I religiously run tar -tvf to list contents before I touch the originals. That single habit has saved me countless headaches.
How tar Actually Works Behind the Scenes
Understanding tar vs Compression (Two Separate Steps)
This is where newcomers often get confused. Tar itself doesn’t compress anything. It creates an archive – a single file containing multiple files – but that archive is the same size as all the original files combined.
Compression is a separate step. When you see a .tar.gz file, that’s actually two things happening:
- tar bundles the files into a
.tararchive - gzip compresses that archive into a smaller
.tar.gzfile
Modern tar handles both steps with a single command, but understanding this distinction helps when things go wrong.
Why tar Preserves Unix Permissions (And Why That Matters)
Unlike zip files (which were designed for DOS/Windows), tar was built for Unix systems. It preserves:
- File permissions: Execute bits, read/write permissions
- Ownership: User and group IDs
- Timestamps: Creation and modification times
- Symbolic links: Stored as links, not as file copies
This makes tar archives essential for system backups. If you’re backing up /etc or moving a web server to a new machine, you need those permissions intact. A zip file would lose critical metadata.
Basic tar Command Syntax and Essential Options
The Three Core Operations: Create, Extract, List
Every tar command does one of three things:
Create an archive (-c): Bundle files into a tar file
tar -cvf archive.tar /path/to/files
Extract an archive (-x): Pull files out of a tar file
tar -xvf archive.tar
List contents (-t): See what’s inside without extracting
tar -tvf archive.tar
Understanding tar Flags (Why -cvf vs -xvf Matters)
The most common flags you’ll use:
- -c: Create a new archive
- -x: Extract files from an archive
- -t: List/test archive contents
- -v: Verbose mode (shows files being processed)
- -f: Specify the archive filename (required!)
A common beginner mistake: forgetting the -f flag. Without it, tar doesn’t know where to read or write the archive. The flags can be combined – -cvf means “create, verbose, file.”
Creating tar Archives: Practical Examples
Create Uncompressed Archive
The simplest tar command creates an uncompressed archive:
tar -cvf backup.tar /home/username/documents
This creates backup.tar containing everything in the documents folder. The -v flag shows you each file as it’s added.
Archive Multiple Files and Directories
You can archive multiple items in one command. I use this when preparing project releases:
tar -cvf project.tar docs/ src/ config/ README.md
Pro tip: Use date-stamped filenames for backups to avoid overwriting previous archives:
tar -cvf backup-$(date +%Y%m%d).tar /var/www/html
This creates files like backup-20250115.tar – much easier to manage than generic names.
Using Relative vs Absolute Paths (Critical Best Practice)
This is important. When you create a tar archive using absolute paths like /home/user/data, the archive stores that full path. When you extract, files go to that exact location – potentially overwriting newer versions.
Better approach: Change to the parent directory and use relative paths:
# Instead of this (absolute path - risky):
tar -cvf backup.tar /home/user/project
# Do this (relative path - safe):
cd /home/user
tar -cvf backup.tar project/
Tar will warn you about leading slashes (“Removing leading / from member names”) – that’s actually tar protecting you. But it’s better to use relative paths from the start and list files first to confirm what you’re archiving.
Compressing tar Archives: gzip, bzip2, and xz
Creating .tar.gz Files (The Most Common Format)
The .tar.gz format (also called “tarball”) is what you’ll encounter most often. It’s the standard for source code distribution, backups, and file transfers:
tar -czf archive.tar.gz directory/
The -z flag tells tar to compress with gzip. Many source packages you installing software from source come in this format.
When to Use bzip2 vs xz vs gzip
Each compression algorithm has tradeoffs. Here’s my practical guide:
- gzip (-z): Fast compression, moderate file size. Use for daily backups where speed matters.
- bzip2 (-j): Slower but smaller files. Good balance for archiving older data.
- xz (-J): Best compression ratio, but slowest. Use for long-term archives where size matters more than time.
# gzip (fastest)
tar -czf archive.tar.gz data/
# bzip2 (balanced)
tar -cjf archive.tar.bz2 data/
# xz (smallest output)
tar -cJf archive.tar.xz data/
Compression Ratio vs Speed Tradeoffs
According to compression performance benchmarks, xz can achieve up to 30% better compression than gzip, but takes 3-5x longer to compress.
My rule of thumb: For automated scripts running via cron, use gzip. For one-time archives you’ll store for years, xz is worth the wait.
Extracting tar Archives the Right Way
Extract to Current Directory
The basic extraction command:
tar -xzf archive.tar.gz
Modern tar auto-detects the compression format, so you can often just use -xf without specifying -z, -j, or -J. Tar figures it out.
Extract to Specific Location with -C
Need to extract somewhere other than your current directory? Use the -C flag:
tar -xzf archive.tar.gz -C /opt/applications/
This extracts the archive contents directly into /opt/applications/. When you download tar files with wget, combining wget and tar -C makes for a clean one-liner install.
List Contents Before Extracting (Smart Safety Check)
Before extracting anything, especially archives from the internet, list the contents first:
tar -tzf archive.tar.gz
This shows you exactly what files exist and where they’ll extract to. You’re looking for:
- Absolute paths: Files starting with
/could overwrite system files - Unexpected locations: Archives that extract to
../(parent directories) - Suspicious files: Anything unexpected before running on production systems
This isn’t paranoia – it’s basic security hygiene.
Advanced tar Techniques Every Sysadmin Should Know
Excluding Files and Directories
Real-world archives rarely need everything. Use --exclude to skip files:
# Skip log files
tar --exclude='*.log' -czf backup.tar.gz /var/www/
# Skip multiple patterns
tar --exclude='*.log' --exclude='*.tmp' --exclude='cache/*' -czf backup.tar.gz /var/www/
# Skip version control directories
tar --exclude-vcs -czf source.tar.gz project/
The --exclude-vcs flag automatically skips .git, .svn, and other version control directories. I use this constantly when creating source distribution packages.
Creating Incremental Backups with –newer
For ongoing backups, you don’t need to re-archive everything daily. The --newer flag only includes files modified after a specific date:
tar -czf incremental-$(date +%Y%m%d).tar.gz --newer='2025-01-01' /data/
For more sophisticated incremental backups, check the GNU tar documentation for level-based backup strategies.
Verifying Archive Integrity
After creating critical archives, verify them:
# List contents to confirm structure
tar -tvf backup.tar.gz
# Create a checksum for later verification
sha256sum backup.tar.gz > backup.tar.gz.sha256
# Verify checksum later
sha256sum -c backup.tar.gz.sha256
Test extraction in a temporary directory before deleting originals. Seriously. Learn from my mistakes.
Common tar Errors and How to Fix Them
Permission Denied Errors
Trying to archive system directories without proper permissions:
tar: /etc/shadow: Cannot open: Permission denied
Solution: Use sudo for system files:
sudo tar -czf etc-backup.tar.gz /etc
File Changed While Reading
This warning appears when files change during archiving:
tar: /var/log/syslog: file changed as we read it
For active log files, this is usually harmless – you get the file state at archive time. For cleaner backups, exclude actively-written files or stop services temporarily.
Removing Leading Slash Warnings
You’ll see this message with absolute paths:
tar: Removing leading '/' from member names
This is tar protecting you, not an error. It converts absolute paths to relative ones so extraction doesn’t overwrite system locations. You can disable this with -P (preserve paths), but I don’t recommend it.
“One of the key rules for working as a system administrator is always to make a backup – you never know when you might need it.”
– Red Hat System Administrator Guide
tar vs zip: When to Use Each
This question comes up constantly. Here’s my practical take after years of working with both:
Use tar.gz when:
- Transferring between Linux/Unix systems
- Preserving file permissions and ownership matters
- Maximum compression is important (tar.gz typically beats zip by 5-15%)
- Creating backups or system archives
Use zip when:
- Sharing with Windows users
- Users need to access individual files without extracting everything
- Cross-platform compatibility is critical
For a deeper dive on the zip side, check my guide on zip and unzip commands.
Real-World tar Use Cases and Workflows
Automating Backups with Cron
One of my most-used tar patterns is automated daily backups. Here’s a practical cron entry:
# Daily backup at 2 AM
0 2 * * * tar -czf /backups/www-$(date +\%Y\%m\%d).tar.gz /var/www/html
Note the escaped percent signs (\%) – cron interprets % specially, so you need backslashes. Learn more about setting this up in my guide on how to schedule automated backups with cron.
Transferring Archives Over SSH
One of tar’s most powerful features is piping directly over SSH without creating a local file:
# Send directory to remote server
tar -czf - /local/data | ssh user@server 'cat > /remote/backup.tar.gz'
# Pull from remote server
ssh user@server 'tar -czf - /remote/data' > local-backup.tar.gz
The - tells tar to write to stdout instead of a file. Combined with SSH, you transfer compressed data without intermediate files. Make sure you have secure SSH connections set up first.
For larger ongoing transfers, consider rsync for efficient file transfers – it only sends changed portions of files.
Creating Release Packages
When distributing source code or application packages:
tar -czf myapp-v1.2.0.tar.gz --exclude-vcs --exclude='*.pyc' --exclude='__pycache__' myapp/
This creates a clean distribution tarball without version control files or Python bytecode. It’s the standard format for source releases on GitHub and package repositories.
Quick Reference: Essential tar Commands
# Create compressed archive
tar -czf archive.tar.gz directory/
# Extract archive
tar -xzf archive.tar.gz
# List contents
tar -tzf archive.tar.gz
# Extract to specific directory
tar -xzf archive.tar.gz -C /destination/
# Create with exclusions
tar –exclude=’*.log’ -czf backup.tar.gz /data/
For the complete list of options and advanced flags, see the official tar manual page.
Wrapping Up
The tar command linux administrators have relied on for decades isn’t going anywhere. It’s simple, it’s reliable, and once you understand the fundamentals, it becomes second nature.
My biggest piece of advice: Always verify your archives before deleting source files. I learned that lesson the hard way during my early sysadmin days, and it’s saved me countless times since.
Start with the basics – create, extract, list. Add compression when needed. Build up to the advanced techniques like SSH piping and incremental backups as your workflows demand them.
Want to level up your command line skills further? Check out my other Linux guides:
- Automating tasks with cron – perfect for scheduled tar backups
- Using rsync for file synchronization – when tar isn’t the right tool
- Working with zip files – for when you need Windows compatibility
Happy archiving!




