Blog » Linux » How to Use tar Command in Linux: The Complete Archiving Guide
› tar-command-linux

How to Use tar Command in Linux: The Complete Archiving Guide

Table of Contents

The tar command linux users rely on every day is one of those foundational tools that separates casual users from folks who actually understand how Unix systems work. Whether you’re backing up your homelab, packaging source code for distribution, or transferring files between servers, tar is the tool that makes it happen.

I’ve been using tar since my Ubuntu 8.04 days, and I’ll be honest – I’ve made every mistake in the book with it. But those mistakes taught me how to use tar properly, and that’s exactly what I want to share with you today.

In this guide, I’ll walk you through everything from basic archive creation to advanced techniques like incremental backups and SSH piping. By the end, you’ll understand not just how to use tar, but why it works the way it does.

What is the tar Command in Linux?

The Simple Definition: Tape Archive for File Bundling

The name “tar” stands for Tape Archive. Back in the early days of Unix, administrators needed a reliable way to back up files to magnetic tape drives. Tar was born from that need, and while tape drives are now mostly museum pieces, tar remains the standard for bundling files on Linux systems.

At its core, tar does one thing exceptionally well: it takes multiple files and directories and bundles them into a single archive file. Think of it like putting everything into a cardboard box before shipping. The box itself doesn’t compress anything – it just keeps everything together.

RackNerd Mobile Leaderboard Banner

Get a VPS from as low as $11/year! WOW!

Why tar Matters for Every Linux User

Here’s what makes tar indispensable:

  • Preserves everything: File permissions, timestamps, ownership, symbolic links – tar keeps all the Unix metadata intact
  • Directory structure: Your folder hierarchy comes along for the ride
  • Compression flexibility: Pair tar with gzip, bzip2, or xz for different compression needs
  • Universal compatibility: Every Linux system has tar installed by default

Before creating large archives, I always recommend checking available disk space first. Nothing ruins your day like running out of disk mid-archive.

My First tar Disaster (And What It Taught Me About Backups)

Let me share a story from my early sysadmin days. I was migrating some critical config files from one server to another. Created a beautiful tar archive, felt pretty smart about myself, and promptly deleted the originals to free up space.

You can probably guess where this is going.

The archive was corrupted. I still don’t know exactly what went wrong – maybe disk errors, maybe I interrupted it mid-write. But I learned the most important tar lesson that day: always verify your archive before deleting the source files.

Now I religiously run tar -tvf to list contents before I touch the originals. That single habit has saved me countless headaches.

How tar Actually Works Behind the Scenes

Understanding tar vs Compression (Two Separate Steps)

This is where newcomers often get confused. Tar itself doesn’t compress anything. It creates an archive – a single file containing multiple files – but that archive is the same size as all the original files combined.

Compression is a separate step. When you see a .tar.gz file, that’s actually two things happening:

  1. tar bundles the files into a .tar archive
  2. gzip compresses that archive into a smaller .tar.gz file

Modern tar handles both steps with a single command, but understanding this distinction helps when things go wrong.

Why tar Preserves Unix Permissions (And Why That Matters)

Unlike zip files (which were designed for DOS/Windows), tar was built for Unix systems. It preserves:

  • File permissions: Execute bits, read/write permissions
  • Ownership: User and group IDs
  • Timestamps: Creation and modification times
  • Symbolic links: Stored as links, not as file copies

This makes tar archives essential for system backups. If you’re backing up /etc or moving a web server to a new machine, you need those permissions intact. A zip file would lose critical metadata.

Basic tar Command Syntax and Essential Options

The Three Core Operations: Create, Extract, List

Every tar command does one of three things:

Create an archive (-c): Bundle files into a tar file

tar -cvf archive.tar /path/to/files

Extract an archive (-x): Pull files out of a tar file

tar -xvf archive.tar

List contents (-t): See what’s inside without extracting

tar -tvf archive.tar

Understanding tar Flags (Why -cvf vs -xvf Matters)

The most common flags you’ll use:

  • -c: Create a new archive
  • -x: Extract files from an archive
  • -t: List/test archive contents
  • -v: Verbose mode (shows files being processed)
  • -f: Specify the archive filename (required!)

A common beginner mistake: forgetting the -f flag. Without it, tar doesn’t know where to read or write the archive. The flags can be combined – -cvf means “create, verbose, file.”

Creating tar Archives: Practical Examples

Create Uncompressed Archive

The simplest tar command creates an uncompressed archive:

tar -cvf backup.tar /home/username/documents

This creates backup.tar containing everything in the documents folder. The -v flag shows you each file as it’s added.

Archive Multiple Files and Directories

You can archive multiple items in one command. I use this when preparing project releases:

tar -cvf project.tar docs/ src/ config/ README.md

Pro tip: Use date-stamped filenames for backups to avoid overwriting previous archives:

tar -cvf backup-$(date +%Y%m%d).tar /var/www/html

This creates files like backup-20250115.tar – much easier to manage than generic names.

Using Relative vs Absolute Paths (Critical Best Practice)

This is important. When you create a tar archive using absolute paths like /home/user/data, the archive stores that full path. When you extract, files go to that exact location – potentially overwriting newer versions.

Better approach: Change to the parent directory and use relative paths:

# Instead of this (absolute path - risky):
tar -cvf backup.tar /home/user/project

# Do this (relative path - safe):
cd /home/user
tar -cvf backup.tar project/

Tar will warn you about leading slashes (“Removing leading / from member names”) – that’s actually tar protecting you. But it’s better to use relative paths from the start and list files first to confirm what you’re archiving.

Compressing tar Archives: gzip, bzip2, and xz

Creating .tar.gz Files (The Most Common Format)

The .tar.gz format (also called “tarball”) is what you’ll encounter most often. It’s the standard for source code distribution, backups, and file transfers:

tar -czf archive.tar.gz directory/

The -z flag tells tar to compress with gzip. Many source packages you installing software from source come in this format.

When to Use bzip2 vs xz vs gzip

Each compression algorithm has tradeoffs. Here’s my practical guide:

  • gzip (-z): Fast compression, moderate file size. Use for daily backups where speed matters.
  • bzip2 (-j): Slower but smaller files. Good balance for archiving older data.
  • xz (-J): Best compression ratio, but slowest. Use for long-term archives where size matters more than time.
# gzip (fastest)
tar -czf archive.tar.gz data/

# bzip2 (balanced)
tar -cjf archive.tar.bz2 data/

# xz (smallest output)
tar -cJf archive.tar.xz data/

Compression Ratio vs Speed Tradeoffs

According to compression performance benchmarks, xz can achieve up to 30% better compression than gzip, but takes 3-5x longer to compress.

My rule of thumb: For automated scripts running via cron, use gzip. For one-time archives you’ll store for years, xz is worth the wait.

Extracting tar Archives the Right Way

Extract to Current Directory

The basic extraction command:

tar -xzf archive.tar.gz

Modern tar auto-detects the compression format, so you can often just use -xf without specifying -z, -j, or -J. Tar figures it out.

Extract to Specific Location with -C

Need to extract somewhere other than your current directory? Use the -C flag:

tar -xzf archive.tar.gz -C /opt/applications/

This extracts the archive contents directly into /opt/applications/. When you download tar files with wget, combining wget and tar -C makes for a clean one-liner install.

List Contents Before Extracting (Smart Safety Check)

Before extracting anything, especially archives from the internet, list the contents first:

tar -tzf archive.tar.gz

This shows you exactly what files exist and where they’ll extract to. You’re looking for:

  • Absolute paths: Files starting with / could overwrite system files
  • Unexpected locations: Archives that extract to ../ (parent directories)
  • Suspicious files: Anything unexpected before running on production systems

This isn’t paranoia – it’s basic security hygiene.

Advanced tar Techniques Every Sysadmin Should Know

Excluding Files and Directories

Real-world archives rarely need everything. Use --exclude to skip files:

# Skip log files
tar --exclude='*.log' -czf backup.tar.gz /var/www/

# Skip multiple patterns
tar --exclude='*.log' --exclude='*.tmp' --exclude='cache/*' -czf backup.tar.gz /var/www/

# Skip version control directories
tar --exclude-vcs -czf source.tar.gz project/

The --exclude-vcs flag automatically skips .git, .svn, and other version control directories. I use this constantly when creating source distribution packages.

Creating Incremental Backups with –newer

For ongoing backups, you don’t need to re-archive everything daily. The --newer flag only includes files modified after a specific date:

tar -czf incremental-$(date +%Y%m%d).tar.gz --newer='2025-01-01' /data/

For more sophisticated incremental backups, check the GNU tar documentation for level-based backup strategies.

Verifying Archive Integrity

After creating critical archives, verify them:

# List contents to confirm structure
tar -tvf backup.tar.gz

# Create a checksum for later verification
sha256sum backup.tar.gz > backup.tar.gz.sha256

# Verify checksum later
sha256sum -c backup.tar.gz.sha256

Test extraction in a temporary directory before deleting originals. Seriously. Learn from my mistakes.

Common tar Errors and How to Fix Them

Permission Denied Errors

Trying to archive system directories without proper permissions:

tar: /etc/shadow: Cannot open: Permission denied

Solution: Use sudo for system files:

sudo tar -czf etc-backup.tar.gz /etc

File Changed While Reading

This warning appears when files change during archiving:

tar: /var/log/syslog: file changed as we read it

For active log files, this is usually harmless – you get the file state at archive time. For cleaner backups, exclude actively-written files or stop services temporarily.

Removing Leading Slash Warnings

You’ll see this message with absolute paths:

tar: Removing leading '/' from member names

This is tar protecting you, not an error. It converts absolute paths to relative ones so extraction doesn’t overwrite system locations. You can disable this with -P (preserve paths), but I don’t recommend it.

“One of the key rules for working as a system administrator is always to make a backup – you never know when you might need it.”

– Red Hat System Administrator Guide

tar vs zip: When to Use Each

This question comes up constantly. Here’s my practical take after years of working with both:

Use tar.gz when:

  • Transferring between Linux/Unix systems
  • Preserving file permissions and ownership matters
  • Maximum compression is important (tar.gz typically beats zip by 5-15%)
  • Creating backups or system archives

Use zip when:

  • Sharing with Windows users
  • Users need to access individual files without extracting everything
  • Cross-platform compatibility is critical

For a deeper dive on the zip side, check my guide on zip and unzip commands.

Real-World tar Use Cases and Workflows

Automating Backups with Cron

One of my most-used tar patterns is automated daily backups. Here’s a practical cron entry:

# Daily backup at 2 AM
0 2 * * * tar -czf /backups/www-$(date +\%Y\%m\%d).tar.gz /var/www/html

Note the escaped percent signs (\%) – cron interprets % specially, so you need backslashes. Learn more about setting this up in my guide on how to schedule automated backups with cron.

Transferring Archives Over SSH

One of tar’s most powerful features is piping directly over SSH without creating a local file:

# Send directory to remote server
tar -czf - /local/data | ssh user@server 'cat > /remote/backup.tar.gz'

# Pull from remote server
ssh user@server 'tar -czf - /remote/data' > local-backup.tar.gz

The - tells tar to write to stdout instead of a file. Combined with SSH, you transfer compressed data without intermediate files. Make sure you have secure SSH connections set up first.

For larger ongoing transfers, consider rsync for efficient file transfers – it only sends changed portions of files.

Creating Release Packages

When distributing source code or application packages:

tar -czf myapp-v1.2.0.tar.gz --exclude-vcs --exclude='*.pyc' --exclude='__pycache__' myapp/

This creates a clean distribution tarball without version control files or Python bytecode. It’s the standard format for source releases on GitHub and package repositories.

Quick Reference: Essential tar Commands

# Create compressed archive

tar -czf archive.tar.gz directory/

# Extract archive

tar -xzf archive.tar.gz

# List contents

tar -tzf archive.tar.gz

# Extract to specific directory

tar -xzf archive.tar.gz -C /destination/

# Create with exclusions

tar –exclude=’*.log’ -czf backup.tar.gz /data/

For the complete list of options and advanced flags, see the official tar manual page.

Wrapping Up

The tar command linux administrators have relied on for decades isn’t going anywhere. It’s simple, it’s reliable, and once you understand the fundamentals, it becomes second nature.

My biggest piece of advice: Always verify your archives before deleting source files. I learned that lesson the hard way during my early sysadmin days, and it’s saved me countless times since.

Start with the basics – create, extract, list. Add compression when needed. Build up to the advanced techniques like SSH piping and incremental backups as your workflows demand them.

Want to level up your command line skills further? Check out my other Linux guides:

Happy archiving!

author avatar
Alexa Velinxs
I'm Alexa Velinxs, a cryptocurrency trading expert passionate about demystifying digital assets for both beginners and seasoned investors. Through my writing, I share actionable strategies, market insights, and practical tips to help you navigate the crypto landscape with confidence. Let's explore the future of finance together.
Related Posts