Why are tar.xz files created with Python tar 15 times smaller than macOS tar

Approx. transl. : This is not an ordinary translation, because it is not based on a separate article, but a recent case with Stack Exchange, which became the main hit of the resource this month. Its author asks a question, the answer to which turned out to be a real revelation for some site visitors.

Compressing directories by ~ 1.3 GB, each with 1440 JSON files, I found a 15x difference between the size of archives compressed using tar

macOS or Raspbian 10 (Buster) and archives obtained using the tarfile library built in in Python.

Minimal working example

This script compares both methods:

#!/usr/bin/env python3

from pathlib import Path 
from subprocess import call 
import tarfile

fullpath = Path("/Users/user/Desktop/temp/tar/2021-03-11") 
zsh_out = Path(fullpath.parent, "zsh-archive.tar.xz") 
py_out = Path(fullpath.parent, "py-archive.tar.xz")

# tar using terminal 
# tar cJf zsh-archive.tar.xz folderpath
call(["tar", "cJf", zsh_out, fullpath])

# tar using tarfile library 
with tarfile.open(py_out, "w:xz") as tar:
    tar.add(fullpath, arcname=fullpath.stem)

# Print filesizes 
print(f"zsh tar filesize: {round(Path(zsh_out).stat().st_size/(1024*1024), 2)} MB") 
print(f"py tar filesize: {round(Path(py_out).stat().st_size/(1024*1024), 2)} MB")

The result is this:

zsh tar filesize: 23.7 MB
py tar filesize: 1.49 MB

The following versions were used:

  • tar

    on MacOS: bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6


  • tar

    Raspbian at 10: xz (XZ Utils) 5.2.4 liblzma 5.2.4


  • tarfile

    Python: 0.9.0



diff -r py-archive-expanded zsh-archive-expanded


Β« Β» ( ) :

➜ diff zsh-archive.tar.xz py-archive.tar.xz
Binary files zsh-archive.tar.xz and py-archive.tar.xz differ

Quicklook ( Betterzip) , -:

On the left is zsh-archive.tar.xz, on the right is py-archive.tar.xz.
β€” zsh-archive.tar.xz, β€” py-archive.tar.xz.


, Python β€” . , .

? ? , Python- ? 15- - Python-?

: , tarlib

Python ; BSD- tar



, , BSD- GNU- tar


GNU tar




, none

, name




β€” , .

GNU tar

GNU tar


brew install gnu-tar

'tar' , --sort


gtar --sort='name' -cJf zsh-archive-sorted.tar.xz /Users/user/Desktop/temp/tar/2021-03-11


1,5 β€” , , Python-.

, , JSON-, ( β€” unixtime), BSD tar


cat *.json > all.txt
tar cJf zsh-cat-archive.tar.xz all.txt


1,5 .

Python- tarfile

, TarFile.add Python , tarfile

Python :

. , recursive False. .

, , , :

JSON- . , .

, . , .


UPD: β€” XZ/LZMA β€” , @iliazeus!


  • Β«Git happens! 6 Git Β»;

  • Β« Β»;

  • Β« Β».

All Articles