backup saga: rsync saves the day

it hurts. every new backup tool which promises simplicity only delivers the opposite. bugs, dependencies, incompatibility…

time to return to the ancient ways

i serendipitously found this post from 17 years ago which is written by none other than Jamie Zawinski

he recommended an alluringly simple solution:

sudo rsync -vax --delete --ignore-errors / /Volumes/Backup/ 0 5 * * * rsync -vax --delete --ignore-errors / /Volumes/Backup/ sudo crontab -u root that-file`

very KISS.

Mac users: for the backup drive to be bootable, you need to do two things:

  • When you first format the drive, set the partition type to “GUID”, not “Apple Partition Map”;

  • Before doing your first backup, Get Info on the drive and un-check “Ignore ownership on this drive” under “Ownership and permissions.”

You can test whether it’s bootable by holding down Option while booting and selecting the external drive.

12 years later Jamie returned to add:

…2. Time Machine exists and is pretty good. It uses rsync underneath.

From comments:

Forget classifying data. It’s too error prone.

You will forget to include something important, some buried directory, and then it will be gone. You will skip stuff that you don’t realise is worth backing up, like your apps’ settings. It takes work to compile all these inclusion lists. Making backups must be a brain-dead easy activity for two reasons:

You won’t do it if it’s something that you think of as something you have to do. It must be so easy that thinking “I should make a backup” is indistinguishable in effort from making the friggin’ backup. The single biggest threat to your data is yourself. Minimising operator involvement at all stages of the process minimises the potential for operator error.

…which then goes on to complain about DVDs and tapes lmao

the setup

i tried getting my bearings

rsync --dry-run -vax --delete --ignore-errors --exclude="CloudStorage" / /Volumes/G$/backups/rjwls/rsync/

sent 92,424,712 bytes  received 8,616,150 bytes  1,413,158.91 bytes/sec
total size is 435,220,886,984  speedup is 4,307.38 (DRY RUN)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1358) [sender=3.3.0]

whoops i forgot sudo

sent 93,199,517 bytes  received 8,672,902 bytes  1,487,188.60 bytes/sec
total size is 440,866,811,315  speedup is 4,327.64 (DRY RUN)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1358) [sender=3.3.0]

that looks alright

sudo rsync -vax --delete --ignore-errors --exclude="CloudStorage" / /Volumes/koichi/backups/rjwls/rsync | less

real slow, but probably bc of the usb 3 drive, and it got stuck in /System for a long time

left running overnight, but it barely had made progress by morning, probably because the macOS setting to sleep hard drives was enabled

time machine blunders

while this was running I decided to look into time machine. i created a .sparsebundle image in Disk Utility on the drive and tried getting TM to recognize it

  969  2024-12-14 15:27  sudo tmutil setdestination /Volumes/koichi/backups/rjwls/rjwls-time-machine.dmg.sparsebundle
  
<error>

  970  2024-12-14 15:28  sudo tmutil setdestination /Volumes/koichi/backups/rjwls/rjwls-time-machine.sparsebundle

<error>

  971  2024-12-14 15:29  sudo tmutil setdestination /Volumes/rjwls-time-machine/

<error>

  972  2024-12-14 15:29  sudo tmutil setdestination /Volumes/rjwls-time-machine

<error>

this was pointless.

back to rsync for more suffering

  974  2024-12-14 15:59  sudo rsync -vax --delete --ignore-errors --<exclude patterns> / /Volumes/koichi/backups/rjwls/rsync >> ~/.log/rsync-backup.log
  975  2024-12-14 16:00  less ~/.log/rsync-backup.log
  976  2024-12-14 16:28  gomatrix

i let that run until the next day, when i discovered that backing up my 180 GB of data from my macbook had completely filled the 1600 GB of free space on the external drive.

dissecting the files in the target directory revealed that .app files had ballooned in size via rsync; for example, Pages grew 50x from 655 MB to 33 GB. my hunch was that despite the -a flag including the -l option, links were being followed to their destinations, and multiple apps referencing the same system file resulted in that system file being duplicated for each app. i went with the quick fix of excluding *.app for now

i also noticed that rsync was deleting files from the previous backup which it shouldn’t be, since with the --delete flag only files not found on the source should be deleted

  984  2024-12-15 13:25  rm ~/.log/rsync-backup.log && sudo rsync -vaux --delete --ignore-errors <exclude patterns> / /Volumes/koichi/backups/rjwls/rsync >> ~/.log/rsync-backup.log

after iterating on this command for a while and deleting previous files using Finder (!), i started getting errors that the disk was full. some investigating in GrandPerspective:

deleting from an external drive using Finder just moves it to the .Trashes folder inside that drive, silly mistake.

  996  2024-12-15 20:10  sudo rm -rf /Volumes/koichi/.Trashes/*

this took another entire day to complete, especially since i continued running rsync commands concurrently…

i begin taking poison damage

i found this post which revealed that the -a flag includes a bunch of posix specific stuff which will cause confusion on an exFAT drive, and to replace it with -rltD instead. not sure if this fixes the .app expansion issue since it still has -l flag, but it seemed to address the deletion issue. also that --modify-window=1 is needed to allow for the weird timestamping in windows/ms-dos/exFAT.

i also decided to add flags to handle interruptions and partial updates based on this thread

after a few iterations i ended up with

sudo rsync -hvrltD --modify-window=1 \
--partial --inplace --append --delete-excluded \
--info=progress2 --ignore-errors \
--exclude-from=./exclude \
/ /Volumes/koichi/backups/rjwls/rsync \
> ./out.log 2> ./err.log

and an exclude file

CloudStorage
.Trash
.Trashes
Library/Developer
*.app
.vscode/extensions
nvim/mason/packages
.rustup/toolchains
Library/Application Support/Google
Library/Caches
Library/System
pyenv/versions
/System

i tried putting ._* in exclude but these get recreated every time so it’ll just slow down the process

after doing this with --dry-run, it returned

sent 61.08M bytes  received 141.47M bytes  93.06K bytes/sec
total size is 454.77G  speedup is 2,245.23 (DRY RUN)

…god damn

checked the logs and:

Volumes/koichi/backups/rjwls/rsync/System/Library/Assistant/FlowDelegatePlugins/EmergencyFlowPlugin.bundle/Contents/Resources/Templates/dialog/emergencyGeneralEmergency.catfamily/siriSupportWebsiteLink.cat/zh-cn.cat.bin

the backup was syncing itself. of course.

added /Volumes to the exclude file, ran again, and i got it down to 140 GB

FINALLY

after excluding a few more things to pare it down, i ended up with a final backup size of about 103 GB which i’m happy with. time for the real thing

while that runs, time to find a reason to redo the whole system from scratch

bootable backups

this will work great in theory, but most of the files being copied are garbage like caches and logs and states and executables that are barely usable unless they’re on a bootable drive, since I’m not going to dissect the entire backup directory to copy each com.whatever.etcetera to the new drive.

i’ve given up on time machine; i’m sure it’s great when it works, but when it doesn’t i have zero way to adjust its behavior in a granular way, and i can’t have a backup solution that’s unfixable.

so how do i get a bootable drive maintained with rsync?

this article from 2011 says to make a OSX Extended partition and rsync to that, but i think this is from before APFS existed

i plowed through two dozen other articles and threads, finding nothing particularly useful. most of it is outdated, “just use time machine”, “just use proprietary paid software X”

it seems apple has deeply nerfed bootability from external drives. it’s possible but with little documentation on what criteria a bootable drive has to satisfy other than being APFS or HFS+

to be continued…