backup saga: rsync saves the day
it hurts. every new backup tool which promises simplicity only delivers the opposite. bugs, dependencies, incompatibility…
time to return to the ancient ways
i serendipitously found this post from 17 years ago which is written by none other than Jamie Zawinski
he recommended an alluringly simple solution:
sudo rsync -vax --delete --ignore-errors / /Volumes/Backup/
0 5 * * * rsync -vax --delete --ignore-errors / /Volumes/Backup/
sudo crontab -u root that-file`
very KISS.
Mac users: for the backup drive to be bootable, you need to do two things:
When you first format the drive, set the partition type to “GUID”, not “Apple Partition Map”;
Before doing your first backup, Get Info on the drive and un-check “Ignore ownership on this drive” under “Ownership and permissions.”
You can test whether it’s bootable by holding down Option while booting and selecting the external drive.
12 years later Jamie returned to add:
…2. Time Machine exists and is pretty good. It uses rsync underneath.
From comments:
Forget classifying data. It’s too error prone.
You will forget to include something important, some buried directory, and then it will be gone. You will skip stuff that you don’t realise is worth backing up, like your apps’ settings. It takes work to compile all these inclusion lists. Making backups must be a brain-dead easy activity for two reasons:
You won’t do it if it’s something that you think of as something you have to do. It must be so easy that thinking “I should make a backup” is indistinguishable in effort from making the friggin’ backup. The single biggest threat to your data is yourself. Minimising operator involvement at all stages of the process minimises the potential for operator error.
…which then goes on to complain about DVDs and tapes lmao
the setup
i tried getting my bearings
rsync --dry-run -vax --delete --ignore-errors --exclude="CloudStorage" / /Volumes/G$/backups/rjwls/rsync/
sent 92,424,712 bytes received 8,616,150 bytes 1,413,158.91 bytes/sec
total size is 435,220,886,984 speedup is 4,307.38 (DRY RUN)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1358) [sender=3.3.0]
whoops i forgot sudo
sent 93,199,517 bytes received 8,672,902 bytes 1,487,188.60 bytes/sec
total size is 440,866,811,315 speedup is 4,327.64 (DRY RUN)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1358) [sender=3.3.0]
that looks alright
sudo rsync -vax --delete --ignore-errors --exclude="CloudStorage" / /Volumes/koichi/backups/rjwls/rsync | less
real slow, but probably bc of the usb 3 drive, and it got stuck in /System for a long time
left running overnight, but it barely had made progress by morning, probably because the macOS setting to sleep hard drives was enabled
time machine blunders
while this was running I decided to look into time machine. i created a .sparsebundle image in Disk Utility on the drive and tried getting TM to recognize it
969 2024-12-14 15:27 sudo tmutil setdestination /Volumes/koichi/backups/rjwls/rjwls-time-machine.dmg.sparsebundle
<error>
970 2024-12-14 15:28 sudo tmutil setdestination /Volumes/koichi/backups/rjwls/rjwls-time-machine.sparsebundle
<error>
971 2024-12-14 15:29 sudo tmutil setdestination /Volumes/rjwls-time-machine/
<error>
972 2024-12-14 15:29 sudo tmutil setdestination /Volumes/rjwls-time-machine
<error>
this was pointless.
back to rsync for more suffering
974 2024-12-14 15:59 sudo rsync -vax --delete --ignore-errors --<exclude patterns> / /Volumes/koichi/backups/rjwls/rsync >> ~/.log/rsync-backup.log
975 2024-12-14 16:00 less ~/.log/rsync-backup.log
976 2024-12-14 16:28 gomatrix
i let that run until the next day, when i discovered that backing up my 180 GB of data from my macbook had completely filled the 1600 GB of free space on the external drive.
…
dissecting the files in the target directory revealed that .app files had ballooned in size via rsync; for example, Pages grew 50x from 655 MB to 33 GB. my hunch was that despite the -a
flag including the -l
option, links were being followed to their destinations, and multiple apps referencing the same system file resulted in that system file being duplicated for each app. i went with the quick fix of excluding *.app
for now
i also noticed that rsync was deleting files from the previous backup which it shouldn’t be, since with the --delete
flag only files not found on the source should be deleted
984 2024-12-15 13:25 rm ~/.log/rsync-backup.log && sudo rsync -vaux --delete --ignore-errors <exclude patterns> / /Volumes/koichi/backups/rjwls/rsync >> ~/.log/rsync-backup.log
after iterating on this command for a while and deleting previous files using Finder (!), i started getting errors that the disk was full. some investigating in GrandPerspective:
deleting from an external drive using Finder just moves it to the .Trashes folder inside that drive, silly mistake.
996 2024-12-15 20:10 sudo rm -rf /Volumes/koichi/.Trashes/*
this took another entire day to complete, especially since i continued running rsync commands concurrently…
i begin taking poison damage
i found this post which revealed that the -a
flag includes a bunch of posix specific stuff which will cause confusion on an exFAT drive, and to replace it with -rltD
instead. not sure if this fixes the .app expansion issue since it still has -l
flag, but it seemed to address the deletion issue. also that --modify-window=1
is needed to allow for the weird timestamping in windows/ms-dos/exFAT.
i also decided to add flags to handle interruptions and partial updates based on this thread
after a few iterations i ended up with
sudo rsync -hvrltD --modify-window=1 \
--partial --inplace --append --delete-excluded \
--info=progress2 --ignore-errors \
--exclude-from=./exclude \
/ /Volumes/koichi/backups/rjwls/rsync \
> ./out.log 2> ./err.log
and an exclude file
CloudStorage
.Trash
.Trashes
Library/Developer
*.app
.vscode/extensions
nvim/mason/packages
.rustup/toolchains
Library/Application Support/Google
Library/Caches
Library/System
pyenv/versions
/System
i tried putting ._* in exclude but these get recreated every time so it’ll just slow down the process
after doing this with --dry-run
, it returned
sent 61.08M bytes received 141.47M bytes 93.06K bytes/sec
total size is 454.77G speedup is 2,245.23 (DRY RUN)
…god damn
checked the logs and:
Volumes/koichi/backups/rjwls/rsync/System/Library/Assistant/FlowDelegatePlugins/EmergencyFlowPlugin.bundle/Contents/Resources/Templates/dialog/emergencyGeneralEmergency.catfamily/siriSupportWebsiteLink.cat/zh-cn.cat.bin
the backup was syncing itself. of course.
added /Volumes
to the exclude file, ran again, and i got it down to 140 GB
FINALLY
after excluding a few more things to pare it down, i ended up with a final backup size of about 103 GB which i’m happy with. time for the real thing
while that runs, time to find a reason to redo the whole system from scratch
bootable backups
this will work great in theory, but most of the files being copied are garbage like caches and logs and states and executables that are barely usable unless they’re on a bootable drive, since I’m not going to dissect the entire backup directory to copy each com.whatever.etcetera to the new drive.
i’ve given up on time machine; i’m sure it’s great when it works, but when it doesn’t i have zero way to adjust its behavior in a granular way, and i can’t have a backup solution that’s unfixable.
so how do i get a bootable drive maintained with rsync?
this article from 2011 says to make a OSX Extended partition and rsync to that, but i think this is from before APFS existed
i plowed through two dozen other articles and threads, finding nothing particularly useful. most of it is outdated, “just use time machine”, “just use proprietary paid software X”
it seems apple has deeply nerfed bootability from external drives. it’s possible but with little documentation on what criteria a bootable drive has to satisfy other than being APFS or HFS+
to be continued…