As part of the ongoing migration of the queer.group media files from local storage to S3, I’ve realised that I should do a cleanup of the orphaned files before copying them to the S3 bucket. Fortunately, Mastodon offers an easy solution with its `tootctl’ command.
tootctl media remove-orphans
And very soon, files got deleted.
# removed for brevity
Found and removed orphan: cache/accounts/headers/109/384/607/004/860/065/original/039a300fdeba0dc0.jpg
Found and removed orphan: cache/accounts/headers/109/427/244/149/514/100/original/b464439464ab98ff.jpeg
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/04dda84036e651b6.png
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2777348e3afd634d.png
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png
During the removal process, it crashed after about 200,000 scanned files. This was probably due to a
temporary database error. But it has a --start-after
parameter with a very unhelpful explanation,
I would say.
The Paperclip attachment key where the loop will start. Use this option if the command was interrupted before.
What the heck is a “Paperclip attachment key”? Where do I find it? Turns out, it’s the filename. I marked it in the following quote.
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png
To provide an example command on how to use it, here’s how I did it:
RAILS_ENV=production bin/tootctl media remove-orphans --start-after=cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png