jasmins little space

Removing orphans from the Mastodon cache

As part of the ongoing migration of the queer.group media files from local storage to S3, I’ve realised that I should do a cleanup of the orphaned files before copying them to the S3 bucket. Fortunately, Mastodon offers an easy solution with its `tootctl’ command.

tootctl media remove-orphans

And very soon, files got deleted.

# removed for brevity
Found and removed orphan: cache/accounts/headers/109/384/607/004/860/065/original/039a300fdeba0dc0.jpg
Found and removed orphan: cache/accounts/headers/109/427/244/149/514/100/original/b464439464ab98ff.jpeg
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/04dda84036e651b6.png
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2777348e3afd634d.png
Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png

During the removal process, it crashed after about 200,000 scanned files. This was probably due to a temporary database error. But it has a --start-after parameter with a very unhelpful explanation, I would say.

The Paperclip attachment key where the loop will start. Use this option if the command was interrupted before.

What the heck is a “Paperclip attachment key”? Where do I find it? Turns out, it’s the filename. I marked it in the following quote.

Found and removed orphan: cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png

To provide an example command on how to use it, here’s how I did it:

RAILS_ENV=production bin/tootctl media remove-orphans --start-after=cache/accounts/headers/109/440/021/041/888/306/original/2aa59f9af035ba23.png