First step is to characterize the delta in the timestamps. That will give you an idea of what time-between-follows (TBF) should be to ensure reliable detection.
You could probably make a client for this. Build comms on top of twitter protocol.
Haven't used twitter in over 10 years, but this kinda makes me want to again.
Side note: reminds me of why gmail scans drafts. Criminals used a signal account and create drafts to communicate back and forth. This prevented detection because intel agencies scan email traffic, not drafts.
EDIT: why stop with twitter. what makes this powerful is that it leverages a feature every platform has. Youtube, Twitch, you name it. Let's go my dude.
You need to represent 2 variables : the dot/dash to represent symbols, and the dit, the time in-between symbols.
If you want to do this you need some sort of 2nd action, a comment or a twitter like, that way we could represent the dot/dash with either action, and then space them out to time the silence.
Or you can use "Morse tap code" where two close knocks represent a dot, and if the gap between them is slightly longer it's a dash. That takes two knocks for one dot/dash but it works and is (was?) used in practice.
More seriously, the events end up in a list, so a recipient would more likely "decode" them by comparing timestamps instead of "live" watching the actual sequence.
But I suppose if you timed your follows/unfollows really carefully, you could get certain patterns in the timestamps to appear which could be used to distinguish "long" and "short" pulses.
Eg in the first ten seconds 10 toggles, in the next 10 seconds a 5 second gap then 5 toggles. That would be dash, quiet, dot.
.. ..-. / -.-- --- ..- / -.-. .- -. / .-. . .- -.. / - .... .. ... --..-- / - --- ..- -.-. .... / --. .-. .- ... ...