The whole selling point of fly is lightweight and fast VMs that can be "off" when not needed and start on-request. For this, I would:
Set up a "peformance" instance, with auto start on, and auto-restart-on-exit _off_, which runs a simple web service which accepts an incoming request, does the processing and upload, and then exits. All you need is the fly config, dockerfile, and service code (e.g. python). A simple api app like that which only serves to ffmpeg-process something, can start very fast (ms). Something which needs to load e.g. a bigger model such as whisper can also still work, but will be a bit slower. fly takes care of automatically starting stopped instances on an incoming request, for you.
(In my use case: app where people upload audio, to have it transcribed with whisper. I would send a ping from the frontend to the "whisper" service even before the file finished uploading, saying "hey wake up, there's audio coming soon", and it was started by the time the audio was actually available. Worked great.)
yanked from my script:
cmd = [
"fly", "machine", "run", latest_image,
"--app", APP_NAME,
"--region", options[:region],
'--vm-size', 'performance-1x',
'--memory', options[:memory] || '2048m',
"--entrypoint", "/rails/bin/docker-entrypoint bundle exec rake #{rake_task}",
"--rm"
]
system(cmd)
or a 1-1 transliteration to their api. You can of course run many of these at once.> When I started building the Call Kent feature, I could have designed a proper job queue with a dedicated worker pool. But that would have been solving a scalability problem I did not yet have. "Start simple and iterate when reality tells you to" is still how I think about this. Reality finally told me.
And then we find ourselves at
> The signature uses HMAC-SHA256 with a shared secret, verified with a timing-safe comparison to avoid leaking information through response time.
Yikes, the complexity!
It took me thirty seconds to find the fly.io guidance to use BullMQ with Node+Redis.
https://fly.io/docs/blueprints/work-queues/
The recommendation is three lines of code. Instead, we have a queue, a queue worker, and an ffmpeg container running on two different vendors with two new internal API calls between services.
But also, this could all have been done in the browser. I did this ten years ago! https://pinecoder.pinecast.com
Start simple is fine, but this solution really seems like it overshot.
nvidia-smi dmon
is the linux command (for nvidia) - column "dec" tells you whether there is hw decode happening. This will work both for browser (youtube) and video player (mpv etc). I needed to make active changes on both to get it to actually hit the gpu. Don't assume you've got hw accel just because it is smooth
Any $5 VPS could do this number crunching, no?
On the debate about Cloudflare vs Fly vs spot instances – for bursty, infrequent workloads that already use R2 for storage, Containers are a natural fit.
The container accesses R2 over Cloudflare's internal network, so files never hit the public internet.
Check out our implementation - https://videotobe.com/blog/how-we-process-video-with-cloudfl...
(The main difference is we don't use Cloudflare Queues in conjunction with Cloudflare Containers. You can set max_instances to your desired settings to process parallel requests.)
Leave it on his primary server and use cpulimit to constrain CPU usage of ffmpeg to, say, 30% - this addresses the throttling problem:
cpulimit -l 30 -- ffmpeg -i input.mp4 -c:v libx264 -preset veryfast output.mp4
Then use the simplest of queueing mechanisms which is just to move files between directories - a long proven and reliable way to do a processing queue:
bash code here:
#!/usr/bin/env bash
set -u
INCOMING="./incoming"
PROCESSING="./processing"
DONE="./done"
FAILED="./failed"
CPU_LIMIT=30
SLEEP_SECONDS=2
mkdir -p "$INCOMING" "$PROCESSING" "$DONE" "$FAILED"
# recover files left in processing after a crash
for path in "$PROCESSING"/*; do
[ -e "$path" ] || continue
name="$(basename "$path")"
mv "$path" "$INCOMING/$name"
done
while true; do
FILE=""
# find first file in incoming
for path in "$INCOMING"/*; do
[ -f "$path" ] || continue
FILE="$path"
break
done
# if nothing available wait
if [ -z "$FILE" ]; then
sleep "$SLEEP_SECONDS"
continue
fi
BASENAME="$(basename "$FILE")"
STEM="${BASENAME%.*}"
WORK="$PROCESSING/$BASENAME"
OUTPUT="$DONE/$STEM.mp4"
# move file into processing
mv "$FILE" "$WORK"
echo "Processing $WORK"
# run ffmpeg under cpu limit (blocking)
cpulimit -l "$CPU_LIMIT" -- ffmpeg -y -threads 1 -i "$WORK" -c:v libx264 -preset veryfast -c:a aac "$OUTPUT"
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
rm "$WORK"
echo "Finished $OUTPUT"
else
mv "$WORK" "$FAILED/$BASENAME"
echo "Failed $FAILED/$BASENAME"
fi
doneI use them to generate thumbnails when uploading large photo files for my personal flickr clone, not even noticeable in billing.
A very heavy-handed solution, but super simple. A single one-liner. Just thought to share a weird trick I found.