I built AIDictation.com, a voice to text app written in Swift. It sends audio to my own backend, runs it through a Whisper-based pipeline, and returns a transcription you can then send straight into an AI chat like ChatGPT or Claude.
I’ve been building full‑stack apps for ~20 years, but this is my first Swift application. I leaned heavily on AI coding tools to get from zero Swift to a working app and backend in a couple of weeks.
What it does
Records audio and sends it to my server. The backend runs a pipeline using Whisper V3 Turbo + OpenAI GPT OSS 120B.
I intentionally went with a cloud pipeline instead of on‑device models so I can: - Parallelize work on the backend and tune the pipeline. - Mix and match providers and models. - Improve latency without shipping new app versions.
After transcription, there’s a “share to AI chat” flow so you can send it with one tap to ChatGPT, Claude, etc.
Context rules One feature I missed in Whisper Flow was configurable context rules (similar to the Super Whisper Modes). AIDictation lets you define how transcription should behave depending on what you’re doing.
For example: - Meetings: keep speaker names and timestamps. - Coding: preserve technical terms and code formatting. - Journaling: be more forgiving, add punctuation, make the text more readable. - You can configure different presets and switch between them.
Why cloud instead of on‑device
A lot of apps focus on running models locally. I chose the opposite trade‑off: - Provider flexibility: right now I’m using the Groq API because, in my tests, it had the best end‑to‑end latency (700-800ms), but the backend is built to swap providers and models. - This does mean audio leaves the device, so I tried to be very explicit about data handling.
No registration needed. You get about 2,000 words per month for free without creating an account or giving an email.
Tech stack Client: Swift (first real Swift/iOS app I’ve shipped). Backend: NodeJS on Vercel. Models: Whisper V3 Turbo + OpenAI GPT OSS 120B. Provider: Groq API at the moment, mainly for latency reasons.
I’ve been using AIDictation daily for the past couple of weeks, and I’m happy with it so far, but I’d really like candid feedback from HN—both on the product and on the implementation.