Building CallSpark (browser based VoIP): what I learned and what caused pain
1 points
58 minutes ago
| 0 comments
| HN
I’ve been building CallSpark, a browser-based calling platform. The core VoIP flow through Twilio’s browser SDK worked as advertised. The hard parts were elsewhere.

The easy part

Twilio’s SDK was straightforward. Initialize a Device, pass in a token, call connect. WebRTC and audio quality held up without much tuning.

The real problems

1. Call state inside the browser The SDK emits events at the right times, but React state doesn’t always represent what the call is actually doing. UI components mount, unmount, or re-render, and state drifts. I wanted the call to remain online even if the user goes to any other page in-app. A call may still be active while the UI falsely thinks it’s disconnected, or vice versa.

I ended up keeping a global reference to the active call and broadcasting custom events so every component could stay aligned. It’s not elegant, but it’s the only approach that consistently kept the UI accurate.

Twilio also doesn’t provide a reliable “call started” timestamp, so I track it myself but obviously I can't use it for billing.

2. Real-time billing I wanted per-second (or per minute) billing while the call is active. Twilio’s post-call webhook isn’t useful for real-time checks because it arrives only after the call finishes. And it does not always contains the call cost.

My backend runs a periodic worker that finds active calls, computes how much time has passed since the last charge, deducts credits, and ends the call if the user runs out of balance. This part works, but you quickly run into edge cases.

If the worker misses a cycle, you have to reconcile without double-charging.. The browser UI might vanish, but billing and call state must continue accurately.

The solution I found was partial billing (roast me on this if you want). A worker runs every ten seconds and looks at all active calls. If the user has enough balance to cover the next ten seconds, the call continues. If not, it ends. After each interval, the worker reserves the cost for the previous ten seconds. When the call ends, twilio tells us with webhook, and I reconcile how much I debited from credits, how much must actually be debited etc etc.

What I’m still solving

A better pattern for keeping client-side call state aligned without falling back to global objects.

A clean way to handle situations where the browser UI disappears mid-call but the call itself continues normally on the backend.

Questions for the community

> If you’ve built real-time billing for VoIP, how did you structure it. > How did you maintain accurate browser UI state for long-lived calls. > Any lessons from working with Twilio’s browser SDK.

No one has commented on this post.