Httpz – Zero-Allocation HTTP/1.1 Parser for OxCaml
84 points
4 days ago
| 6 comments
| github.com
| HN
avsm
1 day ago
[-]
(author here) I'm just adding data-race free parallelism support to this right now to switch my website over to using it! For those familiar with OCaml syntax, the OxCaml parse function is fun:

    val parse : buffer -> len:int -> #(status * request * header list) @ local
This takes in a buffer and returns an unboxed tuple on the stack, so there's no GC activity involved beyond stack management for each HTTP request.

https://github.com/avsm/httpz/blob/main/lib/httpz.mli#L154

reply
cdaringe
1 day ago
[-]
Interesting parser, fun to read.
reply
henearkr
1 day ago
[-]
Oh I got the joke! (I'm pretty sure it was intended)

Yes a parser is a fun to read ;)

reply
ptrwis
1 day ago
[-]
Doesn't (honest question) the operating system kernel prevent data races in memory accesses at the level of system calls like brk? I wonder at what level the operating system handles such things?
reply
ptrwis
1 day ago
[-]
I mean, aren't system calls thread-safe?
reply
spooneybarger
20 hours ago
[-]
As a general rule, not all system calls are thread safe.
reply
nospice
1 day ago
[-]
"Zero heap allocations: Parser results are stack-allocated using OxCaml unboxed records and local lists" - honest question, why?

On almost any platform on which you want to run a HTTP server - including bare metal - it usually doesn't matter if you keep state near the stack pointer or not. What matters is that you use it well, making it play well with CPU caches, etc. Or is there something specifically horrible about OxCaml's heap allocator?

reply
avsm
1 day ago
[-]
In a conventional GCed language, you need to minimise heap allocations to avoid putting too much pressure on the garbage collector. The OxCaml extensions allows values to be passed 'locally' (that is, on the callstack) as an alternative to heap allocation. When the function returns, the values are automatically deallocated (and the type system guarantees safety).

This means that I can pass in a buffer, parse it, do my business logic, and then return, without ever allocating anything into the global heap. However, if I do need to allocate into it (for example, a complex structure), then it's still available.

It's kind of Rust in reverse: OxCaml has a GC by default, but you can write very high-performance code that effectively never uses a GC. There's also emerging support for data-race-free parallelisation as well.

The webserver I'm putting together also uses io_uring, which provides zero-copy buffers from kernel to userspace. The support for one-shot effect handlers in OCaml allows me to directly resume a blocked fiber straight from the io_uring loop, and then this httpz parser operates directly on that buffer. Shared memory all the way with almost no syscalls!

reply
noelwelsh
22 hours ago
[-]
I think there are several advantages of stack allocation:

* freeing stack allocated memory is O(1) with a small constant factor: simply set the stack pointer to a new location. In a generational garbage collector, like OCaml, minor garbage collection is O(amount of retained memory) with a larger constant factor.

* judiciously stack allocating memory can improve data locality.

* unboxed data takes up less space, again improving locality.

Overall, I think this about improving constant factors---which makes a big difference in practice!

reply
ori_b
23 hours ago
[-]
Unboxed records are fine, but stack-allocated lists make me nervous. What happens when someone gives you 8 megs of headers, and you run out of stack?

This code seems to put a 32k limit on it, but it's a manual check and error return. What about code that forgets to manually add that limit, or sets it too high? How do you decide when to bump that limit, since 32k is an artificial constraint?

reply
outpost_mystic2
23 hours ago
[-]
By default in oxcaml, "stack" / local allocations happen in a separate stack on the heap (which the runtime allocates for you). If you allocate enough to exceed that capacity, it will resize it dynamically for you.
reply
naasking
20 hours ago
[-]
So stack-local arena. Neat.
reply
messe
23 hours ago
[-]
> Local lists (@ local): Header list grows on the stack, not heap

Does this mean unbounded stack growth? I'd much rather heap allocation if that's the case, as at least that can be recovered from in the case of allocation failure (assuming your OS, language, and stdlib allow for it).

reply
avsm
23 hours ago
[-]
This particular iteration is unbounded, but the next step is to pass in a GADT argument to specify which headers the application wants, so only those are parsed into a heterogenous tuple.
reply
messe
22 hours ago
[-]
That sounds like a rather elegant solution to it.
reply
spooneybarger
20 hours ago
[-]
The OxCaml work is great. I don't use OCaml much but I have been following along with OxCaml as they are doing fascinating work that leverages a lot of research that interests me.
reply
gethly
1 day ago
[-]
ocaml looks more like a spec than actual code.
reply
beckford
1 day ago
[-]
If you were looking at the parse link in the author's comment, you were looking at a spec (called a module interface in OCaml/OxCaml, similar to an interface in Java). The parse implementation is at https://github.com/avsm/httpz/blob/240051dd5f00281b09984a14a...

That said, I would be happy if all I needed to type in was a spec.

reply
infamouscow
1 day ago
[-]
I'm excited to see what comes of OxCaml the next few years.
reply