Show HN: jsoon, a streaming JSON parser and query engine in C
1 points
1 hour ago
| 0 comments
| github.com
| HN
Hi HN. I’ve been working on jsoon, a JSON query engine in one C file. The target use case is large JSON where I only need one or two fields out of the document. It doesn’t build a DOM. It scans for structure, skips subtrees when it can, and stops when it finds the result. The CPU path uses AVX2/ PCLMUL, and there’s also an optional CUDA path that does structural indexing on the GPU.

It has plenty of rough edges. The query language is limited, portability is limited, docs are still thin, and I would not treat it as a general-purpose replacement for simdjson, yyjson, or RapidJSON.

The benchmark numbers also need caveats. On selective queries jsoon can bail out early, while those libraries are doing full parsing and validation. So the large speedups are mostly about workload and architecture, not a claim that this is just a better JSON parser.

Posting because I think the implementation is interesting and I’d rather get criticism now than after spending more time on it. I’d especially like feedback on correctness, the SIMD/CUDA approach, and whether the benchmarks are framed in a fair way.

No one has commented on this post.