FilterHN

Gorilla: A fast, scalable, in-memory time series database (2016)

21 points

by xnorswap

4 days ago

| past

| 2 comments

| blog.acolyer.org

| HN

▲

x-yl

6 hours ago

[-]

The simplicity of Gorilla is attractive but for better compression ratios without too much extra compute I'd instead recommend Sprintz: https://github.com/dblalock/sprintz.

The downside is that (a) Sprintz requires the data to be quantised to fixed point integers, usually fine if the data is coming out of a sensor of some sort and (b) the Huffman coding step of Sprintz requires dynamic memory allocation, whilst Gorilla is almost trivially implemented without it.

Also see Chimp, which proposes some small tweaks to Gorilla to improve its performance: https://dl.acm.org/doi/abs/10.14778/3551793.3551852

▲

mgaunard

4 hours ago

[-]

How does it compare to DuckDB?

▲

phrotoma

3 hours ago

[-]

It doesn't really, except I suppose that both are called "databases". DuckDB is intended for OLAP, while Gorilla is specifically designed for time series data. You would never do something like "INSERT INTO users ..." with Gorilla.

▲

tosh

50 minutes ago

[-]

DuckDB also has as-of joins

https://duckdb.org/docs/current/guides/sql_features/asof_joi...

Are there workloads time series databases can do where DuckDB would be a bad fit?

▲

tosh

1 hour ago

[-]

afaiu DuckDB doesn't do delta of delta for timestamps

but it can do delta and bitpacking which is also kinda neat