FilterHN

Turso SQLite Offline Sync Public Beta

222 points

by charlieirish

3 months ago

| past

| 15 comments

| turso.tech

| HN

▲

ezekg

3 months ago

[-]

It'd be nice if the post went into how conflict resolution will (?) work because that's the hard part here and the main selling point imo.

▲

titaphraz

3 months ago

[-]

A lot of offline sync projects just drop data on conflict and pretend it didn't happen. It's the salesman job to divert your questions about it.

I found another blogpost from turso where they say they offer 3 options on conflict: drop it, rebase it (and hope for no conflict?) and "handle it yourself".

Writing an offline sync isn't hard. Dealing with conflicts is a PITA.

https://turso.tech/blog/introducing-offline-writes-for-turso...

▲

bob1029

3 months ago

[-]

Conflict resolution can't work in a general sense.

How you reconcile many copies of the same record could depend on time of action, server location, authority level of the user, causality between certain business events, enabled account features, prior phase of the moon, etc.

Whether or not offline sync can even work is very much a domain specific concern. You need to talk to the business about the pros & cons first. For example, they might not like the semantics regarding merchant terminals and offline processing. I can already hear the "what if the terminal never comes back online?" afternoon meeting arising out of that one.

▲

ochiba

3 months ago

[-]

I would say this is why server-authoritative systems that allow for custom logic in the backend for conflict resolution work well in practice (like Replicache, PowerSync and Zero - custom mutators coming in beta for the latter). Predefined deterministic distributed conflict resolution such as CRDT data structures work well for certain use cases like text editing, but many other use cases require deeper customizability based on various factors like you said.

▲

jahewson

3 months ago

[-]

CRDT falls flat for rich text editing though. So many nasty edge cases and nobody has solved them all, despite their claims.

▲

mentalgear

3 months ago

[-]

have you tried loro

▲

mmerlin

3 months ago

[-]

https://loro.dev/

▲

ezekg

3 months ago

[-]

Server-authoritative conflict resolution kind of mirrors my thinking as well, having resolution work like multiplayer net code, where the client and server may or may not attempt to resolve recent conflicts, but server has the final say on state. Just not sure how this plays out when a client starts dropping conflicting data because the server says so...

▲

ochiba

3 months ago

[-]

Yes, exactly. https://www.gabrielgambetta.com/client-side-prediction-serve...

▲

larkost

3 months ago

[-]

CFRDT (Conflict Free Replicated Data Types) can absolutely reconcile many-writers situations. There are a number of these systems, and they all have their own rules around that replication (sometimes very complicated rules that are hard to reason about). As long as you can live inside those rules, and accept that they are going to have sharp corners that don't quite make sense for your use case, then you can get a virtually free lunch there.

But living inside of those rules (and sometimes just understanding those rules) can be a big ask in some situations, so you have to know what you are doing.

▲

jahewson

3 months ago

[-]

This. It’s so obvious when you think about it, but everybody wants a free lunch.

▲

mystifyingpoi

3 months ago

[-]

I think this point confuses me the most in this regard:

> Local-first architectures allow for fast and responsive applications that are resilient to network failures

So are we talking about apps that can work for days and weeks offline and then sync a lot of data at once, or are we talking about apps that can survive a few seconds glitch in network connectivity? I think that what is promised is the former, but what will make sense in practice is the latter.

▲

hnthrow90348765

3 months ago

[-]

Local-first is overkill for transient faults. This is probably meant for outage and disaster scenarios.

▲

ochiba

3 months ago

[-]

There are niche use cases where the former (work for days to weeks offline) are useful and even critical - like certain field service use cases. Surviving glitches in network connectivity is useful for mainstream/consumer applications for users in general, especially those on mobile.

In my experience, it can affect the architecture and performance in a significant way. If a client can go offline for an arbitrary period of time, doing a delta sync when they come back online is more tricky, since we need to sync a specific range of operation history (and this needs to be adjusted for specific scope/permissions that the client has access to). If you scale up a system to thousands or millions of clients, having them all do arbitrary range queries doesn't scale well. For this reason I've seen sync engines simply force a client to do a complete re-sync if it "falls behind" with deltas for too long (e.g. more than a day or so.) Maintaining an operation log that is set up and indexed for querying arbitrary ranges of operations (for a specific scope of data) works well.

▲

Matthias247

3 months ago

[-]

I'm wondering too.

In general this seems to work only if there's a single offline client that accepts the writes.

With limitations to the data scheme (e.g. have distinct tables per client), it might work with multiple clients. However those would need to be documented and I couldn't see anything in this blog post.

▲

billconan

3 months ago

[-]

This sounds great, but I have some questions regarding data integrity and security.

If I build an offline first app using Turso, will my client directly exchange data with the database, without a layer of backend APIs to guarantee data integrity and security? For example, certain db write is only permitted for certain users, but when the db API is exposed, will that cause problems? A concrete example would be a forum where only moderators can remove users and posts. Say if I build an offline first forum, can a hacker hack the database on the filesystem and utilize the syncing feature to propagate the hacked data to the server?

▲

aboodman

3 months ago

[-]

Yes, this is a central issue in sync. For most applications, sync engines just aren't useful without some solution. Of course you need to validate inputs, support fine-grained permissions, etc., as developers have done with web apps for eons.

In Replicache, we addressed this by making your application server responsible for writes:

https://doc.replicache.dev/concepts/how-it-works

By doing this, your server can implement any validation it wants. It can also interact with external systems, do notifications, etc. Anything you can do with a traditional API.

In our new sync engine, Zero (https://zerosync.dev), we're adding this same ability soon (like this week) under the name custom mutators:

https://bugs.rocicorp.dev/issue/3045

This has been a hard project, but is really critical to use sync engines for anything serious.

▲

isaachinman

3 months ago

[-]

Happy user of Replicache. You and the team got it right.

▲

franciscop

3 months ago

[-]

The blog post doesn't even touch on write conflicts, which is the main reason I opened it (I was curious on how they solved them), so not surprised there's no many details about security etc.

▲

refulgentis

3 months ago

[-]

You raise an interesting point, that along with the replies, compels me to note that all of this stuff is bespoke, and things that sound simple like "I just want a good syncing library" are intractable in practice.

Ex. if I'm doing a document-based app, users can have at it, corrupt their own data all they want.

I honestly cannot wrap my mind around discussions re: SQLite x web dev, perhaps because I've been in mobile dev: but I don't even know what it'd mean to have an "offline-first forum" that syncs state: it's a global object with shared state rendered on the client.

When you set aside the implications introduced by using a hack scenario, a simpler question emerges: How would my clients sync the whole forum back to the cloud? Generally, my inclination is to handwave about users being able to make posts and have it "just work", after all, can't Turo help with simple scenarios like a posts table that has a date column? That makes it virtually conflict free...but my experience is "virtually" bites you, hard.

▲

ochiba

3 months ago

[-]

I am not sure about Turso but I've seen a few different approaches to this with other sync engine architectures:

1. At a database level: Using something like RLS in Postgres

2. At a backend level: The sync engine processes write operations via the backend API, where custom validation and authorization logic can be applied.

3. At a sync engine level: If the sync engine processes the write operations, there can be some kind of authorization layer similar to RLS enforced by the sync engine on the backend.

▲

tracker1

3 months ago

[-]

I'm pretty sure you'll have to write parts of your app against your own APIs that represent the owner of the db for a group.

With Turso, you would want a model that had, for example a db per user and one per group. With the turso model you want to think something closer to sharding by hand for secure write user or group.

I could be wrong on this though. That's just my rough understanding.

▲

krashidov

3 months ago

[-]

This is my problem with these local first libraries. What happens if there's some data that needs to live in a db that's separate from the replicated sqlite db?

What I would really love is a sync engine library that is agnostic of your database.

Haven't really seen one yet.

▲

vekker

3 months ago

[-]

Exactly. So many local first libs don't cover this that it makes me wonder if the applications I am typically working on are so fundamentally different from what the local-first devs are normally building?

Most apps have user data that needs to be (partially or fully) shielded from other users. Yet, most local-first libs neglect to explain how to implement this with their libraries, or sometimes it's an obscure page or footnote somewhere in their docs, as if this is just an afterthought...

▲

ochiba

3 months ago

[-]

It's definitely quite a hard engineering problem to solve, if you try to cover a wide range of use cases, and layer on top of that things like permissions/authorization and scalability

▲

justanotheratom

3 months ago

[-]

That is a very crucial question. I am also interested in the answer.

Perhaps they have RLS type policies that are only modifiable on the server.

▲

nightowl_games

3 months ago

[-]

Honestly this is so simple and core to the idea that I literally just assume it's handled.

▲

thisislife2

3 months ago

[-]

I'd have thought that in this day and age every developer would know by now the importance of sanitizing user input before a web application accepts it? Your doubt has given me some pause ...

▲

setr

3 months ago

[-]

If the database is local, your web app database access is local. It can be modified and changed by the user, unlike code hosted on the web server, and any sanitization can thus be bypassed.

Meaning the user has effectively direct access to the underlying local database. Which, if blindly and totally synced, gives the user effectively direct access to the central database.

I'd have thought that in this day and age every developer would know by now the importance of not trusting frontend validation in a web application? your doubt has given me some pause.

▲

thisislife2

3 months ago

[-]

any sanitization can thus be bypassed. - Then you are obviously not doing it properly. It should also be obvious nobody is talking about frontend validation when talking about syncing a database.

▲

setr

3 months ago

[-]

So when you say “sanitize user input”, you meant “store unsanitized/unvalidated user input in the local DB, and then sanitize it on sync to the central server”? You’ll need a hook into the syncing process to do that.

Perhaps something like “a layer of backend APIs to guarantee data integrity and security”?

This is a sync between a local database (read: on the user’s machine) and a central one (read: on your fancy server). The whole point of introducing a local database is to make database writes happen locally… on the frontend. everything related to the app, including database writes, is happening on the user’s machine. The only time you have a backend that you actually own and control is on database sync between local and central.

▲

wahnfrieden

3 months ago

[-]

No need to give a rude, condescending and unhelpful answer - there will always be people learning

▲

thisislife2

3 months ago

[-]

It wasn't an answer - it was a comment adding to his question expressing my surprise that developers still are making this kind of mistake.

▲

wahnfrieden

3 months ago

[-]

I know

▲

0x6c6f6c

3 months ago

[-]

Since enough comments seem to focus on consolidating writes and security issues, please see the post. They explicitly state this as something they're working on.

> we're working hard on the following features: > > Automatic and manual conflict resolution

As such everything within this thread is conjecture unless otherwise informed by their work.

▲

SahAssar

3 months ago

[-]

And it is probably not useful for most applications until there is some sort of security there.

▲

benwilber0

3 months ago

[-]

I just wrote a post about this kind of stuff and why it's (almost always) nonsense. You will spend more time and effort trying to shoe-horn SQLite into a server database than you will ever get any benefits from.

https://benwilber.github.io/programming/2025/03/31/sqlite-is...

▲

crazygringo

3 months ago

[-]

Agreed. I've been seeing a lot of these posts lately, about embedding SQLite into web servers.

I think a lot of people just don't realize how few resources Postgres or MySQL use, and how fast they are. You can run Apache and MySQL and a scripting language on a tiny little 512 MB memory instance, and serve some decent traffic. It works great.

Wanting to use SQLite and deal with replication is a nightmare. I don't get it. (And I love using SQLite in apps and scripts. But not websites!)

▲

GrumpyCat42

3 months ago

[-]

Yup - I just recently learned this lesson the hard way with Turso's LibSQL server. While some of the features (like s3 replication) are cool and interesting, the amount of time working around multiple writers and foreign key shenanigans was not worth it when Postgres would have just gotten the job done.

▲

NathanFlurry

3 months ago

[-]

Can you elaborate on both of these issues?

Vanilla SQLite solved multiple writers a long time ago when they introduced WAL in 2010. Does Turso not support this?

Is the issue with foreign keys that they're not enabled by default?

▲

DaveMcMartin

3 months ago

[-]

I would like to test it with a large database to see how it handles a 3–5 GB database sync.

In the example, it shows syncing returning a promise. Is there no way to track the progress of the sync?

▲

klabb3

3 months ago

[-]

What is the realtime and reactivity support in Turso? It’s starting to look attractive but do updates propagate automatically without polling?

▲

avinassh

3 months ago

[-]

(I work at Turso)

libSQL does not support realtime/reactivity features. There have been attempts (by both our team and the community), but we encountered difficulties and ultimately abandoned the idea. I can check with the team for the exact reasons.

However, Limbo (our SQLite rewrite in Rust) will include this feature, though there is no ETA yet. It's currently being discussed here: https://github.com/tursodatabase/limbo/discussions/1149. Please share your thoughts and use cases.

▲

klabb3

3 months ago

[-]

Thanks. My personal use case is a client with business logic written in Go with intermittent connectivity, some data is private to the device, some private to the server, and with some data synced from server to clients. Many events are time critical and need instant updates.

To me it’s more general than that though. The expectation of having reactive behavior comes up everywhere from chat apps to collaborative to even games. Polling and refreshing feels very 90s and ”web page centric” but of course in some cases (say long form blogging or similar) it’s not a concern.

▲

srameshc

3 months ago

[-]

I really admire Turso. This Offline Sync is probably like ELectric offering but if it will work as anticipated , lot of read only data can be closed to the frontend and can improve performance and save cost.

▲

joelkoen

3 months ago

[-]

> Point-of-Sale Systems — process transactions regardless of internet connectivity

Mentioning this as an example use case is simply untrue, right?

▲

aabhay

3 months ago

[-]

I think this might be true in an invoicing based world but I agree that PoS is a misnomer. But I can imagine e.g. a field sales rep in construction taking orders on site in a rural area then syncing up when they return to connectivity.

▲

anovick

3 months ago

[-]

Is Turso open-source? is libSQL open-source?

▲

avinassh

3 months ago

[-]

disclosure: I work at Turso.

libSQL is our fork of SQLite. It adds new features on top of SQLite, such as vector search, replication, server mode, and more. libSQL is open source (MIT license).

The Turso SaaS platform, which provides hosted databases, is not open source.

Limbo (which will be renamed to Turso in the future) is our Rust rewrite of SQLite. It is also open source (MIT license) - https://github.com/tursodatabase/limbo

▲

anonzzzies

3 months ago

[-]

Seems not really or only partially anyway; I cannot see, with disasters like Fauna, that anyone would trust their core stuff with something not open source. But maybe it's just me. I need to be able to switch and sure, I can switch to the open source fork libSQL (open core I think, so nope) (which they are rewriting to Rust for some bizarre reason; sqlite is one of the most readable, robust and well tested codebases in the business so it looks like burning vc money, but whatever), but once my business depends on their offline / replication / etc, then I cannot switch to anything, so never going to happen.

Again, opinion. It's core infra, in my opinion at least that should never depend on others or, if the others inevitably screw you over for a few $, you need to be able to move without possibly bankrupting your company.

▲

cultofmetatron

3 months ago

[-]

turso is a SAAS built on libsql. libsql itself is opensource

▲

dyu-

3 months ago

[-]

They are open-core. The offline-write feature (libsql::Builder::new_synced_database) basically does not work with the bare `sqld` server on their `libsql` github repository.

In fairness though their `libsql::Builder::new_remote_replica` works with the bare `sqld`

▲

canadiantim

3 months ago

[-]

Mind you they're migrating from libsql to Turso (formerly called Limbo), a rust-based rewrite of sqlite entirely

▲

SchwKatze

3 months ago

[-]

Technically the project name remain being Limbo until the milestone[1] be reached. AFAIK the platform continue to use libSQL, given the early stage of Limbo.

[1] - https://github.com/tursodatabase/limbo/milestone/1

▲

no_wizard

3 months ago

[-]

I wonder if they will subject limbo to (If I recall correctly) Jepsen tests.

▲

adamrezich

3 months ago

[-]

Also important to note that, by all appearances, they seem to intend to Embrace Extend Extinguish sqlite3 with said rewrite.

▲

jakelazaroff

3 months ago

[-]

That’s a tall order; by SQLite’s own estimates, SQLite is used more than all other databases combined and may even be the most widely deployed software component, ever: https://www.sqlite.org/mostdeployed.html

▲

adamrezich

3 months ago

[-]

Yes, which is why I believe the entire effort to be pretty silly.

▲

rustc

3 months ago

[-]

> they seem to intend to Embrace Extend Extinguish

What makes you think that? Limbo is MIT licensed and I don't see any CLA on [1].

[1]: https://github.com/tursodatabase/limbo/blob/main/CONTRIBUTIN...

▲

adamrezich

3 months ago

[-]

It is very interesting that there is no “code of conduct” in the Limbo project, given that there is one in libsql [0]—I wonder if they realized that playing the whole “the SQLite Code of Ethics is bad and wrong, actually” card [1] wasn't turning out to be as popular of a move as they thought it would be. Or, at the very least, they realized that if you're going to shit on SQLite for having a “code of ethics”, then it's better to not provide a “code of conduct” rather than to provide one (even though the entire impetus for SQLite writing their Code of Ethics was some projects required they have some sort of “code of conduct” in order to use SQLite…)

I know it's super lame to regurgitate LLM output in comments, but I found it interesting that Grok was able to infer [2] what I was talking about when I made this accusation on X some months ago, and it absolutely nailed basically everything I had in mind:

---

Q: Why might Adam think Limbo is an example of "EEE" so "brazen" as to resemble Microsoft in the 90s?

A: Adam Rezich's comment suggests that Pekka Enberg's project, Limbo, could be seen as an example of "Embrace, Extend, Extinguish" (EEE) tactics, which were notably used by Microsoft in the 1990s. Here's a breakdown of why Adam might think this:

1. Embrace: Limbo starts by embracing SQLite, a widely used, open-source database engine. By aiming for full compatibility with SQLite, Limbo acknowledges and adopts the existing technology, similar to how Microsoft would integrate or support existing standards or technologies in their products.

2. Extend: Limbo extends SQLite by rewriting it in Rust, adding new features like built-in vector search and asynchronous I/O, which are not part of the original SQLite. This extension phase is where Microsoft historically added proprietary features or extensions to standards, making their versions more appealing or functional in certain ways.

3. Extinguish: Although Limbo hasn't reached this phase yet, the potential for extinguishing could arise if Limbo becomes superior enough to make SQLite less relevant or if it leads to a market shift where SQLite users migrate to Limbo due to its enhanced capabilities. Microsoft in the 90s would often extend standards in such a way that their versions became the de facto standard, overshadowing or making competitors' versions obsolete.

The "brazen" aspect comes from the transparency and public nature of the project. Unlike Microsoft's more secretive and strategic approach in the past, Limbo's development is open, with Enberg sharing his plans and progress publicly. This openness, combined with the ambitious goal of completely rewriting a foundational piece of software like SQLite, might seem bold or even audacious, reminiscent of Microsoft's aggressive market strategies but done in a more transparent and community-driven manner.

---

The only thing Grok didn't mention was the overt emphasis on the “community” aspect of this project, which is being promoted as being strictly, even obviously better than sqlite3's way of doing things. For me personally, sqlite3 not being a “community”-focused project is actually a huge advantage—modern programming “communities”, while being good for building hype and allegiance for one's project, generally result in very unpleasant nth-order social effects which have nothing to do with the quality of software.

[0] https://github.com/tursodatabase/libsql/blob/main/CODE_OF_CO...

[1] https://x.com/glcst/status/1887148191344148681

[2] https://x.com/i/grok/share/6PMLqGr2JETaSGF8hhVxs0WJk

▲

progx

3 months ago

[-]

Does it support type safety? Or exists an add / module on top of turso that support it?

▲

brikym

3 months ago

[-]

Does Turso have any plans to support streaming queries for real time applications?

▲

shanth

3 months ago

[-]

how about gun.js with sqlite3 storage adapter https://github.com/gundb/sqlite

▲

cheema33

3 months ago

[-]

What are some of the differences from Electric SQL?

▲

thruflo

3 months ago

[-]

Electric is a read-path sync engine for Postgres that does partial replication [1]. It's agnostic to your choice of client -- you can sync into anything you like. It doesn't handle or mandate a pattern for syncing local writes back to Postgres. Instead, it's designed to allow you to handle concerns like writes and auth using your existing API [2].

Turso Offline Sync is an active-active replication system for distributed SQLite/Turso databases. IIUC, this release adds the capability to sync local writes back to the cloud [3].

[1] https://electric-sql.com [2] https://electric-sql.com/blog/2024/11/21/local-first-with-yo... [3] https://turso.tech/blog/turso-offline-sync-public-beta#what-...

▲

senderista

3 months ago

[-]

LOL "[conflict] resolution is not yet implemented"

▲

kmetan

3 months ago

[-]

Looks good, but i am fine with midas.dll