https://github.com/dbos-inc/dbos-transact-java
Essentially, DBOS helps you write long-lived, reliable code that can survive failures, restarts, and crashes without losing state or duplicating work. As your workflows run, it checkpoints each step they take in a Postgres database. When a process stops (fails, restarts, or crashes), your program can recover from those checkpoints to restore its exact state and continue from where it left off, as if nothing happened.
In practice, this makes it easier to build reliable systems for use cases like AI agents, payments, data synchronization, or anything that takes hours, days, or weeks to complete. Rather than bolting on ad-hoc retry logic and database checkpoints, durable workflows give you one consistent model for ensuring your programs can recover from any failure from exactly where they left off.
This library contains all you need to add durable workflows to your program: there's no separate service or orchestrator or any external dependencies except Postgres. Because it's just a library, you can incrementally add it to your projects, and it works out of the box with frameworks like Spring. And because it's built on Postgres, it natively supports all the tooling you're familiar with (backups, GUIs, CLI tools) and works with any Postgres provider.
If you want to try it out, check out the quickstart:
https://docs.dbos.dev/quickstart?language=java
We'd love to hear what you think! We’ll be in the comments for the rest of the day to answer any questions.
I've been reading the DBOS Java documentation and have some questions, if you don't mind:
- Versioning; from the looks of it, it's either automatically derived from the source code (presumably bytecode?), or explicitly set for the entire application? This would be too coarse. I don't see auto-configuration working, as a small bug fix would invalidate the version. (Could be less of a problem if you're looking only at method signatures... perhaps add to the documentation?) Similarly, for the explicit setting, a change to the version number would invalidate all existing workflows, which would be cumbersome.
Have you considered relying on serialVersionUID? Or at least allowing explicit configuration on a per workflow basis? Or a fallback method to be invoked if the class signatures change in a backward incompatible way?
Overall, it looks like DBOS is fairly easy to pick up, but having a good story for workflow evolution is going to be essential for adoption. For example, if I have long-running workflows... do I have to keep the old code running until all old instances complete? Is that the idea?
- Transactional use; would it be possible to create a new workflow instance transactionally? If I am using the same database for DBOS and my application, I'd expect my app to do some work and create some jobs for later, reusing my transaction. Similarly, maybe when the jobs are running, I'd perhaps want to use the same transaction? As in, the work is done and then DBOS commits?
I know using the same transaction for both purposes could be tricky. I have, in fact, in the past, used two for job handling. Some guidance in the documentation would be very useful to have.
Thank you.
I was trying to think of a use case for this and I was reminded of a Sun Microsystems demo (yes I'm that old) I saw. The paused a JVM and "slept" it, then kicked up on another machine almost instantly, all over the network. Was a pretty cool party trick, but then they did it when an HTTP request came in (Serverless was invented a LONG time ago!). I kinda wonder if this could be used for that?
I wonder if my Solaris 7 cert is still good...
what is your input on these two topics? aka pull vs push and working well with serverless workflows
You can view your workflows and queues, search/filter them by any number of criteria, visualize graphs of workflow steps, cancel workflows, resume workflows, restart workflows from a specific step--everything you'd want.
Currently, this is available as a managed offering (Conductor - https://docs.dbos.dev/production/self-hosting/conductor), but we're also releasing a self-hostable version of it soon.
Whats the best way to hear about it when it does? Maybe newsletter I can register to or something.
Also we have a backronym now: Durable Backends, Observable and Simple.
one of the rough edges i've noticed w/DBOS is for workflows that span multiple services. all of the examples are contained in a single application and thus use a single dbos 'system db' instance. if you have multiple services (as you often do in the real world) that need to participate in a workflow.. you really can't. you need to break them into multiple workflows and enqueue them in each service by creating an instance of the dbos client pointed at the other services system db. aside from the obvious overhead from fragmenting a workflow into multiple (and that you have to push to the service instead of a worker pulling the step), that means that every service needs to be aware of and have access to, every other services system db. also worth noting that sharing a single system db between services was not advised when i asked.
(docs for the above: https://docs.dbos.dev/architecture#using-dbos-in-a-distribut...)
So if you were for example running a website and wanted to have a "cancellation" flow, you'd have the cancellation service with the workflow inside of it, which would have all the steps defined, like
1) disable user account
2) mark user data as archived
3) cancel recurring payments
And then each step would call the service that actually does that work, using an idempotency key. Each service might have its own internal workflows to accomplish each task. In this case step 1 would call the accounts service, step two would call the storage service, and step three would call the payment service.
But then you have a clean reusable interface in each service, as well as a single service responsible for the workflow.
How could we make this experience better while keeping DBOS a simple library? One improvement that comes to mind is to add an "application name" field to the workflows table so that multiple applications could share a system database. Then one application could directly enqueue a workflow to another application by specifying its name, and workflow observability tooling would work cross-application.
Being a library is a pretty interesting feature! Correct, Durable Functions allows you to write task-parallel orchestrations of task-parallel 'activities' (which are stateless functions), and these orchestrations are fully persistent and resilient, like DBOS executions. It also has the concept of 'Entities', which are named objects (of a type you define) that "live forever", and serialize all method invocations, which are the only way to change their private state. These are also persistent. The Netherite paper [1], section 2, describes this model well.
So, there seems to be a pretty close correspondence between DBOS steps and DF activities, and between workflows and orchestrations. I don't know what the correspondence is to DF entities is in the DBOS model.
[1] https://www.microsoft.com/en-us/research/wp-content/uploads/...
There's no technical reason why this couldn't be done with another database, and we may add support for more in the future (DBOS Python already supports SQLite), but we're not working on it right now.
But dbos doesn't have opensource release of web ui, which is most critical part for a workflow management tool.
Since competitor have all of itself opensource I don't think dbos will have a chance.
That said, while DBOS requires Postgres for its own checkpoints, it can (and often is) used alongside other databases like MySQL for application data.
This reminds me of a product called restate. I talked to some people in that company a while ago. Their solution is built in Rust I think but they have clients for all sorts of platforms. Including Kotlin. Cool company and distributed workflow engines and agentic / long running workflows feel like a good match.
There are lots of other solutions in this space. I believe an ex Red Hat person is working on rebooting a workflow engine called Kogito based on something that orginally lived under their umbrella.
There's a long history of very enterprisy business process management stuff here. Lots of potential for overengineered solutions.
I once got sucked into a Spring Batch centric project and it was hopelessly overengineered for the requirements. Gave me a proper headache. Nothing was simple. Everything was littered in magic annotations causing all sorts of weird side effects. That's why I prefer declarative approaches with simple functions. Which is what the Kotlin syntax enables relative to Java. You can do the same technically in java but it quickly becomes an unreadable mess of function chaining.
Have you run into any issues using DBOS Python with gevent? Please let us know!