Cap'n Web: a new RPC system for browsers and web servers
633 points
2 days ago
| 61 comments
| blog.cloudflare.com
| HN
mtrovo
2 days ago
[-]
The section on how they solved arrays is fascinating and terrifying at the same time https://blog.cloudflare.com/capnweb-javascript-rpc-library/#....

> .map() is special. It does not send JavaScript code to the server, but it does send something like "code", restricted to a domain-specific, non-Turing-complete language. The "code" is a list of instructions that the server should carry out for each member of the array.

> But the application code just specified a JavaScript method. How on Earth could we convert this into the narrow DSL? The answer is record-replay: On the client side, we execute the callback once, passing in a special placeholder value. The parameter behaves like an RPC promise. However, the callback is required to be synchronous, so it cannot actually await this promise. The only thing it can do is use promise pipelining to make pipelined calls. These calls are intercepted by the implementation and recorded as instructions, which can then be sent to the server, where they can be replayed as needed.

reply
mdasen
2 days ago
[-]
In C#, there's expression trees which handle things like this and it's how Entity Framework is able to convert the lambdas it's given into SQL. This means that you can pass around code that can be inspected or transformed instead of being executed. Take this EntityFramework snippet:

    db.People.Where(p => p.Name == "Joe")
`Where` takes an `Expression<Func<T, bool>> predicate`. It isn't taking the `Func` itself, but an `Expression` of it so that it can look at the code rather than execute it. It can see that it's trying to match the `Name` field to the value "Joe" and translate that into a SQL WHERE clause.

Since JS doesn't have this, they have to pass in a special placeholder value and try to record what the code is doing to that value.

reply
squirrellous
1 day ago
[-]
Is there anything C# _doesn’t_ have? :-)

It feels like C# has an answer to every problem I’ve ever had with other languages - dynamic loading, ADTs with pattern matching, functional programming, whatever this expression tree is, reflection, etc etc. Yet somehow it’s still a niche language that isn't widely used (outside of particular ecosystems).

reply
rjbwork
1 day ago
[-]
It's one of the most widely used languages out there actually. But it's primarily used at buttoned up and boring SMB's/enterprise backoffices. We're not out here touting our new framework of the month to kafloogle the whatzit. We're just building systems with a good language and ecosystem that's getting better every year.

I've worked only at startups/small businesses since I graduated university and it's all been in C#.

reply
brainzap
1 day ago
[-]
getting better? many packages we been using did a license swap on us xD

fucking nice ecosystem

reply
rjbwork
1 day ago
[-]
Fork it. End of the day some guys decided they wanted to make money and the corporations profiting off their labor weren't paying up. These things don't happen in a vacuum. Does your company have a multi-thousand dollar a year budget to make sure your dependencies are sustainable?
reply
garblegarble
1 day ago
[-]
>Is there anything C# _doesn’t_ have?

You were maybe already getting at it, but as a kitchen sink language the answer is "simplicity". All these diverse language features increase cognitive load when reading code, so it's a complexity/utility tradeoff

reply
dominicrose
1 day ago
[-]
As someone who dislikes clutter, in my experience it's just easier to read and write with these languages: Perl, PHP, Ruby, Python, Javascript, Smalltalk.

If you dare leave the safety of a compiler you'll find that Sublime Merge can still save you when rewriting a whole part of an app. That and manual testing (because automatic testing is also clutter).

If you think it's more professional to have a compiler I'd like to agree but then why did I run into a PHP job when looking for a Typescript one? Not an uncommon unfolding of events.

reply
FungalRaincloud
1 day ago
[-]
I'm a bit surprised that you put PHP in that list. My current workload is in it, and a relatively modern version of it, so maybe that surprise will turn around soon, but I've always felt that PHP was more obnoxious than even C to read and write.

Granted, I started out on LISP. My version of "easy to read and write" might be slightly masochistic. But I love Perl and Python and Javascript are definitely "you can jump in and get shit done if you have worked in most languages. It might not be idiomatic, but it'll work"...

reply
dominicrose
6 hours ago
[-]
PHP is easy to get into because of the simple (and tolerant) syntax and extremely simple static typing system. The weak typing also means it's easier for beginners.

It does require twice the lines of PHP code to make a Ruby or Python program equivalent, or more if you add phpdoc and static types though, so it is easier to read/write Ruby or Python, but only after learning the details of the language. Ruby's syntax is very expressive but very complex if you don't know it by heart.

reply
gwbas1c
1 day ago
[-]
Good abstractions around units (Apologies if there is a specific terminology that I should use.)

Specifically, I'd like to be able to have "inches" as a generic type, where it could be an int, long, float, double. Then I'd also like to have "length" as a generic type where it could be inches as a double, millimeters as a long, ect, ect.

I know they added generic numbers to the language in C# 7, so maybe there is a way to do it?

reply
evntdrvn
1 day ago
[-]
Check out F# "units of measure" ;)
reply
vaylian
1 day ago
[-]
It's funny how C# started out as a Java clone and then added a ton of features while Java stayed very conservative with new language features. And both languages are fine.
reply
uzerfcwn
1 day ago
[-]
> Is there anything C# _doesn’t_ have?

Pi types, existential types and built-in macros to name a few.

reply
moomin
1 day ago
[-]
Sum types are the ones I really miss. The others would be nice but processing heterogeneous streams is my biggest practical issue.
reply
drysart
1 day ago
[-]
There are inherent limitations with the "execute it once and see what happens" approach; namely that any conditional logic that might be in the mapping function is going to silently get ignored. For example, `db.people.map(p => p.IsPerson ? (p.FirstName + ' ' + p.LastName) : p.EntityName)` would either be seen as reading `(IsPerson, FirstName, LastName)` or `(p.IsPerson, p.EntityName)` depending on the specific behavior of the placeholder value ... and neither of those sets is fully correct.

I wonder why they don't just do `.toString()` on the mapping function and then parse the resulting Javascript into an AST and figure out property accesses from that. At the very least, that'd allow the code to properly throw an error in the event the callback contains any forbidden or unsupported constructs.

reply
kentonv
1 day ago
[-]
The placeholder value is an RpcPromise. Which means that all its properties are also RpcPromises. So `p.IsPerson` is an RpcPromise. I guess that's truthy, so the expression will always evaluate to `(p.FirstName + ' ' + p.LastName)`. But that's going to evaluate to '[object Object] [object Object]'. So your mapper function will end up not doing anything with the input at all, and you'll get back an array full of '[object Object] [object Object]'.

Unfortunately, "every object is truthy" and "every object can be coerced to a string even if it doesn't have a meaningful stringifier" are just how JavaScript works and there's not much we can do about it. If not for these deficiencies in JS itself, then your code would be flagged by the TypeScript compiler as having multiple type errors.

reply
drysart
1 day ago
[-]
Yeah I'll definitely chalk this up to my not having more than a very very passing idea of the API surface of your library based on a quick read over just the blog post.

On a little less trivial skim over it looks like the intention here isn't to map property-level subsets returned data (e.g., only getting the `FirstName` and `LastName` properties of a larger object); as much as it is to do joins and it's not data entities being provided to the mapping function but RpcPromises so individual property values aren't even available anyway.

So I guess I might argue that map() isn't a good name for the function because it immediately made me think it's for doing a mapping transformation and not for basically just specifying a join (since you can't really transform the data) since that's what map() can do everywhere else in Javascript. But for all I know that's more clear when you're actually using the library, so take what I think with a heaping grain of salt. ;)

reply
Aeolun
1 day ago
[-]
Couldn’t you make this safer by passing the map something that’s not a plain JS function? I confess to that being the only thing that had me questioning the logic. If I can express everything, then everything should work. If it’s not going to work, I don’t want to be able to express it.
reply
kentonv
1 day ago
[-]
I think any other syntax would likely be cumbersome. What we actually want to express here is function-shaped: you have a parameter, and then you want to substitute it into one or more RPC calls, and then compute a result. If you're going to represent that with a bunch of data structures, you end up with a DSL-in-JSON type of thing and it's going to be unwieldy.
reply
pcthrowaway
1 day ago
[-]
I suspect there is prior work to draw from that could make this feasible for you... Have a look at how something like MongoDB handles conditional logic for example.
reply
kentonv
1 day ago
[-]
Yeah that's what I mean by DSL-in-JSON. I think it's pretty clunky. It's also (at least in Mongo's formulation, at least when I last used it ~10 years ago) very vulnerable to query injection.
reply
skybrian
1 day ago
[-]
Another way to screw this up would be to have an index counter and do something different based on the index. I think the answer is "don't do that."
reply
kentonv
1 day ago
[-]
Hmm, but I should make it so the map callback can take the index as the second parameter probably. Of course, it would actually be a promise for the index, so you couldn't compute on it, but there might be other uses...
reply
kentonv
1 day ago
[-]
> I wonder why they don't just do `.toString()` on the mapping function and then parse the resulting Javascript into an AST and figure out property accesses from that.

That sounds incredibly complicated, and not something we could do in a <10kB library!

reply
actionfromafar
1 day ago
[-]
Maybe Fabrice Bellard could spare an afternoon.
reply
sonthonax
1 day ago
[-]
To the contrary, a simple expression language is one of those things that can easily be done in that size.
reply
kentonv
1 day ago
[-]
But the suggestion wasn't to design a simple expression language.

The suggestion was to parse _JavaScript_. (That's what `.toString()` on a function does... gives you back the JavaScript.)

reply
notpushkin
1 day ago
[-]
PonyORM does something similar in Python:

    select(c for c in Customer if sum(c.orders.total_price) > 1000)
I love the hackiness of it.
reply
porridgeraisin
1 day ago
[-]
PonyORM is my favourite python ORM.

Along with https://pypi.org/project/pony-stubs/, you get decent static typing as well. It's really quite something.

reply
rafaelgoncalves
1 day ago
[-]
whoa, didn't know PonyORM, looks really neat! thanks for showing
reply
keyle
1 day ago
[-]
C#, Swift, Dart, Rust... Python. Many languages take lambda/predicate/closure as filter/where.

It generally unrolls as a `for loop` underneath, or in this case LINQ/SQL.

C# was innovative for doing it first in the scope of SQL. I remember the arrival of LINQ... Good times.

reply
sobani
1 day ago
[-]
How many of those languages can take an expression instead of a lambda?

Func<..> is lambda that can only be invoked.

Expression<Func..>> is an AST of a lambda that can be transformed by your code/library.

reply
Tyr42
1 day ago
[-]
R let's you do that, and it gets used by the tidy verse libraries to do things like change the scope variables in the functions are looked up in.
reply
javier2
1 day ago
[-]
It dont think C# looks at the code? I suspect it can track that you called p.Name, then generate sql with this information?
reply
ziml77
1 day ago
[-]
The C# compiler is looking at that code. It sees that the lambda is being passed into a function which accepts an Expression as a parameter, so it compiles the lambda as an expression tree rather than behaving like it's a normal delegate.
reply
adzm
1 day ago
[-]
The lambda is converted into an Expression, basically a syntax tree, which is then analyzed to see what is accessed.
reply
javier2
10 hours ago
[-]
Ok, so its a step in the compile that rewrites and analyzes it?
reply
samwillis
1 day ago
[-]
This record and replay trick is very similar to what I recently used to implement the query DSL for Tanstack DB (https://tanstack.com/db/latest/docs/guides/live-queries). We pass a RefProxy object into the where/select/join callbacks and use it to trace all the props and expressions that are performed. As others have noted you can't use js operators to perform actions, so we built a set of small functions that we could trace (eq, gt, not etc.). These callbacks are run once to trace the calls and build a IR of the query.

One thing we were surprisingly able to do is trace the js spread operation as that is a rare case of something you can intercept in JS.

Kenton, if you are reading this, could you add a series of fake operators (eq, gt, in etc) to provide the capability to trace and perform them remotely?

reply
kentonv
1 day ago
[-]
Yes, in principle, any sort of remote compute we want to support, we could accomplish by having a custom function you have to call for it. Then the calls can be captured into the record.

But also, apps can already do this themselves. Since the record/replay mechanism already intercepts any RPC calls, the server can simply provide a library of operations as part of its RPC API. And now the mapper callback can take advantage of those.

I think this is the approach I prefer: leave it up to servers to provide these ops if they want to. Don't extend the protocol with a built-in library of ops.

reply
samwillis
1 day ago
[-]
Ah, yes, obviously. This is all very cool!
reply
svieira
1 day ago
[-]
Just a side note - reading https://tanstack.com/db/latest/docs/guides/live-queries#reus... I see:

    const isHighValueCustomer = (row: { user: User; order: Order }) => 
     row.user.active && row.order.amount > 1000
But if I'm understanding the docs correctly on this point, doesn't this have to be:

    const isHighValueCustomer = (row: { user: User; order: Order }) => 
      and(row.user.active, gt(row.order.amount, 1000))
reply
samwillis
1 day ago
[-]
Yep, that's an error in the docs..
reply
IshKebab
1 day ago
[-]
As I understand it it's basically how Pytorch works. A clever trick but also super confusing because while it seems like normal code, as soon as you try and do something that you could totally do in normal code it doesn't work:

  let friendsWithPhotos = friendsPromise.map(friend => {
    return {friend, photo: friend.has_photo ? api.getUserPhoto(friend.id) : default_photo};
  }
Looks totally reasonable, but it's not going to work properly. You might not even realise until it's deployed.
reply
mdavidn
1 day ago
[-]
Reminds me of Temporal.io "workflows," which are functions that replace the JSON workflow definitions of AWS Step Functions. If a workflow function's execution is interrupted, Temporal.io expects to be able to deterministically replay the workflow function from the beginning, with it yielding the same sequence of decisions in the form of callbacks.
reply
miki123211
1 day ago
[-]
I kind of want to see an ORM try something like this.

You can't do this in most languages because of if statements, which cannot be analyzed in that way and break the abstraction. You'd either need macro-based function definitions (Lisp, Elixir), bytecode inspection (like in e.g. Pytorch compile), or maybe built-in laziness (Haskell).

Edit: Or full object orientation like in Smalltalk, where if statements are just calls to .ifTrue and .ifFalse on a true/false object, and hence can be simulated.

reply
spankalee
2 days ago
[-]
I presume conditionals are banned - sort of like the rules of hooks - but how?
reply
kentonv
2 days ago
[-]
The input to the map function (when it is called in "record" mode on the client) is an RpcPromise for the eventual value. That means you can't actually inspect the value, you can only queue pipelined calls on it. Since you can't inspect the value, you can't do any computation or branching on it. So any computation and branching you do perform must necessarily have the same result every time the function runs, and so can simply be recorded and replayed.

The only catch is your function needs to have no side effects (other than calling RPC methods). There are a lot of systems out there that have similar restrictions.

reply
btown
1 day ago
[-]
> your function needs to have no side effects

I'm trying to understand how well this no-side-effects footgun is defended against.

https://github.com/cloudflare/capnweb/blob/main/src/map.ts#L... seems to indicate that if the special pre-results "record mode" call of the callback raises an error, the library silently bails out (but keeps anything already recorded, if this was a nested loop).

That catches a huge number of things like conditionals on `item.foo` in the map, but (a) it's quite conservative and will fail quite often with things like those conditionals, and (b) if I had `count += 1` in my callback, where count was defined outside the scope, now that's been incremented one extra time, and it didn't raise an error.

React Hooks had a similar problem, with a constraint that hooks couldn't be called conditionally. But they solved their DX by having a convention where every hook would start with `use`, so they could then build linters that would enforce their constraint. And if I recall, their rules-of-hooks eslint plugin was available within days of their announcement.

The problem with `map` is that there are millions of codebases that already use a method called `map`. I'd really, really love to see Cap'n Web use a different method name - perhaps something like `smartMap` or `quickMap` or `rpcMap` - that is more linter-friendly. A method name that doesn't require the linter to have access to strong typing information, to understand that you're mapping over the special RpcPromise rather than a low-level array.

Honestly, it's a really cool engineering solve, with the constraint of not having access to the AST like one has in Python. I do think that with wider adoption, people will find footguns, and I'd like this software to get a reputation for being resilient to those!

reply
da25
1 day ago
[-]
This ! Using the same name, i.e. `.map()` is a footgun, that devs would eventually fumble upon. `rpcMap()` sounds good. cc: @kentonv
reply
fizx
1 day ago
[-]
Also, your function needs to be very careful on closures. Date.toLocaleString and many other js functions will be different on client and server, which will also cause silent corruption.
reply
kentonv
1 day ago
[-]
If you invoke `Date.toLocaleString()` in a map callback, it will consistently always run on the client.
reply
fizx
23 hours ago
[-]
I don't see how this very contrived example pipelines:

    client.getAll({userIds}).map((user) => user.updatedAt == new Date().toLocaleString() ? client.photosFor(user.id) : {})
or without the conditional,

    client.getAll({userIds}).map((user) => client.photos({userId: user.id, since: new Date(user.updatedAt).toLocaleString()})
Like it has to call toLocaleString on the server, no?
reply
kentonv
18 hours ago
[-]
Neither of these will type check.

You can't perform computation on a promise. The only thing you can do is pipeline on it.

`user.updatedAt == date` is trying to compare a promise against a date. It won't type check.

`new Date(user.updatedAt)` is passing a promise to the Date constructor. It won't type check.

reply
tonyg
2 days ago
[-]
Is .map specialcased or do user functions accepting callbacks work the same way? Because you could do the Scott-Mogensen thing of #ifTrue:ifFalse: if so, dualizing the control-flow decision making, offering a menu of choices/continuations.
reply
kentonv
2 days ago
[-]
.map() is totally special-cased.

For any other function accepting a callback, the function on the server will receive an RPC stub, which, when called, makes an RPC back to the caller, calling the original version of the function.

This is usually what you want, and the semantics are entirely normal.

But for .map(), this would defeat the purpose, as it'd require an additional network round-trip to call the callback.

reply
qcnguy
1 day ago
[-]
What about filter? Seems useful also.
reply
kentonv
1 day ago
[-]
I don't think you could make filter() work with the same approach, because it seems like you'd actually have to do computation on the result.

map() works for cases where you don't need to compute anything in the callback, you just want to pipeline the elements into another RPC, which is actually a common case with map().

If you want to filter server-side, you could still accomplish it by having the server explicitly expose a method that takes an array as input, and performs the desired filter. The server would have to know in advance exactly what filter predicates are needed.

reply
svieira
1 day ago
[-]
But you might want to compose various methods on the server in order to filter, just like you might want to compose various methods on the server in order to transform. Why is `collection.map(server.lookupByInternalizedId)` a special case that doesn't require `server.lookupCollectionByInternalizedId(collection)`, but `collection.filter(server.isOperationSensibleForATuesday)` is a bridge too far and for that you need `server.areOperationsSensibleForATuesday(collection)`?
reply
kentonv
1 day ago
[-]
I agree that, in the abstract, it's inconsistent.

But in the concrete:

* Looking up some additional data for each array element is a particularly common thing to want to do.

* We can support it nicely without having to create a library of operations baked into the protocol.

I really don't want to extend the protocol with a library of operations that you're allowed to perform. It seems like that library would just keep growing and add a lot of bloat and possibly security concerns.

(But note that apps can actually do so themselves. See: https://news.ycombinator.com/item?id=45339577 )

reply
svieira
1 day ago
[-]
Doesn't this apply for _all_ the combinators on `Array.prototype` though? Why special-case `.map` only?
reply
kentonv
1 day ago
[-]
reply
spankalee
2 days ago
[-]
But you also can't close over anything right?

I did a spiritually similar thing in JS and Dart before where we read the text of the function and re-parsed (or used mirrors in Dart) to ensure that it doesn't access any external values.

reply
kentonv
2 days ago
[-]
You actually CAN close over RPC stubs. The library will capture any RPC calls made during the mapper callback, even on stubs other than the input. Those stubs are then sent along to the server with the replay instructions.
reply
tobyhinloopen
1 day ago
[-]
That's neat, tnx for pointing it out
reply
__alexs
1 day ago
[-]
So now I have yet another colour of function? Fun.
reply
davexunit
2 days ago
[-]
This has some similarities and significant differences from OCapN [0]. Capability transfer and promise pipelining are part of both, and both are schemaless. Cap'n web lacks out-of-band capabilities, which OCapN has in the form of URIs known as sturdyrefs. I suppose this difference is why the examples show API key authentication since anyone can connect to the Cap'n Web endpoint. This is not necessary in OCapN because a sturdyref is an unguessable token so by possessing it you have the authority to send messages to the endpoint it designates. Cap'n Web also seems to lack the ability for Alice to introduce Bob to Carol, a feature in OCapN called third-party handoffs. Handoffs are needed for distributed applications. So I guess Cap'n Web is more for traditional client-server SaaS but now with a dash of ocaps.

[0] https://ocapn.org/

reply
kentonv
2 days ago
[-]
I’d love to add 3PH support in the future, but it wasn’t a priority for an initial release as this is very focused on enabling browser<->web server communications specifically.

SturdyRefs are tricky. My feeling is that they don’t really belong in the RPC protocol itself, because the mechanism by which you restore a SturdyRef is very dependent on the platform in which you're running. Cloudflare Workers, for example, may soon support storing capabilities into Durable Object storage. But the way this will work is very tied to the Cloudflare Workers platform. Sandstorm, similarly, had a persistent capability mechanism, but it only made sense inside Sandstorm – which is why I removed the whole notion of persistent capabilities from Cap’n Proto itself.

The closest thing to a web standard for SturdyRefs is OAuth. I could imagine defining a mechanism for SturdyRefs based on OAuth refresh tokens, which would be pretty cool, but it probably wouldn’t actually be what you want inside a specific platform like Sandstorm or Workers.

reply
prngl
1 day ago
[-]
This is cool.

There's an interesting parallel with ML compilation libraries (TensorFlow 1, JAX jit, PyTorch compile) where a tracing approach is taken to build up a graph of operations that are then essentially compiled (or otherwise lowered and executed by a specialized VM). We're often nowadays working in dynamic languages, so they become essentially the frontend to new DSLs, and instead of defining new syntax, we embed the AST construction into the scripting language.

For ML, we're delaying the execution of GPU/linalg kernels so that we can fuse them. For RPC, we're delaying the execution of network requests so that we can fuse them.

Of course, compiled languages themselves delay the execution of ops (add/mul/load/store/etc) so that we can fuse them, i.e. skip over the round-trip of the interpreter/VM loop.

The power of code as data in various guises.

Another angle on this is the importance of separating control plane (i.e. instructions) from data plane in distributed systems, which is any system where you can observe a "delay". When you zoom into a single CPU, it acknowledges its nature as a distributed system with memory far away by separating out the instruction pipeline and instruction cache from the data. In Cap'n Web, we've got the instructions as the RPC graph being built up.

I just thought these were some interesting patterns. I'm not sure I yet see all the way down to the bottom though. Feels like we go in circles, or rather, the stack is replicated (compiler built on interpreter built on compiler built on interpreter ...). In some respect this is the typical Lispy code is data, data is code, but I dunno, feels like there's something here to cut through...

reply
ryanrasti
1 day ago
[-]
Agree -- I think that's a powerful generalization you're making.

> We're often nowadays working in dynamic languages, so they become essentially the frontend to new DSLs, and instead of defining new syntax, we embed the AST construction into the scripting language.

And I'd say that TypeScript is the real game-changer here. You get the flexibility of the JavaScript runtime (e.g., how Cap'n Web cleverly uses `Proxy`s) while still being able to provide static types for the embedded DSL you're creating. It’s the best of both worlds.

I've been spending all of my time in the ORM-analog here. Most ORMs are severely lacking on composability because they're fundamentally imperative and eager. A call like `db.orders.findAll()` executes immediately and you're stuck without a way to add operations before it hits the database.

A truly composable ORM should act like the compilers you mentioned: use TypeScript to define a fully typed DSL over the entirety of SQL, build an AST from the query, and then only at the end compile the graph into the final SQL query. That's the core idea I'm working on with my project, Typegres.

If you find the pattern interesting: https://typegres.com/play/

reply
prngl
1 day ago
[-]
I do find the pattern interesting and powerful.

But at the same time, something feels off about it (just conceptually, not trying to knock your money-making endeavor, godspeed). Some of the issues that all of these hit is:

- No printf debugging. Sometimes you want things to be eager so you can immediately see what's happening. If you print and what you see is <RPCResultTracingObject> that's not very helpful. But that's what you'll get when you're in a "tracing" context, i.e. you're treating the code as data at that point, so you just see the code as data. One way of getting around this is to make the tracing completely lazy, so no tracing context at all, but instead you just chain as you go, and something like `print(thing)` or `thing.execute()` actually then ships everything off. This seems like how much of Cap'n Web works except for the part where they embed the DSL, and then you're in a fundamentally different context.

- No "natural" control flow in the DSL/tracing context. You have to use special if/while/for/etc so that the object/context "sees" them. Though that's only the case if the control flow is data-dependent; if it's based on config values that's fine, as long as the context builder is aware.

- No side effects in the DSL/tracing context because that's not a real "running" context, it's only run once to build the AST and then never run again.

Of the various flavors of this I've seen, it's the ML usage I think that's pushed it the furthest out of necessity (for example, jax.jit https://docs.jax.dev/en/latest/_autosummary/jax.jit.html, note the "static*" arguments).

Is this all just necessary complexity? Or is it because we're missing something, not quite seeing it right?

reply
ryanrasti
1 day ago
[-]
Really nice summary of the core challenges with this DSL/code-as-data pattern.

I've spent a lot of time thinking about this in the database context:

> No printf debugging

Yeah, spot on. The solutions here would be something like a `toSQL` that let's you inspect the compiled output at any step in the AST construction.

Also, if the backend supports it, you could compile a `printf` function all the way to the backend (this isn't supported in SQL though)

> No "natural" control flow in the DSL/tracing context

Agreed -- that can be a source of confusion and subtle bugs.

You could have a build rule that actually compile `if`/`while`/`for` into your AST (instead of evaluate them in the frontend DSL). Or you could have custom lint rules to forbid them in the DSL.

At the same time -- part of what makes query builders so powerful is the ability to dynamically construct queries. Runtime conditionals is what makes that possible.

> No side effects in the DSL/tracing context because that's not a real "running" context

Agreed -- similar to the above: this is something that needs to be forbidden (e.g., by a lint rule) or clearly understood before using it.

> Is this all just necessary complexity? Or is it because we're missing something, not quite seeing it right?

My take is that, at least in the SQL case: 100% the complexity is justified.

Big reasons why: 1. A *huge* impediment to productive engineering is context switching. A DSL in the same language as your app (i.e., an ORM) makes the bridge to your application code also seamless. (This is similar to the argument of having your entire stack be a single language) 2. The additional layer of indirection (building an AST) allows you to dynamically construct expressions in a way that isn't possible in SQL. This is effectively adding a (very useful) macro system on top of SQL. 3. In the case of Typescript, because its type-system is so flexible, you can have stronger typing on your DSL than the backend target.

tl;dr is these DSLs can enable better ergonomics in practice and the indirection can unlock powerful new primitives

reply
porridgeraisin
1 day ago
[-]
I think this kind of tracing-caused complexity only arises when the language doesn't let you easily represent and manipulate code as data, or when the language doesn't have static type information.

Python does let you mess around with the AST, however, there is no static typing, and let's just say that the ML ecosystem will <witty example of extreme act> before they adopt static typing. So it's not possible to build these graphs without doing this kind of hacky nonsense.

For another example, torch.compile() works at the python bytecode level. It basically monkey patches the PyEval_EvalFrame function evaluator of Cpython for all torch.compile decorated functions. Inside that, it will check for any operators e.g BINARY_MULTIPLY involving torch tensors, and it records that. Any if conditions in the path get translated to guards in the resulting graph. Later, when said guard fails, it recomputes the subgraph with the complementary condition (and any additional conditions) and stores this as an alternative JIT path, and muxes these in the future depending on the two guards in place now.

Jax works by making the function arguments proxies and recording the operations like you mentioned. However, you cannot use normal `if`, you use lax.cond(), lax.while(), etc,. As a result, it doesn't recompute graph when different branches are encountered, it only computes the graph once.

In a language such as C#, Rust, or a statically typed lisp, you wouldn't need to do any of this monkey business. There's probably already a way in the rust toolchain to interject at the MIR stage and have your own backend convert these to some Tensor IR.

reply
prngl
1 day ago
[-]
Yes being able to have compilers as libraries inline in the same code and same language. That feels like what all these call for. Which really is the Lisp core I suppose. But with static types and heterogenous backends. MLIR I think hoped (hopes?) to be something like this but while C++ may be pragmatic it’s not elegant.

Maybe totally off but would dependent types be needed here? The runtime value of one “language” dictates the code of another. So you have some runtime compilation. Seems like dependent types may be the language of jit-compiled code.

Anyways, heady thoughts spurred by a most pragmatic of libraries. Cloudflare wants to sell more schlock to the javascripters and we continue our descent into madness. Einsteins building AI connected SaaS refrigerators. And yet there is beauty still within.

reply
ignoramous
1 day ago
[-]
> I just thought these were some interesting patterns.

Reading this from TFA ...

  Alice and Bob each maintain some state about the connection. In particular, each maintains an "export table", describing all the pass-by-reference objects they have exposed to the other side, and an "import table", describing the references they have received. 

  Alice's exports correspond to Bob's imports, and vice versa. Each entry in the export table has a signed integer ID, which is used to reference it. You can think of these IDs like file descriptors in a POSIX system. Unlike file descriptors, though, IDs can be negative, and an ID is never reused over the lifetime of a connection.

  At the start of the connection, Alice and Bob each populate their export tables with a single entry, numbered zero, representing their "main" interfaces.

  Typically, when one side is acting as the "server", they will export their main public RPC interface as ID zero, whereas the "client" will export an empty interface. However, this is up to the application: either side can export whatever they want.
... sounds very similar to how Binder IPC (and soon RPC) works on Android.
reply
losvedir
1 day ago
[-]
Babe get in here, a new kentonv library just dropped!

I'm surprised how little code is actually involved here, just looking at the linked GitHub repo. Is that really all there is to it? In theory, it shouldn't be too hard to port the server side to another language, right? I'm interested in using it in an Elixir server for a JS/TS frontend.

For that matter, the language porting seems like a pretty good LLM task. Did you use much LLM-generated code for this repo? I seem to recall kentonv doing an entirely AI-generated (though human-reviewed, of course) proof of concept a few months ago.

reply
kentonv
1 day ago
[-]
Some of the tests are LLM-generated, but none of the library itself is.

I don't think LLMs would be capable of writing this library (at least at present). The pieces fit together like a very intricate puzzle. I spent a lot more time thinking about how to do it right, than actually coding.

Very different from my workers-oauth-provider library, where it was just implementing a well-known spec with a novel (yet straightforward) API.

The code might port nicely to another dynamic language, like Python, but I think you'd have a hard time porting it to a statically-typed language. There's a whole lot of iterating over arbitrary objects without knowing their types.

reply
krosaen
1 day ago
[-]
Hammock driven development :)

https://www.youtube.com/watch?v=f84n5oFoZBc

reply
naasking
1 day ago
[-]
> There's a whole lot of iterating over arbitrary objects without knowing their types.

That's just parametric polymorphism.

reply
nl
1 day ago
[-]
> just parametric polymorphism

Those three words are doing a lot of work there.

reply
thethimble
1 day ago
[-]
I'm curious about two things:

1. What's the best way to do app deploys that update the RPC semantics? In other words how do you ensure that the client and server are speaking the same version of the RPC? This is a challenge that protos/grpc/avro explicitly sought to solve.

2. Relatedly, what's the best way to handle flaky connections? It seems that the export/import table is attached directly to a stateful WS connection such that if the connection breaks you'd lose the state. In principle there should be nothing preventing a client/server caching this state and reinstantiating it on reconnect. That said, given these tables can contain closures, they're not exactly serializable so you could run into memory issues. Curious if the team has thought about this.

Absolutely mind blowing work!

reply
kentonv
1 day ago
[-]
1. Think of it like updating a JavaScript API without breaking existing callers. The rules are essentially the over RPC as they would be for local calls. So, you can add new methods and new optional arguments, etc.

2. After losing the connection, you'll have to reconnect and reconstruct the objects from scratch. The way I've structured this in an actual React app is, I pass the main RPC stub as an argument to the top-level component. It calls methods to get sub-objects and passes them down to various child components. When the connection is lost, I recreate it, and then pass the new stub into the top-level component, causing it to "rerender" just like any other state change. All the children will fetch the sub-objects they need again.

If you have an object that represents some sort of subscription with a callback, you'll need to design the API so that when initiating the subscription, the caller can specify the last message they saw on the subscription, so that it can pick up where they left off without missing anything.

Hmm, I suppose we'll need to do a blog post of design patterns at some point...

reply
thethimble
1 day ago
[-]
A blog post of design patterns would be really great. Again - amazing work!
reply
mpweiher
1 day ago
[-]
Big fan of Cap'n'Proto and this looks really interesting, if RPC is the thing that works for your use-case.

However, stumbled over this:

The fact is, RPC fits the programming model we're used to. Every programmer is trained to think in terms of APIs composed of function calls, not in terms of byte stream protocols nor even REST. Using RPC frees you from the need to constantly translate between mental models, allowing you to move faster.

The fact that this is, in fact, true is what I refer to as "The gentle tyranny of Call/Return"

We're used to it, doing something more appropriate to the problem space is too unfamiliar and so more or less arbitrary additional complexity is...Just Fine™.

https://www.hpi.uni-potsdam.de/hirschfeld/publications/media...

Maybe it shouldn't actually be true. Maybe we should start to expand our vocabulary and toolchest beyond just "composed function calls"? So composed function calls are one tool in our toolchest, to be used when they are the best tool, not used because we have no reasonable alternative.

https://blog.metaobject.com/2019/02/why-architecture-oriente...

reply
KuSpa
1 day ago
[-]
Ah, the good old squeak/smalltalk days. A few years back I worked on signals (or rather a static analyser for the editor to support signals) in squeak/smalltalk. The kind of signals those indie frameworks like angular and svelte now adopt trying to solve the problem of changepropagation you outline in your paper.

What i'm getting at is: For the places where other tools are better (like the UI example), we already have other tools (signals, observables, effects, runes,...). And for the places like client/server-communication: This is kind of where "call/return" usually shines.

reply
mpweiher
4 hours ago
[-]
> And for the places like client/server-communication: This is kind of where "call/return" usually shines.

The WWW would like a quick word with you. CORBA as well, if it could get a word in.

> we already have other tools (signals, observables, effects, runes,...)

We can build them. We can't express them. We can also build everything out of Turing Machines, or Lambda Calculus or NAND gates.

reply
kentonv
1 day ago
[-]
Eh, I think composed function calls have legitimately won _because_ they compose well and are easy to understand, not just because we haven't tried other things.
reply
mpweiher
4 hours ago
[-]
1. The WWW would like a quick word with you regarding function calls having won.

2. You are making an invalid assumption, which is that we only get to have one tool in our toolbox, and therefore one tool has to "win". Even if function calls were the best tool, they would still not always be the right one.

With the benefit of hindsight, it’s clear that these properties of structured programs, although helpful, do not go to the heart of the matter. The most important difference between structured and unstructured programs is that structured programs are designed in a modular way. Modular design brings with it great productivity improvements. First of all, small modules can be coded quickly and easily. Second, general-purpose modules can be reused, leading to faster development of subsequent programs. Third, the modules of a program can be tested independently, helping to reduce the time spent debugging.

However, there is a very important point that is often missed. When writing a modular program to solve a problem, one first divides the problem into subproblems, then solves the subproblems, and finally combines the solutions.

The ways in which one can divide up the original problem depend directly on the ways in which one can glue solutions together. Therefore, to increase one’s ability to modularize a problem conceptually, one must provide new kinds of glue in the programming language.

-- John Hughes, Why Functional Programming Matters

https://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf

via

https://blog.metaobject.com/2019/02/why-architecture-oriente...

3. Procedure calls are not particularly composable

See CORBA vs. REST.

reply
spankalee
2 days ago
[-]
Looking at this quickly, it does seem to require (or strongly encourage?) a stateful server to hold on to the import and export tables and the state of objects in each.

One thing about a traditional RPC system where every call is top-level and you pass keys and such on every call is that multiple calls in a sequence can usually land on different servers and work fine.

Is there a way to serialize and store the import/export tables to a database so you can do the same here, or do you really need something like server affinity or Durable Objects?

reply
kentonv
2 days ago
[-]
The state only lives for a single RPC session.

When using WebSockets, that's the lifetime of the WebSocket.

But when using the HTTP batch transport, a session is a single HTTP request, that performs a batch of calls all at once.

So there's actually no need to hold state across multiple HTTP requests or connections, at least as far as Cap'n Web is concerned.

This does imply that you shouldn't design a protocol where it would be catastrophic if the session suddenly disconnected in the middle and you lost all your capabilities. It should be possible to reconnect and reconstruct them.

reply
crabmusket
2 days ago
[-]
Hmm so websocket reconnects could break state? Important to know when building on this, to e.g. re-establish the object graph when beginning a reconnected session? Or when using the http protocol is a possibility - to e.g. always include the "get object" as the first call in the batch.
reply
kentonv
2 days ago
[-]
Yes, when you reconnect the WebSocket, the client will need to call methods to get new instances of any objects it needs. The old stubs are permanently broken.

FWIW the way I've handled this in a React app is, the root stub gets passed in as a prop to the root component, and children call the appropriate methods to get whatever objects they need from it. When the connection is lost, a new one is created, and the new root stub passed into the root component, which causes everything downstream to re-run exactly as you'd want. Seems to work well.

reply
doctorpangloss
2 days ago
[-]
^ maybe the most knowledgable person in the world about these gritty details

RPC SDKs should have session management, otherwise you end up in this situation:

"Any sufficiently complicated gRPC or Cap'n'Proto program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Akka"

reply
vereis
2 days ago
[-]
i think the original quote is "Any sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of Erlang" :-) but your point still stands
reply
coryrc
2 days ago
[-]
reply
paradox460
1 day ago
[-]
Isn't akka just a poor man's erlang
reply
cogman10
2 days ago
[-]
That's what I noticed reading through this.

It looks like the server affinity is accomplished by using websockets. The http batching simply sends all the requests at once and then waits for the response.

I don't love this because it makes load balancing hard. If a bunch of chatty clients get a socket to the same server, now that server is burdened and potentially overloadable.

Further, it makes scaling in/out servers really annoying. Persistent long lived connections are beasts to deal with because now you have to handle that "what do I do if multiple requests are in flight?".

One more thing I don't really love about this, it requires a timely client. This seems like it might be trivial to DDOS as a client can simply send a stream of push events and never pull. The server would then be burdened to keep those responses around so long as the client remains connected. That seems bad.

reply
fleventynine
2 days ago
[-]
Yeah, I think to make a system using this really scale you'd have to add support for this protocol in your load balancer / DDOS defenses.
reply
dgl
1 day ago
[-]
This isn't really that different to GWT, which Google has been scaling for a long time. My knowledge is a little outdated, however more complex applications had a "UI" server component which talked to multiple "API" backend components, doing internal load balancing between them.

Architecturally I don't think it makes sense to support this in a load balancer, you instead want to pass back a "cost" or outright decisions to your load balancing layer.

Also note the "batch-pipelining" example is just a node.js client; this already supports not just browsers as clients, so you could always add another layer of abstraction (the "fundamental theorem of software engineering").

reply
afiori
2 days ago
[-]
My limited knowledge of Cap'n'Proto (read the docs once years ago) server and clients can peer stubs, so that is server C receives a stub originated from server A via client B then C can try to call A directly.
reply
jensneuse
1 day ago
[-]
I'm with WunderGraph, a vendor providing enterprise tooling for GraphQL.

First, I absolutely love Capn Proto and the ideas of chaining calls on objects. It's amazing to see what's possible with CapNweb.

However, one of the examples compares it to GraphQL, which I think falls a bit short of how enterprises use the Query language in real life.

First, like others mentioned, you'll have N+1 problems for nested lists. That is, if we call comments() on each post and author() on each comment, we absolutely don't want to have one individual call per nested object. In GraphQL, with the data loader pattern, this is just 3 calls.

Second, there's also an element of security. Advanced GraphQL gateways like WunderGraph's are capable of implementing fine grained rate limiting that prevent a client to ask for too much data. With this RPC object calling style, we don't have a notion of "Query Plans", so we cannot statically analyze a combination of API calls and estimate the cost before executing them.

Lastly, GraphQL these days is mostly used with Federation. That means a single client talks to a Gateway (e.g. WunderGraph's Cosmo Router) and the Router distributed the calls efficiently across many sub services (Subgraphs) with a query planner that finds the optimal way to load information from multiple services. While capNweb looks amazing, the reality is that a client would have to talk to many services.

Which brings me to my last point. Instead of Going the capNweb vs GraphQL route, I'd think more about how the two can work together. What if a client could use CapNweb to talk to a Federation Router that allows it to interact with entities, the object definitions in a GraphQL Federation system.

I think this is really worth exploring. Not going against other API styles but trying to combine the strengths.

- https://wundergraph.com/

reply
rhaps0dy
1 day ago
[-]
> First, like others mentioned, you'll have N+1 problems for nested lists. That is, if we call comments() on each post and author() on each comment, we absolutely don't want to have one individual call per nested object. In GraphQL, with the data loader pattern, this is just 3 calls.

Why is that a problem? As far as I can tell, those calls are all done on the server, where they're cheap normal function calls, and the results are all sent back with 1 roundtrip; because of the pipelining.

reply
robmccoll
1 day ago
[-]
Because they each result in round-trips and individual queries to the database rather than a more efficient single round-trip with a join. Note: I don't know the details of GraphQL, but I'm assuming it does the smarter thing.

In this paradigm, the places where you are calling map() could probably be replaced with explicit getComments() or getCommentsWithAuthors() or two methods that do just one query each.

reply
kentonv
1 day ago
[-]
Well, if the server is a Cloudflare Durable Object running sqlite, then the round-trip to the database is free.

https://www.sqlite.org/np1queryprob.html

But you are right that this won't work great with traditional databases without significantly more magic on the server side, and in that sense the comparison with GraphQL is... aggressive :)

It is still much better than making all the calls client-side, of course. And there are many use cases where you're not querying a database.

And maybe there can be some fusion between GraphQL server infrastructure and this RPC-oriented syntax that gives people the best of both worlds?

reply
jensneuse
1 day ago
[-]
Side note, a lot of people these days build agents on top of APIs. GraphQL has selection sets, which allows you to select subsets of objects. This is quite handy when it comes to gents because of context window limitations.
reply
benpacker
2 days ago
[-]
This seems great and I'm really excited to try it in place of trpc/orpc.

Although it seems to solve one of the problems that GraphQL solved that trpc doesn't (the ability to request nested information from items in a list or properties of an object without changes to server side code), there is no included solution for the server side problem that creates that the data loader pattern was intended to solve, where a naive GraphQL server implementation makes a database query per item in a list.

Until the server side tooling for this matures and has equivalents for the dataloader pattern, persisted/allowlist queries, etc., I'll probably only use this for server <-> server (worker <-> worker) or client <-> iframe communication and keep my client <-> server communication alongside more pre-defined boundaries.

reply
kentonv
2 days ago
[-]
I generally agree that the .map() trick doesn't actually replace GraphQL without some sort of server-side optimizations to avoid turning this into N+1 selects.

However, if your database is sqlite in a Cloudflare Durable Object, and the RPC protocol is talking directly to it, then N+1 selects are actually just fine.

https://www.sqlite.org/np1queryprob.html

reply
ryanrasti
1 day ago
[-]
Agree, and to add, from what I see, the main issue is that server-side data frameworks (e.g., ORMs) aren't generally built for the combination of security & composability that make them naturally synergize with Cap'n Web. Another way to put it: promise pipelining is a killer feature but if your ORM doesn't support pipelining, then you have to build a complex bridge to support them both.

I've been working on this issue from the other side. Specifically, a TS ORM that has the level of composability to make promise pipelining a killer feature out of the box. And analagous to Cap'n Web's use of classes, it even models tables as classes with methods that return composable SQL expressions.

If curious: https://typegres.com/play/

reply
benpacker
1 day ago
[-]
This seems really cool and I'd be happy to help (I'm currently a pgtyped + Kysely user and community contributor), and I see how this solves n+1 from promise pipelining when fetched "nested" data with a similar approach as Cap'n Web, but I don't we've solved the map problem.

If I run, in client side Cap'n Web land (from the post): ``` let friendsWithPhotos = friendsPromise.map(friend => { return {friend, photo: api.getUserPhoto(friend.id))}; } ```

And I implement my server class naively, the server side implementation will still call `getUserPhoto` on a materialized friend returned from the database (with a query actually being run) instead of an intermediate query builder.

@kentonv, I'm tempted to say that in order for a query builder like typegres to do a good job optimizing these RPC calls, the RpcTarget might need to expose the pass by reference control flow so the query builder can decide to never actually run "select id from friends" without the join to the user_photos table, or whatever.

reply
ryanrasti
1 day ago
[-]
> but I don't we've solved the map problem.

Agreed! If we use `map` directly, Cap'n Web is still constrained by the ORM.

The solution would be what you're getting at -- something that directly composes the query builder primitives. In Typegres, that would look like this:

``` let friendsWithPhotos = friendsPromise.select((f) => ({...f, photo: f.photo()}) // `photo()` is a scalar subquery -- it could also be a join ```

i.e., use promise pipelining to build up the query on the server.

The idea is that Cap'n Web would allow you to pipeline the Typegres query builder operations. Note this should be possible in other fluent-based query builders (e.g., Kysely/Drizzle). But where Typegres really synergizes with Cap'n Web is that everything is already expressed as methods on classes, so the architecture is capability-ready.

P.S. Thanks for your generous offer to help! My contact info is in my HN profile. Would love to connect.

reply
kentonv
1 day ago
[-]
That is actually pretty interesting!

Have you considered making a sqlite version that works in Durable Objects? :)

reply
ryanrasti
1 day ago
[-]
Thanks, Kenton! Really encouraging to hear you find the idea interesting.

Right now I'm focused on Postgres (biggest market-share for full-stack apps). A sqlite version is definitely possible conceptually.

You're right about the bigger picture, though: Cap'n Web + Typegres (or a "Typesqlite" :) could enable the dream dev stack: a SQL layer in the client that is both sandboxed (via capabilities) and fully-featured (via SQL composability).

reply
qcnguy
1 day ago
[-]
I wonder if there's a way to process the RPC encoding inside a stored procedure, using the various JS-in-DB features out there.
reply
krosaen
2 days ago
[-]
This looks pretty awesome, and excited it's not only a cloudflare product (Cap'n Web exists alongside cloudflare Workers). Reading this section [1], can you say more about:

> as of this writing, the feature set is not exactly the same between the two. We aim to fix this over time, by adding missing features to both sides until they match.

do you think once the two reach parity, that that parity will remain, or more likely that Cap'n Web will trail cloudflare workers, and if so, by what length of time?

[1] https://github.com/cloudflare/capnweb/tree/main?tab=readme-o...

reply
kentonv
2 days ago
[-]
I think we'll likely keep them pretty close to in-sync, at least when it comes to features that make sense in both.

If anything I'd expect Cap'n Web to run ahead of Workers RPC (as it is already doing, with the new pipeline features) because Cap'n Web's implementation is actually much simpler than Workers'. Cap'n Web will probably be the place where we experiment with new features.

reply
dimal
2 days ago
[-]
Looks very cool, especially passing functions back and forth. But then I wonder, what would I actually use that for?

You mention that it’s schemaless as if that’s a good thing. Having a well defined schema is one of the things I like about tRPC and zod. Is there some way that you get the benefits of a schema with less work?

reply
kentonv
2 days ago
[-]
You can use TypeScript to define your API, and get all the benefits of schemas.

Well, except you don't get runtime type checking with TypeScript, which might be something you really want over RPC. For now I actually suggest using zod for type checks, but my dream is to auto-generate type checks based on the TypeScript types...

reply
ngrilly
2 days ago
[-]
Then that means we have to define the interface twice: once in TypeScript and another in Zod?
reply
kentonv
2 days ago
[-]
No. Zod gives you TypeScript types corresponding to the schema. So you would only need to write the schema in Zod.

(I do wish it could be the other way, though: Write only TypeScript, get runtime checks automatically.)

reply
sebws
1 day ago
[-]
There are ways if you're ok with a build step, e.g. https://typia.io/ or https://github.com/GoogleFeud/ts-runtime-checks

Although perhaps that's not what you mean.

I found these through this https://github.com/moltar/typescript-runtime-type-benchmarks

reply
ngrilly
2 days ago
[-]
I'm familiar with zod.infer but I'm not sure how to use it to produce an interface that would be compatible with RpcStub and RpcTarget, like MyApi in the example in your post:

  // Shared interface declaration:
  interface MyApi {
    hello(name: string): Promise<string>;
  }

  // On the client:
  let api: RpcStub<MyApi> = newWebSocketRpcSession("wss://example.com/api");

  // On the server:
  class MyApiServer extends RpcTarget implements MyApi {
    hello(name) {
      return `Hello, ${name}!`
    }
  }
reply
kentonv
2 days ago
[-]
I'll be honest and say I haven't tried it myself.

But my expectation is you'd use Zod to define all your parameter types. Then you'd define your RpcTarget in plain TypeScript, but for the parameters on each method, reference the Zod-derived types.

reply
oulipo2
2 days ago
[-]
I guess it's if some data is quite heavy and is located either client-side or server-side, and you want to do calls without having to transfer the data
reply
beckford
2 days ago
[-]
Since Cap'n Web is a simplification of Cap'n Proto RPC, it would be amazing if eventually the simplification traveled back to all the languages that Cap'n Proto RPC supports (C++, etc.). Or at least could be made to be binary compatible. Regardless, this is great.
reply
kentonv
2 days ago
[-]
Yeah I now want to go back and redesign the Cap'n Proto RPC protocol to be based on this new design, as it accomplishes all the same features with a lot less complexity!

But it may be tough to justify when we already have working Cap'n Proto implementations speaking the existing protocol, that took a lot of work to build. Yes, the new implementations will be less work than the original, but it's still a lot of work that is essentially running-in-place.

OTOH, it might make it easier for Cap'n Proto RPC to be implemented in more languages, which might be worth it... idk.

reply
beckford
2 days ago
[-]
Disclaimer: I took over maintenance of the Cap'n Proto C bindings a couple years ago.

That makes sense. There is some opportunity though since the Cap'n Proto RPC had always lacked a JavaScript RPC implementation. For example, I had always been planning on using the Cap'n Proto OCaml implementation (which had full RPC) and using one of the two mature OCaml->JavaScript frameworks to get a JavaScript implementation. Long story short: Not now, but I'd be interested in seeing if Cap'n Web can be ported to OCaml. I suspect other language communities may be interested. Promise chaining is a killer feature and was (previously) difficult to implement. Aside: Promise chaining is quite undersold on your blog post; it is co-equal to capabilities in my estimation.

reply
josephg
1 day ago
[-]
I tried using the C library recently but was turned off by the lack of bounds checking. I’m not sure how anyone could reasonably accept packets over the wire which allow arbitrary memory access. Am I misunderstanding? Any hope this can be fixed?
reply
CobrastanJorji
2 days ago
[-]
You mean redesign Cap'n Proto to not have a schema? Or did you mean the API, not the protocol?
reply
kentonv
2 days ago
[-]
Here is the Cap'n Proto RPC protocol:

https://github.com/capnproto/capnproto/blob/v2/c%2B%2B/src/c...

That's just the RPC state machine -- the serialization is specified elsewhere, and the state machine is actually schema-agnostic. (Schemas are applied at the edges, when messages are actually received from the app or delivered to it.)

This is the Cap'n Web protocol, including serialization details:

https://github.com/cloudflare/capnweb/blob/main/protocol.md

Now, to be fair, Cap'n Proto has a lot of features that Cap'n Web doesn't have yet. But Cap'n Web's high-level design is actually a lot simpler.

Among other things, I merged the concepts of call-return and promise-resolve. (Which, admittedly, CapTP was doing it that way before I even designed Cap'n Proto. It was a complete mistake on my part to turn them into two separate concepts in Cap'n Proto, but it seemed to make sense at the time.)

What I'd like to do is go back and revise the Cap'n Proto protocol to use a similar design under the hood. This would make no visible difference to applications (they'd still use schemas), but the state machine would be much simpler, and easier to port to more languages.

reply
izzylan
2 days ago
[-]
I was trying to port Cap'n Proto to modern C# as a side project when I was unemployed, since the current implementation years old and new C# features have been released that would make it much nicer to use.

I love the no-copy serialization and object capabilities, but wow, the RPC protocol is incredibly complex, it took me a while to wrap my head around it, and I often had to refer to the C++ implementation to really get it.

reply
coreload
1 day ago
[-]
Language specific RPC. At least Cap'n Proto is language agnostic. ConnectRPC is language agnostic and web compatible and a gRPC extended subset. I would have difficulty adopting a language specific RPC implementation.
reply
kentonv
1 day ago
[-]
The protocol itself is really only language-specific to a similar extent that JSON is language-specific. Which you can totally argue it is, but also people have figured out how to use it in lots of languages other than JavaScript.

I think Cap'n Web could work pretty well in Python and other dynamically-typed languages. Statically-typed would be a bit trickier (again, in the same sense that they are harder to use JSON in), but the answer there might just be to bridge to Cap'n Proto...

reply
coreload
1 day ago
[-]
Retrofitting to some but not all languages is not nearly the same as an intentionally language agnostic protocol. To the extent that calling this "web" is at best misleading.
reply
kentonv
1 day ago
[-]
Which part of the protocol do you think is actually specific to JavaScript?
reply
divan
2 days ago
[-]
> RPC is often accused of committing many of the fallacies of distributed computing. > But this reputation is outdated. When RPC was first invented some 40 years ago, async programming barely existed. We did not have Promises, much less async and await.

I'm confused. How is this a "protocol" if its core premises rely on very specific implementation of concurrency in a very specific language?

reply
closeparen
2 days ago
[-]
"RPC" originally referred to a programming paradigm where remote calls looked just like any other method calls, and it might not even be any of the programmer's business whether they're implemented in-process or on another machine. This obviously required wire protocols, client and server libraries, etc. to implement.

There's been a renaissance in the tools, but now we mainly use them like "REST" endpoints with the type signatures of functions. Programming language features like Future and Optional make it easier to clearly delineate properties like "this might take a while" or "this might fail" whereas earlier in RPC, these properties were kind of hidden.

reply
kiitos
1 day ago
[-]
mm, i think you're describing corba, not rpc in general
reply
closeparen
1 day ago
[-]
CORBA is trippier than that. A client’s request could include elements not normally serializable, like callbacks. A server could provide an object in response to your query and then continue mutating it, with the mutations reflected (effectively) in your address space, without your knowledge or participation.
reply
kiitos
29 minutes ago
[-]
I am not really sure what you're talking about

RPC is "remote procedure call", emphasis on "remote", meaning you always necessarily gonna be serializing/deserializing the information over some kind of wire, between discrete/different nodes, with discrete/distinct address spaces

a client request by definition can't include anything that can't be serialized, serialization is the ground truth requirement for any kind of RPC...

a server doesn't provide "an object" in response to a query, it provides "a response payload", which is at most a snapshot of some state it had at the time of the request, it's not as if there is any expectation that this serialized state is gonna be consistent between nodes

reply
kentonv
1 day ago
[-]
That's exactly what Cap'n Web does...
reply
kentonv
2 days ago
[-]
What do you mean? Async programming exists in tons of languages. Just off the top of my head, I've used async/await in JavaScript, C++, Python, Rust, C#, ...

Anyway, the point here is that early RPC systems worked by blocking the calling thread while performing the network request, which was obviously a terrible idea.

reply
chao-
2 days ago
[-]
Reminds me of the old "MongoDB is Web Scale" series of comedy videos:

https://youtu.be/bzkRVzciAZg

Some friends and I still jokingly troll each other in the vein of these, interjecting with "When async programming was discovered in 2008...", or "When memory safe compiled languages were invented in 2012..." and so forth.

reply
afiori
2 days ago
[-]
Async/await became ergonomic and widespread only recently, I am sure there were async systems in the '80 but for example nodejs focus on non blocking I/O changed how a lot of people thought about servers and concurrency (whether node was first is almost irrelevant)

Often when something is discovered or invented is far less influential[1] than when it jumps on and hype train.

[1] the discovery is very important for historical and epistemological reasons of course, rewriting the past is bad

reply
frollogaston
1 day ago
[-]
It's not a programming paradigm shift, more of a change to how runtimes work. We want to avoid the overhead of kernel threads in servers, and async/await on top of an event loop is a convenient way to do that, like in JS, Rust, and now Python.

Meanwhile Go doesn't have async/await and never will because it doesn't need it; it does greenthreading instead. Java has that too now.

Either way, your code waits on IO like before and does other work while it waits. But instead of the kernel doing the context switching, your runtime does something analogous at a higher layer.

reply
kentonv
1 day ago
[-]
I disagree that async/await is purely about avoiding overhead of kernel threads. Kernel threads are actually not that expensive these days. You can have a server with 10,000 threads, no problem.

The problem is synchronization becomes extremely hard to reason about. With event loop concurrency, each continuation (callback) becomes effectively a transaction, in which you don't need to worry about anything else modifying your state out from under you. That legitimately makes a lot of things easier.

The Cloudflare Workers runtime actually does both: There's a separate thread for each connection, but within each thread there's an event loop to handle all the concurrent stuff relating to that one connection. This works well because connections rarely need to interact with each other's state, but they need to mess with their own state constantly.

(Actually we have now gone further and stacked a custom green-threading implementation on top of this, but that's really a separate story and only a small incremental optimization.)

reply
catern
1 day ago
[-]
I totally agree with your framing of the value of async/await, but could you elaborate more on why you think that this behavior (which I would call "cooperative concurrency") is important for (ocap?) RPC systems? It seems to me that preemptive concurrency also suffices to make RPC viable. Unless you just feel that preemptive concurrency is too hard, and therefore not workable for RPC systems?
reply
kentonv
1 day ago
[-]
Almost all ocap systems seem to use event loops -- and many of the biggest ocap nerds I know are also the biggest event loop nerds I know. I'm not actually sure if this is a coincidence or if there's something inherent that makes it necessary to pair them.

But one thing I can't figure out: What would be the syntax for promise pipelining, if you aren't using promises to start with?

reply
catern
1 day ago
[-]
>What would be the syntax for promise pipelining, if you aren't using promises to start with?

Oh, great point! That does seem really hard, maybe even intractable. That's definitely a reason to like cooperative concurrency, huh...

Just to tangent even further, but some ideas:

- Do it the ugly way: add an artificial layer of promises in an otherwise pre-emptive, direct-style language. That's just, unfortunately, quite ugly...

- Use a lazy language. Then everything's a promise! Some Haskell optimizations feel kind of like promise pipelining. But I don't really like laziness...

- Use iterator APIs; that's a slightly less artificial way to add layers of promises on top of things, but still weird...

- Punt to the language: build an RPC protocol into the language, and promise pipelining as a guaranteed optimization. Pretty inflexible, and E already tried this...

- Something with choreographic programming and modal-types-for-mobile-code? Such languages explicitly track the "location" of values, and that might be the most natural way to represent ocap promises: a promise is a remote value at some specific location. Unfortunately these languages are all still research projects...

reply
frollogaston
1 day ago
[-]
It's true that JS await is kinda like releasing a lock, but otherwise, you'd just use a mutex whenever you access shared state. Which is rare as you said, and also easy to enforce in various langs nowadays.
reply
kentonv
1 day ago
[-]
I said that shared state between connections is rare, but shared state within a connection is extremely common. And there are still multiple concurrent things going on within that connection context, requiring some concurrency mechanism. Locking mutexes everywhere sounds like a nightmare to me.
reply
frollogaston
1 day ago
[-]
Ah I see. Well that is typically just fan-out-fan-in like "run these 4 SQL queries and RPCs in parallel and collect responses," nothing too complicated since the shared resources like the DB handle are usually thread-safe. It works out fine in Go and Java, even though I have unrelated reasons to avoid Go.
reply
branko_d
1 day ago
[-]
“Running 4 SQL queries in parallel” is not thread-safe if done in separate transactions, and on data that is not read-only.

If some other transaction commits at just the wrong time, it could change the result of some of these queries but not all. The results would not be consistent with each other.

reply
frollogaston
23 hours ago
[-]
Thread-safe just means that the threading by itself doesn't break anything. The race condition you're describing is outside this scope and would happen the same in a single-threaded event loop.

Btw if you really want consistent multi reads, some DBMSes support setting a read timestamp, but the common ones don't.

reply
branko_d
11 hours ago
[-]
> would happen the same in a single-threaded event loop

Well...if you implemented a relational DBMS server without using threads. To my knowledge, no such DBMS exists, so the distinction seems rather academic.

> Btw if you really want consistent multi reads, some DBMSes support setting a read timestamp, but the common ones don't.

Could you elaborate? I can't say I heard of that mechanism. Perhaps you are referring to something like Oracle flashback queries or SQL Server temporal tables?

Normally, I'd use MVCC-based "snapshot" transaction isolation for consistency between multiple queries, though they would need to be executed serially.

reply
kentonv
1 day ago
[-]
The Cloudflare Workers runtime is 1000x more complicated than your average web application. :)
reply
divan
1 day ago
[-]
I actually hate async/await approach to concurrency and avoid it as much as I can.

My mental model is that it's a caller who decides how call should be executed (synchroniously or asynchroniously). Synchronious call is when caller waits till completion/error, asynchronious - is when caller puts the call in the background (whatever it means in that language/context) and handle return results later. CSP concurrency model [1] is the closest fit here.

It's not a property of the function to decide how the caller should deal with it. This frustration was partly described in the viral article "What color is your function?" [2], but my main rant about this concurrency approach is that it doesn't match well how we think and reason about concurrent processes, and requires mental cognitive gymnastics to reason about relatively simple code.

Seeing "async/await/Promises/Futures" being a justification of a "protocol" makes little sense to me. I can totally get that they reimagined how to do RPC with first-class async/await primitives, but that doesn't make it a network "protocol".

[1] https://en.wikipedia.org/wiki/Communicating_sequential_proce...

[2] https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

reply
josephg
1 day ago
[-]
I love this about sel4. Sel4 defines a capability based API between processes, and the invoking functions have both synchronous and asynchronous variants. (Ie, send, sendAsync, recv, recvAsync, etc). How you want to use any remote function is up to you!
reply
pests
1 day ago
[-]
Can’t you just write everything default-async and then if you want sync behavior just await immediately?
reply
afiori
1 day ago
[-]
that is terrible for performance and some operations have external requirements to be sync
reply
bern4444
1 day ago
[-]
This looks awesome, I had two questions:

Is there a structured concurrency library being used to manage the chained promise calls and lazy evaluation (IE when the final promise result is actually awaited) of the chained functions?

If an await call is never added, would function calls continue to build up taking up more and more memory - I imagine the system would return an error and clear out the stack of calls before it became overwhelmed, what would these errors look like if they do indeed exist?

reply
kentonv
1 day ago
[-]
> Is there a structured concurrency library being used to manage the chained promise calls

Cap'n Web has no dependencies at all. All the chaining is implemented internally. Arguably, this is the main thing the library does; without promise chaining you could cut out more than half the code.

> If an await call is never added, would function calls continue to build up taking up more and more memory

Yes. I recommend implementing rate limits and/or per-session limits on expensive operations. This isn't something the library can do automatically since it has no real idea how expensive each thing is. Note you can detect when the client has released things by putting disposers on your return values, so you can keep count of the resources the client is holding.

reply
matlin
2 days ago
[-]
More niche use case but this would be awesome for communicating between contexts in web app (e.g. web worker, iframe, etc)
reply
kentonv
2 days ago
[-]
That's a use case I'm actually already using it for. :)

That's why the MessagePort transport is included.

reply
vmg12
2 days ago
[-]
I see that it supports websockets for the transport layer, is there any support for two way communication?

edit: was skimming the github repo https://github.com/cloudflare/capnweb/tree/main?tab=readme-o...

and saw this which answers my question:

> Supports passing functions by reference: If you pass a function over RPC, the recipient receives a "stub". When they call the stub, they actually make an RPC back to you, invoking the function where it was created. This is how bidirectional calling happens: the client passes a callback to the server, and then the server can call it later.

> Similarly, supports passing objects by reference: If a class extends the special marker type RpcTarget, then instances of that class are passed by reference, with method calls calling back to the location where the object was created.

Gonna skim some more to see if i can find some example code.

reply
pests
1 day ago
[-]
It seems peers are equal and there is no “server” or “client” role, just what you import/export from each.
reply
jimmyl02
2 days ago
[-]
This seems like a similar and more feature complete / polished version of JSON RPC?

The part that's most exciting to me is actually the bidirectional calling. Having set this up before via JSON RPC / custom protocol the experience was super "messy" and I'm looking forward to a framework making it all better.

Can't wait to try it out!

reply
kentonv
2 days ago
[-]
Yeah, JSON RPC doesn't support the pass-by-reference and lifecycle management stuff. You just have a static list of top-level functions you can call. This makes a pretty big difference in what kinds of APIs you can express.

OTOH, JSON RPC is extremely simple. Cap'n Web is a relatively complicated and subtle underlying protocol.

reply
crabmusket
2 days ago
[-]
> You just have a static list of top-level functions you can call.

Actually the author of JSON RPC suggested that method names could be dynamic, there's nothing in the spec preventing that.

https://groups.google.com/g/json-rpc/c/vOFAhPs_Caw/m/QYdeSp0...

So you could definitely build a cursed object/reference system by packing stuff into method names if you wanted. I doubt any implementations would allow this.

But yes, JSON RPC is very minimal and doesn't really offer much.

reply
kentonv
2 days ago
[-]
Right, your methods in JSON RPC could be dynamic. JSON RPC really doesn't specify anything, so you can do anything with it. But you need conventions around that, like how does a client find out that the server exported new methods, and how does the client indicate that it is done with them? That's exactly what Cap'n Web is all about -- defining those conventions in a usable way.
reply
jzig
1 day ago
[-]
Hey kentonv, this looks really neat. How would you go about writing API documentation for something like this? I really like writing up OpenAPI YAML documents for consumers of APIs I write so that any time someone asks a question, "How do I get XYZ?" I can just point them to e.g. the SwaggerUI. But I'm struggling to understand how that would work here.
reply
kentonv
1 day ago
[-]
For public-facing APIs, I would strongly recommend writing the TypeScript interfaces in a separate file from any implementation, and using JSDoc comments. There are various documentation generators that can turn that into a web page, though personally as a user I tend to prefer to just look at the actual TS file.
reply
deepsun
1 day ago
[-]
The main problem was always the same -- all the RPC libraries are designed to hide where the round-trip happens, but in real world you always want to know where and how the round-trip happens.

Just read about Cap'n Web array .map() [1] -- it's hard to understand where the round-trip is. And that is not a feature, that's a bug -- in reality you want to easily tell what the code does, not hide it.

[1] https://blog.cloudflare.com/capnweb-javascript-rpc-library/#...

reply
kentonv
1 day ago
[-]
The round trip happens when you `await` the result.

You can tell that promise pipelining isn't adding any round trips because you set it all up in a series of statements without any `await`s. At the end you do one `await`. That's your round trip.

reply
degamad
1 day ago
[-]
You say "round trip", but you mean "return trip", right?

Because if I understand correctly, you don't queue the requests and then perform a single request/response cycle (a "round trip"), you send a bunch of requests as they happen with no response expected, then when an await happens, you send a message saying "okay, that's all, please send me the result" and get a response.

reply
kentonv
1 day ago
[-]
In HTTP batch mode, they're all sent as a batch.

In WebSocket mode, yes, you are sending messages with each call. But you're not waiting for anything before sending the next message. It's not a round trip until you await something. As far as round trips are concerned, there is really no difference between sending multiple messages vs. a single batch message, if you are ultimately only waiting for one reply at the end.

reply
crabmusket
1 day ago
[-]
No, the requests are queued and sent as a batch.
reply
degamad
1 day ago
[-]
Ah, okay, that's much better then... In principle, that then allows the server to aggregate or optimise the operations rather than performing them as written. While that might not be relevant for version 1, it's useful to have for later.
reply
kentonv
1 day ago
[-]
What does server-side aggregation for optimization have to do with round trips, though?
reply
youngbum
18 hours ago
[-]
Yesterday, I migrated my web worker codes from Comlink to CapnWeb. I had extensive experience with Cloudflare Worker bindings, and as mentioned in the original post, they were quite similar.

Everything appears to be functioning smoothly, but I do miss the ‘transfer’ feature in Comlink. Although it wasn’t a critical feature, it was a nice one.

The best aspect of CapnWeb is that we can reuse most of the code related to clients, servers, and web workers (including Cloudflare Workers).

reply
jauntywundrkind
2 days ago
[-]
I really dig the flexibility of transport. Having something that works over postMessage is totally clutch!!

> Similarly, supports passing objects by reference: If a class extends the special marker type RpcTarget, then instances of that class are passed by reference, with method calls calling back to the location where the object was created.

Can this be relaxed? Having to design the object model ahead of time for RpcTarget is constraining. If we could just attach a ThingClass.prototype[Symbol.for('RpcTarget')] = true then there would be a lot more flexibility, less need to design explciitly for RpcTarget, to use RpcTarget with the objects/classes of 3rd party libraries.

reply
kentonv
2 days ago
[-]
The fear here is that if a class wasn't explicitly designed to be an RPC interface, then it could very easily offer functionality that isn't safe to expose over a security boundary with RPC. Normally, JavaScript classes do not expect their APIs to be security boundaries.

With that said, I do think we ought to support `new RpcStub(myObject)` to explicitly create a stub around an arbitrary class, even if it doesn't extend `RpcTarget`. It would be up to the person writing the `new RpcStub` invocation to verify it's safe.

reply
ianbicking
2 days ago
[-]
Couple random thoughts:

I'm trying to see if there's something specifically for streaming/generators. I don't think so? Of course you can use callbacks, but you have to implement your own sentinel to mark the end, and other little corner cases. It seems like you can create a callback to an anonymous function, but then the garbage collector probably can't collect that function?

---

I don't see anything about exceptions (though Error objects can be passed through).

---

Looking at array mapping: https://blog.cloudflare.com/capnweb-javascript-rpc-library/#...

I get how it works: remotePromise.map(callback) will invoke the callback to see how it behaves, then make it behave similarly on the server. But it seems awfully fragile... I am assuming something like this would fail (in this case probably silently losing the conditional):

    friendsPromise.map(friend => {friend, lastStatus: friend.isBestFriend ? api.getStatus(friend.id) : null})
---

The array escape is clever and compact: https://blog.cloudflare.com/capnweb-javascript-rpc-library/#...

---

I think the biggest question I have is: how would I apply this to my boring stateless-HTTP server? I can imagine something where there's a worker that's fairly simple and neutral that the browser connects to, and proxies to my server. But then my server can also get callbacks that it can use to connect back to the browser, and put those callbacks (capability?) into a database or something. Then it can connect to a worker (maybe?) and do server-initiated communication. But that's only good for a session. It has to be rebuilt when the browser network connection is interrupted, or if the browser page is reloaded.

I can imagine building that on top of Cap'n Web, but it feels very complicated and I can equally imagine lots of headaches.

reply
kentonv
2 days ago
[-]
You can define a stream like:

    interface Stream extends RpcTarget {
      write(chunk): void;
      end(): void;
      [Symbol.dispose](): void;
    }
Note that the dispose method will be called automatically when the caller disposes the stub or when they disconnect the RPC session. The `end()` method is still useful as a way to distinguish a clean end vs. an abort.

In any case, you implement this interface, and pass it over the RPC connection. The other side can now call it back to write chunks. Voila, streaming.

That said, getting flow control right is a little tricky here: if you await every `write()`, you won't fully utilize the connection, but if you don't await, you might buffer excessively. You end up wanting to count the number of bytes that aren't acknowledged yet and hold off on further writes if it goes over some threshold. Cap'n Proto actually has built-in features for this, but Cap'n Web does not (yet).

Workers RPC actually supports sending `ReadableStream` and `WritableStream` (JavaScript types) over RPC. I'd like to support that in Cap'n Web, too, but haven't gotten around to it yet. It'd basically work exactly like above, but you get to use the standard types.

---------------------

Exceptions work exactly like you'd expect. If the callee throws an exception, it is serialized, passed back to the caller, and used to reject the promise. The error also propagates to all pipelined calls that derive from the call that threw.

---------------------

The mapper function receives, as its parameter, an `RpcPromise`. So you cannot actually inspect the value, you can only pipeline on it. `friend.isBestFriend ?` won't work, because `friend.isBestFriend` will resolve as another RpcPromise (for the future property). I suppose that'll be considered truthy by JavaScript, so the branch will always evaluate true. But if you're using TypeScript, note that the type system is fully aware that `friend` is type `RpcPromise<Friend>`, so hopefully that helps steer you away from doing any computation on it.

reply
mythmon_
1 day ago
[-]
I'll definitely be watching out for more built-in streaming support. Being able to throw the standard types directly over the wire and trust that the library will handle optimally utilizing the connection would make this the RPC library that I've been looking for all year.
reply
ianbicking
1 day ago
[-]
Re: RpcPromise, I'm pretty sure all logical operations will result in unexpected results. TypeScript isn't going to complain about using RpcPromise as a boolean.

Maybe the best solution is just an eslint plugin. Like this plugin basically warns for the same thing on another type: https://github.com/bensaufley/eslint-plugin-preact-signals

Overloading .map() does feel a bit too clever here, as it has this major difference from Array.map. I'd rather see it as .mapRemote() or something that immediately sticks out.

I can imagine a RpcPromise.filterRemote(func: (p: RPCPromise) => RPCPromise) that only allows filtering on the truthiness of properties; in that case the types really would save someone from confusion.

I guess if the output type of map was something like:

    type MapOutput = RpcPromise | MapOutput[] | Record<string, MapOutput>;
    map(func: (p: RpcPromise) => MapOutput)
... then you'd catch most cases, because there's no good reason to have any constant/literal value in the return value. Almost every case where there's a non-RpcPromise value is likely some case where a value was calculated in a way that won't work.

Though another case occurs to me that might not be caught by any of this:

    result = aPromise.map(friend => {...friend, nickname: getNickname(friend.id, userId)})
The spread operator is a pretty natural thing to use in this case, and it probably doesn't work on an RpcPromise?
reply
unshavedyak
2 days ago
[-]
What would this look like for other language backends to support? Eg would be neat if Rust (my webservers) could support this on the backend

edit: Downvoted, is this a bad question? The title is generically "web servers", obviously the content of the post focuses primarily on TypeScript, but i'm trying to determine if there's something unique about this that means it cannot be implemented in other languages. The serverside DSL execution could be difficult to impl, but as it's not strictly JavaScript i imagine it's not impossible?

reply
kentonv
2 days ago
[-]
I'm hoping the answer will be:

* Use Cap'n Proto in your Rust backend. This is what you want in a type-safe language like Rust: generated code based on a well-defined schema.

* We'll build some sort of proxy that, given a Cap'n Proto schema, converts between Cap'n Web and Cap'n Proto. So your frontend can speak Cap'n Web.

But this proxy is just an idea for now. No idea if or when it'll exist.

reply
indolering
1 day ago
[-]
Do you feel like we are reaching the final form for RPC protocols yet?
reply
bryanlarsen
2 days ago
[-]
Now you've likely been downvoted because you're complaining about downvotes. Which is too bad because you've elicited a great answer from the author.

It's usually best to ignore downvotes. Downvoted comments are noticeably grey. If people feel that's unfair, that'll attract upvotes in my experience.

reply
unshavedyak
2 days ago
[-]
> Now you've likely been downvoted because you're complaining about downvotes.

Fwiw i think it was only once, and i was upvoted after mentioning it. You're right i could have worded it as something more ambiguous, aka "it seems this is unpopular" or w/e, but my edit was in reply to someones feedback (the downvote), so i usually mention it.

No complaint, just a form of wordless-feedback that i was attempting to respond to. Despite such actions being against HN will heh.

reply
fitzn
2 days ago
[-]
Just making sure I understand the "one round trip" point. If the client has chained 3 calls together, that still requires 3 messages sent from the client to the server. Correct?

That is, the client is not packaging up all its logic and sending a single blob that describes the fully-chained logic to the server on its initial request. Right?

When I first read it, I was thinking it meant 1 client message and 1 server response. But I think "one round trip" more or less message "1 server message in response to potentially many client messages". That's a fair use of "1 RTT", but took me a moment to understand.

Just to make that distinction clear from a different angle, suppose the client were _really_ _really_ slow and it did not send the second promise message to the server until AFTER the server had computed the result for promise1. Would the server have already responded to the client with the result? That would be a way to incur multiple RTTs, albeit the application wouldn't care since it's bottlenecked by the client CPU, not the network in this case.

I realize this is unlikely. I'm just using it to elucidate the system-level guarantee for my understanding.

As always, thanks for sharing this, Kenton!

reply
kentonv
2 days ago
[-]
To chain three calls, the client will send three messages, yes. (At least when using the WebSocket transport. With the HTTP batch transport, the entire batch is concatenated into one HTTP request body.)

But the client can send all three messages back-to-back without waiting for any replies from the server. In terms of network communications, it's effectively the same as sending one message.

reply
fitzn
2 days ago
[-]
Yep - agreed. Thanks!
reply
Elucalidavah
2 days ago
[-]
> the client is not packaging up all its logic and sending a single blob that describes the fully-chained logic to the server on its initial request. Right

See "But how do we solve arrays" part:

> > .map() is special. It does not send JavaScript code to the server, but it does send something like "code", restricted to a domain-specific, non-Turing-complete language. The "code" is a list of instructions that the server should carry out for each member of the array

reply
benpacker
2 days ago
[-]
My understanding is that your first read is right and your current understanding is wrong.

The client sends over separate 3 calls in one message, or one message describing some computation (run this function with the result of this function) and the server responds with one payload.

reply
random3
2 days ago
[-]
It's inspired by and created by a coauthor of [Cap'n Proto](https://capnproto.org), which is also what OCapN (referenced in a separate comment) name refers to.

Cap'n Proto is inspired by ProtoBuf, protobuf has gRPC and gRPC web.

We've been using ProtoBuf/gRPC/gRPC-web both in the backends and for public endpoints powering React / TS UI's, at my last startup. It worked great, particularly with the GCP Kubernetes infrastructure. Basically both API and operational aspects were non-problems. However, navigating the dumpster fire around protobuf, gRPC, gRPC web with the lack of community leadership from Google was a clusterfuck.

This said, I'm a bit at loss with the meaning of schemaless. You can have different approaches wrt schema (see Avro vs ProtoBuf) but otherwise, can't fundamentally eschew schema/types. It's purely information tied to a communication channel that needs to be somewhere, whether that's explicit, implicit, handled by the RCP layer, passed to the type system, or worse all the way to the user/dev. Moreover, schemas tend to evolve and any protocol needs to take that into account.

Historically, ProtoBuf has done a good job managing various tradeoffs, here but had no experience using Capt'n Proto, yet seen mostly good stuff about it, so perhaps I'm just missing something here.

reply
kentonv
2 days ago
[-]
Of course, all programming language APIs even in dynamic languages have some implied type (aka schema). You can't write code against an API without knowing what methods it provides, what their inputs and outputs are, etc. -- and that's a schema, whether or not it's actually written out as such.

But Cap'n Web itself does not need to know about any of that. Cap'n Web just accepts whatever method call you make, sends it to the other end of the connection, and attempts to deliver it. The protocol itself has no idea if your invocation is valid or not. That's what I mean by "schemaless" -- you don't need to tell Cap'n Web about any schemas.

With that said, I strongly recommend using TypeScript with Cap'n Web. As always, TypeScript schemas are used for build-time type checking, but are then erased before runtime. So Cap'n Web at runtime doesn't know anything about your TypeScript types.

reply
random3
2 days ago
[-]
thank you. So indeed it's, as corrrectly described, schemaless i.e. schema agnostic, which falls into "schema responsibility being passed to user/dev" (I should have picked up what it means when writing that).

So it's basically Stubby/gRPC.

From strictly a RPC perspective this makes sense (i guess to the same degree gRPC would be agnostic to protobuf serialization scheme, which IIRC is the case (also thinking Stubby was called that for the same reason)).

However, that would mean some there's

1. a ton of responsibility on the user/dev —i.e. the same amount that prompted protobuf to exist, afterall.

You basically have the (independent problem of) clients, servers and data (in fligiht, or even persisted) that get different versions of the schema.

2. a missied implicit compression opportunity? IDK to what extent this actually happens on the fly or not.

reply
kentonv
1 day ago
[-]
> So it's basically Stubby/gRPC.

Stubby / gRPC do not support object capabilities, though. I know that's not what you meant but I have to call it out because this is a huuuuuuuge difference between Cap'n Proto/Web vs. Stubby/gRPC.

> a ton of responsibility on the user/dev —i.e. the same amount that prompted protobuf to exist, afterall.

In practice, people should use TypeScript to specify their Cap'n Web APIs. For people working in TypeScript to start with, this is much nicer than having to learn a separate schema format. And the protocol evolution / compatibility problem becomes the same as evolving a JavaScript library API with source compatibility, which is well-understood.

> a missied implicit compression opportunity? IDK to what extent this actually happens on the fly or not.

Don't get me wrong, I love binary protocols for their efficiency.

But there are a bunch of benefits to just using JSON under the hood, especially in a browser.

Note that WebSocket in most browsers will automatically negotiate compression, where the compression context is preserved over the whole connection (not just one message at a time), so if you are sending the same property names a lot, they will be compressed out.

reply
Degorath
1 day ago
[-]
Not the person you were discussing with, but I have to add that to me the main benefit of using Stubby et al. was exactly the schema that was so nicely searchable.

I currently work in a place where the server-server API clients are generated based on TypeScript API method return types, and it's.. not great. The reality of this situation quickly devolves the types using "extends" from a lot of internal types that are often difficult to reason about.

I know that it's possible for the ProtoBuf types to also push their tendrils quite deep into business code, but my personal experience has been a lot less frustrating with that than the TypeScript return type being generated into an API client.

reply
dannyobrien
2 days ago
[-]
I think rather than related to each other, Cap'n and OCapN are both references to object capabilities, aka ocaps. (Insert joke about unforgeable references here)
reply
chrisweekly
2 days ago
[-]
100% agreed (as will anyone sane who's tried to use it), grpc-web is a trainwreck.
reply
electric_muse
2 days ago
[-]
This is such a useful pattern.

I’ve ended up building similar things over and over again. For example, simplifying the worker-page connection in a browser or between chrome extension “background” scripts and content scripts.

There’s a reason many prefer “npm install” on some simple sdk that just wraps an API.

This also reminds me a lot of MCP, especially the bi-directional nature and capability focus.

reply
evansd
2 days ago
[-]
I need time to try this out for real, but the simplicity/power ratio here looks like it could be pretty extraordinary. Very exciting!

Tiny remark for @kentonv if you're reading: it looks like you've got the wrong code sample immediately following the text "Putting it together, a code sequence like this".

reply
kentonv
2 days ago
[-]
Ugh, looks like a copy-pasto when moving the blog into the CMS, will get that fixed, thanks.

The code was supposed to be:

    let namePromise = api.getMyName();
    let result = await api.hello(namePromise);

    console.log(result);
reply
sgarland
1 day ago
[-]
Tangential from a discussion in TFA about GraphQL:

> One benefit of GraphQL was to solve the “waterfall” problem of traditional REST APIs by allowing clients to ask for multiple pieces of data in one query. For example, instead of making three sequential HTTP calls:

    GET /user
    GET /user/friends
    GET /user/friends/photos
…you can write one GraphQL query to fetch it all at once.

Or you could have designed a schema to allow easy tree traversal. Or you could use a recursive CTE.

reply
pests
1 day ago
[-]
> a scheme to allow easy tree traversal

Huh that sounds a lot like graphql

reply
sgarland
1 day ago
[-]
GraphQL isn’t changing your schema, it’s issuing queries to satisfy your request. The queries are in no way guaranteed to be optimal, or well-suited for your schema or indices.
reply
JensRantil
1 day ago
[-]
I struggle to find why it would be appealing to use an RPC framework which is only targetted to TypeScript (and I guess, JavaScript). The point, for me, of an RPC framework is that it should be platform agnostic to allow for reuse and modularity.
reply
crabmusket
2 days ago
[-]
Really nice to have something I could potentially use across the whole app. I've been looking into things I can use over HTTP, websockets, and also over message channels to web workers. I've usually ended up implementing something that rounds to JSON-RPC (i.e. just use an `id` per request and response to tie them together). But this looks much sturdier.

Building an operation description from the callback inside the `map` is wild. Does that add much in the way of restrictions programmers need to be careful of? I could imagine branching inside that closure, for example, could make things awkward. Reminiscent of the React hook rules.

reply
kentonv
2 days ago
[-]
The .map() callback receives as its input an RpcPromise, not the actual value. You can't do any computation (including branching) on an RpcPromise, the only thing you can do is pipeline on it. Since the map callback must be synchronous, you can't await the promise either.

So it turns out it's actually not easy to mess up in a map callback. The main thing you have to avoid is side effects that modify stuff outside the callback. If you do that, the effect you'll see is those modifications only get applied once, rather than N times. And any stubs you exfiltrate from the callback simply won't work if called later.

reply
crabmusket
2 days ago
[-]
Yeah that's what I meant, reading/writing variables captured into the callback. But that sounds like the kind of code that would be easy to sniff out in a code review, or write lints for.
reply
osigurdson
1 day ago
[-]
>> If a class extends the special marker type RpcTarget, then instances of that class are passed by reference, with method calls calling back to the location where the object was created.

This is like .NET Remoting. Suggest resisting the temptation to use this kind of stuff. It gets very hard to reason about what is going on.

reply
spullara
1 day ago
[-]
Which was directly based on Java RMI. Generally, not thought of as great ideas.
reply
gwbas1c
1 day ago
[-]
I've built similar things in the past. (And apologies for merely skimming the article.)

In general, I worry that framework frameworks like this could be horribly complex; breaking bugs (in the framework) might not show up until late in your development cycle. This could mean that you end up having to rewrite your product sooner than you would like, or otherwise "the framework gets in the way" and cripples product development.

Some things that worry me:

1: The way that callbacks are passed through RPC. This requires a lot of complexity in the framework to implement.

2: Callbacks passed through RPC implies server-side state. I didn't read in detail how this is implemented; but server-side state always introduces a lot of complexity in code and hosting.

---

Personally, if that much server-side state is involved, I think it makes more sense to operate more like a dumb terminal and do more HTML rendering on the server. I'm a big fan of how server-side Blazor does this, but that does require drinking C# kool-aide. On the other hand, server-side Blazor is very mature, has major backing, and is built into two IDEs.

reply
gwbas1c
1 day ago
[-]
I should have added: One of the advantages of server-side Blazor (in C#) is that the server essentially returns partial bits of HTML for incremental page updates. This allows you to write UI code without needing to jump through all the hoops of creating a full API between your UI and server.

(IE, you can write quick-and-dirty pages where the UI directly queries the database. Useful for one-offs, prototypes, internal admin pages, "KISS" applications, ect, ect. IE, any situation where it's okay for the browser UI to be tightly coupled to your data model.)

reply
osigurdson
1 day ago
[-]
>> let namePromise = batch.getMyName(); let result = await batch.hello(namePromise);

This is quite interesting. However the abysmal pattern I have seen a number of times is:

list = getList(...) for item in list getItemDetails(item)

Sometimes this is quite hard to undo.

reply
kentonv
1 day ago
[-]
Keep reading, the blog post addresses this.
reply
cbarrick
2 days ago
[-]
What's going on under the hood with that authentication example?

Is the server holding onto some state in memory that this specific client has already authenticated? Or is the API key somehow stored in the new AuthenticatedSession stub on the client side and included in subsequent requests? Or is it something else entirely?

reply
kentonv
2 days ago
[-]
The server constructs a new AuthenticatedSession implementation each time authenticate() is called, and can store the key (or just the authenticated user info) in the server-side object.

This does mean the server is holding onto state, but remember the state only lasts for the lifetime of the particular connection. (In HTTP batch mode, it's only for the one batch. In WebSocket mode, it's for the lifetime of the WebSocket.)

reply
cbarrick
2 days ago
[-]
Ah, the bit about it only lasting for the lifetime of the connection was the part I missed. That makes a lot of sense. As does the bit about the state staying on the server side.

Thanks for the explanation!

reply
meindnoch
1 day ago
[-]
What happens if I pass a recursive function to map()?

  let traverse = (dir) => {
    name: dir.name,
    files: api.ls(dir).map(traverse) // api.ls() returns [] for files
  };
  let tree = api.ls("/").map(traverse);
reply
kentonv
1 day ago
[-]
I believe this will stack-overflow on the client side. The callback is invoked in recording mode synchronously when you call `.map()`. Nested maps are allowed, but this case ends up being infinitely nested, so eventually you're going to hit a stack overflow while trying to do the recording.
reply
comex
1 day ago
[-]
What prevents an attacker from using nested maps to make the server spend exponential amounts of CPU and memory on the response? Is there some kind of limit on the total number of response items?
reply
kentonv
1 day ago
[-]
The application should track resource use and implement limits as needed.

I know that sounds like a cop-out, but this is really true of any protocol, and the RPC protocol itself has no real knowledge of the cost of each operation or how much memory is held, so can't really enforce limits automatically.

reply
meindnoch
1 day ago
[-]
But you could detect such recursion and stop descending on the client side. Then the server could mirror the same recursion on their end.
reply
kentonv
1 day ago
[-]
Yes, perhaps. Particularly if it's the exact same function (by identity). It hadn't occurred to me.
reply
garaetjjte
2 days ago
[-]
I think Cap'n Proto plays with web platform pretty nicely too... okay, some might say that my webapp that is mostly written in C++, compiled with Emscripten and talks to server with capnp rpc-over-websocket is in fact not playing nice with web.
reply
mdasen
2 days ago
[-]
This looks really nice. Are there plans to bring support to languages other than JS/TS?
reply
HDThoreaun
2 days ago
[-]
This a reference to cap'n jazz?
reply
kentonv
2 days ago
[-]
No, in fact this is the first I've heard of Cap'n Jazz.

The name "Cap'n Proto" came from "capabilities and protobuf". The first, never-released version was based on Protobuf serialization. The first public release (way back on April 1, 2013) had its own, all-new serialization.

There's also a pun with it being a "cerealization protocol" (Cap'n Cruch is a well-known brand of cereal).

reply
chrisweekly
2 days ago
[-]
aha, the Cap'n Crunch "cerealization" pun is solid
reply
benburton
1 day ago
[-]
My first thought too. Glad there are other indie rock folks poking around!
reply
Yoric
2 days ago
[-]
That's more or less a dynamically-typed version of what we had with Opalang ~15 years ago and it worked great. Happy to see that someone else has picked the idea of sending capabilities, including server->client calls!
reply
kentonv
2 days ago
[-]
FWIW, Cap'n Proto is a statically-typed version of this that has been around since 2013, and is actually used heavily in the implementation of Cloudflare Workers.
reply
Yoric
1 day ago
[-]
Ah, thanks, I wasn't aware of that.

That won't prevent me from bragging, as we released Opalang 1.0 in 2011 :)

reply
kentonv
1 day ago
[-]
I think Mark S Miller beat us both by like 10-20 years with CapTP. ;)
reply
Yoric
1 day ago
[-]
Not on multi-tier web applications, though :)

(but yeah, I was passingly familiar with E! when I designed the capability system of Opalang, so I definitely don't get full bragging rights)

reply
mrbluecoat
2 days ago
[-]
> "Cap'n" is short for "capabilities and"

Learn something new every day

reply
srameshc
1 day ago
[-]
If you have ever had to use gRPC & web, then you will know how painful Protobuf is to make it work on the web. I love the simplicity of Cap'n Web https://capnproto.org/language.html , hopefully this will lead us to better and easier RPC.

Update: Unlike Cap'n Proto, Cap'n Web has no schemas. In fact, it has almost no boilerplate whatsoever. This means it works more like the JavaScript-native RPC system in Cloudflare Workers. https://github.com/cloudflare/capnweb

reply
yayitswei
1 day ago
[-]
One of the authors is the Kenton who built this awesome lan party house: https://lanparty.house/
reply
pzo
2 days ago
[-]
Any idea how its compares to tRPC and oRPC? Wish all such project always had section of 'Why' and explained why it was needed and what it solved that other projects didn't.
reply
kentonv
2 days ago
[-]
I haven't actually used tRPC nor oRPC.

But my understanding is that neither of them support object-capabilities nor promise pipelining. These are the killer features of Cap'n Web (and Cap'n Proto), which the blog post describes at length.

reply
tanepiper
2 days ago
[-]
Reminds me of dnode (2015) - https://www.npmjs.com/package/dnode
reply
nly
1 day ago
[-]
Last time I used capnProto for RPC I found it an incredibly chatty protocol with tonnes of small memory allocations.
reply
benmmurphy
2 days ago
[-]
are there security issues with no schemas + callback stubs + language on the server with little typing. for example with this `hello(name)` example the server expects a string but can the client pass an callback object that is string-like and then use this to try and trick the server into doing something bad?
reply
kentonv
2 days ago
[-]
The protocol explicitly blocks overriding `toString()` (and all other Object.prototype members), as well as `toJSON()`, to prevent the obvious ways that you might accidentally invoke a callback when you weren't expecting to. How else might you invoke a callback by accident?

That said, type checking is called out both in the blog post (in the section on TypeScript) and in the readme (under "Security Considerations"). You probably should use some runtime type checking library, just like you should with traditional JSON inputs.

In the future I'm hoping someone comes up with a way to auto-generate type checks based on TypeScript types.

reply
benmmurphy
2 days ago
[-]
i was thinking if you were doing some string operations like `indexOf` then maybe that could be an issue.
reply
isaiahballah
1 day ago
[-]
This looks awesome! Does anyone know if this also works in React Native?
reply
kentonv
1 day ago
[-]
I haven't tested it, but I'd be surprised if it doesn't work.

I'd happily accept a PR adding a CI job to make sure it works and we don't break it.

reply
thethimble
1 day ago
[-]
Should work anywhere that supports WS/HTTP + JS so yes
reply
pjmlp
1 day ago
[-]
I am going to be that guy, the way REST is used in 99% of project deployments, it is yet another form of RPC, mostly JSON-RPC with a little help of HTTP verbs to save having yet another field for the actuall message purpose, all nicely wrapped in language specific SDKs, looking like method/function calls.
reply
indolering
1 day ago
[-]
I find the choice of TypeScript to be disappointing. One of the reasons that capproto has struggled for market share is the lack of implementations.

Is the overhead for calling into WASM too high for a Rust implementation to be feasible?

A Haxe or Dafny implementation have let us generate libraries in multiple languages from the same source.

reply
josephg
1 day ago
[-]
There’s no way you could implement this in 10kb of rust. It takes massive advantage of javascript’s dynamism to work. Also, I’m pretty sure ts / js are vastly more popular languages than rust. I suspect this will get a lot more use because it’s typescript.
reply
kentonv
1 day ago
[-]
Yes, pretty much. Just interfacing between the JS app and the Wasm RPC implementation would take more code than the entire RPC implementation in TypeScript.

Also, the audience of this library is very specifically TypeScript developers. If your app is Rust, you'd probably be happier with Cap'n Proto.

reply
ryanrasti
1 day ago
[-]
> I find the choice of TypeScript to be disappointing.

Genuinely curious, is the disappointment because it's limited to the JS/TS ecosystem?

My take is that by going all-in on TypeScript, they get a huge advantage: they can skip a separate schema language and use pure TS interfaces as the source of truth for the API.

The moment they need to support multiple languages, they need to introduce a new complex layer (like Protobuf), which forces design into a "lowest common denominator" and loses the advanced TypeScript features that make the approach so powerful in the first place.

reply
indolering
1 day ago
[-]
You can generate TypeScript schema for Haxe JS output. I'm honestly a bit surprised that TS isn't a supported target!

That could change with some investments. Haxe is a great toolkit to develop libraries in because it reduces the overhead for each implementation. It would be nice to see some commercial entity invest in Haxe or Dafny (which can also enable verification of the reference implementation).

> The moment they need to support multiple languages, they need to introduce a new complex layer (like Protobuf),

So this just won't be used outside of Node servers then?

reply
kentonv
1 day ago
[-]
> So this just won't be used outside of Node servers then?

Well... I imagine / hope it will be used a lot on Cloudflare Workers, which is not Node-based, it has its own custom runtime.

(I'm the author of Cap'n Web and also the lead developer for Cloudflare Workers.)

reply
spullara
1 day ago
[-]
The spiritual descendant of Java RMI.
reply
myflash13
1 day ago
[-]
Brilliantly engineered but this is solving all the wrong problems. The author implicates that this is supposed to be a better GraphQL/REST, but the industry is already moving towards a better solution for that[1]: data sync like ElectricSQL/Turso/litefs/RxDb. If you want to collapse the API boundary between server and client so that it "feels like" the server and client are the same, then sync the relevant data so it actually IS the same. Otherwise DON'T pretend it is the same because you will have badly leaking abstractions. This looks like it breaks all of the assumptions that programmers have about locally running code. Now every time I do a function call() I have to think about how to handle network failures and latency?

What this could've been is a better way to consume external APIs to avoid the SDK boilerplate generation dance. But the primary problems here are access control, potentially malicious clients, and multi-language support, none of which are solved by this system.

In short, if you're working over a network boundary, better keep that explicit. If you want to pretend the network boundary doesn't exist, then let a data sync engine handle the network parts and only write local code. But why would you write code that pretends to be local but is actually over a network boundary? I can't think of a single case where I would want to do that, I'd rather explicitly deal with the network issues so I can clearly see where the boundary is.

[1] https://bytemash.net/posts/i-went-down-the-linear-rabbit-hol...

reply
porridgeraisin
1 day ago
[-]
I really like the idea. Especially the idea where you return different RpcTargets based on the "context", that's really quite nice. Not just for authentication and such, but for representing various things like "completely structurally different output for GET /thingies for admin and users".

The promise-passing lazily evaluated approach is also nice -- any debugging woes are solved by just awaiting and logging before the await -- and it solves composability at the server - client layer. The hackiness of `map()` is unfortunate, but that's just how JS is.

However, I don't see this being too useful without there also being composability at the server - database layer. This is notoriously difficult in most databases. I wonder what the authors / others here think about this.

For an example of what I mean

  const user = rpc.getUser(id)
  const friends = await rpc.getFriends(user)
Sure beats

  GET /user/id
  GET /graph?outbound=id
But, at the end, both cases are running two different SQL queries. Most of the time when we fuse operations in APIs we do it all the way down to the SQL layer (with a join).

  GET /user/id?include=friends
Which does a join and gets the data in a single query.

So while its a nice programming model for sure, I think in practice we'll end up having a `rpc.getUserAndFriends()` anyways.

I'm not that experienced, so I don't know in how many projects composability at just one layer would actually be enough to solve most composability issues. If it's a majority, then great, but if not, then I don't think this is doing much.

One situation where this actually works that comes to mind is SQLite apps, where multiple queries are more or less OK due to lack of network round trip. Or if your DB is colocated with your app in one of the new fancy datacenters where you get DB RAM to app RAM transfer through some crazy fast network interconnect fabric (RDMA) that's quicker than even local disk sometimes.

reply
quatonion
1 day ago
[-]
Really not a big fan of batteries included opinionated protocols.

Even Cap'n Proto and Protobuf is too much for me.

My particular favorite is this. But then I'm biased coz I wrote it haha.

https://github.com/Foundation42/libtuple

No, but seriously, it has some really nice properties. You can embed JSON like maps, arrays and S-Expressions recursively. It doesn't care.

You can stream it incrementally or use it a message framed form.

And the nicest thing is that the encoding is lexicographically sortable.

reply
elijahcarrel
1 day ago
[-]
That link is dead
reply
gethly
1 day ago
[-]
This looks like something you tinker with during a weekend when you are bored out of your mind.
reply
stavros
1 day ago
[-]
I was really excited when I saw the headline, but I was kind of disappointed to see it doesn't support schemas natively. I know you can specify them via zod, but I'm really not looking forward to any more untyped APIs, given that the lack of strict API typing has been by far the #1 reason for the bugs I've had to deal with in my career.
reply
kentonv
1 day ago
[-]
I really want to bake in some sort of support for generating type checks based on TypeScript types... so then your schemas are just TypeScript. Not sure why this doesn't seem to be a common practice TBH, might be missing something.
reply
crabmusket
1 day ago
[-]
We built our own in-house RPC interface definition pipeline where the schemas are just typescript types, looking very similar to the example in your post

    interface MyService {
      method(): ReturnType;
    }
We parse the typescript to JSON Schema, then use that to generate runtime validation across both JS implementations and other languages.

Typescript is a really nice IDL. I didn't want to hitch our wagon to something else like typespec.io even if that would have given us more things out of the box.

reply
ryanrasti
1 day ago
[-]
> Not sure why this doesn't seem to be a common practice TBH, might be missing something.

Yeah... I've been deep in this problem space myself. The two big points of friction are: 1. Requiring a build-step to generate runtime code from the TS types 2. TS doesn't officially support compiler transforms that do it

That said, the two most promising approaches I've found so far: 1. https://github.com/GoogleFeud/ts-runtime-checks -- it does exactly what you describe 2. https://arktype.io/ -- a very interesting take on the Zod model, but feels like writing native Typescript

Congrats on the launch, really exciting to see a way to get capabilities into the JS ecosystem!

reply
stavros
1 day ago
[-]
That would be great, would these checks run at deserialization time? They'd probably need to, as you wouldn't want to assume that the stuff coming through the network is of a specific type.
reply
kentonv
1 day ago
[-]
I'm thinking the ideal would be if I could feed in a TypeScript interface, and have some tool generate a wrapper around that interface which type-checks all inputs.

This tool could actually be totally independent from the RPC implementation.

I don't think it's necessary to bake type checks into the deserialization itself, since the RPC system already doesn't make any assumptions about the payloads it is moving around (other than that they are composed of only the types that the deserialization supports, which is a fixed list).

reply
stavros
1 day ago
[-]
I understand your reasoning, but this is one of those things where people will use the default, and your choice as designer is whether you'll have the default that makes it harder to get started with, but reduces bugs down the line, or whether you'll make it easier to get started with, but much more buggy.

History has shown that, if you want things to be popular, you should choose the latter, but I think the tide has turned enough that the former could be the right choice now. That's also the reason why we use Typescript instead of JS, so mandatory static typing would definitely fit with the zeitgeist.

Honestly, the boundary is the one place where I wouldn't want to not have types.

reply
paradox460
1 day ago
[-]
Feels a little like how erlang calls things across different nodes
reply
3cats-in-a-coat
1 day ago
[-]
On the path they've started with pipelining calls parametric on previous calls, they'll quickly realize that eventually they'll need basic transforms on the output before they pass it as input i.e. necessitating support for a standard runtime with a standard library of features.

I.e. imagine you need this:

    promise1 = foo.call1();
    promise2 = foo.call2();
    promise3 = foo.call2(promise1 + promise2);
Can't implement that "+" there unless...

    promise1 = foo.call1();
    promise2 = foo.call2();
    promise3 = foo.add(promise1, promise2)
    promise4 = foo.call2(promise3);
You can also make some kind of a RpcNumber object so you can use their Proxy function to do promise1.add(promise2) but ultimately you don't want to write such classes on the spot every time. Or functions on the server for it.

The problem is even that won't give you conditions (loops, branches) that run on the server, the server execution is blocked by the client.

Once you realize THAT, you realize it's most optimal if both sides exchange command buffers in general, including batch instructions to remote and local calls and standardized expression syntax and library.

What they did with array.map() is cute but it's not obvious what you can and what you can't do with this, and most developers will end up tripping up every time they use it, both trying to overuse this feature and underusing it, unaware of what it maps, how, when and where.

For example this record replay can't do any (again...) arithmetic, logic, branching and so on. It can record calling method on the Proxy and replaying this on the other side, in simple containers, like an object literal.

This is where GraphQL is better because it's an explicit buffer send and an explicit buffer return. The number of roundtrips and what maps how is not hidden.

GraphQL has its own mess of poorly considered features, but I don't think Cap'n Web survives prolonged contact with reality because of how implicit and magical everything is.

When you make an abstraction like this, it needs to work ALL THE TIME, so you don't have to think about it. If it only works in demo examples written by developers who know exactly when the abstraction breaks, real devs won't touch it.

reply
truth_seeker
1 day ago
[-]
JSON Serialization, seriously ???

Why not "application/octet-stream" header and sending ArrayBuffer over the network ?

reply
kentonv
1 day ago
[-]
Because JSON serialization is built into the browser.

I'm obviously a huge fan of binary serialization; I wrote Cap'n Proto and Protobuf v2 after all.

But when you're working with pure JS, it's hard to be much faster than the built-in JSON implementation, and even if you can beat it, you're only going to get there with a lot of code, and in a browser code footprint often matters more than runtime speed.

reply
cyberax
2 days ago
[-]
ETOOMAGIC for me.
reply
davexunit
2 days ago
[-]
What's so magical about it?
reply
cyberax
1 day ago
[-]
Magically tracing dependencies. Imagine debugging it when it breaks because of some subtle differences between environments.
reply
davexunit
1 day ago
[-]
Not sure what you mean by "tracing dependencies". Can you elaborate?
reply
waynenilsen
1 day ago
[-]
Now do the same for ffi
reply