Node's programming model seems highly deceptive and farcical
13 points
1 year ago
| 11 comments
| HN
The more I think about it, the more I'm convinced of it.

The biggest selling point of Node folks has been the "single threaded, event driven" model, right? Unlike JavaScript, other languages work on a "blocking" basis i.e. you run a statement or command and the program "waits" until the I/O is complete. For example, you issue `open('xyz.txt', 'rb').read()` in python and the program waits or blocks until the underlying driver is able to read that whole text file (which could take arguably long time if said file is too large in size).

But with the Nodejs equivalent, you just issue the statement and then pass the "event handler" so that your program is never in the "waiting state". The whole premise of Node/JS event-callback is that "you don't call us, we will call you".

This is all nice in theory but if this were indeed true then Nodejs scripts should be blazing fast compared to Python and even Java considering that most programs we write are I/O heavy and 99% of time, they're just waiting for an input from a File/URI/User? If this event callback model indeed worked as effectively as claimed, Node would have been the numero one and only language being used today?

I think I'm starting to understand why that isn't the case. This whole "single threaded, event driven" thing is just a farce. You can also replicate the same thing that Node.js is doing in your Java or Python too by applying multi-threading (i.e. one thread just "waits" for the I/O in the background while the other keeps doing its job). All you've done here is just handed or delegated that complexity of multi-threading to Node.js?

Realistically, it's impossible to wait or block an I/O request while at the same time also letting the other part of the code engage in other tasks, that's the very definition of multi-threading. Doing "async" is impossible without multiple threads in that sense. Node must have a thread pool of sorts where one of them is engaged in the wait/block while another is running your JS code further. When the wait is over, the control is then passed to the "event handler" function it was bound to in that other thread.

What Node is selling as "single threaded" applies to application or business logic we are writing, node itself can't be single threaded. I feel it's better to just implement multi-threading in your own code (as needed) instead of using something convoluted and confusing like Node.js. What say you?

biorach
1 year ago
[-]
> feel it's better to just implement multi-threading in your own code (as needed) instead of using something convoluted and confusing like Node.js. What say you?

I could give a long and detailed answer, but I think the best way is to simply try implement this, in Java or Python (but without using Python's async obviously, which would be just node but worse) and then report back about which is more complicated and convoluted.

Oh and to ensure you're providing the same guarantees as node you should have some test code that a) modifies a variable both in the main and worker threads b) causes exceptions in the worker thread which can be handled in the main thread

For a) I'd suggest using the code here as a starting point: https://realpython.com/intro-to-python-threading/#race-condi...

After you've done that try the same in idiomatic node

reply
duped
1 year ago
[-]
> The biggest selling point of Node folks has been the "single threaded, event driven" model, right?

No, it's "JS on the server."

> it's impossible to wait or block an I/O request while at the same time also letting the other part of the code engage in other tasks

No, it's absolutely possible. You can ask the network stack, "do you have any new packets" and the stack can say "no" and you can go off and do work before coming back to ask "do you have any new packets" and the network stack says "yes" before your (still single-threaded) application reads those packets and does something with them.

Concurrency is not parallelism.

reply
karmakaze
1 year ago
[-]
The thing this post doesn't distinguish is a 'thread' vs a fiber/coroutine/goroutine/etc.

For example if you write something in Go without using async and spawning a bunch of goroutines, they could execute on one thread (if you set GOMAXPROCS=1) or many (the default is the number of cores/hyperthreads of your processors). On virtually any library function call, the goroutine could yield and switch to executing a different goroutine. This is all transparent and IMO preferable to write and read than explicitly passing callbacks, or having the compiler rewrite any await as 'rest of program' passed as a callback.

For small numbers of goroutines, actual OS threads could instead be used effectively, but as that number gets large, there's overhead of thread context switches and scheduler work that becomes significant.

reply
toast0
1 year ago
[-]
I'm not a fan of Node, but I think you've misunderstood how it works. Ignoring web workers, Node is a single threaded event loop based system. When you make an i/o call with a callback, that can run the i/o call immediately as non-blocking ... if it suceeds or fails with a final error the callback can be called immediately (or queued for later, whatevs), if it returns EAGAIN, that fd is added to the event loop's select equivalent, and the next pending event is processed or the select is polled again. When that fd eventually is ready, your callback is called. This doesn't require multithreading.

Traditional Javascript is missing the basic concepts that allow for reasonable shared memory multithreading, if such a thing exists. WebWorkers allows for multithreading without shared memory. There's some ability to pass messages between workers and the main script, but I haven't explored this in Node.

reply
austin-cheney
1 year ago
[-]
JavaScript executes 50x or more faster than Python. This has nothing to do with the event model. JavaScript compiles to a low level byte code in a VM at run time whereas Python is an interpreted language.

* https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

reply
Someone
1 year ago
[-]
CPython compiles to a byte code, too and runs that (https://devguide.python.org/internals/interpreter/)

Any speed difference between the two is because of a combination of

- developers spending less effort on making it fast

- differences in the virtual machines instruction set

- differences in the languages that make one harder to run fast

I think/guess the first is the major reason, but the instruction set may matter, too. The last one has at most a minor impact.

A reason developers spend less time making the CPython interpreter faster is that you can relatively easily make ‘python’ even faster by making it call C code.

reply
pyeri
1 year ago
[-]
By "developers" you mean the core CPython developers here?

Because CPython itself is very slow, Java runs circles around it on most benchmarks you'll find such as this one[1] from Debian. If you observe the computation time taken, the order of magnitude of difference between Java and Python programs is just preposterous!

[1]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

reply
chalsprhebaodu
1 year ago
[-]
Node builds on top of V8, Chromium’s JS engine, which has JIT, which allows for some optimizations that aren’t as easy or obvious in a simple bytecode interpreter.
reply
Someone
1 year ago
[-]
That’s exactly what I argue: the CPython developers choose not to spend time on a JIT (quite possibly because they didn’t have the resources to build a good one)

PyPy shows Python could have had a JIT (https://www.pypy.org/)

reply
ffgjgf1
1 year ago
[-]
> You can also replicate the same thing that Node.js is doing in your Java or Python too by applying multi-threading (i.e. one thread just "waits" for the I/O in the background while the other keeps doing its job).

Why would you need “multi-threading” isn’t what you’ve described exactly what async/coroutines do in Python? I’m not sure there is anything particularly unique with Node here, there are plenty of Python frameworks based on async etc.

reply
thehucklecat
1 year ago
[-]
I was a non-believer for a while too. My first impression of node was definitely that it was a shell game.

But ease of use matters and I have to admit that a lot of node apps really scream and scale for developers of a lot of different skill levels.

I am really excited about java project loom virtual threads. This feels like the most have your cake and eat it too thing I've seen in a while.

reply
kypro
1 year ago
[-]
> Realistically, it's impossible to wait or block an I/O request while at the same time also letting the other part of the code engage in other tasks, that's the very definition of multi-threading. Doing "async" is impossible without multiple threads in that sense. Node must have a thread pool of sorts where one of them is engaged in the wait/block while another is running your JS code further. When the wait is over, the control is then passed to the "event handler" function it was bound to in that other thread.

> What Node is selling as "single threaded" applies to application or business logic we are writing, node itself can't be single threaded.

I've questioned the same thing in the past and have been confused about it too. Node itself runs in a single thread, but my understanding is that calls on external APIs to do IO which would then run in their own thread because this would typically be managed by the OS. While Node is waiting on those calls to finish and call the callback method it can go execute another event that was waiting in the meantime. Then when that's complete assuming the original IO call has now responded it will continue to execute that code by running its callback.

I could continue trying to attempt to explain why you're wrong poorly, but honestly I haven't read into this in years and my understand even then was just good enough to satisfy my own curiosity. I think it might be worth reading more about how Node works.

> I feel it's better to just implement multi-threading in your own code (as needed) instead of using something convoluted and confusing like Node.js.

Skipping over the fact Node isn't multi-threading, the advantage as I understand is that Node is faster at handling requests since there is no over heading in spinning up and managing threads for each incoming request.

I also remember reading that Node tends to perform better when handling very large amounts of concurrent requests. The reason for this if you have a sudden spike in incoming requests you'll need a server which can spin up hundreds threads a second then attempt to process those requests simultaneously. Node on the other hand doesn't really care because it will just continue to handle each request as fast as it can in a single thread.

reply
compressedgas
1 year ago
[-]
You are right. Node uses libuv's thread pool for calls to blocking system calls or high latency libc functions such as getaddrinfo.
reply
valand
1 year ago
[-]
See libuv and event loop.

"Single threaded" is not a selling point of JS, but a statement of its limitation and it refers to how JS event loop is executed.

Implementing multithreading doesn't make sense for some people unless they can attain the same "ease of use" of JS.

reply
mrkeen
1 year ago
[-]
> This is all nice in theory but if this were indeed true then Nodejs scripts should be blazing fast compared to Python and even Java considering that most programs we write are I/O heavy and 99% of time, they're just waiting for an input from a File/URI/User?

* Node is blazingly fast.

* I don't think it's true that most programs are just waiting for input, and if they were, then the language you choose would matter less because you can't fix a slow user.

> blazing fast compared to Python and even Java

My day job is Java. I typically find that JS/TS test suites run a lot faster than Java test suites. And by that I mean wall-clock time taken to spin up, do 100-300 test cases, and shut down.

> Node would have been the numero one and only language being used today?

Node is a runtime, its language JS/TS is numero uno being used today. But (surprise, surprise) it turns out there are other criteria to judge languages/runtimes on. It's not just one checkbox that reads "non-blocking IO".

> You can also replicate the same thing that Node.js is doing in your Java or Python too by applying multi-threading (i.e. one thread just "waits" for the I/O in the background while the other keeps doing its job).

1) If you could just 'replicate the same thing', you'd see that it's not a farce.

2) You're grossly underestimating the work required to retrofit single-threaded/event-driven onto an existing JVM/library/user-code: You can't naively schedule some method foo() onto either an IO or non-IO thread, because some part will be IO, and some part will be non-IO (fractally - because some parts of its IO work will be non-IO also, all the way down).

3) Other languages are aware of this "farce" and are trying to do it also:

* My favourite language Haskell also has a runtime which does non-blocking IO.

* Java started introducing non-blocking IO libs for this reason a long time ago. [1]

* Some Java frameworks offer assistance to users trying to program in this way [2]

* More recently, Java is trying to also embrace this "farce" wholeheartedly into the language, with Project Loom [3]. It's now mature enough that it's either been released in the latest Java, or is about to be.

> I feel it's better to just implement multi-threading in your own code (as needed)...

How do you feel about garbage collection or synchronised blocks? Is it better to implement those yourself rather than relying on the runtime? If not, what's the difference? JVM GC is too confusing for me to understand, but I don't want to go back to malloc().

> ...instead of using something convoluted and confusing like Node.js.

You showed the blocking python code `open('xyz.txt', 'rb').read()`. Show the Node code as well, and then we judge if it's more convoluted or confusing.

For all that criticism, you missed the what's actually wrong with that model, which is that you can't do true simultaneous computations (and make use of those cores in your CPU). At least you used to not be able to. Web workers came along, offering true parallelism, but I don't live enough in the JS/TS world to know if people are actually using them.

[1] https://www.baeldung.com/java-io-vs-nio

[2] https://reactivex.io/RxJava/3.x/javadoc/io/reactivex/rxjava3...

[3] https://stackoverflow.com/questions/70174468/project-loom-wh...

reply
igouy
1 year ago
[-]
> and make use of those cores in your CPU

For example, see the difference between elapsed time and cpu time measurements here:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

reply