Speeding up Ruby by rewriting C in Ruby
297 points
21 days ago
| 19 comments
| jpcamara.com
| HN
Someone
21 days ago
[-]
FTA: The loop example iterates 1 billion times, utilizing a nested loop:

  u = ARGV[0].to_i       
  r = rand(10_000)                          
  a = Array.new(10_000, 0)                 
 
  (0...10_000).each do |i|                     
    (0...100_000).each do |j|               
      a[i] += j % u                     
    end
    a[i] += r                      
  end
 
  puts a[r]
Weird benchmark. Hand-optimized, I guess this benchmark will spend over 99% of its time in the first two lines.

If you do liveliness analysis on array elements you’ll discover that it is possible to remove the entire outer loop, turning the program into:

  u = ARGV[0].to_i       
  r = rand(10_000)                                         
                    
  (0...100_000).each do |j|               
    a += j % u                     
  end
  a += r                      
  
  puts a
Are there compilers that do this kind of analysis?

Even though u isn’t known at compile time, that inner loop can be replaced by a few instructions, too, but that’s a more standard optimization that, I suspect, the likes of clang may be close to making.

reply
IshKebab
21 days ago
[-]
Compilers don't do liveness analysis on individual array elements. It's too much data to keep track of and would probably only be useful in incorrect code like this.

I used to work on an AI compiler where liveness analysis of individual tensor elements actually would have been useful. We still didn't do it because the compilation time/memory requirements would be insane.

reply
Asmod4n
20 days ago
[-]
Truffle ruby could replace this with a O(1) Operation, even when it’s part of a c extension.
reply
IshKebab
20 days ago
[-]
I think most compilers could do that. That's a separate much easier optimisation.
reply
hatthew
21 days ago
[-]
Closed form that works for most cases:

    result = ((u * (u - 1)) / 2 * (100000/u)) + (100000%u * (100000%u - 1) / 2) + r)
reply
kristianp
21 days ago
[-]
The article refers to upcoming versions of Ruby. For the curious, looks[1] like ruby 3.4.0 will be released this Christmas, and ruby 3.5.0 next Christmas.

Also, I'm wondering what effect Python's minimal JIT [2] has coming for this type of loop. Python 3.13 needs to be built with the JIT enabled, so it would be interesting if someone who has built it runs the benchmarks.

[1] https://www.ruby-lang.org/en/downloads/releases/

[2] https://drew.silcock.dev/blog/everything-you-need-to-know-ab...

reply
riffraff
21 days ago
[-]
Ruby is always released on Christmas, it's a predictable and cute schedule.

But perf improvements can and do drop in point releases too, afair.

reply
Lammy
21 days ago
[-]
> There was a PR to improve the performance of `Integer#succ` in early 2024, which helped me understand why anyone would ever use it: “We use `Integer#succ` when we rewrite loop methods in Ruby (e.g. `Integer#times` and `Array#each`) because `opt_succ (i = i.succ)` is faster to dispatch on the interpreter than `putobject 1; opt_plus (i += 1)`.”

I find myself using `#succ` most often for readability reasons, not just for performance. Here's an example where I use it twice in my UUID library's `#bytes` method to keep my brain in “bit slicing mode” when reading the code. I need to loop 16 times (`0xF.succ`) and then within that loop divide things by 256 (`0xFF.succ`): https://github.com/okeeblow/DistorteD/blob/ba48d10/Globe%20G...

reply
e12e
21 days ago
[-]
Why do you find 0xF.succ better than 0x10 in this case?
reply
Lammy
21 days ago
[-]
Because of how I'm used to thinking of the internal 128-bit UUID/GUID value as a whole:

  irb> 0xFFFFFFFF_FFFFFFFF_FFFFFFFF_FFFFFFFF.bit_length => 128
reply
fsckboy
21 days ago
[-]
... 0 to 127 < 128
reply
block_dagger
21 days ago
[-]
After all these years, I still love Ruby. Thank you Matz!
reply
Imustaskforhelp
21 days ago
[-]
super interesting , actually I am also a contributer of the https://github.com/bddicken/languages and after I had tried to create a lua approach , I started to think of truffleruby as it was mentioned somewhere but unfortunately when I had run the code of main.rb , there was virtually no significant difference b/w truffleruby and main.rb (sometimes normal ruby was faster than truffleruby)

I am not sure if the benchmark that you had provided showing the speed of truffleruby were made after the changes that you have made.

I would really appreciate it if I could verify the benchmark

and maybe try to add it to the main https://github.com/bddicken/languages as a commit as well , because the truffleruby implementation actually is faster than the node js and goes close to bun or even golang for that matter which is nuts.

This was a fun post to skim through , definitely bookmarking it.

reply
cutler
20 days ago
[-]
With TruffleRuby you'll need to account for startup time and time to max. performance which vary with the native and JVM runtime configurations. See https://github.com/oracle/truffleruby
reply
Alifatisk
21 days ago
[-]
Woah, Ruby has become fast, like really fast. What's even more impressive is TruffleRuby, damn!
reply
knowitnone
21 days ago
[-]
It's Oracle https://github.com/oracle/truffleruby Double Damn!
reply
Twirrim
21 days ago
[-]
It's open source under Eclipse Public License version 2.0, GNU General Public License version 2, or GNU Lesser General Public License version 2.1.

Making it easily fork-able should Oracle choose to do something users dislike.

reply
ksec
20 days ago
[-]
Holy! I know TruffleRuby is Open Source but I somehow always thought Graal ( Which TruffleRuby is based on ) wasn't open sourced.
reply
tiffanyh
21 days ago
[-]
Note that Rails doesn't work on Truffle and from what I understand, won't anytime soon.

Which is disappointing since it has the highest likelihood of making the biggest impact to Ruby perf.

reply
uamgeoalsk
21 days ago
[-]
Huh, what exactly doesn't work? Their own readme says "TruffleRuby runs Rails and is compatible with many gems, including C extensions." (https://github.com/oracle/truffleruby)
reply
tiffanyh
21 days ago
[-]
Truffle:

  TruffleRuby is not 100% compatible with MRI 3.2 yet
Rails:

  Rails 8 will require Ruby 3.2.0 or newer
https://github.com/oracle/truffleruby

https://rubyonrails.org/2024/9/27/this-week-in-rails

reply
lmm
21 days ago
[-]
That doesn't mean Rails won't run on TruffleRuby. TruffleRuby may not implement 100% of MRI 3.2, but that doesn't mean it doesn't implement all the parts that Rails needs.
reply
hotpocket777
21 days ago
[-]
Is it possible that those two statements taken together means truffleruby can run rails 8?
reply
jeremy_k
21 days ago
[-]
Super interesting. I didn't know that YJIT was written in Rust.
reply
riffraff
21 days ago
[-]
It was initially written in C then ported to rust[0], which seems like it was a good idea. The downside is that it may not be enabled at build time if you don't have the right toolchain/platform, but that seems a good trade off.

0: https://shopify.engineering/porting-yjit-ruby-compiler-to-ru...

reply
tgmatt
21 days ago
[-]
Another language comparison repo that's been going for longer with more languages https://github.com/niklas-heer/speed-comparison.
reply
igouy
20 days ago
[-]
Another language comparison repo with hard-to-read presentation.

The chart axis labels and bar labels overlap each other, and there are no vertical grid lines.

Oh for a simple HTML table!

reply
resonious
21 days ago
[-]
> Python was the slowest language in the benchmark, and yet at the same time it’s the most used language on Github as of October 2024.

Interesting that there seems to be a correlation between a language being slow and it being popular.

reply
_kb
21 days ago
[-]
Now do it again, but include compile time and amortise across the number of executions expected for that specific build.

I say this as a pretty deep rust fanatic. All languages (and runtimes, interpreters, and compilers) are tools. Different problems and approaches to solving them benefit from having a good set at your disposal.

If you're building something that may only run a handful of times (which a lot of python, R, et al programs include) slow execution doesn't matter.

reply
VeejayRampay
21 days ago
[-]
it's like food, people like it way more when you put sugar on top

by and large, Ruby is slow, but damn is it nice to code with, which is more appealing for newcomers

reply
Alifatisk
20 days ago
[-]
I think, for being a interpreted language, Ruby is quite fast now.
reply
pjmlp
20 days ago
[-]
Because now a JIT is part of the picture, as it should be in any dynamic language that isn't only meant for basic scripting tasks.
reply
KerrAvon
20 days ago
[-]
Ruby was always faster than people gave it credit for.
reply
igouy
20 days ago
[-]
Not really.

Work has been done to make faster Ruby language implementations.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

reply
VeejayRampay
20 days ago
[-]
of course sorry, you're right, what I meant is that it's rather slow _in the grand scheme of things_
reply
jb1991
21 days ago
[-]
Slower languages are higher level thus easier to use.
reply
igouy
20 days ago
[-]
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Program performance is associated with the specific programming language implementation and the specific program implementation.

reply
jb1991
20 days ago
[-]
No one takes that site too seriously when judging real world programming, for a variety of reasons.
reply
igouy
20 days ago
[-]
Do you really not understand that the exact same Java programs are likely to be 10x slower without JIT?

Different language implementation, so different performance.

> too seriously

Take those measurements just seriously enough.

reply
remedan
21 days ago
[-]
Does that correlation hold if you look at let's say the top 20 popular languages?
reply
sigzero
20 days ago
[-]
No, because Java is #2, C++ is #4, C# is #7. People just really like Python for what it brings to the table.
reply
smileson2
21 days ago
[-]
Game changing for my advent of code solutions which look surprisingly similar
reply
knowitnone
21 days ago
[-]
I'm a little surprised that Node is beating Deno. Interesting that Java would be faster than Kotlin since both run on jvm.
reply
wiseowise
21 days ago
[-]
“Faster”.

> Ran each three times and used the lowest timing for each. Timings taken on an M3 Macbook pro with 16 gb RAM using the /usr/bin/time command. Input value of 40 given to each.

Not even using JMH. I highly doubt accuracy of the “benchmark”.

reply
pjmlp
21 days ago
[-]
That is one of the differences between a platform systems language, and guest languages.

You only have to check the additional bytecode that gets generated, to work around the features not natively supported.

reply
wiseowise
21 days ago
[-]
Which difference? It is literally same code, it doesn’t even use any Kotlin std goodies.
reply
pjmlp
21 days ago
[-]
Yet, they don't generate the same bytecode, and that matters.

https://godbolt.org/z/h4dofq3Wq

reply
entropicdrifter
21 days ago
[-]
I mean, the JVM's been optimized specifically for Java since the Bronze Ages at this point, it's not that surprising
reply
coliveira
21 days ago
[-]
Although being slow, Python has a saving grace: it doesn't have a huge virtual machine like Java, so it can in many situations provide a better experience.
reply
igouy
21 days ago
[-]
Does JavaME have a "huge virtual machine" ?

https://www.oracle.com/java/technologies/javameoverview.html

Do you mean CPython or PyPy or MicroPython or ?

reply
coliveira
21 days ago
[-]
> Does JavaME have a "huge virtual machine"

Yes, compared to Python.

> Do you mean CPython or PyPy

Python standard virtual machine is called CPython, just look at the official web page.

reply
igouy
21 days ago
[-]
I imagine we need a nuts and bolts definition of "virtual machine" before we can make a comparison.
reply
ksec
21 days ago
[-]
>This got me thinking that it would be interesting to see a kind of “YJIT standard library” emerge, where core ruby functionality run in C could be swapped out for Ruby implementations for use by people using YJIT.

This actually makes me feel sad because it reminded me of Chris Seaton. The idea isn't new and Chris has been promoting it during his time working on TruffleRuby. I think the idea goes back even further to Rubinius.

It is also nice to see TruffleRuby being very fast and YJIT still has lots of headroom to grow. I remember one obstacle with it running rails was memory usage. I wonder if that is still the case.

reply
Asmod4n
21 days ago
[-]
One of the amazing things truffle ruby does is handle c extensions like ruby code, meaning C is interpreted and not compiled in a traditional sense.

This makes way for jitting c code to make it way faster than the author has written it.

reply
pantulis
21 days ago
[-]
Amazing indeed!
reply
0x457
21 days ago
[-]
Yup, Rubinius was probably the most widely known implementation of Ruby's standard library in Ruby. Too bad it was slower than MRI.
reply
Lio
21 days ago
[-]
I think jRuby takes a similar approach.

It’s possible to write gems which will use underlying C on MRI or Java when running on jRuby.

It would be interesting to know if a “pure” would also help jRuby too.

reply
e12e
21 days ago
[-]
I thought maybe mruby had a mostly ruby stdlib - but I guess it's c ported over from mri?
reply
jerf
21 days ago
[-]
"In most ways, these types of benchmarks are meaningless. Python was the slowest language in the benchmark, and yet at the same time it’s the most used language on Github as of October 2024."

First, this indicates some sort of deep confusion about the purpose of benchmarks in the first place. Benchmarks are performance tests, not popularity tests. And I don't think I'm just jumping on a bit of bad wording, because I see this idea in its various forms a lot poking out in a lot of conversations. Python is popular because there are many aspects to it, among which is the fact that yes, it really is a rather slow language, but the positives outweigh it for many purposes. They don't cancel it. Python's other positive aspects do not speed it up; indeed, they're actually critically tied to why it is slow in the first place. If they were not, Python would not be slow. It has had a lot of work done on it over the years, after all.

Secondly, I think people sort of chant "microbenchmarks are useless", but they aren't useless. I find that microbenchmark actually represents some fairly realistic representation of the relative performance of those various languages. What they are not is totally determinative. You can't divide one language's microbenchmark on this test by another to get a "Python is 160x slower than C". This is, in fact, not an accurate assessment; if you want a single unified number, 40-50 is much closer. But "useless" is way too strong. No language is so wonderful on all other dimensions that it can have something as basic as a function call be dozens of times slower than some other language and yet keep up with that other language in general. (Assuming both languages have had production-quality optimizations applied to them and one of them isn't some very very young language.) It is a real fact about these languages, it is not a huge outlier, and it is a problem I've encountered in real codebases before when I needed to literally optimize out function calls in a dynamic scripting language to speed up certain code to acceptable levels, because function calls in dynamic scripting languages really are expensive in a way that really can matter. It shouldn't be overestimated and used to derive silly "x times faster/slower" values, but at the same time, if you're dismissing these sorts of things, you're throwing away real data. There are no languages that are just as fast as C, except gee golly they just happen to have this one thing where function calls are 1000 times slower for no reason even though everything else is C-speed. These performance differences are reasonably correlated.

reply
mlyle
21 days ago
[-]
> First, this indicates some sort of deep confusion about the purpose of benchmarks in the first place. Benchmarks are performance tests, not popularity tests.

I don't think it indicates a deep confusion. I think it leaves a simple point unsaid because it's so strongly implied (related to what you say):

Python may be very low in benchmarks, but clearly it has acceptable performance for a very large subset of applications. As a result, a whole lot of us can ignore the benchmarks.

Even in domains where one would have shuddered at this before. My students are launching a satellite into low earth orbit that has its primary flight computer running python. Yes, sometimes this does waste a few hundred milliseconds and it wastes several milliwatts on average. But even in the constrained environment of a tiny microcontroller in low earth orbit, language performance doesn't really matter to us.

We wouldn't pay any kind of cost (financial or giving up any features) to make it 10x better.

reply
jerf
21 days ago
[-]
I wouldn't jump on it except for the number of times I've been discussing this online and people completely seriously counter "Python is a fairly slow language" with "But it's popular!"

Fuzzy one-dimensional thinking that classifies languages on a "good" and "bad" axis is quite endemic in this industry. And for those people, you can counter "X is slow" with "X has good library support", and disprove "X lacks good tooling" with "But X has a good type system", because all they hear is that you said something is "good" but they have a reason why it's "bad", or vice versa.

Keep an eye out for it.

reply
ModernMech
21 days ago
[-]
"My students" - so there's really nothing on the line except a grade then, yeah? That's why you wouldn't pay any cost to make it 10x better, because there's no catastrophic consequence if it fails. But sometimes wasting a few milliwatts on average is the difference between success and failure.

I've built an autonomous drone using Matlab. It worked but it was a research project, so when it came down to making the thing real and putting our reputation on the line, we couldn't keep going down that route -- we couldn't afford the interpreter overhead, the GC pauses, and all the other nonsense. That aircraft was designed to be as efficient as possible, so we could literally measure the inefficiency from the choice of language in terms of how much it cost in extra battery weight and therefore decreased range.

If you can afford that, great, you have the freedom to run your satellite in whatever language. If not, then yeah you're going to choose a different language if it means extra performance, more runtime, greater range, etc.

reply
mlyle
21 days ago
[-]
> "My students" - so there's really nothing on the line except a grade then, yeah? That's why you wouldn't pay any cost to make it 10x better, because there's no catastrophic consequence if it fails. But sometimes wasting a few milliwatts on average is the difference between success and failure.

Years of effort from a large team is worth something, as is the tens of thousands of dollars we're spending. We expect a return on that investment of data and mission success. We're spending a lot of money to improve odds of success.

But even in this power constrained application, a few milliwatts is nothing. (Nearly half the time, it's literally nothing, because we'd have to use power to run heaters anyways. Most of the rest of the time, we're in the sun, so there's a lot of power around, too). The marginal benefit to saving a milliwatt is zero, so unless the marginal cost is also zero we're not doing it.

> That aircraft was designed to be as efficient as possible, so we could literally measure the inefficiency from the choice of language in terms of how much it cost in extra battery weight and therefore decreased range

If this is a rotorcraft of some sort, that seems silly. It's hard to waste enough power to be more than rounding error compared to what large brushless motors take.

reply
ModernMech
21 days ago
[-]
If you have enough power from the sun and enough compute, are you really that resource constrained?

Let me ask you, why do you think most real-time mission critical projects are not typically done in Python?

> If this is a rotorcraft of some sort, that seems silly. It's hard to waste enough power to be more than rounding error compared to what large brushless motors take.

It was a glider trying to fly as long as possible, so no motors, no solar power either. It got to the point that we could not even execute the motion planner fast enough in Matlab given the performance demands of the craft, we had to resort to Mex, and at that point we might as well have been writing in C. Which we did.

reply
igouy
21 days ago
[-]
otoh When performance doesn't matter, it doesn't matter.

otoh When the title is "Speeding up Ruby" we are kind-of presuming it matters.

reply
grumpyprole
21 days ago
[-]
> My students are launching a satellite into low earth orbit that has its primary flight computer running python. Yes, sometimes this does waste a few hundred milliseconds

Never mind performance, would it not be good to at least machine check some static properties? A dynamic language is not a good choice for anything mission critical IMHO.

reply
wiseowise
21 days ago
[-]
Python has had since Mypy and Pyright since forever.
reply
grumpyprole
21 days ago
[-]
Even with those retrofits, it's still a language designed for maximum flexibility and maximum ease of use. This has trade offs with regard to reasoning for correctness.
reply
wiseowise
21 days ago
[-]
What’s your point? That their type checks are incomplete?
reply
grumpyprole
20 days ago
[-]
That Python makes the wrong trade-offs for mission critical software. This goes beyond just lacking static types.
reply
wiseowise
20 days ago
[-]
Which trade offs do you make when you opt in for full static typing? Except for performance.
reply
pjmlp
21 days ago
[-]
It does help that the Python ecosystem sees C and Fortran as being "Python".
reply
igouy
21 days ago
[-]
> people sort of chant "microbenchmarks are useless", but they aren't useless.

They might be !

(They aren't necessarily useless. It depends. It depends what one is looking for. It depends etc etc)

> You can't divide one language's microbenchmark on this test by another to get a "Python is 160x slower than C".

Sure you can !

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

— and —

Table 4, page 139

https://dl.acm.org/doi/pdf/10.1145/3687997.3695638

— and then one has — "[A] Python is 160x slower than C" not "[THE] Python is 160x slower than C".

Something multiple and tentative not something singular and definitive.

reply
lmm
21 days ago
[-]
> Benchmarks are performance tests, not popularity tests.

But presumably they're meant to test something that matters. And the popularity suggests that what's being tested in this case doesn't.

> But "useless" is way too strong. No language is so wonderful on all other dimensions that it can have something as basic as a function call be dozens of times slower than some other language and yet keep up with that other language in general.

And yet Python does keep up with C in general. You might object that when a Python-based system outperforms a C-based system it's not running the same algorithm, or it's not really Python, and that would be technically true, but seemingly not in a way that matters.

> if you're dismissing these sorts of things, you're throwing away real data

Everything is data. The most important part of programming is often ignoring the things that aren't important.

reply
chikere232
21 days ago
[-]
very true.

Also, for a lot of the areas where languages like python or ruby aren't great choices because of performance, they would also not be great choices because of the cost of maintaining untyped code, or in python's case the cost of maintaining code in a language that keeps making breaking changes in minor versions.

Script with scripting languages, build other things in other languages

reply
ribadeo
21 days ago
[-]
It seems odd to willfulky ignore Crystal language when discussing Ruby and speeding it up. Granted, macro semantics mean something else, more like c macros, but the general syntax and flow of Crystal is basically Ruby. https://crystal-lang.org/

Amber and Lucky are 2 mature frameworks to give Rails a run for their money, and Kemal is your Sinatra.

https://docs.amberframework.org/amber

https://luckyframework.org/

https://kemalcr.com/

reply
hamandcheese
20 days ago
[-]
Crystal is not Ruby. Full stop. It is not useful to anyone with an existing Ruby code base.

Mentioning Crystal would be odd since it has nothing to do with the article.

reply
Alifatisk
20 days ago
[-]
Will these Crystal frameworks allow me to share a single standalone binary with peers that allows them run the web application locally?
reply
norman784
20 days ago
[-]
As per this article[0] seems that crystal produces statically linked binaries, so I think the answer is yes.

[0] https://crystal-lang.org/2020/02/02/alpine-based-docker-imag...

reply
Alifatisk
20 days ago
[-]
Woah, what a luxury
reply
davidw
21 days ago
[-]
It seems like it's been a while since I've seen one of these language benchmark things.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/... seems like the latest iteration of what used to be a pretty popular one, now with fewer languages and more self-deprecation.

reply
igouy
21 days ago
[-]
> fewer languages

Maybe you've only noticed the dozen in-your-face on the home page?

The charts have shown ~27 for a decade or so.

There's another half-dozen more in the site map.

reply
igouy
21 days ago
[-]
> a fun visualization of each language’s performance

The effect is similar to dragging a string past a cat: complete distraction — unable to avoid focusing on the movement — unable to extract any information from the movement.

To understand the measurements, cover the "fun visualization" and read the numbers in the single column data table.

(Unfortunately we aren't able to scan down the column of numbers, because the language implementation name is shown first.)

Previously: <blink>

https://developer.mozilla.org/en-US/docs/Glossary/blink_elem...

reply
chikere232
21 days ago
[-]
It does visualise how big the difference is though
reply
igouy
21 days ago
[-]
Cover up the single column of lang/secs and then try to read how big the difference is between java and php from the moving circles.

You would have no problem doing that with a [typo histogram should say bar chart].

reply
MeetingsBrowser
21 days ago
[-]
Cover the labels on the histogram and try to read how big the difference is between java and php....
reply
igouy
21 days ago
[-]
We can read the relative difference from the length of the bars because the bars are stable.
reply
MeetingsBrowser
21 days ago
[-]
I can see the relative difference in speed between the two balls.
reply
igouy
21 days ago
[-]
"The first principle is that you must not fool yourself and you are the easiest person to fool."

:-)

reply
chikere232
21 days ago
[-]
PHP looks much slower
reply
igouy
21 days ago
[-]
The question is: How much slower?

We could try to count how many times the java circle crosses left-to-right and right-to-left, in the time it takes for the PHP circle to cross left-to-right once.

That's error prone but should be approximately correct after a couple of attempts.

That's work we're forced to do because the "fun visualization" is uninformative.

reply
chikere232
20 days ago
[-]
That might be your question, but then you can look at the numbers. No chart will be as exact.
reply
igouy
20 days ago
[-]
If only we could look at the numbers without the uninformative distraction.
reply
chikere232
19 days ago
[-]
I found the animation informative
reply
igouy
19 days ago
[-]
> I found the animation informative

Java was so fast it glowed orange!

I wonder if the distraction of the animation actually makes people slower at reading the information that is in the text column.

The animation serves its purpose -- it grabs attention.

reply
tiffanyh
21 days ago
[-]
Dart - I see it mentioned (and perf looks impressive), but is it widely adopted?

Also, would have loved to see LuaJIT (interpreted lang) & Crystal (static Ruby like language) included just for comparison sake.

reply
suby
21 days ago
[-]
It looks like a more complete breakdown is here. Crystal ranks just below Dart at 0.5413 (Dart was 0.5295). Luajit was 0.8056. I'm surprised Luajit does worse than Dart. Actually I am surprised Dart is beating out languages like C# too.

http://benjdd.com/languages2

reply
saurik
21 days ago
[-]
Dart's VM was designed by the team (I think not just the one guy, but maybe I'm wrong on that and it really is just Lars Bak) that designed most of the truly notable VMs that have ever existed: Self, Smalltalk Strongtalk, Java Hotspot, and JavaScript V8. It also features an ahead-of-time compiler mode in addition to a world-class JIT and interpreter, allowing for hot reload during development.

https://en.m.wikipedia.org/wiki/Lars_Bak_(computer_programme...

It was stuck with a bad rep for being the language that was never going to replace JavaScript in the browser, and then was merely a transpiler no one was going to use, before it found a new life as the language for Flutter, which has driven a lot of its syntax and semantics improvements since, with built-in VM support for extremely efficient object templating (used by the reactive UI framework).

reply
igouy
21 days ago
[-]
Maybe that dozen lines of code isn't sufficient to characterize performance differences?

Nearly 25 years ago, nested loops and fibs.

https://web.archive.org/web/20010424150558/http://www.bagley...

https://web.archive.org/web/20010124092800/http://www.bagley...

It's been a long time since the benchmarks game showed those.

reply
neonsunset
21 days ago
[-]
This nested loops microbenchmark only measures in-loop integer division optimizations on ARM64 - there are division fault handling differences which are ARM64 specific which introduce significant variance between compilers of comparable capability.

On x86_64 I expect the numbers would have been much closer and within measurement error. The top half is within 0.5-0.59s - there really isn't much you can do inside such a loop, almost nothing happens there.

As Isaac pointed out in a sibling comment - it's best to pick specific microbenchmarks, a selection of languages and implementations that interest you and dissect those - it will tell you much more.

reply
lern_too_spel
21 days ago
[-]
Runtime startup isn't amortized.
reply
igouy
21 days ago
[-]
How do you know?
reply
lern_too_spel
21 days ago
[-]
The methodology is documented in the link of the comment I responded to.
reply
igouy
21 days ago
[-]
Perhaps you mean that "the methodology" does not include an explicit step intended to amortize runtime startup.

Perhaps the tiny tiny programs none-the-less took enough time that startup was amortized.

reply
ModernMech
21 days ago
[-]
I wonder why C++ isn't in that list but a bunch of languages no one uses are.
reply
Alifatisk
21 days ago
[-]
Been using pure Dart since last year, it's a lovely language that has it's quirks. I like it.

It's fast and flexible.

reply
contagiousflow
21 days ago
[-]
Have you used it for anything other than Flutter? I recently did a Flutter project and I'm interested in using dart more now.
reply
Alifatisk
21 days ago
[-]
Yes, that's what I meant with pure Dart. I've created cli's with it and a little api-only server.
reply
igouy
21 days ago
[-]
reply
coliveira
21 days ago
[-]
This kind of benchmark doesn't make sense for Python because it is measuring the speed of pure code written in the language. However, and here is the important point, most python code rely on compiled libraries to run fast. The heavy lifting in ML code is done in C, and Python is used only as a glue language. Even for web development this is also the case, Python is only calling a bunch of libraries, many of those being written in C.
reply
chucke
21 days ago
[-]
That's not true. Sure, many hot path functions dealing with tensor calculations are done in numpy functions, but etl and args/results are python objects and functions. And most web development libs are pure python (flask, django, etc)
reply
coliveira
21 days ago
[-]
For performance, hot paths are the only ones that matter.
reply
IshKebab
21 days ago
[-]
Sure, but only a small subset of problems have a hot path. You can easily offload huge tensor operations to C. That's the best possible case. More usually the "hot path" is fairly evenly distributed through your entire codebase. If you offload the hot path to C you'll end up rewriting the whole thing in C.
reply
coliveira
21 days ago
[-]
> "hot path" is fairly evenly distributed

No, hot paths are seldom fairly evenly distributed, even on non-numeric applications. In most cases they will be in a small number of locations.

reply
IshKebab
21 days ago
[-]
Not in my experience.
reply
dragonwriter
21 days ago
[-]
Yeah, this is a benchmark of recursion and tight loops doing integer math on array members. Nontrivial recursion is nonidiomatic in Python, and tight loops doing integer math on array members will probably be done via one of the many libraries that do one or more of optimizing, jitting, or move those to GPU (Numpy, Taichi, Numba, etc.)
reply
igouy
21 days ago
[-]
aka Python is as fast as C when it is C.
reply
int_19h
21 days ago
[-]
Any language with FFI (which is like all of them, these days) has the same exact issue, the only difference being how common it is to drop into C or other fast compiled language for parts of the code.

And this kind of benchmark is the one that tells you why this is different across different languages.

reply
knowitnone
21 days ago
[-]
I don't know. Ruby is able to call C too so it's a wash?
reply
kreetx
21 days ago
[-]
Yet this particular blog post shows how Ruby-writen-in-Ruby is faster than Ruby-written-in-C because it's more optimizable.
reply
ModernMech
21 days ago
[-]
Yes, if you pull out all the optimization tricks for Python, it will be faster than vanilla Python. And yet it's still 6x slower (by my measurement) than naive code written in a compiled language like Rust without any libraries.
reply