FilterHN

Converting an Integer to a Decimal String in Under Two Nanoseconds

50 points

by mpweiher

4 days ago

| past

| 6 comments

| onlinelibrary.wiley.com

| HN

▲

Nokinside

2 hours ago

[-]

Sounds familiar. If one of the authors Lemire? Of course.

SIMD-accelerated integer-to-string conversion https://lemire.me/blog/2026/05/18/simd-accelerated-integer-t...

Other speedy things:

On-Demand JSON: A Better Way to Parse Documents? https://lemire.me/en/publication/arxiv231217149/

Parsing Millions of URLs per Second https://lemire.me/en/publication/arxiv231110533/

Transcoding Unicode Characters with AVX-512 Instructions https://lemire.me/en/publication/arxiv221205098/

▲

childintime

1 hour ago

[-]

What will be the lifetime of AVX512? There have been many similar extensions before it. So it's a great result, but heavily marked by the target platform. I have the hope that RISC-V vector extensions will prove to be the more durable substrate to develop on, and a result there would be much more relevant for the future.

▲

simonask

51 minutes ago

[-]

It will be literal decades before RISC-V becomes mainstream. Not because it’s not a perfectly fine ISA, but because business incentive structures aren’t nowhere near supporting it.

Literal man-millennia have been poured into writing software for both x86 and ARM, and nobody seems close to designing a competitive RISC-V chip.

▲

xlii

3 hours ago

[-]

I wonder if this can be categorized as galactic algorithm. I can't imagine systems where bulk of processing goes into integer to decimal string conversion but maybe there are such.

https://en.wikipedia.org/wiki/Galactic_algorithm

▲

oersted

2 hours ago

[-]

My understanding of a Galactic Algorithm is that it has better performance scaling based on input size/complexity, but its overhead is such that it will not actually be faster unless you use it for impracticality large inputs.

I don’t think it has much to do with the case of an algorithm that offers a faster solution to a problem that is rarely a bottleneck (not sure if that’s true in this case anyway).

▲

Tuna-Fish

2 hours ago

[-]

It takes a substantial amount of time when emitting lots of numbers in JSON, happens very commonly.

And this algorithm has low constant costs, and does not take dramatically more icache than the simple versions. There is no reason not to use this if your compile target can handle avx-512.

▲

superjan

33 minutes ago

[-]

It’s faster for 3 digits and more. 3 digits is not galactic scale. Otoh, if over half of your numbers are single digits, it will lose to other implementations. I think that is more often the case that we’d like it to be.

▲

adrian_b

2 hours ago

[-]

I always use binary interchange formats between programs so I am not familiar with the overhead caused by format conversions. Even when displaying numbers for reading them, in the case of floating-point numbers that are displayed in the "scientific" format, i.e. with exponents, I prefer to have only the exponent as a decimal number, but the significand as a hexadecimal number. So I do not need fast algorithms for number conversions.

Nonetheless, there are plenty of people who advocate the use of JSON, XML and similar formats, in which case I assume that number conversions can take a non-negligible time, which might be decreased by such fast algorithms.

▲

superjan

23 minutes ago

[-]

You know, if can change code without overhead to ends of the pipeline, using the language & library of my choice, I’d do this too. For many of us this isn’t always the case.

▲

Cold_Miserable

3 hours ago

[-]

This is just a worse copy of the original ifma method. Sneller is even better for max throughput.

▲

IshKebab

3 hours ago

[-]

Very impressive! But yeah AVX-512 is an awkward requirement.

▲

adrian_b

2 hours ago

[-]

There already exists a large installed base of AMD Zen 4 and Zen 5 CPUs.

Next year, these AVX-512 supporting CPUs will be joined by AMD Zen 6 and Intel Nova Lake. Starting with Intel Nova Lake, all future Intel CPUs will support AVX-512.

▲

IshKebab

54 minutes ago

[-]

Sure, it's not just the support though. As I understand it it also has serious power and frequency implications. Also if your process uses AVX-512 you suddenly have an extra 2kB of data to save/restore on context switches. Maybe not super significant but I really doubt this will ever make it into standard libraries.

▲

jqpabc123

4 days ago

[-]

Our design exploits the AVX-512 instruction set

AVX-512 is being discontinued in newer Intel consumer CPUs, particularly with the Alder Lake series, where it has been completely disabled through BIOS updates.

▲

adrian_b

3 hours ago

[-]

Your comment is obsolete.

AVX-512 had been discontinued in the CPU generations from Alder Lake until the Panther Lake, Wildcat Lake and Clearwater Forest CPUs introduced during the first half of 2026, but Intel has committed than all future Intel CPUs will implement the complete 512-bit variant of the AVX-512 a.k.a. AVX10 ISA, starting with the Nova Lake desktop and laptop CPUs, to be launched by the end of this year.

Obviously, the competition from the AMD Zen 4, Zen 5 and Zen 6 CPUs, all of which implement AVX-512 and easily beat any Intel CPU in any workload that has been updated to use the AVX-512 ISA, has forced Intel to reconsider its previous decision.

▲

anematode

4 hours ago

[-]

To the contrary, Nova Lake, coming out this year, will have it.

▲

yvdriess

3 hours ago

[-]

And that's a shame, but the relevant workloads typically run on server class CPUs.

▲

adrian_b

3 hours ago

[-]

From all the workloads that I execute on my laptops or desktops, there is only one where the speed matters yet it is not significantly affected by the use of the AVX-512 ISA: the compilation of big software projects.

All the other things that I do and which can take a noticeable CPU time (i.e. not time used for waiting on SSDs or other peripherals) can be accelerated by AVX-512. This includes things like computing file hashes, data compression and encryption algorithms, graphics/audio/video algorithms and also EDA/CAD applications.