FilterHN

orliesaurus

4 hours ago

[-]

The most surprising part of uv's success to me isn't Rust at all, it's how much speed we "unlocked" just by finally treating Python packaging as a well-specified systems problem instead of a pile of historical accidents. If uv had been written in Go or even highly optimized CPython, but with the same design decisions (PEP 517/518/621/658 focus, HTTP range tricks, aggressive wheel-first strategy, ignoring obviously defensive upper bounds, etc.), I strongly suspect we'd be debating a 1.3× vs 1.5× speedup instead of a 10× headline — but the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?

Nextgrid

3 hours ago

[-]

It's not just greenfield-ness but the fact it's a commercial endeavor (even if the code is open-source).

Building a commercial product means you pay money (or something they equally value) to people to do your bidding. You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).

In this case it's a company that believes they can make a "good" package manager they can sell/monetize somehow and so built that "good" package manager. Turns out it's at least good enough that other people now like it too.

This would never work in a FOSS world because the project will be stuck in endless planning as everyone will have an opinion on how it should be done and nothing will actually get done.

Similar story with systemd - all the bitching you hear about it (to this day!) is the stuff that would've happened during its development phase had it been developed as a typical FOSS project and ultimately made it go nowhere - but instead it's one guy that just did what he wanted and shared it with the world, and enough other people liked it and started building upon it.

3 hours ago

[-]

> You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).

Money is indeed a great lubricator.

However, it's not black-and-white: office politics is a long standing term for a reason.

Nextgrid

3 hours ago

[-]

Office politics happen when people determine they can get more money by engaging in politics instead of working. This is just an indicator people aren't being paid enough money (since people politicking around is detrimental to the company, it is better off paying them whatever it takes for them not to engage in such behavior). "You get what you pay for" applies yet again.

dpark

2 hours ago

[-]

Politicking is just group dynamics. In large companies people engage in politics because it becomes necessary to accomplish large things.

Of course a group can also have bad actors but that’s not really an issue with politics specifically. Politics are neither good nor bad.

xvector

2 hours ago

[-]

Hard disagree, most of my coworkers make well north of $1M and office politics is at an all time high.

I believe office politics happens when there are simply too many people at a company or org.

optionalsquid

2 hours ago

[-]

Office politics happen when the number of people at an office exceeds 2

Nextgrid

2 hours ago

[-]

I think too many people happens because a company would rather hire 10 "market rate" people than 3 well-compensated ones. Headcount inflation dilutes responsibility and rewards, so even if one of the "market rate" guys does the best work possible they won't get rewarded proportionally... so if hard work isn't going to get them adequate comp, maybe politics will.

ngcc_hk

1 hour ago

[-]

I believe incompetence is the key. When someone cannot compete (or the office does not use yardstick that can be measurable) politics is the only way to get you up.

Switch to what Nobel prize to man instead of the woman who do the work … sometimes. Take the credit and get the promotion.

another-account

1 hour ago

[-]

Sounds like you’re really down on FOSS and think FOSS projects don’t get stuff done and have no success? You might want to think about that a bit more.

insane_dreamer

1 hour ago

[-]

numpy would like a word

baby_souffle

4 hours ago

[-]

I largely agree but don't want to entirely discount the effect that using a compiled language had.

At least in my limited experience, the selling point with the most traction is that you don't already need a working python install to get UV. And once you have UV, you can just go!

If I had a dollar for every time I've helped somebody untangle the mess of python environment libraries created by an undocumented mix of python delivered through the distributions package management versus native pip versus manually installed...

At least on paper, both poetry and UV have a pretty similar feature set. You do however need a working python environment to install and use poetry though.

mkoubaa

3 hours ago

[-]

1000% this. uv is trivially installable and is completely unrelated to installations of python.

3 hours ago

[-]

I wonder how much Rust's default to statically link almost everything helped here? That should make deployment of uv even easier?

collinmanderson

3 hours ago

[-]

> the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?

I think there's a few things going on here:

- If you're going have a project that's obsessed with speed, you might as well use rust/c/c++/zig/etc to develop the project, otherwise you're always going to have python and the python ecosystem as a speed bottleneck. rust/c/c++/zig ecosystems generally care a lot about speed, so you can use a library and know that it's probably going to be fast.

- For example, the entire python ecosystem generally does not put much emphasis on startup time. I know there's been some recent work here on the interpreter itself, but even modules in the standard library will pre-compile regular expressions at import time, even if they're never used, like the "email" module.

- Because the python ecosystem doesn't generally optimize for speed (especially startup), the slowdowns end up being contagious. If you import a library that doesn't care about startup time, why should your library care about startup time? The same could maybe be said for memory usage.

- The bootstrapping problem is also mostly solved by using a complied language like c/rust/go. If the package manager is written in python (or even node/javascript), you first have to have python+dependencies installed before you can install python and your dependencies. With uv, you copy/install a single binary file which can then install python + dependencies and automatically do the right thing.

- I think it's possible to write a pretty fast implementation using python, but you'd need to "greenfield" it by rewriting all of the dependencies yourself so you can optimize startup time and bootstrapping.

- Also, as the article mentions there are _some_ improvements that have happened in the standards/PEPs that should eventually make they're way into pip, though it probably won't be quite the gamechanger that uv is.

edoceo

4 hours ago

[-]

Consensus building and figuring out what was actually needed?

Someone on this site said most tech problems are people problems - this feels like one.

Greenfield mostly solves the problem because it's all new people.

morshu9001

2 hours ago

[-]

I can't find the quote for this, but I remember Python maintainers wanted package installing and management to be separate things. uv did the opposite, and instead it's more like npm.

MBCook

19 minutes ago

[-]

Do you remember the reason? I spend most of my time in the Java and JS ecosystems where one tool does both jobs.

In my mind they’re pretty heavily linked. But that may be based on not experiencing the opposite. At least not as far as I can remember.

3 hours ago

[-]

> That feels like cargo-culting the toolchain [...]

Pun intended?

Jokes aside, what you describe is a common pattern. It's also why Google internally they used to get decent speedups from rewriting some old C++ project in Go for a while: the magic was mostly in the rewrite-with-hindsight.

If you put effort into it, you can also get there via an incremental refactoring of an existing system. But the rewrite is probably easier to find motivation for, I guess.

ChadNauseam

1 hour ago

[-]

I don't know the problem space and I'm sure that the language-agnostic algorithmic improvements are massive. But to me, there's just something about rust that promotes fast code. It's easy to avoid copies and pointer-chasing, for example. In python, you never have any idea when you're copying, when you're chasing a pointer, when you're allocating, and so on. (Or maybe you do, but I certainly don't.) You're so far from hardware that you start thinking more abstractly and not worrying about performance. For some things, that's probably perfect. But for writing fast code, it's not the right mindset.

JasonSage

3 hours ago

[-]

I suspect that the non-Rust improvements are vastly more important than you’re giving credit for. I think the go version would be 5x or 8x compared to the 10x, maybe closer. It’s not that the Rust parts are insignificant but the algorithmic changes eliminate huge bottlenecks.

3 hours ago

[-]

Though Rust probably helps getting the design right, instead of fighting it.

From having sum-types to also having a reasonable packaging system itself.

benreesman

3 hours ago

[-]

I have been a big Astral and uv booster for a long time. But specifications like this one: https://gist.github.com/b7r6/47fea3c139e901cd512e15f42355f26... have me re-evaluating everything.

That's TensorRT-LLM in it's entirety at 1.2.0rc6 locked to run on Ubuntu or NixOS with full MPI and `nvshmem`, the DGX container Jensen's Desk edition (I know because I also rip apart and `autopatchelf` NGC containers for repackaging on Grace/SBSA).

It's... arduous. And the benefit is what exactly? A very mixed collection of maintainers have asserted that software behavior is monotonic along a single axis most of which they can't see and we ran a solver over those guesses?

I think the future is collections of wheels that have been through a process the consumer regards as credible.

jeeeb

2 hours ago

[-]

> That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?

This feels like a very unfair take to me. Uv didn’t happen in isolation, and wasn’t the first alternative to pip. It’s built on a lot of hard work by the community to put the standards in place, through the PEP process, that make it possible.

What uv did was to bring it all together.

moab

1 hour ago

[-]

The point stands that it's less about the language than doing said hard work in any reasonable programming language.

7 hours ago

[-]

I think this post does a really good job of covering how multi-pronged performance is: it certainly doesn't hurt uv to be written in Rust, but it benefits immensely from a decade of thoughtful standardization efforts in Python that lifted the ecosystem away from needing `setup.py` on the hot path for most packages.

glaslong

6 hours ago

[-]

Someone once told me a benefit of staffing a project for Haskell was it made it easy to select for the types of programmers that went out of their way to become experts in Haskell.

Tapping the Rust community is a decent reason to do a project in Rust.

bri3d

6 hours ago

[-]

It's an interesting debate. The flip side of this coin is getting hires who are more interested in the language or approach than the problem space and tend to either burn out, actively dislike the work at hand, or create problems that don't exist in order to use the language to solve them.

With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.

The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)

tikhonj

3 hours ago

[-]

If you're doing something forgettable, what makes you think the workaday Java or Python programmer would find it innately motivating?

Alternately, if you have the sort of work or culture that taps into people's intrinsic motivation, why would that work worse with Haskell or Clojure programmers than anybody else?

People are interested in different things along different dimensions. The way somebody is motivated by what they're doing and the way somebody is motivated by how they're doing it really don't seem all that correlated to me.

mannycalavera42

5 hours ago

[-]

Clojure engineers not interested in doing work? That's surprising

lll-o-lll

4 hours ago

[-]

When people say things like:

> there was way too much academic interest in language concepts and way too little practical interest in doing work.

They are communicating something real, but perhaps misattributing the root cause.

The purely abstract ‘ideal’ form of software development is unconstrained by business requirements. In this abstraction, perfect software would be created to purely express an idea. Academia allows for this, and to a lesser extent some open source projects.

In the real world, the creation of software must always be subordinate to the goals of the business. The goals are the purpose, and the software is the means.

Languages that are academically interesting, unsurprisingly, attract a greater preponderance of academically minded individuals. Of these, only a percentage have the desire or ability to let go of the pure abstract, and instead focus on the business domain. So it inevitably creates a management challenge; not an insurmountable one, but a challenge.

Hence the simplified ‘these people won’t do the work!’.

bri3d

1 hour ago

[-]

Yes, exactly this. I don’t feel that I misattributed anything, but if I had to expound on the idea this is exactly how I would explain it.

Calavar

6 hours ago

[-]

Paul Graham said the same thing about Python 20 years ago [1], and back then it was true. But once a programming langauge hits mainstream, this ceases to be a good filter.

[1] https://paulgraham.com/pypar.html

jghn

6 hours ago

[-]

This is important. The benefit here isn't the language itself. It's the fact that you're pulling from an esoteric language. People should not overfit and feel that whichever language is achieving that effect today is special in this regard.

mkoubaa

3 hours ago

[-]

He was right. Python programmers are still the most likely to prioritize getting things done quickly.

tyre

30 minutes ago

[-]

This is a pretty broad generalization!

The fastest iterating people engineers I’ve worked with often have a deep user focus rather than a language affiliation.

rgoulter

16 minutes ago

[-]

Eh.

I think the cultural context has changed.

In "python paradox", 'knows python' is an indication that the developer is interested in something technically interesting but otherwise impractical. Hence, it's a 'paradox' that you end up practically better off by selecting for something impractical.

These days, Python is surely a practical choice, so doesn't really resemble the "interested in something technically interesting but impractical".

discreteevent

4 hours ago

[-]

That was bullshit then and it's bullshit now but it sells very well to people who know a few programming languages (a lot of the people on this site)

steve_adams_86

6 hours ago

[-]

I'm my experience this is definitely where rust shined. The language wasn't really what made the project succeed so much as having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.

Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.

Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.

IgorPartola

5 hours ago

[-]

Rust is also a systems language. I am still wrapping my mind around why it is so popular for so many end projects when its main use case and goals were basically writing a browser a maybe OS drivers.

But that’s precisely why it is good for developer tools. And it turns out people who write systems code are really damn good at writing tools code.

As someone who cut my teeth on C and low level systems stuff I really ought to learn Rust one of these days but Python is just so damn nice for high level stuff and all my embedded projects still seem to require C so here I am, rustless.

aaronblohowiak

2 hours ago

[-]

If python's painpoints don't bother you enough (or you are already comfortable with all the workarounds,) then I'm not sure Rust will do much for you.

What I like about Rust is ADTs, pattern matching, execution speed. The things that really give me confidence are error handling (right balance between "you can't accidentally ignore errors" of checked exceptions with easy escape hatches for when you want to YOLO,) and the rarity of "looks right, but is subtly wrong in dangerous ways" that I ran into a lot in dynamic languages and more footgun languages.

Compile times suck.

IgorPartola

2 hours ago

[-]

I rarely if ever encounter bugs that type checking would have fixed. Most common types of bugs for me are things like forgetting that two different code paths access a specific type of database record and when they do both need to do something special to keep data cohesive. Or things like concurrency. Or worst of all things like fragile subprocesses (ffmpeg does not like being controlled by a supervisor process). I think all in all I have encountered about a dozen bugs in Python that were due to wrong types over the past 17 years of writing code in this language. Maybe slightly more than that in JS. The reason I would switch is performance.

webstrand

4 hours ago

[-]

I write scripts in rust as a replacement for bash. Its really quite good at it. Aside from perl, its the only scripting language that can directly make syscalls. Its got great libraries for: parsing, configuration management, and declarative CLIs built right into it.

Sure its a little more verbose than bash one-liners, but if you need any kind of error handling and recovery, its way more effective than bash and doesn't break when you switch platforms (i.e. mac/bsd utility incompatibilities with gnu utilities).

My only complaint would be that dealing with OsString is more difficult than necessary. Way to much of the stdlib encourages programmers to just do "non-utf8 paths don't exist" and panic/ignore when encountering one. (Not a malady exclusive to rust, but I wish they'd gotten it right)

Example I had handy: <https://gist.github.com/webstrand/945c738c5d60ffd7657845a654...>

6 hours ago

[-]

> having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.... Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.

I genuinely can't understand why you suppose that has to do with the implementation language at all.

tikhonj

3 hours ago

[-]

Different programming languages come with different schools of thought about programming and different communities of practice around programming.

If you take a group of people who are squarely in the enterprise Java school of thought and have them write Rust, the language won't make much of a difference. They will eventually be influenced by the broader Rust community and the Rust philosophy towards programming, but, unless they're already interested in changed approaches, this will be a small, gradual difference. So you'll end up with Enterprise Java™ code, just in Rust.

But if you hire from the Rust community, you will get people who have a fundamentally different set of practices and expectations around programming. They will not only have a stronger grasp of Rust and Rust idioms but will also have explicit knowledge based on Rust (eg Rust-flavored design patterns and programming techniques) and, crucially, tacit knowledge based on Rust (Rust-flavored ways of programming that don't break down into easy-to-explain rules). And, roughly speaking, the same is going to be true for whatever other language you substitute for "Rust".

(I say roughly because there doesn't have to be a 1:1 relationship between programming languages, schools of thought and communities of practice. A single language can have totally different communities—just compare web Python vs data scientist Python—and some communities/schools can span multiple languages. But, as an over-simplified model, seeing a language as a community is not the worst starting point.)

KPGv2

5 hours ago

[-]

> I genuinely can't understand why you suppose that has to do with the implementation language at all.

Languages that attract novice programmers (JS is an obvious one; PHP was one 20 years ago) have a higher noise to signal ratio than one that attracts intermediate and above programmers.

If you grabbed an average Assembly programmer today, and an average JavaScript programmer today, who do you think is more careful about programming? The one who needs to learn arcane shit to do basic things and then has to compile it in order to test it out, or the one who can open up Chrome's console and console.log("i love boobies")

How many embedded systems programmers suck vs full stack devs? I'm not saying full stack devs are inferior. I'm saying that more inferior coders are attracted to the latter because the barriers to entry are SO much easier to bypass.

5 hours ago

[-]

Sure, but that kind of incompetence is already filtered out (in the https://www.lesswrong.com/w/screening-off-evidence sense) by the task of creating a package installer.

IgorPartola

5 hours ago

[-]

You would think so, yet here I am sitting with a node_modules full of crud placed there by npm, waiting for the next supply chain attack.

tacticus

4 hours ago

[-]

npm isn't the issue there it's the ts\js community and their desire to use a library for everything. in communities that do not consider dependencies to be a risk you will find this showing up in time.

The node supply chain attacks are also not unique to node community. you see them happening on crates.io and many other places. In fact the build time scripts that cause issues on node modules are probably worse off with the flexibility of crate build scripts and that they're going to be harder to work around than in npm.

4 hours ago

[-]

I don't see how that follows.

uv doesn't exactly stop python package supply chain attacks...

4 hours ago

[-]

That argument is FUD. The people who created the NPM package manager are not the people who wrote your dependencies. Further, supply chain attacks occur for reasons that are entirely outside NPM's control. Fundamentally they're a matter of trust in the ecosystem — in the very idea of installing the packages in the first place.

7 hours ago

[-]

I think a lot of rust rewrites have this benefit; if you start with hindsight you can do better more easily. Of course, rust is also often beneficial for its own sake, so it's a one-two punch:)

7 hours ago

[-]

Succinctly, perhaps with some loss of detail:

"Rewrite" is important as "Rust".

pixelpoet

6 hours ago

[-]

as important as

3 hours ago

[-]

whoops. That's right

Levitating

7 hours ago

[-]

> I think a lot of rust rewrites have this benefit

I think Rust itself has this benefit

7 hours ago

[-]

Completely agreed!

s_ting765

7 hours ago

[-]

Rust rewrites are known for breaking (compatibility with) working software. That's all there is to them.

6 hours ago

[-]

In Python's case, as the article describes quite clearly, the issue is that the design of "working software" (particularly setup.py) was bad to the point of insane (in much the same way as the NPM characteristics that enabled the recent Shai Hulud supply chain attacks, but even worse). At some point, compatibility with insanity has got to go.

Helpfully, though, uv retains compatibility with newer (but still well-established) standards in the Python community that don't share this insanity!

s_ting765

6 hours ago

[-]

My gripe is with Rust rewrites. Not uv. Though I very much think uv is overhyped.

eduction

5 hours ago

[-]

Actually uv retains compatibility with the setup.py “insanity,” according to the article:

> uv parses TOML and wheel metadata natively, only spawning Python when it hits a setup.py-only package that has no other option

The article implies that pip also prefers toml and wheel metadata, but has to shell out to parse those, unlike uv.

3 hours ago

[-]

Ugh. Thank you for the correction. :(

eduction

2 hours ago

[-]

I mean, you’re on the right track in that they did cut out other insanity. But unclear how much of the speed up is necessarily tied to breaking backward compat (are there a lot of “.egg” files in the wild?)

Lammy

5 hours ago

[-]

I would say the downside of them is that they're known for replacing GPL software with MIT software

psyclobe

4 hours ago

[-]

Got it so, because it is rust it is good.. 10-4!!

epage

7 hours ago

[-]

> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.

Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".

This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.

6 hours ago

[-]

> Isn't assigning out what all made things fast presumptive without benchmarks?

We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.

> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.

This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .

kibwen

55 minutes ago

[-]

> This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML.

I don't know the details of Python's resolution algorithm, but for Cargo (which is where epage is coming from) a lockfile (which is encoded in TOML) can be somewhat large-ish, maybe pushing 100 kilobytes (to the point where I'm curious if epage has benchmarked to see if lockfile parsing is noticeable in the flamegraph).

[0] https://www.youtube.com/watch?v=gSKTfG1GXYQ

krick

3 hours ago

[-]

To be fair, the whole post isn't very good IMO, regardless of ChatGPT involvement, and it's weird how some people seem to treat it like some kind of revelation.

I mean, of course it wasn't specifically Rust that made it fast, it's really a banal statement: you need only very moderate serious programming experience to know, that rewriting legacy system from scratch can make it faster even if you rewrite it in a "slower" language. There have been C++ systems that became faster when rewritten in Python, for god's sake. That's what makes system a "legacy" system: it does a ton of things and nobody really knows what it does anymore.

But when listing things that made uv faster it really mentions some silly things, among others. Like, it doesn't parse pip.conf. Right, sure, the secret of uv's speed lies in not-parsing other package manager's config files. Great.

So all in all, yes, no doubt that hundreds of little things contributed into making uv faster, but listing a few dozens of them (surely a non-exhaustive lists) doesn't really enable you to make any conclusions about the relative importance of different improvements whatsoever. I suppose the mentioned talk[0] (even though it's more than a year old now) would serve as a better technical report.

pecheny

7 hours ago

[-]

The content is nice and insightful! But God I wish people stopped using LLMs to 'improve' their prose... Ironically, some day we might employ LLMs to re-humanize texts that had been already massacred.

captn3m0

6 hours ago

[-]

The author’ blog was on HN a few days ago as well for an article on SBOMs and Lockfiles. They’ve done a lot of work in the supply-chain security side and are clearly knowledgeable, and yet the blog post got similarly “fuzzified” by the LLM.

NewsaHackO

4 hours ago

[-]

To me, unless it is egregious, I would be very sensitive to avoid false positives before saying something is LLM aided. If it is clearly just slop, then okay, but I definitely think there is going to be a point where people claim well-written, straightforward posts as LLM aided. (Or even the opposite, which already happens, where people purposely put errors in prose to seem genuine).

[1]: https://github.com/andrew/nesbitt.io/commit/0664881a524feac4...

DrawTR

2 hours ago

[-]

Editing the post to switch five "it's X not Y"s[1] is pretty disappointing. I wish people were more clear with their disclosure of LLM editing.

laidoffamazon

7 hours ago

[-]

Interestingly I didn’t catch this, I liked it for not looking LLM written!

yunohn

7 hours ago

[-]

“Why this matters” being the final section is a guaranteed give away, among innumerable others.

rick_dalton

5 hours ago

[-]

I realized once I was in the "optimizations that dont need rust" section. Specifically "This is concurrency, not language magic."

dkmar

4 hours ago

[-]

Yup. The author has now swapped that part out for “Any language can do this.”

Just commenting to preempt any comments telling you that the article doesn’t say this.

yunohn

7 hours ago

[-]

I have reached a point where any AI smell (of which this articles has many) makes me want to exit immediately. It feels tortuous to my reading sensibilities.

I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.

fleebee

7 hours ago

[-]

You're probably right about the latter point, but I do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.

As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.

SatvikBeri

6 hours ago

[-]

> do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.

Fairly easy, in my wife's experience. She repeatedly got accused of using chatgpt in her original writing (she's not a native english speaker, and was taught to use many of the same idioms that LLMs use) until she started actually using chatgpt with about two pages of instructions for tone to "humanize" her writing. The irony is staggering.

mattkevan

5 hours ago

[-]

It’s pretty easy. I’ve written a fairly detailed guide to help Claude write in my tone of voice. It also coaxes it to avoid the obvious AI tells such as ‘It’s not X it’s Y’ sentences, American English and overuse of emojis and em dashes.

It’s really useful for taking my first drafts and cleaning them up ready for a final polish.

efilife

5 hours ago

[-]

I also don't read AI slop. It's disrespectful to any reader.

ethin

7 hours ago

[-]

> Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. This is a Rust-specific technique.

This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.

nemothekid

7 hours ago

[-]

Given the context of the article, I think "Rust specific" here means that "it couldn't be done in python".

For example "No interpreter startup" is not specific to Rust either.

7 hours ago

[-]

I think the framing in the post is that it's specific to Rust, relative to what Python packaging tools are otherwise written in (Python). It's not very easy to do zero-copy deserialization in pure Python, from experience.

(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)

5 hours ago

[-]

I can't even imagine what "safety" issue you have in mind. Given that "zero-copy" apparently means "in-memory" (a deserialized version of the data necessarily cannot be the same object as the original data), that's not even difficult to do with the Python standard library. For example, `zipfile.ZipFile` has a convenience method to write to file, but writing to in-memory data is as easy as

  with zipfile.ZipFile(archive_name) as a:
      with a.open(file_name) as f, io.BytesIO() as b:
          b.write(f.read())
          return b.getvalue()

(That does, of course, copy data around within memory, but.)

[1]: https://rkyv.org/zero-copy-deserialization.html

5 hours ago

[-]

> Given that "zero-copy" apparently means "in-memory" (a deserialized version of the data necessarily cannot be the same object as the original data), that's not even difficult to do with the Python standard library

This is not what zero-copy means. Here's a working definition[1].

Specifically, it's not just about keeping things in memory; copying in memory is normal. The goal is to not make copies (or more precisely, what Rust would call "clones"), but to instead convey the original representation/views of that representation through the program's lifecycle where feasible.

> a deserialized version of the data necessarily cannot be the same object as the original data

rust-asn1 would be an example of a Rust library that doesn't make any copies of data unless you explicitly ask it to. When you load e.g. a Utf8String[2] in rust-asn1, you get a view into the original input buffer, not an intermediate owning object created from that buffer.

> (That does, of course, copy data around within memory, but.)

Yes, that's what makes it not zero-copy.

[2]: https://docs.rs/asn1/latest/asn1/struct.Utf8String.html

5 hours ago

[-]

> Yes, that's what makes it not zero-copy.

Yeah, so you'd have to pass around the `BytesIO` instead.

I know that zero-copy doesn't ordinarily mean what I described, but that seemed to be how TFA was using it, based on the logic in the rest of the sentence.

4 hours ago

[-]

> Yeah, so you'd have to pass around the `BytesIO` instead.

That wouldn’t be zero-copy either: BytesIO is an I/O abstraction over a buffer, so it intentionally masks the “lifetime” of the original buffer. In effect, reading from the BytesIO creates new copies of the underlying data by design, in new `bytes` objects.

(This is actually a great capsule example of why zero-copy design is difficult in Python: the Pythonic thing to do is to make lots of bytes/string/rich objects as you parse, each of which owns its data, which in turn means copies everywhere.)

4 hours ago

[-]

Fair. (You can `.getbuffer` but you still have to keep the underlying BytesIO object "open" somehow.)

I'm not convinced this is going to bottleneck things, though.

(On the flip side, I guess the OS is likely to cache any disk write in memory anyway.)

SpaceNugget

3 hours ago

[-]

As a quick and kind of oversimplified example of what zero copy means, imagine you read the following json string from a file/the network/whatever:

    json = '{"user":"nugget"}' // from somewhere

A simple way to extract json["user"] to a new variable would be to copy the bytes. In pythony/c pseudo code

    let user = allocate_string(6 characters)
    for i in range(0, 6)
      user[i] = json["user"][i]
    // user is now the string "nugget"

instead, a zero copy strategy would be to create a string pointer to the address of json offset by 9, and with a length of 6.

    {"user":"nugget"}
             ^     ]end

The reason this can be tricky in C is that when you call free(json), since user is a pointer to the same string that was json, you have effectively done free(user) as well.

So if you use user after calling free(json), You have written a classic _memory safety_ bug called a "use after free" or UAF. Search around a bit for the insane number of use after free bugs there have been in popular software and the havoc they have wreaked.

In rust, when you create a variable referencing the memory of another (user pointing into json) it keeps track of that (as a "borrow", so that's what the borrow checker does if you have read about that) and won't compile if json is freed while you still have access to user. That's the main memory safety issue involved with zero-copy deserialization techniques.

stefan_

6 hours ago

[-]

I suppose it can fairly claim that now every other library and blog post invokes "zero-copy" this and that, even in the most nonsensical scenarios. It's a technique for when you can literally not afford the memory bandwidth, because you are trying to saturate a 100Gbps NIC or handling 8k 60Hz video, not for compromising your data serialization schemes portability for marketing purposes while all applications hit the network first, disk second and memory bandwidth never.

vlovich123

6 hours ago

[-]

You’ve got this backward. The vast majority of time due to spatial and temporal locality, in practice for any application you’re actually usually doing CPU registers first, cache second, memory third, disk fourth, network cache fifth, and network origin sixth. So this stuff does actually matter for performance.

Also, aside from memory bandwidth, there’s a latency cost inherent in traversing object graphs - 0 copy techniques ensure you traverse that graph minimally, just what’s needed to actually be accessed which is huge when you scale up. There’s a difference between one network request and fetching 1 MB vs making 100 requests to fetch 10kib and this difference also appears in memory access patterns unless they’re absorbed by your cache (not guaranteed for object graph traversal that a package manager would be doing).

6 hours ago

[-]

Many of the hot paths in uv involve an entirely locally cached set of distributions that need to be loaded into memory, very lightly touched/filtered, and then sunk to disk somewhere else. In those contexts, there are measurable benefits to not transforming your representation.

(I'm agnostic on whether zero-copy "matters" in every single context. If there's no complexity cost, which is what Rust's abstractions often provide, then it doesn't really hurt.)

5 hours ago

[-]

The point is that the packaging tool can analyze files from within the archives it downloads, without writing them to disk.

kbd

7 hours ago

[-]

It's Rust vs Python in this case.

landr0id

5 hours ago

[-]

They speak about “technique” but rkyv is a Rust-specific format. Could be an editing error or maybe they’re suggesting it’s more difficult in python.

ofek

6 hours ago

[-]

pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.

For example, my employer (Datadog) allowed me and two other engineers to improve various aspects of Python packaging for nearly an entire quarter. One of the items was to satisfy a few long-standing pip feature requests. I discovered that the cross-platform resolution feature I considered most important is basically incompatible [1] with the current code base. Maintainers would have to decide which path they prefer.

[1]: https://github.com/pypa/pip/issues/13111

5 hours ago

[-]

> pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.

Backwards compatibility is the one thing that prevents the code in an older project from being replaced with a better approach in situ. It cannot be more difficult than a rewrite, except that rewrites (arguably including my project) may hold themselves free to skip hard legacy cases, at least initially (they might not be relevant by the time other code is ready).

(I would be interested in hearing from you about UX designs for cross-platform resolution, though. Are you just imagining passing command-line flags that describe the desired target environment? What's the use case exactly — just making a .pylock file? It's hard to imagine cross-platform installation....)

punnerud

5 hours ago

[-]

My favorite speed up trick: “ HTTP range requests for metadata. Wheel files are zip archives, and zip archives put their file listing at the end. uv tries PEP 658 metadata first, falls back to HTTP range requests for the zip central directory, then full wheel download, then building from source. Each step is slower and riskier. The design makes the fast path cover 99% of cases. None of this requires Rust.”

dmarwicke

2 minutes ago

[-]

wait, zero-copy deserialization isn't rust-specific. you can mmap structs in C. done it before, works fine

blintz

7 hours ago

[-]

> PEP 658 went live on PyPI in May 2023. uv launched in February 2024. The timing isn’t coincidental. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.

How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?

6 hours ago

[-]

> If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?

It wasn't working okay for many people, and many others haven't started using pyproject.toml.

For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.

7 hours ago

[-]

Because static declaration was clearly safer and more performant? My question is why pip isn't fully taking advantage

eesmith

7 hours ago

[-]

Because pip contains decades of built-up code and lacks the people willing to work on updating it.

w10-1

6 hours ago

[-]

I like the implication that we can have an alternative to uv speed-wise, but I think reliability and understandability are more important in this context (so this comment is a bit off-topic).

What I want from a package manager is that it just works.

That's what I mostly like about uv.

Many of the changes that made speed possible were to reduce the complexity and thus the likelihood of things not working.

What I don't like about uv (or pip or many other package managers), is that the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems. Better (pubhub) error messages are good, but it's rare that they can provide specific fixes. So even if you get 99% speed, you end up with 1% perplexity and diagnostic black boxes.

To me the time that matters most is time to fix problems that arise.

5 hours ago

[-]

> the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems.

This is a priority for PAPER; it's built on a lower-level API so that programmers can work within a clear mental model, and I will be trying my best to communicate well in error messages.

andy99

7 hours ago

[-]

I remain baffled about these posts getting excited about uv’s speed. I’d like to see a real poll but I personally can’t imagine people listing speed as one of the their top ten concerns about python package managers. What are the common use cases where the delay due to package installation is at all material?

Edit to add: I use python daily

techbruv

6 hours ago

[-]

At a previous job, I recall updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem. Was not an enjoyable experience.

uv has been a delight to use

6 hours ago

[-]

> updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem

I'd characterize that as unusable, for sure.

thraxil

7 hours ago

[-]

Working heavily in Python for the last 20 years, it absolutely was a big deal. `pip install` has been a significant percentage of the deploy time on pretty much every app I've ever deployed and I've spent countless hours setting up various caching techniques trying to speed it up.

stavros

7 hours ago

[-]

I can run `uvx sometool` without fear because I know that it'll take a few seconds to create a venv, download all the dependencies, and run the tool. uv's speed has literally changed how I work with Python.

quectophoton

5 hours ago

[-]

I wouldn't say without fear, since you're one typo away from executing a typo-squatted malicious package.

I do use it on CI/CD pipelines, but I wouldn't dare type uvx commands myself on a daily basis.

stavros

5 hours ago

[-]

uvx isn't more risky than `pip install`, which is what I used before.

gordonhart

7 hours ago

[-]

`poetry install` on my dayjob’s monolith took about 2 minutes, `uv sync` takes a few seconds. Getting 2 minutes back on every CI job adds up to a lot of time saved

rsyring

7 hours ago

[-]

As a multi decade Python user, uv's speed is "life changing". It's a huge devx improvement. We lived with what came before, but now that I have it, I would never want to go back and it's really annoying to work on projects now that aren't using it.

recov

7 hours ago

[-]

Docker builds are a big one, at least at my company. Any tool that reduces wait time is worth using, and uv is an amazing tool that removes that wait time. I take it you might not use python much as it solves almost every pain point, and is fast which feels rare.

5 hours ago

[-]

It's not just about delays being "material"; waiting on the order of seconds for a venv creation (and knowing that this is because of pip bootstrapping itself, when it should just be able to install cross-environment instead of having to wait until 2022 for an ugly, limited hack to support that) is annoying.

But small efficiencies do matter; see e.g. https://danluu.com/productivity-velocity/.

morshu9001

2 hours ago

[-]

One weird case where this mattered to me, I wanted pip to backtrack to find compatible versions of a set of deps, and it wasn't done after waiting a whole hour. uv did the same thing in 5 minutes. This might be kinda common because of how many Python repos out there don't have pinned versions in dependencies.txt.

VorpalWay

6 hours ago

[-]

CI: I changed a pipeline at work from pip and pipx to uv, it saved 3 minutes on a 7 minute pipeline. Given how oversubscribed our runners are, anything saving time is a big help.

It is also really nice when working interactivly to have snappy tools that don't take you out of the flow more than absolutely more than necessary. But then I'm quite sensitive to this, I'm one of those people who turn off all GUI animations because they waste my time and make the system feel slow.

optionalsquid

3 hours ago

[-]

Speed is one of the main reasons why I keep recommending uv to people I work with, and why I initially adopted it: Setting up a venv and installing requirements became so much faster. Replacing pipx and `uv run` for single-file scripts with external dependencies, were additional reasons. With nox adding uv support, it also became much easier and much faster to test across multiple versions of Python

SatvikBeri

6 hours ago

[-]

Setting up a new dev instance took 2+ hours with pip at my work. Switching to uv dropped the Python portion down to <1 minute, and the overall setup to 20 minutes.

A similar, but less drastic speedup applied to docker images.

pseudosavant

6 hours ago

[-]

I avoided Python for years, especially because of package and environment management. Python is now my go to for projects since discovering uv, PEP 723 metadata, and LLMs’ ability to write Python.

patrick91

6 hours ago

[-]

for me it's being able to do `uv run whatever` and always know I have the correct dependencies

(also switching python version is so fast)

pants2

7 hours ago

[-]

The biggest benefit is in CI environments and Docker images and the like where all packages can get reinstalled on every run.

toenail

7 hours ago

[-]

The speed is nice, but I switched because uv supports "pip compile" from pip-tools, and it is better at resolving dependencies. Also pip-tools uses (used?) internal pip methods and breaks frequently because of that, uv doesn't.

blibble

3 hours ago

[-]

conda can take an hour to tell you your desired packages are unsatisifiable

saying that, other than the solver, most of what uv does is always going to be IO bound

ExoticPearTree

5 hours ago

[-]

Build jobs where you have a lot of dependencies. Those GHA minutes go brrrr.

IshKebab

6 hours ago

[-]

Do you still remain baffled after the many replies that people actually do like their tooling to be not dog slow like pip is?

adammarples

6 hours ago

[-]

It's annoying. Do you use poetry? Pipenv? It's annoying.

didibus

6 hours ago

[-]

There's an interesting psychology at play here as well, if you are a programmer that chooses a "fast language" it's indicative of your priorities already, it's often not much the language, but that the programmer has decided to optimize for performance from the get go.

bastawhiz

6 hours ago

[-]

> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.

This is kind of fascinating. I've never considered runtime upper bound requirements. I can think of compelling reasons for lower bounds (dropping version support) or exact runtime version requirements (each version works for exact, specific CPython versions). But now that I think about it, it seems like upper bounds solve a hypothetical problem that you'd never run into in practice.

If PSF announced v4 and declared a set of specific changes, I think this would be reasonable. In the 2/3 era it was definitely reasonable (even necessary). Today though, it doesn't actually save you any trouble.

wging

6 hours ago

[-]

I think the article is being careful not to say uv ignores _all_ upper bound checks, but specifically 4.0 upper bound checks. If a package says it requires python < 3.0, that's still super relevant, and I'd hope for uv to still notice and prevent you from trying to import code that won't work on python 3. Not sure what it actually does.

breischl

6 hours ago

[-]

I read the article as saying it ignores all upper-bounds, and 4.0 is just an example. I could be wrong though - it seems ambiguous to me.

But if we accept that it currently ignores any upper-bounds checks greater than v3, that's interesting. Does that imply that once Python 4 is available, uv will slow down due to needing to actually run those checks?

VorpalWay

5 hours ago

[-]

Are there any plans to actually make a 4.0 ever? I remember hearing a few years ago that after the transition to 3.0, the core devs kind of didn't want to repeat that mess ever again.

That said, even if it does happen, I highly doubt that is the main part of the speed up compared to pip.

unethical_ban

1 hour ago

[-]

The problem: The specification is binary. Are you compatible or not?

That is unanswerable now, whether a python package will be compatible with a version that is not released.

Having an ENUM like [compatible, incompatible, untested] at the least would fix this.

8 hours ago

[-]

> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.

Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)

8 hours ago

[-]

No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.

7 hours ago

[-]

That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.

My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.

VorpalWay

5 hours ago

[-]

I think they are making the bet that most modules won't be imported. For example if I install scipy, numpy, Pillow or such: what are the chances that I use a subset of the modules vs literally all of them?

I would bet on a subset for pretty much any non-trivial package (i.e. larger than one or two user facing modules). And for those trivial packages? Well they are usually small, so the cost is small as well. I'm sure there are exceptions: maybe a single gargantuan module thst consists of autogenerated FFI bindings for some C library or such, but that is likely the minority.

7 hours ago

[-]

> It's only once so it's not persistently slower, but that is a perf hit.

Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.

(This assumption is not universal, naturally!)

dddgghhbbfblk

7 hours ago

[-]

Ummm, your comment is backwards, right?

7 hours ago

[-]

Which part? The assumption is that when you `$TOOL install $PACKAGE`, you run (i.e. import) `$PACKAGE` more than you re-install it. So there's no point in slowing down (relatively less common) installation events when you can pay the cost once on import.

(The key part being that 'less common' doesn't mean a non-trivial amount of time.)

dddgghhbbfblk

5 hours ago

[-]

Why would you want to slow down the more common thing instead of the less common thing? I'm not following that at all. That's why I asked if that's backwards.

13 minutes ago

[-]

Because you only slow down the more common thing once, and the less common thing is slower in absolute terms.

beacon294

7 hours ago

[-]

Probably for any case where an actual human is doing it. On an image you obviously want to do it at bake time, so I feel default off with a flag would have been a better design decision for pip.

I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.

Epa095

7 hours ago

[-]

Images are a good example where doing it at install-time is probably the best yeah, since every run of the image starts 'fresh', losing the compilation which happened last time the image got started.

If it was a optional toggle it would probably become best practice to activate compilation in dockerfiles.

tedivm

6 hours ago

[-]

You can change it to compile the bytecode on install with a simple environment variable (which you should do when building docker containers if you want to sacrifice some disk space to decrease initial startup time for your app).

saidnooneever

7 hours ago

[-]

you are right. it depends on how often this first start is, if its bad or not..most usecases id guess (total guess, have limited exp with python projects professionally) its not an issue.

zmgsabst

1 hour ago

[-]

That’s actually a negative:

My Docker build generating the byte code saves it to the image, sharing the cost at build time across all image deployments — whereas, building at first execution means that each deployed image instance has to generate its own bytecode!

That’s a massive amplification, on the order of 10-100x.

“Well just tell it to generate bytecode!”

Sure — but when is the default supposed to be better?

Because this sounds like a massive footgun for a system where requests >> deploys >> builds. That is, every service I’ve written in Python for the last decade.

hauntsaninja

6 hours ago

[-]

Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`

salviati

7 hours ago

[-]

Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them. If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.

thundergolfer

6 hours ago

[-]

This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.

5 hours ago

[-]

> I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?

They do.

> Are we losing out on performance of the actual installed thing, then?

When you consciously precompile Python source files, you can parallelize that process. When you `import` from a `.py` file, you only get that benefit if you somehow coincidentally were already set up for `multiprocessing` and happened to have your workers trying to `import` different files at the same time.

plorkyeran

7 hours ago

[-]

If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.

est

2 hours ago

[-]

> Virtual environments required

This bothers me more than once when building a base docker image. Why would I want a venv inside a docker with root?

pornel

1 hour ago

[-]

The old package managers messing up the global state by default is the reason why Docker exists. It's the venv for C.

forrestthewoods

2 hours ago

[-]

Because a single docker image can run multiple programs that have mutually exclusive dependencies?

Personally I never want program to ever touch global shared libraries ever. Yuck.

https://docs.docker.com/engine/containers/multi-service_cont...

est

1 hour ago

[-]

> a single docker image can run multiple programs

You absolutely can. But it's not best practice.

forrestthewoods

27 minutes ago

[-]

God I hate docker so much. Running computers does not have to be so bloody complicated.

PrettyPastry

41 minutes ago

[-]

I wish this were enough to get the flake8 devs to accept pyproject support PRs.

eviks

6 hours ago

[-]

> Every code path you don’t have is a code path you don’t wait for.

No, every code path you don't execute is that. Like

> No .egg support.

How does that explain anything if the egg format is obsolete and not used?

Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.

And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!

But otherwise great and seemingly comprehensive list!

5 hours ago

[-]

> No, every code path you don't execute is that.

Even in compiled languages, binaries have to get loaded into memory. For Python it's much worse. On my machine:

  $ time python -c 'pass'

  real 0m0.019s
  user 0m0.013s
  sys 0m0.006s

  $ time pip --version > /dev/null

  real 0m0.202s
  user 0m0.182s
  sys 0m0.021s

Almost all of that extra time is either the module import process or garbage collection at the end. Even with cached bytecode, the former requires finding and reading from literally hundreds of files, deserializing via `marshal.loads` and then running top-level code, which includes creating objects to represent the functions and classes.

It used to be even worse than this; in recent versions, imports related to Requests are deferred to the first time that an HTTPS request is needed.

dangoodmanUT

3 hours ago

[-]

> Zero-copy deserialization

Just a nit on this section: zero-copy deserialization is not Rust specific (see flatbuffers). rkyv as a crate for doing so in Rust is though

simonw

5 hours ago

[-]

This post is excellent. I really like reading deep dives like this that take a complex system like uv and highlight the unique design decisions that make it work so well.

I also appreciate how much credit this gives the many previous years of Python standards processes that enabled it.

Update: I blogged more about it here, including Python recreations of the HTTP range header trick it uses and the version comparison via u64 integers: https://simonwillison.net/2025/Dec/26/how-uv-got-so-fast/

shevy-java

1 hour ago

[-]

Soon uv will deliver results without you even thinking about them beforehand!

ggm

6 hours ago

[-]

Some of these speed ups looked viable to backport into pip including parallel download, delayed .pyc, ignore egg, version checks.

Not that I'd bother since uv does venv so well. But, "it's not all rust runtime speed" implies pip could be faster too.

https://plotly.com/blog/uv-python-package-manager-quirks/

robertclaus

6 hours ago

[-]

At Plotly we did a decent amount of benchmarking to see how much the different defaults `uv` uses lead to its performance. This was necessary so we could advise our enterprise customers on the transition. We found you lost almost all of the speed gains if you configured uv behave as much like pip as you could. A trivial example is the precompile flag, which can easily be 50% of pips install time for a typical data science venv.

5 hours ago

[-]

The precompilation thing was brought up to the uv team several months ago IIRC. It doesn't make as much of a difference for uv as for pip, because when uv is told to pre-compile it can parallelize that process. This is easily done in Python (the standard library even provides rudimentary support, which Python's own Makefile uses); it just isn't in pip yet (I understand it will be soon).

VerifiedReports

7 hours ago

[-]

So... will uv make Python a viable cross-platform utility solution?

I was going to learn Python for just that (file-conversion utilities and the like), but everybody was so down on the messy ecosystem that I never bothered.

pseudosavant

6 hours ago

[-]

I write all of my scripts in Python with PEP 723 metadata and run them with `uv run`. Works great on Windows and Linux for me.

6 hours ago

[-]

It has been viable for a long time, and the kinds of projects you describe are likely well served by the standard library.

IshKebab

6 hours ago

[-]

Yes, uv basically solves the terrible Python tooling situation.

In my view that was by far the biggest issue with Python - a complete deal-breaker really. But uv solves it pretty well.

The remaining big issues are a) performance, and b) the import system. uv doesn't do anything about those.

Performance may not be an issue in some cases, and the import system is ... tolerable if you're writing "a python project". If you're writing some other project and considering using Python for its scripting system, e.g. to wrangle multiple build systems or whatever than the import mess is a bigger issue and I would thing long and hard before picking it over Deno.

VerifiedReports

12 minutes ago

[-]

Thanks! I don't really think about importing stuff (which maybe I should), because I assume I'll have to write any specialized logic myself. So... your outlook is encouraging.

6 hours ago

[-]

I've talked about this many times on HN this year but got beaten to the punch on blogging it seems. Curses.

... Okay, after a brief look, there's still lots of room for me to comment. In particular:

> pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.

This is largely refuted by the fact that pip is still slow, even when installing from wheels (and getting PEP 600 metadata for them). Pip is actually still slow even when doing nothing. (And when you create a venv and allow pip to be bootstrapped in it, that bootstrap process takes in the high 90s percent of the total time used.)

hk1337

5 hours ago

[-]

It’s fast because it sucks the life force from bad developers to make them into something good.

Jokes aside…

I really like uv but also really like mise and I cannot seem to get them to work well together.

Onavo

3 hours ago

[-]

Why? They are pretty compatible. Just set the venv in the project's mise.toml are you are good to go. Mise will activate it automatically when you change into the project directory.

ec109685

7 hours ago

[-]

The article info is great, but why do people put up with LLM ticks and slop in their writing? These sentences add no value and treats the reader as stupid.

> This is concurrency, not language magic.

> This is filesystem ops, not language-dependent.

Duh, you literally told me that the previous sentence and 50 million other times.

aurumque

7 hours ago

[-]

This kind of writing goes deeper than LLM's, and reflects a decline in both reading ability, patience, and attention. Without passing judgement, there are just more people now who benefit from repetition and summarization embedded directly in the article. The reader isn't 'stupid', just burdened.

twoodfin

6 hours ago

[-]

Indeed, I am coming around in the past few weeks to realization and acceptance that the LLM editorial voice is a benefit to an order of magnitude more hn readers than those (like us) for whom it is ice pick in the nostril stuff.

Oh well, all I can do is flag.

BiteCode_dev

5 hours ago

[-]

Other design decisions that made uv fast:

- uncompressing packages while they are still being downloaded, in memory, so that you only have to write to disk once

- design of its own locking format for speed

But yes, rust is actually making it faster because:

- real threads, no need for multi-processing

- no python VM startup overhead

- the dep resolution algo is exactly the type of workload that is faster in a compiled language

Source, this interview with Charlie Marsh: https://www.bitecode.dev/p/charlie-marsh-on-astral-uv-and-th...

The guy has a lot of interesting things to say.

zzzeek

4 hours ago

[-]

> real threads, no need for multi-processing

parallel downloads don't need multi-processing since this is an IO bound usecase. asyncio or GIL-threads (which unblock on IO) would be perfectly fine. native threads will eventually be the default also.

nurettin

7 hours ago

[-]

> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower.

I will bring popcorn on python 4 release date.

7 hours ago

[-]

If it's really not doing any upper bound checks, I could see it blowing up under more mundane conditions; Python includes breaking changes on .x releases, so I've had eg. packages require (say) Python 3.10 when 3.11/12 was current.

dev_l1x_be

7 hours ago

[-]

I always bring popcorn on major version changes for any programming language. I hope Rust's never 2.0 stance holds.

6 hours ago

[-]

It would be popcorn-worthy regardless, given the rhetoric surrounding the idea in the community.

pwdisswordfishy

7 hours ago

[-]

> Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today: […] Python-free resolution

Umm…

looneysquash

7 hours ago

[-]

I don't have any real disagreement with any of the details the author said.

But still, I'm skeptical.

If it is doable, the best way to prove it is to actually do it.

If no one implements it, was it ever really doable?

Even if there is no technical reason, perhaps there is a social one?

5 hours ago

[-]

I guess you mean doing the things in Python that are supposedly doable from Python.

Yeah, to a zeroth approximation that's my current main project (https://github.com/zahlman/paper). Of course, I'm just some rando with apparently serious issues convincing myself to put in regular unpaid work on it, but I can see in broad strokes how everything is going to work. (I'm not sure I would have thought about, for example, hard-linking files when installing them from cache, without uv existing.)

stevemk14ebr

7 hours ago

[-]

What are you talking about, this all exists

agumonkey

7 hours ago

[-]

very nice article, always good to get a review of what a "simple" looking tool does behind the scense

about rust though

some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)

also the parallelism might be a benefit the language orientation

enough semi fanboyism

didip

5 hours ago

[-]

If UV team has a spare time, they should rewrite Python in Rust without any of the legacy baggage.

hallvard

7 hours ago

[-]

Great post, but the blatant chatgpt-esque feel hits hard… Don’t get me wrong, I love astral! and the content, but…

hallvard

7 hours ago

[-]

Reading the other replies here makes it really obvious that this is some LLM’s writing. Maybe even all of it…

zzzeek

5 hours ago

[-]

> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence. But it means pip will always be slower than a tool that starts fresh with modern assumptions.

what does backwards compatibility have to do with parallel downloads? or global caching? The metadata-only resolution is the only backwards compatible issue in there and pip can run without a setup.py file being present if pyproject.toml is there.

Short answer is most, or at least a whole lot, of the improvements in uv could be integrated into pip as well (especially parallelizing downloads). But they're not, because there is uv instead, which is also maintained by a for-profit startup. so pip is the loser

IshKebab

6 hours ago

[-]

Mmm I don't buy it. Not many projects use setup.py now anyway and pip is still super slow.

> Plenty of tools are written in Rust without being notably fast.

This also hasn't been my experience. Most tools written in Rust are notably fast.

5 hours ago

[-]

> Not many projects use setup.py now anyway and pip is still super slow.

Yes, but that's still largely not because of being written in Python. The architecture is really just that bad. Any run of pip that touches the network will end up importing more than 500 modules and a lot of that code will simply not be used.

For example, one of the major dependencies is Rich, which includes things like a 3600-entry mapping of string names to emoji; Rich in turn depends on Pygments which normally includes a bunch of rules for syntax highlighting in dozens of programming languages (but this year they've finished trimming those parts of the vendored Pygments).

Another thing is that pip's cache is an HTTP cache. It literally doesn't know how to access its own package download cache without hitting the network, and it does that access through wrappers that rely on cachecontrol and Requests.

scottlamb

5 hours ago

[-]

Mine either. Choosing Rust by no means guarantees your tool will be fast—you can of course still screw it up with poor algorithms. But I think most people who choose Rust do so in part because they aspire for their tool to be "blazing fast". Memory safety is a big factor of course, but if you didn't care about performance, you might have gotten that via a GCed (and likely also interpreted or JITed or at least non-LLVM-backend) language.

skywhopper

7 hours ago

[-]

This is great to read because it validates my impression that Python packaging has always been a tremendous overengineered mess. Glad to see someone finally realized you just need a simple standard metadata file per package.

5 hours ago

[-]

It has been realized in the Python community for a very long time. But there have been years of debate over the contents and formatting, and years of trying to figure out how to convince authors and maintainers to do the necessary work on their end, and years of trying to make sure the ecosystem doesn't explode from trying to remove legacy support.

There are still separate forms of metadata for source packages and pre-compiled distributions. This is necessary because of all the weird idiosyncratic conditional logic that might be necessary in the metadata for platform-specific dependencies. Some projects are reduced to figuring out the final metadata at build time, while building on the user's machine, because that's the only way to find out enough about the user's machine to make everything work.

It really isn't as straightforward as you'd expect, largely because Python code commonly interfaces to compiled code in several different languages, and end users expect this to "just work", including on Windows where they don't have a compiler and might not know what that is.

See https://pypackaging-native.github.io/ for the general flavour of it.

aswegs8

7 hours ago

[-]

uv seems to be a pet peeve of HN. I always thought pipenv was good but yeah, seems like I was being ignorant

aw1621107

7 hours ago

[-]

> uv seems to be a pet peeve of HN.

Unless I've been seeing very different submissions than you, "pet peeve" seems like the exact opposite of what is actually the case?

VerifiedReports

7 hours ago

[-]

Indeed; I don't think he knows what "peeve" means...

glaucon

7 hours ago

[-]

I too use pipenv unless there's a reason not to. I hope people use whatever works best for them.

I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.

EdwardDiego

4 hours ago

[-]

My biggest complaint with pipenv is/was(?) that it's lockfile format only kept the platform identifiers of the platform you locked it on - so if you created it on Mac, then tried to install from the lockfile on a Linux box, you're building from source because it's only locked in wheels for MacOS.

Poetry and uv avoid this issue.

pkaodev

5 hours ago

[-]

AI slop

rvz

5 hours ago

[-]

TLDR: Because Rust.

This entire AI generated article with lots of text just to just say the obvious.

5 hours ago

[-]

That conclusion is largely false, and is not what the article says.