Sensible type-annotated python code could be so much faster if it didn't have to assume everything could change at any time. Most things don't change, and if they do they change on startup (e.g. ORM bindings).
Definitely, but then it wouldn't be Python. One of the core principles of Python's design is to be extremely dynamic, and that anything can change at any time.
There are many other, pretty good, strictly dynamically typed languages which work just as well if not better than Python, for many purposes.
And when Python is a mainstream language on top of which large, globally known websites, AI tools, core system utilities, etc are built, we should give up the purity angle and be practical.
Even the new performance push in Python land is a reflection of this. A long time ago some optimizations were refused in order to not complicate the default Python implementation.
class SomeClass
def init(self)
self.x = 0
def SomeMethod(self)
q = self.x
## do stuff with q, because otherwise you're dereferencing self.x all the damn timeThere might be common scenarios where this had a real, significant performance impacts, E.G. use-cases where it's such a bottle-neck in the interpreter that it measurably affects warm-up time. Also, string manipulation seems like the kind of thing you see in small scripts that end before a JIT even kicks in but that are also called very often (although I don't know how many people would reach for Java in that case.
EDIT: also, if you're a commercial entity trying to get people to use your programming language, it's probably a good idea to make the language perform less bad with the most common terrible code. And accidentally quadratic or worse string manipulation involving excessive calls to trim() seems like a very likely scenario in that context.
If these analyses don't apply and the callee could do anything, then of course the compiler can't keep the value hoisted. But a function call has to occur anyway, so the hoisted value will be pushed/popped from the stack and you might as well reload it from the object's field anyway, rather than waste a stack slot.
i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).
lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.
There is no such thing as 'successive references to the same member value' here. It's not that you look up the same object and it can change, it's that you are not referring to the same object at all.
self.x is actually self.__getattr__('x'), which can in fact return a different thing each time. `self.x` IS a string lookup and that is not an implementation detail, but a major design goal. This is the dynamism, that is one of the selling points of Python, it allows you to change and modify interfaces to reflect state. It's nice for some things and it is what makes Python Python. If you don't want that, use another language.
And getting rid of descriptors would be a _fundamental change to the language_. An immeense one. Loads of features are built off of descriptors or descriptor-like things.
And what you're complaining about is also not true in Javascript world either... I believe you can build descriptor-like things in JS now as well.
_But_ if you want that you can use stuff like mypyc + annotations to get that for you. There are tools that let you get to where you want. Just not out of the box because Python isn't that language.
Remember, this is a scripting language, not a compiled language. Every optimization for things you talk about would be paid on program load (you have pyc stuff but still..)
Gotta show up with proof that what you're saying is verifiable and works well. Up until ~6 or 7 years ago CPython had a concept of being easy to onboard onto. Dataflow analyses make the codebase harder to deal with.
Having said all of that.... would be nice to just inline RPython-y code and have it all work nicely. I don't need it on everything and proving safety is probably non-trivial but I feel like we've got to be closer to doing this than in the past.
I ... think in theory the JIT can solve for that too. In theory
The language supports multiple threads and doesn’t have private fields (https://docs.python.org/3/tutorial/classes.html#private-vari...), so the runtime cannot rule out that the value gets changed in-between.
And yes, it often is obvious to humans that’s not intended to happen, and almost never what happens, but proving that is often hard or even impossible.
For example, numbers and strings are immutable objects in Python. If self.x is a number and its numeric value is changed by a method call, self.x will be a different object after that. I'd dare say people expect this to work.
now, functional languages don't have this problem at all.
This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it, which is a problem that extends to the "self" object. In contrast in C++ the compiler knows everything there's to know about the type of this which avoids the issue.
struct S { const int x; int f() const; }; int S::f() const { int a = x; printf("hello\n"); int b = x; return a-b; }
The compiler can't reuse 'x' unless it's able to prove that it definitely couldn't have changed during the `printf()` call - and it's unable to prove it. The member is loaded twice. C++ compilers can usually only prove it for trivial code with completely inlined functions that doesn't mutate any external state, or mutates in a definitely-not-aliasing way (strict aliasing). (and the `const` don't do any difference here at all)
In Python the difference is that it can basically never prove it at all.
That's not the whole thing, what is going on. Every attribute access is a function call to __getattr__, that can return whatever object it wants.
bar.foo (...) is actually bar.__getattr__ ('foo') (bar, ...)
This dynamism is what makes Python Python and it allows you to wrap domain state in interface structure.
I would say it is part python being highly dynamic and part C++ being full of undefined behavior.
A c++ compiler will only optimize member access if it can prove that the member isn't overwritten in the same thread. Compatible pointers, opaque method calls, ... the list of reasons why that optimization can fail is near endless, C even added the restrict keyword because just having write access to two pointers of compatible types can force the compiler to reload values constantly. In python anything is a function call to some unknown code and any function could get access to any variable on the stack (manipulating python stack frames is fun).
Then there is the fun thing the C++ compiler gets up to with varibles that are modified by different threads, while(!done) turning into while(true) because you didn't tell the compiler that done needs to be threadsafe is always fun.
Did you miss the part where I explained to you there's no way to identify that it's a member variable?
> Nobody in the real world expects this behaviour
As has already been explained to you by a sibling comment you are in fact wrong and there are in fact plenty of people in the real world who do actually expect this behavior.
So I'll repeat myself: lots of hottakes from just pure. Unadulterated, possibly willful, ignorance.
"Did you miss the part where I explained to you there's no way to identify that it's a member variable?"
No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.
"this just how things are dunn around diz here parts" is not an argument.
This is not a side implementation detail, that they got wrong, this is a fundamental design goal of Python. You can find that nuts, but then just don't use Python, because that is (one of) that things, that make Python Python.
Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.
Did I stutter when I wrote about "an unfortunate early decision"? Who said it has to be "an arbitrary name"?
Even so, you could add a bloody marker announcing an arbitrary name (which 99% would be self anyway) as so, as an instruction to the interpreter. If it fails, it fails, like countless other things that can fail during runtime in Python today.
The name `self` is a convention, yes, but interestingly in python methods the first parameter is special beyond the standard "bound method" stuff. See for example PEP 367 (New Super) for how `super()` resolution works (TL;DR the super function is a special builtin that generates extra code referencing the first parameter and the lexically defining class)
Still churning on it, will probably publish it and do a proper blog post once I've built something interesting with the language itself.
https://github.com/abilian/p2w
NB: some preliminary results:
p2w is 4.03x SLOWER than gcc (geometric mean)
p2w is 5.50x FASTER than cpython (geometric mean)
p2w is 1.24x FASTER than pypy (geometric mean)It is called type hints, and is already there. TS typing doesn't bring any perf benefits over plain JS.
class Foo:
__slots__ = ("a", "b")
a: int
b: float
there are multiple issues with Python that prevent optimizations:* a user can define subtype `class my_int(int)`, so you cannot optimize the layout of `class Foo`
* the builtin `int` and `float` are big-int like numbers, so operations on them are branchy and allocating.
and the fact that Foo is mutable and that `id(foo.a)` has to produce something complicates things further.
Therefore Python has no use for TS-like superset, because it already has facilities for static analysis with no bearing on runtime, which is what TS provides.
1) Add TS like language on top of Python in backwards compatible way
2) Introduce frozen/final runtime types
3) Use 1 and 2 to drive runtime optimizations
From all posts it looks like what OP wants is a different language that looks somewhat like Python syntax-wise, so calling for "backwards-compatible" superset is pointless, because stuff that is being demanded would break compatibility by necessity.
And what prevents someone from designing such a language?
Funnily enough I’ve found Python to be excellent for modelling my problem domain with Pydantic (so far basically unparalleled, open for suggestions in Go/Rust), while the language also gets out of my way when I get creative with list expressions and the like. So overall, still it is extremely productive for the work I’m doing, I just need to spin up more containers in prod.
You could make this clean break and call it Python 4 but frankly I fear it won't be Python anymore.
Great idea, but I'm not convinced that they learned anything from the Python 2 to 3 transition, so I wouldn't hold my breath.
If you want a language system without contempt for backward compatibility, you're probably better off with Java/C++/JavaScript/etc. (though using JS libraries is like building on quicksand.) Bit of a shame since I want to like Python/Rust/Swift/other modern-ish languages, but it turns out that formal language specifications were actually a pretty good idea. API stability is another.
TL;DR: SPy is a variant of Python specifically designed to be statically compilable while retaining a lot of the "useful" dynamic parts of Python.
The effort is led by Antonio Cuni, Principal Software Engineer at Anaconda. Still very early days but it seems promising to me.
It has nothing to do with whether the list is empty. It has nothing to do with lists at all. It's the behaviour of default arguments.
It happens at the time that the function object is created, which is during runtime.
You only notice because lists are mutable. You should already prefer not to mutate parameters, and it especially doesn't make sense to mutate a parameter that has a default value because the point of mutating parameters is that the change can be seen by the caller, but a caller that uses a default value can't see the default value.
The behaviour can be used intentionally. (I would argue that it's overused intentionally; people use it to "bind" loop variables to lambdas when they should be using `functools.partial`.)
If you're getting got by this, you're fundamentally expecting Python to work in a way that Pythonistas consider not to make sense.
It's just slightly annoying having to work around this by defaulting to None.
The entire point of it being an executable statement is to let you change things on the fly. This is key to how the REPL works. If I have `def foo(): ...` twice, the second one overwrites the first. There's no need to do any checks ahead of time, and it works the same way in the REPL as in a source file, without any special logic, for the exact same reason that `foo = 1` works when done twice. It's actually very elegant.
People who don't like these decisions have plenty of other options for languages they can use. Only Python is Python. Python should not become not-Python in order to satisfy people who don't like Python and don't understand what Python is trying to be.
b = ComplexObject (...)
# do things with b
def foo (self, arg=b):
# use b
return foo
Should it create a copy of b every time the function is invoked? If you want that right now, you can just call b.copy (), when you always create that copy, then you can not implement the current choice.Should the semantic of this be any different? :
def foo (self, arg=ComplexObject (...)):
Now imagine a: ComplexObject = listdef foo(self, arg=expression):
could, and should work as if it was written like this (pseudocode)
def foo(self, arg?): if is_not_given(arg): arg=expression
if "expression" is a literal or a constructor, it'd be called right there and produce new object, if "expression" is a reference to an object in outer scope, it'd be still the same object.
it's a simple code transformation, very, very predictable behavior, and most languages with closures and default values for arguments do it this way. Except python.
def foo (self, arg=lambda : expression):
Assignment of unevaluated expressions is not a thing yet in Python and would be really surprising. If you really want that, that is what you get with a lambda.> most languages with closures and default values for arguments do it this way.
Do these also evaluate function definitions at runtime?
There might not be that many of them, depending on how you count, but they're not rare in the slightest. For example, you have to use `is` in the common case where you want the default value of a function argument to be an empty list.
https://github.com/python/cpython/blob/3.14/Lib/json/encoder...
Default value is evaluated once, and accessing parameter is much cheaper than global
Similarly, I don't entirely understand refcount elimination; I've seen the codegen difference, but since the codegen happens at build time, does this mean each opcode is possibly split into two (or more?) stencils, with and without removed increfs/decrefs? With so many opcodes and their specialized variants, how many stencils are there now?
Thanks for your interest. This is something we could improve on. We were supposed to document the JIT better in 3.15, but right now we're crunching for the 3.15 release. I'll try to get to updating the docs soon if there's enough interest. PEP 744 does not document the new frontend.
I wrote a somewhat high-level overview here in a previous blog post https://fidget-spinner.github.io/posts/faster-jit-plan.html#...
> does this mean each opcode is possibly split into two (or more?) stencils, with and without removed increfs/decrefs?
This is a great question, the answer is not exactly! The key is to expose the refcount ops in the intermediate representation (IR) as one single op. For example, BINARY_OP becomes BINARY_OP, POP_TOP (DECREF), POP_TOP (DECREF). That way, instead of optimizing for n operations, we just need to expose refcounting of n operations and optimize only 1 op (POP_TOP). Thus, we just need to refactor the IR to expose refcounting (which was the work I divided up among the community).
If you have any more questions, I'm happy to answer them either in public or email.
https://discuss.python.org/t/pep-744-jit-compilation/50756/8... here's one thing
I do think you can also just outright ask questions about it on the forums and you'll get some answers.
At the end of the day there's only so many people working on this though.
I love playing with compilers for fun, so maybe I can shed some light. I’ll explain it in a simplified way for everyone’s benefit (going to ignore the stack):
When an object is passed between functions in Python, it doesn’t get copied. Instead, a reference to the object’s memory address is sent. This reference acts as a pointer to the object’s data. Think of it like a sticky note with the object’s memory address written on it. Now, imagine throwing away one sticky note every time a function that used a reference returns.
When an object has zero references, it can be freed from memory and reused. Ensuring the number of references, or the “reference count” is always accurate is therefore a big deal. It is often the source of memory leaks, but I wouldn’t attribute it to a speed up (only if it replaces GC, then yes).
Although your general sentiment is something I agree with(if it's going to be painful do it and get it over with), I don't believe anybody knew or could've guessed what the reaction of the ecosystem would be.
Your last point about being able to change internals more freely is also great in theory but very difficult(if not impossible) to achieve in practice.
I don't know. Having maintained some small projects that were free and open source, I saw the hostility and entitlement that can come from that position. And those projects were a spec of dust next to something like Python. So I think the core team is doing the best they can. It was always going to be damned if you do, damned if you don't.
Slight tangent: if Claude can decimate IBM stock price by migrating off Cobol for cheap, surely we can do Python 2 to 3 now, too?
About the internals: we sort of missed an opportunity there, but back then there also didn't quite know what they were doing (or at least we have better ideas of what's useful today). And making the step from 2 to 3 even bigger might have been a bad idea?
In my experience, the problem had always been maintaining the business logic and any integrations with third-party software that also may be running legacy code-bases or have been abandoned. It can get quite complicated, from what I've seen. Now of course if you're talking about well maintained code-bases with 100%, or close to 100% test coverage, and that includes the integration part along with having the ability to maintain the user experience and/or user interface then yes it becomes a relatively easy process of "just write the code". But, in my experience, this has never been the case.
For the 2.x code-bases I maintain, the customers simply doesn't want to pay for any of it. They might choose to at a later time, but so far it has been more cost effective for them to pay me to maintain that legacy code than pay to have it migrated. Other customers have different needs and thus budget differently.
I'll refrain from judging if 2 to 3 was a missed opportunity or not. I believe the core team does actually know what they're doing and that any decision would've been criticized.
"IBM Sinks Most Since 2000 as Anthropic Touts Cobol Tool"
https://finance.yahoo.com/news/ibm-sinks-most-since-2000-210...
It may not be "cheap", but possibly cheaper than IBM's consulting.
To me, there's a big difference between saying that migration projects can now be assisted with some AI tooling and saying that it is cheap and to just get Claude to do it.
Maybe I am out of touch but the former is realistic and the latter is just magical hand-waving.
I agree with the latter. About the former: they probably made a good decisions given the information available at the time. I mean that nowadays they know more than they did in the past.
It does much better with good tests. In my case the output was a statically generated website, so I could just say 'make the same website, given these inputs'.
Since the switch we have seen enormous companies being built from scratch. There is no reason for anyone to be complaining about it being too hard to upgrade in 2026
It wasn't until much later (I would say 3.4 or 3.5?) that we had good tooling to allow for migrating from Python 2 to Python 3 gradually, which is what most tools needed to do.
The final thing that made Python upgrading easy was making a bunch of changes (along with stuff like six) so that you could write code that would run identically in Python 2 and Python 3. That lets you do refactors over time, little cleanups, and not have the huge "move to Python 3" commit.
The switch had nothing to do with Python's rise in popularity though, it was because of NumPy and later PyTorch being adopted by data scientist and later machine learning tasks that themselves became very popular. Python's popularity rose alongside those.
> There is no reason for anyone to be complaining about it being too hard to upgrade in 2026
The "complaints" are about unnecessary and pointless breakage, that was very difficult for many codebases to upgrade for years. That by now most of these codebases have been either abandoned, upgraded or decided to stick with Python2 until the end of time doesn't mean these pains didn't happen nor that the language's developers inflicting them to their users were a good idea because some largely unrelated external factors made the language popular several years later.
In case people have forgotten: python 3.3 through 3.5 (and 3.6 I think) each had to reintroduce something that was removed to make the upgrade easier. Jumping from 2.7 to 3.3 (or higher depending on what you needed) was the recommended route because of this, it was less work than going to 3.0, 3.1, or 3.2
Its widely regarded as a disaster for good reason, that forced some corrections in python to fix it. Just because its fine now, does not mean it was always fine
if sys.version_info.major == 2:
import old
else:
import new
Or worse, people used try/except in their imports.Anyway you can already try freethreaded builds that have the GIL disabled, but my experience is that most of your dependencies won't work.
Even the main driver for Python 3, the bytes-Unicode split, has unfortunately turned out to be sub-optimal. Python essentially bet on UTF-32 (with space-saving optimisations), while everyone else has chosen UTF-8.
How so? Python3 strings are unicode and all the encoding/decoding functions default to utf-8. In practice this means all the python I write is utf-8 compatible unicode and I don't ever have to think about it.
While most characters might be encodable as a single code point, Python does not normalize strings, so there is no guarantee that even relatively normal characters are actually stored as single code points.
Try this in Python:
s = "a\u0308"
print(s)
print(s[0])
You will see: ä
aLanguages that use UTF-8 natively don't need those functions at all. And the ones in Python aren't trivial - see, for example, `surrogateescape`.
As the sibling comment says, the only benefit of all this encoding/decoding is that it allows strings to support constant-time indexing of code points, which isn't something that's commonly needed.
IMO, while this may not be optimal, it's far better than the more arcane choice made by other systems. For example, due to reasons only Microsoft can understand, Windows is stuck with UTF-16.
[1] Actually it's more intelligent. For example, Python automatically uses uint8 instead of uint32 for ASCII strings.
>>> x = '日本語'*100000000
>>> import time
>>> t = time.time(); y = x.encode(); time.time() - t # takes nontrivial time
>>> t = time.time(); y = x.encode(); time.time() - t # not cached; not any faster
Generally, the only reason this would happen implicitly is for I/O; actual operations on the string operate directly on the internal representation.Python uses either 8, 16 or 32 bits per character according to the maximum code point found in the string; uint8 is thus used for all strings representable in Latin-1, not just "ASCII". (It does have other optimizations for ASCII strings.)
The reason for Windows being stuck with UTF-16 is quite easy to understand: backwards compatibility. Those APIs were introduced before there supplementary Unicode planes, such that "UTF-16" could be equated with UCS-2; then the surrogate-pair logic was bolted on top of that. Basically the same thing that happened in Java.
No there certainly is. This is documented in the official API documentation:
UTF-8 representation is created on demand and cached in the Unicode object.
https://docs.python.org/3/c-api/unicode.html#unicode-objects
In particular, Python's Unicode object (PyUnicodeObject) contains a field named utf8. This field is populated when PyUnicode_AsUTF8AndSize() is first called and reused thereafter. You can check the exact code I'm talking about here:https://github.com/python/cpython/blob/main/Objects/unicodeo...
Is it clear enough?
It did nothing of the sort. UTF-8 is the default source file encoding and has been the target for many APIs. It likely would have been the default for all I/O stuff if we lived in a world where Windows had functioning Unicode in the terminal the whole time and didn't base all its internal APIs on UTF-16.
I assume you're referring to the internal representation of strings. Describing it as "UTF-32 with space-saving optimizations" is missing the point, and also a contradiction in terms. Yes, it is a system that uses the same number of bytes per character within a given string (and chooses that width according to the string contents). This makes random access possible. Doing anything else would have broken historical expectations about string slicing. There are good arguments that one shouldn't write code like that anyway, but it's hard to identify anything "sub-optimal" about the result except that strings like "I'm learning 日本語" use more memory than they might be able to get away with. (But there are other strings, like "ℍℯℓ℗", that can use a 2-byte width while the UTF-8 encoding would add 3 bytes per character.)
There is a story that Python is harder to optimize than, say, Typescript, with Python flexibility and the C API getting mentioned. Maybe, if the list of troublesome Python features was out there, programmers could know to avoid those features with the promise of activating the JIT when it can prove the feature is not in use. This could provide a way out of the current Python hard-to-JIT trap. It's just a gist of an idea, but certainly an interesting first step would be to hear from the JIT people which Python features they find troublesome.
[1] https://fidget-spinner.github.io/posts/faster-jit-plan.html
I think __del__ is tricky though. In theory __del__ is not meant to be reliable. In practice CPython reliably calls it cuz it reference counts. So people know about it and use it (though I've only really seen it used for best effort cleanup checks)
In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it. And that would also generate more pressure to implement code that is performant in "any" system.
A big part of the problem is that much of the power of the Python ecosystem comes specifically from extensions/bindings written in languages with manual (C) or RAII/ref-counted (C++, Rust) memory management, and having predictable Python-level cleanup behavior can be pretty necessary to making cleanup behavior in bound C/C++/Rust objects work. Breaking this behavior or causing too much of a performance hit is basically a non-starter for a lot of Python users, even if doing so would improve the performance of "pure" Python programs.
Doesn't FinalizationRegistry let you do exactly that?
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
> A conforming JavaScript implementation, even one that does garbage collection, is not required to call cleanup callbacks. When and whether it does so is entirely down to the implementation of the JavaScript engine. When a registered object is reclaimed, any cleanup callbacks for it may be called then, or some time later, or not at all. It's likely that major implementations will call cleanup callbacks at some point during execution, but those calls may be substantially after the related object was reclaimed. Furthermore, if there is an object registered in two registries, there is no guarantee that the two callbacks are called next to each other — one may be called and the other never called, or the other may be called much later. There are also situations where even implementations that normally call cleanup callbacks are unlikely to call them:
I remember a couple of years ago (well probably around 2021) reading about GC exposure concerns and seeing some line in some TC39 doc like "users should not have visibility into collection" but if we've shipped weakrefs sounds like we're not thinking about that anymore
A proposal to add new ways of observing garbage collection will still be shot down immediately without a damn good justification.
This is more pedantry than a serious question. JavaScript has WeakReference, sure it'd be cumbersome and inefficient because you'd need to manually make and poll each thing you wanted to observe, but could it not be said that it does provide a view on deallocations?
Note that 90% of the uses for them actually shouldn't be using them, usually for subtle reasons. It's always a big cause for debate.
> Using str.frobnicate prevents TurboJit on line 63
> By using only a single instruction and two tables, we only increase the interpreter by a size of 1 instruction, and also keep the base interpreter ultra fast. I affectionally call this mechanism dual dispatch.
I really do hope they'll write that better explanation one day because this sounds pretty intriguing all on its own.
I recently read an interview about implementing free-threading and getting modifications through the ecosystem to really enable it: https://alexalejandre.com/programming/interview-with-ngoldba...
The guy said he hopes the free-threaded build'll be the only one in "3.16 or 3.17", I wonder if that should apply to the JIT too or how the JIT and interpreter interact.
Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.
Way more than 1% of the community, particularly of the community actively developing Python, wants free-threaded. The problem here is that the Python community consists of several different groups:
1. Basically pure Python code with no threading
2. Basically pure Python with appropriate thread safety
3. Basically pure Python code with already broken threaded code, just getting lucky for now
4. Mixed Python and C/C++/Rust code, with appropriate threading behavior in the C or C++ components
5. Mixed Python and C or C++ code, with C and C++ components depending on GIL behavior
Group 1 gets a slightly reduced performance. Groups 2 and 4 get a major win with free-threaded Python, being able to use threading through their interfaces to C/C++/Rust components. Group 3 is already writing buggy code and will probably see worse consequences from their existing bugs. Group 5 will have to either avoid threading in their Python code or rewrite their C/C++ components.
Right now, a big portion of the Python language developer base consists of Groups 2 and 4. Group 5 is basically perceived as holding Python-the-language and Python-the-implementations back.
Native code can already be multi-threaded so if you are using Python to drive parallelized native code, there's no win there. If your Python code is the bottleneck, well then you could have subinterpreters with shared buffers and locks. If you really need to have shared objects, do you actually need to mutate them from multiple interpreters? If not, what about exploring language support for frozen objects or proxies?
The only thing that free threading gives you is concurrent mutations to Python objects, which is like, whatever. In all my years of writing Python I have never once found myself thinking "I wish I could mutate the same object from two different threads".
I think the GIL provides python with a great guarantee, I would probably prefer single-thread performance improvements over multithreading in python to be honest.
Anyway if I need performance, Python would probably not be my first choice
Microsoft used to do this for their C runtime library.
Kudos to those involved into making it happen.
`from future import time_travel`
But I do agree that it would be a bit clearer to talk in terms of time taken rather than speedup % i.e. instead of "20% slowdown to over 100% speedup" it's clearer to say "takes between 50% and 125% of the original time". (Especially since people very often say things like "3 times faster", which technically means 4 times as fast, when they should say "3 times as fast"; "takes 1/3 of the time" is unambiguous.)
A lot of Python code still leans on CPython internals, C extensions, debuggers, or odd platform behavior, so PyPy works until some dependency or tool turns that gap into a support problem.
The JIT helps on hot loops, but for mixed workloads the warmup cost and compatibility tax are enough to keep most teams on the interpreter their deps target first.
The PSF is primarily a political advocacy organisation, so it wouldn't make sense for them to use the money for Python.
See https://github.com/numpy/numpy/issues/30416 for example. It's not being updated for compatibility with new versions of Python.
Like this is a big deal to get a project to a state where volunteers are spun up and actively breaking tasks and getting work done, no? It's a python JIT something I know next to nothing about — as do most application developers — which tells one how difficult this must have been.
The funding was Microsoft employing most of the team. They were laid off (or at least, moved onto different projects), apparently because they weren't working on AI.
(The latter is probably more to do with the preferences they give it in the re-inforcement learning phase than anything technical, though.)
That is not remotely the case for anyone who produces quality work.
If you care about quality you absolutely can guide a machine to produce that for you without writing a single line of code yourself.
And I expect the amount of guidance needed will continue to drop.
In my experience the people who care the most about code readability tend to be the people most opinionated on having the right abstractions, which are historically not available in Go.
This would be a potential case for a new major version number.
It will be interesting to see, moving forward, what languages survive. A 15% perf increase seems nice, until you realize that you get a 10x increase porting to Rust (and the AI does it for you).
Maybe library use/popularity is somewhat related to backwards compatibility.
Disclaimer: I teach Python for a living.
So, you keep reading/writing Python and push a button to get binary executables through whatever hoops are best today ?
(I haven't seen the "fits your brain" tagline in the recent past ...)
> taking backwards compatibility so seriously
Python’s backward compatibility story still isn’t great compared to things like the Go 1.x compatibility promise, and languages with formal specs like JS and C.
The Python devs still make breaking changes, they’ve just learned not to update the major version number when they do so.
I would say it's probably worth it to clean up all the junk that Python has accumulated... But it's definitely not very high up the list of languages in terms of backwards compatibility. In fact I'm struggling to think of other languages that are worse. Typescript probably? Certainly Go, C++ and Rust are significantly better.
The more likely reason is that there simply hasn't been that big a push for it. Ruby was dog slow before the JIT and Rails was very popular, so there was a lot of demand and room for improvement. PHP was the primary language used by Facebook for a long time, and they had deep pockets. JS powers the web, so there's a huge incentive for companies like Google to make it faster. Python never really had that same level of investment, at least from a performance standpoint.
To your point, though, the C API has made certain types of optimizations extremely difficult, as the PyPy team has figured out.
But the main problem was actually that pypy was never adopted as “the JIT” mechanism. That would have made a huge difference a long time ago and made sure they evolved in lock step.
AFAIK it was not driven by anything on the tech side. It was simply unlucky timing, the project getting in the middle of Microsoft's heavy handed push to cut everything. So much so that the people who were hired by MS to work on this found out they were laid off in a middle of a conference where they were giving talks on it.
Or lack of incentive?
Alot of big python projects that does machine learning and data processing offloads the heavy data processing from pure python code to libraries like numpy and pandas that take advantage of C api binding to do native execution.
A worthwhile JIT is a fully optimizing compiler, and that is the hard part. Language semantics are much less important - dynamic languages aren’t particularly harder here, but the performance roof is obviously just much lower.
Including simply implementing the slow parts in C, such as the high performance machine learning ecosystem that exists in Python.
blueberry (aarch64)
Description: Raspberry Pi 5, 8GB RAM, 256GB SSD
OS: Debian GNU/Linux 12 (bookworm)
Owner: Savannah Ostrowski
ripley (x86_64)
Description: Intel i5-8400 @ 2.80GHz, 8GB RAM, 500GB SSD
OS: Ubuntu 24.04
Owner: Savannah Ostrowski
jones (aarch64)
Description: Apple M3 Pro, 18GB RAM, 512GB SSD
OS: macOS
Owner: Savannah Ostrowski
prometheus (x86_64)
Description: AMD Ryzen 5 3600X @ 3.80GHz, 16GB RAM
OS: Windows 11 Pro
Owner: Savannah Ostrowski