A Python Interpreter Written in Python
157 points
by xk3
17 days ago
| 12 comments
| aosabook.org
| HN
BoppreH
13 days ago
[-]
> Byterun is a Python interpreter written in Python. This may strike you as odd, but it's no more odd than writing a C compiler in C.

I'm not so sure. The difference between a self-hosted compiler and a circular interpreter is that the compiler has a binary artifact that you can store.

With an interpreter, you still need some binary to run your interpreter, which will probably be CPython, making the new interpreter redundant. And if you add a language feature to the custom interpreter, and you want to use that feature in the interpreter itself, you need to run the whole chain at runtime: CPython -> Old Interpreter That Understand New Feature -> New Interpreter That Uses New Feature -> Target Program. And the chain only gets longer, each iteration exponentially slower.

Meanwhile with a self-hosted compiler, each iteration is "cached" in the form a compiled binary. The chain is only in the history of the binary, not part of the runtime.

---

Edit since this is now a top comment: I'm not complaining about the project! Interpreters are cool, and this is genuinely useful for learning and experimentation. It's also nice to demystify our tools.

reply
gwerbin
13 days ago
[-]
PyPy handled this by implementing PyPy in a restricted minimal subset of Python that they called RPython, and that seemed to work out well for them.
reply
mikepurvis
13 days ago
[-]
I was never a user of PyPy but I really appreciated the (successful) effort to cleanly extract from Python a layer that of essential primitives upon which the rest of the language's features and sugar could be implemented.

It's more than just what is syntax or a language feature, for example RPython provides nts classes, but only very limited multiple inheritance; all the MRO stuff is implemented using RPython for PyPy itself.

reply
paulddraper
13 days ago
[-]
The key difference is that RPython is actually a compiled language.

I.e. PyPy DOESN'T have an interpreter written in an interpreted language.

reply
SJC_Hacker
13 days ago
[-]
This is the case only if the new interpreter does not simply include the layer that the old interpreter has for translating bytecode to native instructions. Once you have that, you can simply bootstrap any new interpreters from previous ones. Even in the case of supporting new architectures, you can still work at the Python level to produce the necessary binary, although the initial build would have to be done on an already supported architechture.
reply
direwolf20
13 days ago
[-]
Interpreters don't translate bytecode to native instructions.
reply
SJC_Hacker
13 days ago
[-]
The usual understanding of "interpreter" in a CS context is program that executes source code directly without a compilation step. However the binary that translates an intermediate bytecode to native machine code is at least sometimes called a "bytecode interpreter".

https://doc.pypy.org/en/latest/interpreter.html

reply
ghusbands
13 days ago
[-]
This is still incorrect. A bytecode interpreter, as its name indicates, interprets a bytecode. Typically, compiling a bytecode to native machine code is the work of a JIT compiler.
reply
genxy
13 days ago
[-]
reply
ghusbands
13 days ago
[-]
That's a partial evaluator, not an interpreter, and it converts an interpreter into compiler, which are different things.
reply
genxy
13 days ago
[-]
> Interpreters don't translate bytecode to native instructions.

> That's a partial evaluator, not an interpreter, and it converts an interpreter into compiler, which are different things.

https://old.reddit.com/r/Compilers/comments/1sm90x5/retrofit...

reply
ghusbands
12 days ago
[-]
Yes, that's another great example of the same kind of thing - creating a JIT from an interpreter. It remains true that interpreters do not directly generate machine code.
reply
genxy
12 days ago
[-]
The author of weval is the top comment.

Reading the comments and understanding that transitively, weval turns interpreters into compilers, allowing interpreters to generate machine code.

reply
direwolf20
12 days ago
[-]
If you turn milk into cheese it isn't milk any more, and it doesn't prove that milk is a yellow solid.
reply
genxy
8 days ago
[-]
We lost the plot here.

What are your goals, to let everyone know that interpreters, definitionally don't generate code? This isn't debate club.

I dropped a cool link that shows we have a machine that turns interpreters into compilers. I am talking about the machine. You are talking about the definition. We aren't talking about the same thing.

reply
ghusbands
8 days ago
[-]
Partly, it's simply that words matter. An interpreter is not a compiler, even if partial evaluators and Futamura transforms are very cool. Posting about them in a context that isn't a confusion about what interpreters are may have been more fruitful.
reply
anitil
13 days ago
[-]
Oooh it's a bytecode interpreter! I was wondering how they'd fit a parser/tokenizer in 500 lines unless the first was `import tokenizer, parser`. And it looks like 1500ish lines according to tokei

I think because python is a stack-based interpreter this is a really great way to get some exposure to how it works if you're not too familiar with C. A nice project!

reply
cestith
13 days ago
[-]
The article contrasts Python to Perl, saying Perl is purely interpreted while Python has compilation. This is factually incorrect.

Perl is transformed into an AST. Then that is decorated into an opcode tree. The thing runs code nearly as fast as C in many instances, once the startup has completed and the code is actually running.

reply
throwpoaster
13 days ago
[-]
reply
jgbuddy
13 days ago
[-]
one liner:

eval(str)

reply
PhunkyPhil
13 days ago
[-]
I can do you one better:

```python3

from openai import OpenAI

import sys

client = OpenAI()

response = client.chat.completions.create( model="gpt-4", messages=[{ "role": "user", "content": f"generate valid python byte code this program compiles to: {sys.argv[1]}" }] )

print(response.choices[0].message.content)

```

Actually, probably not better.

reply
nagaiaida
12 days ago
[-]
and as soon as one tries to meaningfully add features to this sort of metainterpreter, the usefulness of homoiconic syntax becomes abundantly clear
reply
nasretdinov
13 days ago
[-]
Went into comments looking for this exact comment. Wasn't disappointed
reply
_blk
13 days ago
[-]
Great minds think alike ;)
reply
tekknolagi
13 days ago
[-]
reply
bjoli
13 days ago
[-]
And, in some ways, PyPy. I still think it is the sanest way to implement Python.

It makes me sad that I have to write C to make any meaningful changes to Python. Same goes for ruby. Rubinius was such a nice project.

Hacking on schemes and lisps made me realize how much more fun it is when the language is implemented in the language itself. It also makes sure you have the right abstractions for solving a bunch of real problems.

reply
actionfromafar
13 days ago
[-]
Well, one could rewrite Python (perhaps piece by piece?) in Shedskin.

Shedskin is very nearly Python compatible, one could say it is an implementation of Python.

reply
anitil
13 days ago
[-]
> And, in some ways, PyPy

What do you mean by that? I'm not familiar with PyPy

reply
nxpnsv
13 days ago
[-]
PyPy is python implemented in python. It is fast.
reply
notpushkin
13 days ago
[-]
https://pypy.org/

It lags behind CPython in features and currently only supports Python versions up to 3.11. There was a big discussion a month ago: https://news.ycombinator.com/item?id=47293415

But you can help! https://pypy.org/howtohelp.html

https://opencollective.com/pypy

reply
Doxin
13 days ago
[-]
PyPy is python implemented in RPython, which is technically a python subset. It's so restricted it might as well be a different language though.
reply
bjoli
13 days ago
[-]
It is restricted in a way that you would restrict yourself to write high speed software in most languages, and I found it is not that restrictive compared to C that you would have to use if you were to write a fast Python library.
reply
Doxin
13 days ago
[-]
oh for sure, but I still feel like telling people pypy is written in python is misleading. it's written in something significantly like python, but it's not python.
reply
mjmas
13 days ago
[-]
> technically a python subset

So it can just run under CPython? If so, then that isn't too misleading.

reply
bjoli
13 days ago
[-]
Yes. It can run under Cpython (2.7).
reply
nxpnsv
12 days ago
[-]
PyRPy is just less catchy sounding
reply
wyldfire
13 days ago
[-]
The fact that it's written in python is often brought up in order to explain its name. But really, it's much less interesting than the fact that it has a tracing JIT. If it were called PyJIT I'd bet it would be clearer and more obvious that it's fast. And people would prob get less hung up on the distinction between python/rpython.
reply
vachanmn123
13 days ago
[-]
Very well written! Everyone used to tell me during Uni that stacks are used for running programs, never ACTUALLY understood where or how.
reply
woadwarrior01
13 days ago
[-]
aka A Metacircular Interpreter
reply
mapontosevenths
13 days ago
[-]
Do you think God stays in heaven because he too lives in fear of what he's created?
reply
blueybingo
13 days ago
[-]
the article glosses over something worth pausing on: the `getattr` trick for dispatching instructions (replacing the big if-elif chain) is actaully a really elegant pattern that shows up in a lot of real interpreters and command dispatchers, not just toy ones -- worth studying that bit specifically if you're building anything with extensible command sets.
reply
johndough
13 days ago
[-]
Are you a bot? All your recent comments point out a thing in an article and contain LLM-isms.
reply
bdangubic
13 days ago
[-]
you asking a bot if it is a bot? :)
reply
gbacon
11 days ago
[-]
You know that Voight-Kampff test of yours? Did you ever take that test yourself?
reply
blueybingo
13 days ago
[-]
haha no im not a bot, but starting to realise i sound like one. need to be less cynical.
reply
bdangubic
13 days ago
[-]
exactly what the bot would say lol :)
reply
blueybingo
12 days ago
[-]
damn brain is becoming bot mush
reply
gield
13 days ago
[-]
(2012)
reply
em-bee
13 days ago
[-]
actually it was published as a chapter in "500 lines or less" in 2016: https://news.ycombinator.com/item?id=11796253

the text is based on python 3.5 which was released in 2015

other discussions:

https://news.ycombinator.com/item?id=16795049

https://news.ycombinator.com/item?id=12455104

https://news.ycombinator.com/item?id=11796253

reply
gield
13 days ago
[-]
Oops, I went by the publication date of the book
reply
em-bee
13 days ago
[-]
where did you see a publication date of 2012 if the book was published in 2016?
reply
andltsemi3
13 days ago
[-]
"Yaw dog I heard you liked python, so I put python in your python so you can interpret python while you interpret python"
reply
hcfman
13 days ago
[-]
Just wondering why you stopped there? Why not a python interpreter for a python interpreter for python ?
reply
dnnddidiej
13 days ago
[-]
It already is that.
reply