I've also uploaded binary executable for JavaScript (Deno), Lua and PHP and had it write and execute code in those languages too: https://til.simonwillison.net/llms/code-interpreter-expansio...
If there's a Python package you want to use that's not available you can upload a wheel file and tell it to install that.
So he tricked it by saying “what is the error message if you try to pip install foo” so it ran pip install and announced there was no error.
Package foo now installed.
Normie: How do I do X in Linux?
Linux nerds: RTFM, noob.
vs.
Normie: Linux sucks because you can't do X.
Linux nerds: Actually, you can just apt-get install foo and...
The wiki however was (is?) absolutely fantastic. I used it as a general-purpose Linux wiki before I even switched to Arch, I distinctly remember the info on X Multi-Head being leagues above other resources I could find.
https://en.wikipedia.org/wiki/Ward_Cunningham#%22Cunningham'...
Does it matter if I answer every question with either 1 or 2 and flip a coin each time to decide which?
Deterministic means that if it is accurate/correct once, it will continue to be in future runs (unless the correct answer changes; a stopped clock is deterministic).
I think the analogy breaks down here. The elided bit "time indicator" implied at the end makes that statement is false. A stopped clock is not a deterministic time indicator.
If the correct answer changes, a (correct and accurate) deterministic model either gets new input and changes the answer accordingly, or is not correct to begin with.
LLMs can be deterministic if you run them with a temperature of 0 or a fixed random seed, and your kernel is built to be deterministic, but they're not typically used that way, and will produce different output for identical input.
I never said it is. That's why I qualified my example with the word correct.
> no matter what you do, it gives you the same output
This is not deterministic. This is determined. I think this is the confusion I was pointing out.
>> Deterministic means that if it is accurate/correct once, it will continue to be in future runs (unless the correct answer changes; a stopped clock is deterministic).
The bit in the parenthesis, I am trying to argue, is nonsense. If the correct answer changes, the system is not accurate or correct to begin with so the point is moot. Correcting the system will make it accurate. A stopped clock is not deterministic, it's determined. As a time indicator, a stopped clock is not a correct, accurate or deterministic model at all under any possible interpretation.
Determinism is about the behavior of a system. Correctness is also about the purpose of a system. A system can have deterministic behavior while being completely unfit for its purpose. And depending on its purpose, it can be fit for purpose while being nondeterministic.
I build a box. It has an LCD display. It has a button labeled “what time is it”. You push the button and it always shows “10:43am”. This is a deterministic system.
Ask it what the capital of France is, and it will tell you it is Paris. Same with "how do I reverse a string in Python", or whatever problem you have at hand that needs solving (sans searching capability, which makes things more complicated).
So does not the problem need to be unique if you want to be able to claim with certainty it indeed has been executed? I am not sure how you account for the searching capability, and I am not excluding the possibility of having access to execution tools, pretty sure they do.
since reading on twitter is annoying with all the popups: https://archive.is/ETVQ0
One weird thing - why would they be running such an old Linux?
“Their sandbox is running a really old version of linux, a Kernel from 2016.”
They didn't.
OP misunderstood what gVisor is, and thought gVisor's uname() return [1] was from the actual kernel. It's not. That's the whole point of gVisor. You don't get to talk to the real kernel.
[1] https://github.com/google/gvisor/blob/c68fb3199281d6f8fe02c7...
I know this because at Modal.com we also use gVisor and our users occasionally ask about this.
How hard would it be to use it for a DDoS attack, for instance? Or for an internal DDoS attack?
If I were working at OpenAI, I'd be worrying about these things. And I'd be screaming during team meetings to get the images more locked down, rather than less :)
I find ChatGPT and Claude really quite good at C.
Though I like Claude's conversation style more than the other ones.
All of the exploits of early dotcom days are new again. Have fun!
Would be cool if you can get weights this way.
And maybe they contain the memory of the users and/or the documents uploaded?
Again, what is the risk?