AI Agent Guidelines for CS336 at Stanford
148 points
2 hours ago
| 29 comments
| github.com
| HN
aaaronic
46 minutes ago
[-]
I'm trying something similar this semester with my course via AGENTS.md. I think this one is overly verbose and probably falls out of context windows pretty quickly, based on my experience (for me, a very terse but clear set of 30 lines performed better than providing examples and more nuanced explanations during my testing with a few models).

I have included the basic "I am a student -- help me learn, don't just do everything for me," but I also am trying out telling it to generate a .history folder with a markdown history of every prompt and a summary of the action take in response.

I _know_ there are some tools that offer the prompt history automatically, but I've told students they can use _whatever_ tool they want, but should let me know if the folder isn't showing up as they work.

The .history folder is required if they used AI and I intend to review it and try to give specific feedback to the students using it as too much of a crutch.

I just started this last Friday, so wish me luck!

reply
andersmurphy
1 hour ago
[-]
Seems like a pretty close copy of Carson's (of HTMX fame) agent.md from 5 months ago

https://gist.github.com/1cg/a6c6f2276a1fe5ee172282580a44a7ac

reply
philipportner
53 minutes ago
[-]
They reference the gist of 1cg in the honor code section of CS336.

https://cs336.stanford.edu/

reply
ohmahjong
2 hours ago
[-]
This seems somewhat sensible to me - the genie _is_ out of the bottle, and students absolutely will use AI agents to finish assignments without learning a thing, but there is some value to showing how agents can be used as teaching tools and what healthy use _can_ look like
reply
asdff
1 hour ago
[-]
Same issue as with cliffnotes. Easy way out means the easy way will be taken. Unless, you actually design a decent assignment or exam. In person essays or exams, heavily weighted, you are simply screwed if you didn't study the old fashioned way. A couple of my more serious classes were like this: no homework, no projects, entire grade based on 3 exams. That put the fear of whatever diety you subscribe to into you like nothing else to study hard and not fall behind. One bad exam you can't really come back from. Better luck next year when you retake it. Or, you dig in like hell.
reply
hibikir
15 minutes ago
[-]
3 tests was already better than the traditional Spanish university class: 1 exam. which is probably written by the department head, not your teacher, and he isn't in any way interested in a high pass rate. Failing 90% of the class might even be positive for them. At that point classes aren't even important: You purchase the tests from the last 10 years, and then you have a prayer of knowing what the bar might be this year.

Teaching, fairness and measuring student performance might seem like similar goals, but it's just so very easy to make sure you succeed at one while messing up the others.

reply
asdff
9 minutes ago
[-]
I tested out of all but the last required Spanish class so I probably skipped over some early stuff and avoided the deeper stuff. But at that level I remember we'd do oral exams with the TA 1 on 1 maybe 15 mins in the hallway. I forget the logistics of it all now. I remember making presentations and class participation in spanish being important. I can't remember how the written exams went.
reply
MengerSponge
1 hour ago
[-]
The insidious thing here is that students can think they're studying and practicing by chatting with an AI "tutor", which shifts them into a passive observation role that's no better than watching YouTube videos.

It turns out that it's much less memorable if you're too "clear and helpful", so nothing helpful sticks for students. A good teacher (tutor, educator, pick a word) challenges students and makes them the right amount of uncomfortable.

reply
asdff
15 minutes ago
[-]
These resources often suck for the college major level anyhow. Youtube and such is all dumbed down usually. Or if it isn't dumbed down, you risk studying beyond the scope of the lecture. Every class I took, the professor would say something like "anything in lecture could end up on the exam." And indeed, every exam was comprised of something that came from the slides, and nothing that didn't come from the slides. Even if there was an assigned textbook, there would be so much skipped over, either subtopics or entire chapters. Emphasis can vary by lecturer for the same class as well. The class might fall behind or run ahead of whatever is outlined on the syllabus; that is more an aspirational goal than a solid plan of what to expect.

The best tutor, as always, is your TA or professor, during office hours that you already pay for in tuition. No one takes advantage though, well the students who were getting As already do just to validate their understanding. The students who really ought to go never go.

reply
hluska
31 minutes ago
[-]
I used to love classes like that and now that I’m a few decades beyond university, I realize they helped me the most. That do it properly now or everything is going to suck is a good prep for the real world.
reply
JohnMakin
1 hour ago
[-]
They're only cheating themselves in a world that increasingly cares about knowledge (market trend of seniors being preferable hires to fresh out of school juniors) and not the piece of paper that "proved" you had such knowledge.
reply
bigstrat2003
18 minutes ago
[-]
I agree with you that they are cheating themselves. Unfortunately, a bunch of 18-22 year olds also don't tend to have the maturity to realize that fact. I imagine that the university is trying to nudge them to do the courses in a way that helps themselves because they know otherwise the students won't be wise enough to do that.
reply
llbbdd
1 hour ago
[-]
Agreed. I don't know how they plan to enforce this but this is way better than some other articles that have come up indicating educational bans on AI use, in-person proctoring, verbal assessments, pen and paper exams etc. This is the first attempt at an approach I've seen that doesn't seek to isolate education from reality; students that are effective at integrating AI into their work and actually understand what they're doing are going to get jobs, which is ultimately the goal of school.
reply
NickNaraghi
2 hours ago
[-]
This would be an interesting approach if the course supplied a custom Harness (perhaps in place of a textbook) and this was part of the instruction set inside of it. As a standalone thing you ask students to import into their agent, seems unlikely to work.
reply
abahgat
1 hour ago
[-]
To be fair, shipping these guidelines as AGENTS.md/CLAUDE.md in the repo that contains the assignments will make it so that agents will pick this up without needing students to opt in explicitly. Seems like a reasonable first step to me
reply
simonw
2 hours ago
[-]
Hah, I like that these are presented as a CLAUDE.md.

(They have the same content duplicated in an AGENTS.md as well - I really wish Anthropic would hurry up and teach Claude Code to check for that file too.)

reply
globular-toast
1 minute ago
[-]
> I really wish Anthropic would hurry up and teach Claude Code to check for that file too.

Surely such a trivial feature could be implemented in seconds using e.g. Claude? It's not about them not "hurrying up".

reply
israrkhan
1 hour ago
[-]
We symlink AGENTS.md and CLAUDE.md to a single file in our repo
reply
cpeterso
1 hour ago
[-]
You can also include other md files like AGENTS.md in CLAUDE.md:

  @AGENTS.md
reply
bakugo
1 hour ago
[-]
They won't, because forcing the file to be named after their product is an intentional marketing choice. Free advertising on every repo that has it.
reply
dymk
25 minutes ago
[-]
They won’t until the winds change, and people start talking about the tradeoffs of Claude Code vs any of the other thousand good quality agent harnesses out there that recognize AGENTS.md

Opencode is good enough for most workflows IME, even if it doesn’t have the kitchen sink of features as cc

reply
matltc
2 hours ago
[-]
I wouldn't hold my breath.
reply
cush
1 hour ago
[-]
This is such a realistic balance between completely banning coding agents and embracing the spirit of higher education
reply
joshmayer
26 minutes ago
[-]
i think people out of school underestimate the power of exams. there's a huge difference in classes recently between ones with and without exams. if there is an exam, people are way more likely to study and therefore actually learn
reply
recursivedoubts
2 hours ago
[-]
I think these are based on the one I posted a while back:

https://gist.github.com/1cg/a6c6f2276a1fe5ee172282580a44a7ac

reply
alexhans
45 minutes ago
[-]
Congrats. This seems like a great prompt to ensure a useful default experience. People should not confuse this with "anti cheating" and instead helping people learn how to learn.

Do you have further insights on AI and education since?

reply
brunborg
1 hour ago
[-]
Yes absolutely! We linked your version inside the extended AI policy document, but forgot to add it to our website cs336.stanford.edu
reply
sgirard
2 hours ago
[-]
This is interesting. I don't know how the AI agent guidelines will be enforced because there will always be a model outside the curriculum that a student can use to bypass the guidelines. Encouraging academic integrity is useful but requires the student to buy into the idea that they are paying for an education, not a diploma. This is a tough problem and I have been wondering how CS departments are incorporating AI into the curriculum while encouraging appropriate use in a learning environment.
reply
earthnail
1 hour ago
[-]
Stanford has an honour code. Meant no oversight even during exams. Worked surprisingly well when I was there. The flipside is, if you’re ever caught cheating, there are no second chances.

I imagine this applies here, too, if they want to enforce it strictly.

reply
asdff
1 hour ago
[-]
>Worked surprisingly well when I was there.

How could you tell? I proctored. People cheat pretty frequently and other students are none the wiser. It really takes like 4 proctors if you want to do it right. Even then I'm sure the clever ones are slipping through. These were scantron though. Short response/essay format you'd be screwed if you didn't know your stuff.

reply
henry2023
1 hour ago
[-]
Marc Tessier-Lavigne was Stanford's president from 2016 to 2023. Not sure if the honor code means anything nowadays.
reply
shimman
1 hour ago
[-]
You mean it worked well for cheaters right? The more I learn about these "honor codes" the more I realize how sheltered these American elites have become.
reply
itopaloglu83
1 hour ago
[-]
Well, no amount of instructions would work if the student has no intention to learn anything.
reply
gchamonlive
1 hour ago
[-]
In an ideal world guidelines should be suggestions for those willing to make the best of the course and improve as a person and professional. However a degree has real world value and repercussions, so enabling someone incompetent to do a dangerous job can put innocent lives in jeopardy. It's tough, but I hope in time we learn how to live with this new tech.
reply
overgard
35 minutes ago
[-]
It's a good idea, at the very least it communicates intent to students, but couldn't students just modify CLAUDE.md and not check in their changes to that?
reply
xydac
1 hour ago
[-]
This is a very good baseline for future courses to build on, there would always be a group that wants to jailbreak this and thats okay, but have baseline agent support learning is needed in this ai first world.
reply
ezfe
1 hour ago
[-]
Jailbreaking isn’t even needed - you can just modify the file
reply
walrusted
1 hour ago
[-]
I just took a C1 Spanish class and it had almost exactly the same instructions. Hmmm and I do not wonder why...
reply
ritzaco
2 hours ago
[-]
yeah I don't think that's going to work - it would be kind of like "we're releasing model answers to all assignments but please only use them as a teaching aid and don't copy from them"

best to

a) adapt assignments so that agents are bad at producing solutions

b) have more scenarios where students have to do things in controlled environments. Universities managed to adapt to 'any solution you need is readily available online' so I don't think it will be that different to have several times a month/year where students have to go into a room with nothing but pencil and paper to prove what knowledge they have vs what they have the skills to access

reply
harikb
2 hours ago
[-]
Laptop without internet access, sure. Pencil and paper? that is brutal :)
reply
dybber
32 minutes ago
[-]
20 years ago this was not unheard of. One exam we had to translate C code to assembly for one of the exercises, convert to numbers to IEEE754 representations and similar, both tasks where access to a laptop would make it possible to cheat. Also had to modify some small computer architecture diagrams if I recall correctly.

For the linear algebra written exam it didn’t work as if you learned to solve the 4 previous years exams, you could be sure most of it was familiar, so you could just prepare for a few standard exercises without really understanding the content.

Our advanced algorithm course used a bit of a combination, with a project take home exam (knapsack like optimization problem - competing for the fastest implementation) combined with a two hour written exam with multiple choice answers, but again only with books, pencil and paper to get to the right answer. This I think could work today, having both the opened ended project + some multiple choice with pencil/paper.

reply
artificialLimbs
1 hour ago
[-]
I did most of my CS class tests this way within the last year. It’s not that bad because prof doesn’t care about syntax so much (unless that’s what we’re testing on of course) and details, but wanting instead to make sure we understand broader concepts.
reply
jastanton
1 hour ago
[-]
I agree it's not a complete solution. But as those don't exist as a society we are looking for a step function in the right direction. and IMO this is one such step. You may disagree that it's not a very large step, but I would argue it's still in the right direction therefore it is neccesary, especially in education space, and I'm happy to see someone publishing at attempt.
reply
rossant
1 hour ago
[-]
Interesting. It makes me think of the idea of fighting piracy by providing a solid legal alternative through streaming platforms, etc.
reply
georgemcbay
2 hours ago
[-]
> What AI Agents SHOULD NOT Do

> * Run bash commands

Students who prefer to use zsh keep winning.

reply
MengerSponge
57 minutes ago
[-]
zsh is fine, but I prefer fish. It has a funnier name!
reply
soldeace
1 hour ago
[-]
I'm definitely going to use a variation of it for learning new programming languages.
reply
baddash
34 minutes ago
[-]
Even though it seems radical, I think the right approach is to simply allow the students to use AI to its full potential, to generate answers, code, whatever.

The onus should be on the instructor to make sure that the student ends up actually understanding and being able to code/solve problems that they pose without using coding agents.

Why? Because:

1. this is exactly what is going on in the real world. People are able to get AI to do whatever the hell they want, but the ones who just use it lazily end up with huge cognitive debts and codebases riddled with opaque bugs that they do not understand whatsoever. If we prevent students from confronting this temptation, then we are sort of coddling or shielding them from it, and not really preparing them to avoid pitfalls of this type.

2. you can actually learn a LOT by being given the answer, if you actually care to learn. i personally think it's pretty fucking lame to handicap a student's ability to learn in an attempt to prevent lazy abuse. isn't the whole point of a grade to measure how well you understand things? can't you have pop quizzes, assignments on a computer with no agent use, written tests, etc etc. to catch the lazy abusers? this is an unnecessary prevention of lazy abuse that unfairly handicaps learning

reply
__float
27 minutes ago
[-]
> you can actually learn a LOT by being given the answer, if you actually care to learn.

Even if you "actually care to learn", this is a huge mental shortcut and you're deceiving yourself if you think deep learning is happening from looking at the answer.

On top of that, the pressures to just finish the coursework and move on to your other homework due tomorrow seems pretty high. Your suggestion means we're no longer coddling/shielding students, but we also aren't actively helping them, are we?

reply
gaiagraphia
1 hour ago
[-]
Is this all an elite educational institution with about $50bil in assets could muster, lol? This is completely and utterly unenforceable, and such, worthless.

There really needs to be diversity in delivery styles for different modules of courses according to their aims, with 'ai access' as a key variable.

If AI is allowed, it should be based on $x of usage/student, with an audit trail to prove no external funding was used, and module aims based on using AI to the max while conserving token use. Like actually creating wild, ambitious shit which takes cutting edge services to the max.

If AI is not allowed for a module, then it really needs to go back to the old skool, with handwritten exams, or coding using old machines and textbooks. Some skills, techniques, etc, really do need drilling.

Straddling the middle will help nobody, result in accusations, increase the burden on teaching staff, and result in a course without a realistic focus.

Though I guess if you're a big brand university, you don't really need to care about innovating. The money will keep pouring in. The whole further education sector is in dire need of a shake up.

reply
chalupa-supreme
1 hour ago
[-]
I don’t really know why this is getting downvoted. It’s clear that higher education is degrading because of easy to reach AI solutions that have no type of penalties for use.

During my undergrad it was normal to see people refer to Chegg solutions to get their answers, or as a friend for theirs.

Maybe there’s a reason my first CS professor wrote out Java code with pencil and paper I guess.

reply
brcmthrowaway
28 minutes ago
[-]
What's the estimated RoI on doing this course?
reply
farmeroy
1 hour ago
[-]
I really like this. I'm currently doing a part time BSc and my current module explicitly allows AI usage as long as you 'cite it'. The guidelines are out of date in that they assume you are using a chatbot and not a coding harness. The temptation to have claude write all my pandas code has become too difficult for my self control, but at the same time I actively feel my education is suffering from using it. As I write my final paper I am thankful that I at least despise AI writing too much to use it for the actual marked assessment, but I still feel that I have cheated myself out of part of my education and probably wasted a lot of time going fast in the wrong direction because generating data frames, graphs, statistics, etc. is just so easy with claude
reply
xyzal
1 hour ago
[-]
I am really baffled by the comments in the spirit of "this is unenforceable, and therefore worthless".

I bet most people would not steal even if they knew they could get away with it.

reply
gaiagraphia
1 hour ago
[-]
Students are struggling to get work after graduating because they're dropped into a competitive environment. Ideals aren't enough to get jobs in the current environment.

Universities should be places which are at the bleeding edge of development and providing society with the best new ideas/tech, etc has to offer. Junior workers should be hotbeds of exciting talent which have the ability to revolutionise industries.

By creating such milquetoast environments to study in, which are seemingly scared or unable to prepare people for the future, students are being done a disservice.

Far too many people are far too comfortable with their cushty positions, and it's not doing the youth any favours.

reply
CamperBob2
1 hour ago
[-]
I mean, some would say that's how this whole thing got started.
reply
cute_boi
2 hours ago
[-]
And, yes students are going to follow it....
reply
ChrisArchitect
1 hour ago
[-]
Related:

CS336: Language Modeling from Scratch

https://news.ycombinator.com/item?id=48357075

reply
echelon
2 hours ago
[-]
This is ridiculous. The genie is not going to go back into the bottle. This is the equivalent of "you wouldn't download a car". (Yes, we would.)

The solution is to scale the difficulty of the objective measures. Expect far more from students.

Reorient the university around physical laboratories and timesharing resources no single student could afford. It's already like this in many STEM disciplines.

More internships, more networking, more large projects. Less trivial tests of knowledge and credentialism.

reply
mi_lk
2 hours ago
[-]
good intention but useless let's be real
reply
LVB
2 hours ago
[-]
Seeing my own kids (teens) go through some of this, I'm becoming slightly less pessimistic as it all shakes out. Among their peer groups there does seem to be an opinion forming that sure, anyone can just ask ChatGPT for quick answers on assignments, but actually knowing stuff is a bit of a "flex" that's respected.
reply
datakan
1 hour ago
[-]
300 years ago when I was in high school I had a friend choose to go the HVAC trade school route instead of college. He chose the hardest school in the country where they did most things manually so that students understood how things work. It removed the "magic" some tools provide. I was pretty impressed he was wise enough to do that. He's exceptional at his job by the way.

I think we have a tendency to think the worst of your people. They frequently surprise me though.

reply
neutronicus
1 hour ago
[-]
Teens also fucking hate AI, on a cultural-ideological level.
reply
bdangubic
1 hour ago
[-]
I think they may hate what it may be doing to their future outlooks but they use it as much as they do social media
reply
neutronicus
54 minutes ago
[-]
Yeah, that's exactly why I added the second clause.

Nevertheless. The peer pressure is to be anti-AI.

reply
xiaoyu2006
2 hours ago
[-]
I always wonder why there is such course. Using agent ai coding tool is trivial.
reply
hn_throwaway_99
2 hours ago
[-]
When calorie dense food and gas powered vehicles came on the scene, humans (generally) got fat and out of shape. "Why eat that salad and go for a run?" one might say, "This cheesecake tastes much better and I can just drive wherever I want to go."

Getting fat is one thing, but getting stupid is another, and I really fear for the future of humanity when it becomes so easy to sidestep the processes that let us actually learn and grow because stuff like "using agent ai coding is trivial".

reply
gaiagraphia
1 hour ago
[-]
There's different skills at play, and they're both as valuable as each other.

They shouldn't be thrown into a big soup with shaky aims.

We still - as a society - manage to have PE and driving as different subjects. The same can equally apply here.

reply
walrusted
1 hour ago
[-]
using a coding tool is trivial, correct. so is using a microwave oven or its larger counter-parts. you need a certain level of person to know if what came out of it was Michelin-star or not and I do not think Stanford is going for Hot Pockets here.
reply
lukeigel
1 hour ago
[-]
Pangram reports as 100% AI generated. Makes sense for a README, but a tad bit funny given that their students must hand-write code
reply
cba_wllm
32 minutes ago
[-]
Stanford is industry compliant and teaches the youth to outsource their thinking to BigTech. No surprise here, the donors will be happy.

The "but we do not let them write code directly" is a smoke screen to appease critics and parents. Yes, hello parents, you pay for your offspring to become a mindless industry tool.

reply
londons_explore
21 minutes ago
[-]
As an employer, I want AI to be fully allowed for assignments, and the assignments to be made trickier to compensate.

Let's train people to use all the tools available to solve the hardest problems, rather than solving toy problems with a slide rule.

reply
jimbokun
10 minutes ago
[-]
You have to balance that with teaching the skills needed to understand the domain sufficiently to take over when the model gets things wrong.
reply
unglaublich
14 minutes ago
[-]
What a ridiculous take. Just because solutions exist in the llm training data, doesn't make these problems 'toy' or 'easy'. The human 'engineering hardness' scale doesn't align with what an llm can and can't do.
reply