Giving university exams in the age of chatbots
150 points
6 hours ago
| 22 comments
| ploum.net
| HN
zahlman
5 minutes ago
[-]
> I realized that my students are so afraid of cheating that they mostly don’t collaborate before their exams! At least not as much as what we were doing.

This is radically different from the world that's been described to me. Even 20 years ago cheating was endemic and I've only heard of it getting worse.

reply
knallfrosch
4 hours ago
[-]
I don't understand.

10 years ago, we wrote exams by hand with whatever we understood (in our heads.)

No colleagues, no laptops, no internet, no LLMs.

This approach still works, why do something else? Unless you're specifically testing a student's ability to Google, they don't need access to it.

reply
recursivedoubts
17 minutes ago
[-]
I am returning to this model in my classes: pen in paper quizzes, no digital devices. I also do seven equally weighted quizzes to deescalate them individually. I have reduced project/programming weight from 60-80% of my grade to 50% because it is not possible to tell if the students actually did the work.
reply
bArray
19 minutes ago
[-]
> This approach still works, why do something else?

One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.

Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.

> Unless you're specifically testing a student's ability to Google, they don't need access to it.

I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.

reply
Balgair
7 minutes ago
[-]
I want to echo this.

Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.

Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.

reply
nikitau
3 hours ago
[-]
Open book exams are not a new thing and I've often had them for STEM disciplines (maths and biology). Depending on the subject, you will often fail those unless you had a good prior understanding of the material.

If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.

reply
HWR_14
3 hours ago
[-]
Open book, sure. But you don't even need a computer for that.
reply
freehorse
3 hours ago
[-]
And even if you are allowed to use a computer, you cannot use internet (and should not be hard to prevent that).
reply
croes
2 hours ago
[-]
Local LLMs
reply
DonHopkins
2 hours ago
[-]
Be sure to bring an extra power strip for all your plugs and adaptors.

https://www.tomshardware.com/pc-components/gpus/tiny-corp-su...

reply
lostmsu
2 hours ago
[-]
My laptop runs gpt-oss 120B with none of that. Don't know how long though. I suspect a couple of hours continuous.
reply
HWR_14
2 hours ago
[-]
Which laptop?
reply
lostmsu
1 hour ago
[-]
ROG Flow Z13 with maxxed out RAM.
reply
n4r9
1 hour ago
[-]
One potential answer is that this tests more heavily for the ability to memorise, as opposed to understanding. My last exams were over ten years ago and I was always good at them because I have a good medium-term memory for names and numbers. But it's not clearly useful to test for this, as most names and numbers can just be looked up.

When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.

Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.

reply
Almondsetat
32 minutes ago
[-]
Have you ever seen a programmer who really understands C going to stackoverflow every time they have to use an fopen()? Memorization is part of understanding. You cannot understand something without it being readily available in your head
reply
n4r9
9 minutes ago
[-]
Yes and no. I mean, fopen is a fairly simple interface so there's not much memorisation needed in the first place. But there a many cases where you can forget the names of things so long as you understand the fundamental structures and how to look up names when necessary. For example, I might remember that there is some way to get a unified diff of stashed changes in git, but I have to look up that this needs a -p parameter. It's pointless for me to spend time trying to memorise that it's a p.
reply
zahlman
16 minutes ago
[-]
When I was in university, in my program, the most common format was that you were allowed to bring in a single page of notes (which you prepared ahead of time based on your understanding of what topics were likely to come up). That seemed to work fine for everyone.
reply
technothrasher
1 hour ago
[-]
> because he had a photographic memory and just so happened to leaf through a book containing a required proof

It makes for good rumours and TV show plots, but this sort of "photographic memory" has never been shown to actually exist.

reply
n4r9
44 minutes ago
[-]
Huh, TIL [0]. Thanks. There are people who can perform extraordinary memory feats, but they're very rare and/or self-trainer.

[0] https://skeptoid.com/episodes/542

reply
willis936
2 hours ago
[-]
I was in university around the same time. While there I saw a concerted effort to push online courses. Professors would survey students fishing for interest. It was unpopular. To me the motivation seemed clear: charge the same or more for tuition, but reduce opex. Maybe even admit more students to just have then be remote. It watered down the value of the degree while working towards a worse product. Why would a nonprofit public university be working on maximizing profit?
reply
ecshafer
43 minutes ago
[-]
I had some take home exams in Physics that you could use internet, books, anything except other people (but that was honor code based). Those were some of the hardest exams I ever took in my life. Pages and pages of mathematical derivations. An LLM with how they can do a pretty good job at constructing mathematics, would actually have solved that issue pretty well.
reply
everdrive
2 hours ago
[-]
People really struggle to go back once a technology has been adopted. I think for the most part, people cannot really evaluate whether or not the technology is a net positive; the adoption is more social than it is rational, and so it'd be like asking people to change their social values or behaviors.
reply
SwtCyber
3 hours ago
[-]
I think the key difference is what you're trying to measure
reply
zavec
3 hours ago
[-]
It was the same when I graduated 6 years ago. We had projects to test our ability to use tools and such, and I guess in that context LLMs might be a concern. But exams were pencil and paper only.
reply
tpoacher
3 hours ago
[-]
Optics.
reply
quacked
4 hours ago
[-]
Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"

For most of us--myself included--once you graduate from college, the answer is: "enough to not get fired". This is far less than most curriculums ask you to know, and every year, "enough to not get fired" is a lower and lower bar. With LLMs, it's practically on the floor for 90% of full-time jobs.

That is why I propose exactly the opposite regimen from this course, although I admire the writer's free thinking. Return to tradition, with a twist. Closed-book exams, no note sheets, all handwritten. Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset", where the turning-in of the assignment feels more real than understanding the assignment. Publish problem sets thousands of problems large with worked-out-solutions to remove the incentive to cheat.

"Memorization is a prerequisite for creativity" -- paraphrase of an HN comment about a fondly remembered physics professor who made the students memorize every equation in the class. In the age of the LLM, I suspect this is triply true.

reply
themafia
4 hours ago
[-]
> once you graduate from college, the answer is: "enough to not get fired"

I thought the point was to continue in the same vein and contribute to the sum total of all human knowledge. I suppose this is why people criticize colleges as having lost their core principles and simply responded to market forces to produce the types of graduates that corporate America currently wants.

> "enough to not get fired" is a lower and lower bar.

Usually people get fired for their actions and not their knowledge or lack thereof. It may be that David Graebers core thesis was correct. Most jobs are actually "bullshit jobs," and in the era of the Internet, they don't actually require any formal education to perform.

reply
qsort
4 hours ago
[-]
> Closed-book exams, no note sheets, all handwritten. Add a verbal examination

You are describing how school worked for me (in Italy, but much of Europe is the same I think?) from middle school through university. The idea of graded homework has always struck me as incredibly weird.

> In the age of the LLM, I suspect this is triply true.

They do change what is worth learning though? I completely agree that "oh no the grades" is a ridiculous reaction, but adapting curricula is not an insane idea.

reply
mnky9800n
2 hours ago
[-]
I had an electrodynamics professor say that there was no reason to memorize the equations, you would never remember them anyways, the goal was to understand how the relationships were formed in the first place. Then you would understand what the relationships are that each equation represents. That I think is the basis for this statement. Memorization of the equations gives you a basis to understand the relationships. So I guess the hope is that is enough. I would argue it isn't enough since physics isn't really about math or equations its about the structure and dynamics of how systems evolve over time. And equations give one representation of the evolution of those systems. But it's not the only representation.
reply
dns_snek
3 hours ago
[-]
> Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset"

To the horror of anyone struggling with anxiety, ADHD, or any other source of memory-recall issues under examination pressure. This further optimizes everything for students who can memorize and recall information on the spot under artificial pressure, and who don't suffer any from any of the problems I mentioned.

In grade school you could put me on the spot and I would blank on questions about subjects that I understood rather well and that I could answer 5 minutes before the exam and 5 minutes after the exam, but not during the exam. The best way for me to display my understanding and knowledge is through project assignments where that knowledge is put to practical use, or worked "homework" examples that you want to remove.

Do you have any ideas for accommodating people who process information differently and find it easier to demonstrate their knowledge and understanding in different ways?

reply
ecshafer
30 minutes ago
[-]
Maybe those people just wont get as good of grades, and that's acceptable. It is strange that the educational system determined it wasn't acceptable. If I go to a university and try to walk onto the NCAA Division 1 Basketball team, its fine for them to tell me that I am too short, too slow, too weak, can't shoot, or my performance anxiety means I mess up every game and I am off the team. If I try and go for Art but my art is bad I am rejected. If I try and go for music but my performance anxiety messes up my performances, then I am rejected.

Why aught there be an exception for academics? Do you want your lawyer or surgeon to have performance anxiety? This seems like a perfectly acceptable thing to filter out on.

reply
TheOtherHobbes
3 hours ago
[-]
The question is no longer "How do we educate people?" but "What are work and competence even for?"

The culture has moved from competence to performance. Where universities used to be a gateway to a middle class life, now they're a source of debt. And social performances of all kind are far more valuable than the ability to work competently.

Competence used to be central, now it's more and more peripheral. AI mirrors and amplifies that.

reply
6LLvveMx2koXfwn
4 hours ago
[-]
This is all very well if the goal was to sift the wheat from the chaff - but modern western education is about passing as many fee paying students as possible, preferably with a passably enjoyable experience for the institutional kudos.
reply
sersi
3 hours ago
[-]
I think that really depends on countries. I went to an engineering school only 15% of applicants out of high school were admitted and of those who were admitted only around 75% graduated.

Western education passing as many fee paying students as possible seems to be very much a UK/US phenomenon but doesn't seem to be the case of European countries where the best schools are public and fees are very low (In France, private engineering schools rank lower)

reply
josephg
4 hours ago
[-]
I wonder if education will bifurcate back out as a result of AI. Small, bespoke institutions which insist on knowledge and difficult tests. And degree factories. It seems like students want the degree factory experience with the prestige of an elite institution. But - obviously - that can’t last long. Colleges and universities should decide what they are and commit accordingly.
reply
graemep
3 hours ago
[-]
I think the UK has been heading this way for a while -- before AI. Its not been the size of the institutions that has changed, but the "elite" universities tend to give students more individual attention. A number of them (not just Oxford and Cambridge) have tutorial systems where a lot of learning is done in a small group (usually two or three students). They have always done this.

At the other extreme are universities offering low quality courses that are definitely degree factories. They tend to have a strong vocational focus but nonetheless they are not effective in improving employability. In the last few decades we have expanded the university system and there are far more of these.

There is no clear cutoff and a lot of variation in between so its not a bifurcation but the quality vs factory difference is there.

reply
Ekaros
3 hours ago
[-]
On other side in western systems funded by taxes the incentive is still to give out as many degrees as possible as schools get funding based on produced degrees.

Mostly done to get more degree holders which are seen as "more productive". Or at least higher paid...

reply
SwtCyber
3 hours ago
[-]
What I like about the approach in the article is that it confronts the "why should I know this?" question directly. By making students accountable for reasoning (even when tools are available) it exposes the difference between having access to information and having a mental model
reply
casualscience
4 hours ago
[-]
Honestly, I feel like I have to know more and more these days, as the ais have unlocked significantly more domains that I can impact. Everyone is contributing to every part of the stack in the tech world all of a sudden, and "I am not an expert on that piece of the system" no longer is a reasonable position.

This is in tech now, were the first adopters, but soon it will come to other fields.

To your broader question

> Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"

You should know things because these AIs are wrong all the time, because if you want any control in your life you need to be able to make an educated guess at what is true and what isn't.

As to how to teach students. I think we're in an age of experimentation here. I like the idea of letting students use all tools available for the job. But I also agree that if you do give exams and hw, you better make them hand written/oral only.

Overall, I think education needs to focus more on building portfolios for students, and focus less giving them grades.

reply
monsieurbanana
4 hours ago
[-]
> and "I am not an expert on that piece of the system" no longer is a reasonable position

Gosh that sounds horrifying. I am not an expert on that piece of system, no I do not want to take responsibility for whatever the LLMs have produced for that piece of system, I am not an expert and cannot verify it.

reply
dyauspitr
4 hours ago
[-]
This is like the Indian education system and presumably other Asian ones. Homework counts for very little towards your grade. 90% of your grade comes from the midterms and the finals. All hand written, no notes, no calculators.
reply
keybored
3 hours ago
[-]
That’s a terrible indictment of society if true. People are so far from self-realization, so estranged from their natural curiosity, that there is no motivation to learn anything beyond what will get you fed and housed. How can anyone be okay with that? Because even most chronically alienated people have had glimpses of self-actualization, of curiosity, of intrinsic motivation; most have had times when they were inspired to use the intellectual and bodily gifts that nature has endowed them with.

But the response to that will be further beatings until morale improves.

What about technology professionals? From my biased reading of this site alone: both further beatings and pain relievers in the form of even more dulling and pacifying technology. Follow by misanhtropic, thought-terminating cliches: well people are inherently dumb/unmotivated/unworthy so topic is not really worth our genuine attention; furthermore, now with LMMs, we are seeing just how easy it is to mimic these lumps of meat—in fact they can act both better and more pathetic than human meat bags, just have to adjust the prompts...

reply
nemomarx
1 minute ago
[-]
People who aren't fed and employed generally struggle to be self actualized, right? First you need to work for your supper, then you can focus on learning for its own sake.

As more jobs started requiring degrees, the motivation has to change. If people can get food and housing without a degree again to a comfortable extent than the type of person getting a degree will change again too.

If you let them, they'll alienate you until you have no free time and no space for rest or hobbies or learning. Labour movements had to work hard to prevent the 60 hour workweek, but we're creeping back away from 40, right?

reply
emil-lp
5 hours ago
[-]
> Most Students Don’t Want to Use Chatbots

I think this is changing rapidly.

I'm a university professor, and the amount of students who seem to be in need of LLM as a crutch is growing really exponentially.

We are still in a place where the oldest students did their first year completely without LLMs. But younger students have used LLMs throughout their studies, and I fear that in the future, we will see full generations of students completely incapable of working without LLM assistance.

reply
jcattle
2 hours ago
[-]
I think the professor here presented them with a "special" case which can not be generalized outside of the exam context.

If you're presented with the choice of "Don't use AI" and "Use AI, but live with the consequences" (consequences like mistakes being judged harsher when using AI than when not using AI), I do not think chatbots will be a desirable choice if you've properly prepared for the exam.

reply
raesene9
4 hours ago
[-]
Reading the article, it seemed to me that both the professor and the students were interested in the material being taught and therefore actively wanted to learn it, so using an LLM isn't the best tactic.

My feeling is that for many/most students, getting a great understanding of the course material isn't the primary goal, passing the course so they can get a good job is the primary goal. For this group using LLMs makes a lot of sense.

I know when I was a student doing a course I was not particularly interested in because my parents/school told me that was the right thing to do, if LLMs had been around, I absolutely would have used them :).

reply
Jedd
1 hour ago
[-]
> ... is growing really exponentially.

Or geometrically?

reply
teaearlgraycold
5 hours ago
[-]
It will be very interesting to see what will happen when LLMs start charging users for their true cost. With many people priced out how would they cope?
reply
ben_w
4 hours ago
[-]
May happen, but I suspect not in the way implied by that question.

Hardware is still improving, though not as fast as it used to; it's very plausible that even the current largest open weights models will run on affordable PCs and laptops in 5 years, and high-end smartphones in 7.

I don't know how big the SOTA close-weights models are, that may come later.

But: to the extent that a model that runs on your phone can do your job, your employer will ask "why are we paying you so much?" and now you can't afford the phone.

Even if the SOTA is always running ahead of local models, Claude Code could cost 1500 times as much and still have the average American business asking "So why did we hire a junior? You say the juniors learn when we train them, I don't care, let some other company do that and we only hire mid-tier and up now."

(Threshold is less than 1500 elsewhere, I just happened to have recently seen the average US pay for junior-grade software developers, $85k*, which is 350x cheaper, and my own observation that they're not only junior quality but also much faster to output than a junior).

* but also note while looking for a citation the search results made claims varying from $55k to $97.7k

reply
alphabetag675
4 hours ago
[-]
They would fall behind in the world just like people from developing and poor countries do today.
reply
josephg
4 hours ago
[-]
Very few people fall behind at the moment due to lack of access to information. People in poor countries largely have access to the internet now. It doesn’t magically make people educated and economically prosperous.
reply
alphabetag675
3 hours ago
[-]
You are arguing the converse. Access to information doesn't make people educated, but lack of access definitely puts people at a big advantage. Chatbots are not just information, they are tools and using it needs training because they hallucinate.
reply
KeplerBoy
4 hours ago
[-]
It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.
reply
tucnak
2 hours ago
[-]
Yeah in my last job, unsupervised dataset-scale transformations amounted to 97% of all spending. We were using gemini 2.5 flash in batch/prefill-caching mode in Vertex, and always the latest/brightest for ChatGPT-like conversations.
reply
johndough
3 hours ago
[-]
What do you think the "true cost" is?
reply
themafia
4 hours ago
[-]
Google destroyed search and replaced it with that dippy LLM box.

Are you sure student desire is the driving force here?

reply
kubb
5 hours ago
[-]
Please, you don’t need to counter-narrative everything. Maybe talk about what the professor did here and why students didn’t trust the output in an exam context in this particular subject.
reply
nicce
4 hours ago
[-]
> Second, I learned that cheating, however lightly, is now considered a major crime. It might result in the student being banned from any university in the country for three years. Discussing exam with someone who has yet to pass it might be considered cheating. Students have very strict rules on their Discord.

This has also something to do with it. Hard to make very accurate conclusions.

reply
witcher
5 hours ago
[-]
Quite a thoughtful way to adapt exams to wave of new tools for students and learn on the way.

I wished other universities adapt so quickly too (and have such a mindful attitude to students e.g. try to understand them, be upfront with expectations, learning from students etc).

Majority of professors are stressed and treat students as idiots... at least that was the case decade a go!

reply
ploum
4 hours ago
[-]
OP here: Majority of professors became professors because there were very good at passing standard exam (and, TBH, some are not good at anything else).

I’m different because I was a bad student. Only managed to get my diploma with minimal grade, always rebel against everything. But some good people at my university thought that Open Source was really important and they needed someone with a good career in that field. I was that person (and I’m really thankful for offering that position)

reply
lucb1e
5 hours ago
[-]
> 3. I allow students to discuss among themselves [during an exam] if it is on topic.

Makes me wonder if they should also get a diploma together then, saying "may not have the tested knowledge if not accompanied by $other_student"

I know of some companies that support hiring people as a team (either all or none get hired and they're meant to then work together well), so it wouldn't necessarily be a problem if they wish to be a team like that

reply
ploum
4 hours ago
[-]
OP here: I teach Open Source Strategies.

The main strategy is collaboration. If you are smart enough to:

1. Identify your problem 2. Ask someone about it 3. Get an answer which improve your understanding

Then you are doing pretty good by all standards

Another trick I sometimes use. I take one student which has hard time articulate it a concept. I take student two who don’t understand that concept. I say to student 1: "You have 20 minutes to teach student 2 the concept. When I come back, you will be graded according to his answers"

(I, of course, not only grade that. But it forces both of them to make an extra effort, student 2 not willing to be the cause for student 1 demise)

reply
zahlman
14 minutes ago
[-]
> student 2 not willing to be the cause for student 1 demise

I would very much not count on that.

reply
witcher
5 hours ago
[-]
Yea, curious too about some more rules e.g. both parties has to contribute to the discussion (:
reply
kubb
5 hours ago
[-]
I think we should send all diplomas to OpenAI and end higher education.

Less educated people are easier to steer via TikTok feeds anyway.

reply
elbci
5 hours ago
[-]
ha ha fair enough - but he does mention there's a culture of isolation and cu-throat competition at the school so, maybe it's just a reaction to that
reply
palijer
53 minutes ago
[-]
I'm back in school part time for a bachelor's, and have recently had a class where I had a professor who really understood how to implement LLM's into the class.

Our written assignments were a lot of "have an LLM generate a business proposal, then annotate it yourself"

The final exam was a 30 minute meeting where we just talked as peers, kinda like a cultural job interview. Sure there's lots of potential for bias there, but I think it's better than just blindly passing students using LLM's for the final exam.

reply
shevy-java
1 hour ago
[-]
He describes mostly a process where the exam itself, or rather testing the knowledge of a student, is not so important.

I think not all exams can occur like that. In some cases you just have to test one's knowledge about a specific topic, and knowing facts is a very, very easy way to test this. I would agree that just focusing on facts these days is overrated, but I would still reason that it is not a useless metric still. So, when the author describes "bring your own exam questions", it more means that the exam itself is not so relevant, which is fine - but saying that university exams are now useless in the age of autosolving chatbots, is simply wrong. It just means that the exam itself is not important; that in itself does not automatically mean that ALL exams or exam styles are useless. Also, it depends on what you test. For instance, testing solving math questions - yes, chatbots can solve this, but can a student solve the same without needing a chatbot? How about practical skills? Ok, 3D printing will dominate, but the ability to craft something with your own hands, that is still a skill that may be useful, at the least to some extent.

I feel that the whole discussion about chatbots dumbs down a lot. Skills have not become irrelevant just because chatbots exist.

reply
fexed
2 hours ago
[-]
What a wonderful article, and what a wonderful way of enganging with students and adapting to the new tech. I wish all professors were like you
reply
barbegal
5 hours ago
[-]
Only 2 students actually used an LLM in his exam, one well and one poorly so I'm not sure there is much you can draw from this experience.

In my experience LLMs can significantly speed up the process of solving exam questions. They can surface relevant material I don't know about, they can remember how other similar problems are solved a lot better than I can and they can check for any mistakes in my answer. Yes when you get into very niche areas they start to fail (and often in a misleading way) but if you run through practise papers at all you can tell this and either avoid using the LLM or do some fine tuning on past papers.

reply
diamondgeezer
5 hours ago
[-]
Very interesting write up, would be curious to know more about what an Open Source Strategies course entails, as far as I can remember I never had anything like that on offer at my university.
reply
nkapias
4 hours ago
[-]
reply
pautasso
4 hours ago
[-]
The problem is when students just blindly copy and paste from the chatbot and submit it as their own answer without even reading it.

They should be encouraged to read and review the LLM output so they can critically understand it and take ownership of it.

reply
themafia
3 hours ago
[-]
They should be encouraged to not turn in casual plagiarism as their own work.

I believe there is a mechanism for this already.

reply
dyauspitr
4 hours ago
[-]
In my experience, reading a solution and even understanding it doesn’t go very far in teaching you how to do something. I can look at calculus solutions all day but only when I actually try to solve them myself do I run into all kinds of roadblocks which is where the real learning happens.
reply
mkirsten
5 hours ago
[-]
Interesting write up! I’ve thought about how university exams are done effectively nowadays. I took my degree in CS almost 20 years ago, and being a user of LLMS - I can’t really see how any of my old exams would work today if students would be allowed LLMs.
reply
brainwad
5 hours ago
[-]
I graduated 15 years ago, and I think the exams in my degree were actually the most LLM-proof part of the student assessment. They were no-aid written exams with pencils and paper, whereas the assignments were online-submitted code only that an LLM could easily write.
reply
emil-lp
5 hours ago
[-]
Spoiler: they don't.

CS exercises that we can expect an average student to solve is trivially solved by LLMs. Even smaller local models.

reply
Ekaros
3 hours ago
[-]
This comes to not "smartness" of LLMs. But reality that we do not even want anything novel in these exercises or exams. And same areas are repeated multiple times so naturally there is lot of these in training data.

This is one area where LLMs really should excel at. And that doesn't really mean that students should not also learn it and be able to solve same issues. Which is real dilemma for the school system...

reply
bandrami
3 hours ago
[-]
Except some times they aren't solved by LLMs, but appear to be. A CS student should be able to look at it and tell the difference.
reply
SwtCyber
3 hours ago
[-]
If anything, this reinforces the idea that chatbots don't fundamentally change education... they just amplify whatever incentives and structures already exist
reply
yazantapuz
1 hour ago
[-]
Paper and pencil.
reply
burgerone
5 hours ago
[-]
I wish we could take our exams this way. It seems like a very interesting approach :)
reply
Zababa
2 hours ago
[-]
> Mistakes made by chatbots will be considered more important than honest human mistakes, resulting in the loss of more points.

>I thought this was fair. You can use chatbots, but you will be held accountable for it.

So you're held more accountable for the output actually? I'd be interested in how many students would choose to use LLMs if faults weren't penalized more.

reply
jcattle
2 hours ago
[-]
I thought this part especially was quite ingenious.

If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.

If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.

One shows a misunderstanding, the other doesn't necessarily show any understanding at all.

reply
Zababa
1 hour ago
[-]
>If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.

You could say the same about what people find on the web, yet LLMs are penalized more than web search.

>If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.

Swap "LLMs" for "websites" and you could say the exact same thing.

The author has this in their conclusions:

>One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all.

This is not true. What is true is that if the students are more accountable for their use of LLMs than their use of websites, they prefer using websites. What is "more" here? We have no idea, the author didn't say so. It could be that an error from a website or your own mind is -1 point and from a LLM is -2, so LLMs have to make two times less mistakes than websites and your mind. It could be -1 and -1.25. It could be -1 and -10.

The author even says themselves:

>In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots.

But they don't note the bias they introduced against LLMs with their notation.

reply
anal_reactor
1 hour ago
[-]
> I was completely flabbergasted because, to me, discussing "What questions did you have?" was always part of the collaboration between students

When I was a student, professors maintained a public archive of past exams. The reason was obvious: next time the questions would be different, and memorizing past answers wouldn't help you if you don't understand the core ideas being taught. Then I took part in an exchange program and went to some shit-tier uni and I realized that collaboration was explicitly forbidden because professors would usually ask questions along "what was on slide 54". My favorite part was when professor said "I can't publish the slides online because they're stolen from another professor but you can buy them in the faculcy's shop".

My uni maintained a giant presence on Facebook - we'd share a lot of information, and the most popular group was "easy courses" for students who wanted to graduate but couldn't afford a difficult elective course.

The exchange uni had none of that. Literally no community, no collaboration, nothing. It's astonishing.

BTW regarding the stream of consciousness - I distinctly remember taking an exam and doing my best to force my brain to think about the exam questions, rather than porn I had been watching the previous day.

reply
Joel_Mckay
4 hours ago
[-]
"Marking Exam Done by A.I." (Sixty Symbols)

https://www.youtube.com/watch?v=JcQPAZP7-sE

LLM reasoning models are very good at searching well documented problems. =3

reply
bsder
3 hours ago
[-]
> We were imposed GitHub for so many exercises!

I'm sympathetic to both sides here.

As a professor who had to run Subversion for students (a bit before Git, et al), it's a nightmare to put the infrastructure together, keep it reliable under spiky loads (there is always a crush at the deadline), be customer support for students who manage to do something weird or lose their password, etc. You wind up spending a non-trivial amount of time being sysadmin for the class on top of your teaching duties. Being able to say "Put it on GitHub" short circuits all of that. It sucks, but it makes life a huge amount easier for the professor.

From the students point of view, sure, it sucks that nobody mentioned that Git could be used independently (or jj or Mercurial or ...) However, Github is going to be better than what 99.9% of all professors will put together or be able to use. Sure, you can use Git by itself, but then it needs to go somewhere that the professor can look at it, get submitted to automated testing, etc. That's not a trivial step. My students were happy that I had the "Best Homework Submission System" (said about Subversion of all things ...) because everybody else used the dumbass university enterprise thing that was completely useless (not going to mention its name because it deserves to die in the blazing fires of the lowest circle of Hell). However, it wasn't straightforward for me to put that together. And the probability of getting a professor with my motivation and skill is pretty low.

reply
i_am_proteus
1 hour ago
[-]
Agree about the possibility of infra nightmare, especially in the "SVN era" -- but in 2026, it's pretty straightforward to run a gitlab instance (takes about an hour to set up, most of which is DNS and TLS stuff, ime) for a course and set up actions, or use other submission infra like CMU autolab. I do this.

Agree with your comment about probability, motivation, and skill.

reply
elbci
5 hours ago
[-]
rare here: well written and insightful, I would take this course. I'm curious about why he penalized chatbot mistakes more, at first glance sounds like just discouraging their use but the hole setup indicates genuine desire to let it be a possibility. In my mind the rule should be "same penalty and extra super cookies for catching chatbot mistakes"
reply
jcattle
2 hours ago
[-]
I wrote this before to another comment like yours:

I thought this part of penalizing mistakes made with the help of LLMs more was quite ingenious.

If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.

If you do not use LLMs and just misunderstood something, you will have a (flawed) justification for why you wrote this. If there's something flawed in an LLM answer, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.

One shows a misunderstanding, the other doesn't necessarily show any understanding at all.

reply
qwertytyyuu
4 hours ago
[-]
Here is my guess: Usually marks are given for partially correct answers, partially to be less punishing for human error whether caused by stress or other factors, there’s a good chance the student understood the topic. If instead they are using a chat bot, but didn’t catch the mistake themselves, it’s an indication of less understanding and marked accordingly.
reply
veltas
4 hours ago
[-]
> The third chatbot-using student had a very complex setup where he would use one LLM, then ask another unrelated LLM for confirmation. He had walls of text that were barely readable. When glancing at his screen, I immediately spotted a mistake (a chatbot explaining that "Sepia Search is a compass for the whole Fediverse"). I asked if he understood the problem with that specific sentence. He did not. Then I asked him questions for which I had seen the solution printed in his LLM output. He could not answer even though he had the answer on his screen.

Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.

That guy who is playing with the latest tech, and forcing it to do the job (badly), and could care less about university or the course he's on. There's a time and a place where that guy is the one you want working for you. Maybe he's not the number 1 student, but I think there should be some room for this to be the Chaotic Neutral pick.

reply
watwut
4 hours ago
[-]
> Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.

He might as well be the dumbest guy in the class. Playing with tech is not a proof of being smart on itself.

reply