Exams in the Age of AI: What Are We Really Testing?
I asked this question in my notes last week and got some interesting answers, so I decided to write a post about it: I just saw in the news that professors are going back to oral exams to prevent students from using AI. What do you think about this? What surprised me wasn’t just the range of opinions, but how each response touched a different layer of the same problem. As a data scientist, I can’t help but see this less as an “AI cheating” issue and more as a signal that our evaluation systems are being stress-tested by a new variable they weren’t designed for.
One of the comments mentioned that many STEM exams in Russian universities have traditionally been oral and extremely difficult. That resonated with me. Oral exams force you to expose your reasoning in real time, there’s no hiding behind polished output. You can’t just arrive at the right answer; you have to justify how you got there, respond to follow-up questions, and adapt when challenged. From a data science perspective, this is closer to how expertise actually shows up in practice. In real work, nobody hands you a perfectly phrased problem and gives you unlimited time to respond. You’re constantly asked to explain assumptions, defend trade-offs, and revise your thinking on the fly. Oral exams, when done well, measure understanding rather than pattern reproduction.
Another comment pointed out that this doesn’t mean students can’t use AI, it just means they still need to learn the material well enough to speak about it. I think that distinction is crucial. AI is already a learning tool, whether institutions like it or not. Trying to ban it entirely is like trying to ban calculators after they already exist. The more interesting question is how we design assessments so that AI use shifts from shortcut to scaffold. If a student uses AI to study, practice explanations, or test their understanding, and then has to demonstrate that understanding orally, that’s not a failure of education, that’s arguably a better alignment between tools and outcomes.
There was also a comment praising oral exams for valuing communication skills, which are undeniably important in the real world. As someone working in data science, I can say this is true but incomplete. Yes, being able to explain complex ideas clearly is a career advantage. But another comment rightly raised concerns about accessibility: oral exams can be a real barrier for people with speech impediments, neurodivergence, anxiety, or even just discomfort hearing their own voice. From a systems perspective, this matters. If an assessment method systematically disadvantages certain groups, then its signal is noisy. You’re no longer measuring “who understands the material,” you’re measuring “who performs best under a very specific social constraint.”
What I take away from all these comments is that oral exams aren’t inherently good or bad, they’re a tool. And like any tool, their value depends on how and why they’re used. AI didn’t break education; it exposed how much of our testing was optimized for recall rather than reasoning. The real question isn’t “how do we stop students from using AI?” but “are our pedagogy and evaluation methods keeping up with the world students are entering?” From a data scientist’s lens, if your metric is easy to game, you don’t blame the player, you redesign the metric.


I've been teaching different engineering subjects since 2014, may be this sound heretic, but I don't use writting exams a long time ago.