An Essay in Seconds: What AI Reveals About Broken Assessments and the Need for Pedagogical Reimagining

Published on Apr 03, 2025 by Stephen Wheeler.

It took less than half an hour.

I uploaded a set of references and a 1500-word essay brief for a first-year undergraduate science module into ChatGPT, gave it a few prompts, and in return received a fully referenced essay, a detailed critique, and a revised, glowing A+ version — all without any prior knowledge of the subject. No special trickery. No prompt engineering wizardry. Just a straightforward use of what’s now freely available technology. The whole process took me less than thirty minutes.

It’s the kind of assessment I see constantly in my day-to-day work: here’s a reading list, here’s a question, now write an essay. It’s a familiar formula. It looks academically rigorous. It fits neatly into existing workflows. But it’s also — as this brief experiment demonstrates — worryingly hollow.

This post isn’t about blaming anyone. If anything, it’s about revealing how the very design of our assessments invites exactly this kind of short-circuiting. And if a middle-aged ed tech practitioner like me, with no background in the subject area, can produce what appears to be a very respectable undergraduate essay in half an hour with AI — what are we really assessing? And what might we be missing?

The Essay Factory Model

The traditional essay, especially at the introductory undergraduate level, carries with it a lot of cultural and institutional weight. It’s how we’ve always done it. There’s a clear brief, a fixed word count, and an illusion of objectivity in the marking rubric. For educators, it’s convenient: it generates a product that can be marked in bulk, anonymised, and graded against preset criteria. For students, it’s often equally performative: research, structure, polish, submit, forget.

But what is it really assessing? Does a 1500-word essay demonstrate understanding, or just the ability to reproduce a well-worn pattern? As Wiliam (2011) argues, assessment often fails when it becomes a proxy for learning rather than an integrated part of the learning process. In this case, the proxy is so transparent that even a generative language model can simulate it with minimal input.

The problem isn’t just the potential for cheating. The deeper issue is that this form of assessment privileges surface-level synthesis over meaningful engagement. When the rules of the game are this predictable, is it any wonder that students — and increasingly, machines — play it strategically?

AI as a Mirror, Not a Threat

There’s been a lot of panic around AI in education lately — some of it warranted, much of it reactionary. But what my experiment showed wasn’t so much the power of AI, as the fragility of our assessment design. ChatGPT didn’t “cheat” the system; it simply played by the rules we’ve laid down. And it did so disturbingly well.

I didn’t need to understand the scientific concepts in the essay. I didn’t need to read the references beyond verifying the citation format. I didn’t even need to do the thinking. Yet the result was an essay that would likely pass muster with most markers. The only thing missing was genuine learning.

This is something I’ve written about before: how educational technology often exposes the cracks in our practices more than it solves them. As I noted in a previous post, “Technology doesn’t fix bad pedagogy; it just makes it more obvious.” AI is the latest and most disruptive version of this. It forces us to confront the assumptions we’ve made about what assessment is for — and who it’s really serving.

Beyond Essays: Towards More Authentic Assessment

If AI can write the essay for you, then maybe we’re asking the wrong question. Maybe the question isn’t how to stop students from using AI, but how to design assessment that AI can’t easily game — or, more radically, how to design assessment where the use of AI is part of the learning process rather than an evasion of it.

Authentic assessment — tasks that mirror real-world challenges and require students to apply knowledge in context — is not new, but it’s never been more urgent. As Biggs and Tang (2011) argue, we need constructive alignment between learning outcomes, activities, and assessment. This means assessments that allow students to demonstrate their learning in situated, active ways — not just reproduce information.

Some alternatives include:

Multimodal reflections: Students submit a short video explaining how they approached a problem, what resources they used, and how their thinking evolved.
Collaborative projects: Groups work together on a shared artefact — a website, a policy brief, a podcast — that demonstrates not just the end result but the process.
Problem-based learning tasks: Students tackle a complex, open-ended problem and justify their solutions with evidence and reasoning.
Portfolio assessment: Rather than a single submission, students document their learning journey across multiple touchpoints.

All of these are harder to fake. They require synthesis, communication, and — crucially — reflection. They also make space for failure, iteration, and growth — things that a single summative essay rarely encourages.

What AI Can Help Us With

Paradoxically, the same tools that expose the weaknesses of our assessments can also help us build better ones. ChatGPT can assist with brainstorming, restructuring, checking clarity, or even generating simulated peer feedback. Used transparently and ethically, AI can scaffold learning rather than replace it.

But to do that, we need to move beyond a compliance mindset. Blocking ChatGPT or pretending it doesn’t exist won’t prepare students for a world where AI is integrated into almost every aspect of knowledge work. What we need is a pedagogy of critical engagement — one that encourages students to question, challenge, and co-create with the tools available to them.

A Closing Thought

What this little experiment revealed to me wasn’t just the ease with which AI can write a decent essay. It was the hollowness of a form of assessment that has long outlived its usefulness. If our goal is to foster critical, curious, creative thinkers, then we need to assess them in ways that honour those qualities.

The rise of generative AI doesn’t mark the end of assessment. But it might just mark the end of bad assessment — if we’re willing to use it as a mirror, rather than a scapegoat.

References

Biggs, J. and Tang, C., 2011. Teaching for quality learning at university: What the student does. 4th ed. Maidenhead: Open University Press.

Wiliam, D. (2018). Embedded formative assessment Second edition. Bloomington, Indiana: Solution Tree Press.