Blog post Part of series: Artificial Intelligence in educational research and practice
Evaluating authentic assessments and academic integrity in the age of generative AI
The rapid evolution of generative artificial intelligence (GenAI) technologies, such as ChatGPT and Claude, is prompting urgent reflections across the education sector. While much of the early discourse focused on concerns about plagiarism or cheating (Lee et al., 2024), there is a need to explore how GenAI challenges the core assumptions of assessment design – especially in higher education settings.
Authentic assessments such as Dragon’s Den-type presentations, or live projects, have typically been considered more resilient to academic misconduct because they require students to draw from personal experience, engage with real-world scenarios and produce bespoke outputs (Sotiriadou et al., 2020).
This blog post focuses on our study (Kofinas et al., 2025) suggesting that authentic assessments are not a panacea against the challenges that GenAI poses.
Authentic assessments alone are not enough
A key takeaway from our research is that utilising authentic assessments alone would not ensure academic integrity in a GenAI-rich environment. This challenges the widespread assumption that authentic tasks are inherently resistant to misconduct (Arnold & Croxford, 2025). Students can use GenAI to generate reflections, analyse case studies, or simulate project-based work with astonishing fluency. Furthermore, this can be done without any means for the academics involved to detect unequivocally the involvement of AI (Spennemann et al., 2024). As educators, we must therefore ask not just what students produce, but how they produce it – and how we can assess that process meaningfully.
Human judgment, AI detection and the problem of false positives
‘GenAI is not merely a new technology that may lead to plagiarism or third-party writing; it represents a shift in how knowledge is created, synthesised and communicated.’
Another concern we highlight in the paper is the risk of misjudging how AI was used in an assessment. As some institutions turn to detection tools or rely on educators’ intuition, we warn against overreliance on either approach. Our research points to the real possibility of both false positives – wrongly accusing a student of using AI when they did not – and false negatives – failing to detect AI use when it was relied upon. Both outcomes are problematic – undermining trust in the system, putting students at risk, and leaving educators uncertain about how to respond fairly.
Essentially, GenAI is not merely a new technology that may lead to plagiarism or third-party writing; it represents a shift in how knowledge is created, synthesised and communicated.
From product to process: A paradigmatic shift in assessment
In response, we propose that higher education must reconsider the underlying goals and design of assessments. Instead of focusing on the final product (a written essay, a project report, and so on), we argue for a renewed emphasis on the process of learning and the relevance of assessments in assessing this learning (Kofinas et al., 2025).
This would mean moving away from purely written, asynchronous assessments towards more interpersonal, synchronous assessments – such as live presentations, oral examinations, reflective interviews and collaborative work. These formats make it more difficult to delegate the task to an AI system, but, more importantly, they also help to develop key employability skills that go beyond content knowledge.
Looking forward: Embracing change with purpose
We are advocating that GenAI offers enormous potential for supporting learning, sparking creativity, and reducing inequalities – if used thoughtfully. But we must confront the reality that traditional notions of academic integrity, assessment fairness and individual authorship are all currently being redefined.
As educators, we have a responsibility to rethink our assessment design – not by reverting to surveillance or suspicion, but by rethinking what we value in higher education and how we measure it.
This blog post is based on the article by Alexander Kofinas, Crystal Han-Huei Tsay and David Pike, published in the British Journal of Educational Technology. Ìý
References
Arnold, L., & Croxford, J. (2025). Is it time to stop talking about authentic assessment? Teaching in Higher Education, 30(3), 735–743.
Kofinas, A. K., Tsay, C. H., & Pike, D. (2025). The impact of generative AI on academic integrity of authentic assessments within a higher education context. British Journal of Educational Technology, 56(6), 2522–2549.
Lee, V. R., Pope, D., Miles, S., & Zárate, R. C. (2024). Cheating in the age of generative AI: A high school survey study of cheating behaviors before and after the release of ChatGPT. Computers and Education: Artificial Intelligence, 7, 100253.
Sotiriadou, P., Logan, D., Daly, A., & Guest, R. (2020). The role of authentic assessment to preserve academic integrity and promote skill development and employability. Studies in Higher Education, 45(11), 2132–2148.
Spennemann, D. H. R., Biles, J., Brown, L., Ireland, M. F., Longmore, L., Singh, C. L., Wallis, A., & Ward, C. (2024). ChatGPT giving advice on how to cheat in university assignments: How workable are its suggestions? Interactive Technology and Smart Education, 21(4), 690–707.