The ‘wicked problem’ of AI in assessment

New testing methods are piloted to keep pace with rapidly evolving AI models

A new 2025 study reveals artificial intelligence (AI) use in assessments as a “wicked problem,” or a problem “that [defies] neat solutions,” forcing educators to redesign assessment styles.
Headed by Thomas Corbin, a research fellow at the Centre for Research in Assessment and Digital Learning at Deakin University, this study interviewed 20 university-level educators across four different faculties at Australian universities. Through referrals, the researchers identified educators leading assessment redesign in their respective faculties and questioned them on the types of challenges generative AI has imposed upon them in terms of student assessment, as well as the types of changes they have implemented as a response. Although the researchers did not initially hypothesize generative AI as a “wicked problem,” the response from the educators suggests it as such.
The term “wicked problem” gained popularity in the 1970s due to theorists Horst Rittel and Melvin Webber. Unlike “tame problems,” which are defined by their linearity and ability to be solved through step-by-step processes, “wicked problems” are unpredictable and are characterized by “better” or “worse” answers, but no objective “true” or “false.”
Illustrating the scale of the issue, a survey of academic integrity violations in the UK has exposed nearly 7,000 proven cases of cheating using AI between 2023 and 2024. While this number is expected to increase now that generative AI usage has become more mainstream, experts suggest these proven cases are only a small percentage of real violations. In response, one solution educators across North America have been experimenting with is having more oral exams.
Kyle Maclean, an assistant professor at the Ivey Business School in London, Ontario,is researching both the benefits and downsides of this ancient form of testing. In an interview with the CBC, Maclean explained the implementation of “scalable oral exams” was introduced in his faculty during the COVID-19 lockdown. Instead of face-to-face oral exams — which would not be feasible with large class sizes — the Ivey Business School gave students five minutes to respond to a question with a video. This assessment style represents a significant departure from the traditional testing methods students are accustomed to. It demands quick thinking and does not allow them to return to previous questions, which is an unprecedented challenge for many. This assessment style also risks placing certain groups at a disadvantage. For instance, students for whom English is an additional language may struggle to articulate complex ideas under pressure. The stress of a face-to-face evaluation can further hinder performance.
Furthermore, the AI issue in universities is a layered one that is constantly shifting. While teachers are expected to immediately redesign their curriculums in response to changing AI systems, generative AI is constantly adapting and improving in functionality. In the last three years alone, the global generative AI market has increased by 54.7 per cent, proving itself explosive.
This attractive market for investors has propelled generative AI development, with new versions dropping just a few months apart from each other. For example, ChatGPT dropped its 5.0 version in August 2025, and by November it had released two 5.1 versions — GPT-5.1 Instant and GPT-5.1 Thinking. The implication for educators is that these responsive curriculum adjustments implemented at the beginning of the semester may become irrelevant before the term even ends. This is especially problematic given that occupational stress has been identified as a major problem in Western universities. With excessive workloads as a primary culprit, adding constant redesign to an educator’s workload only contributes to the issue.
Ultimately, in reconciling with an ever-evolving AI ecosystem, the study suggests educators and universities must acknowledge the wicked role AI has and will continue to play in assessment, dispelling any belief that the issue is inherently “solvable.” Instead, the researchers suggest that a constant negotiation is necessary, with trade-offs lying at the heart of every curriculum. While “wicked problems” have no true solutions, the study urges allowing individual educators more bargaining power in assessment creation — all proposed styles will have trade-offs that better suit the needs of certain groups. Instead of creating the illusion of rigid perfection in university assessments, this modified-objective approach highlights flexibility, easing the burden placed on educators to create a uniform, one-size-fits-all solution to assessment-taking.