Published 2026-03-10
Keywords
- AI-generated feedback; AI-assisted Assessment; Formative Assessment; Peer Review; Higher Education
How to Cite
Copyright (c) 2026 Xiaolei Li, David Chen

This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
The consistent provision of effective formative feedback on assessment is an integral part of the learning process, yet it is one of the most demanding and challenging aspects of learning and teaching, especially when required at scale. It is therefore unsurprising that amid the exploration of AI tools in teaching and learning, there is growing interest in the application of Large Language Models (LLMs) to support formative feedback in the learning process. Using student submissions from a course that included structured peer review activities, the study reported in this paper is a comparative analysis of AI-generated feedback and scoring (GPT-4). The findings show AI-generated feedback provides structured and detailed comments for assessment components based on clear and objective criteria, with strong agreement with human marking practices. Limitations appeared in areas requiring subjective judgment, such as evaluating the quality of peer reviews and the depth of student reflection. In these cases, human educators provide more nuanced interpretations based on contextual and pedagogical understanding. Findings put emphasis on the importance of human oversight for qualitative and interpretive evaluation. These findings suggest that a balanced human-AI approach, grounded in pedagogical intent and careful integration, is essential for the effective use of AI-assisted feedback in higher education.
References
- Abdel Aziz, M. H., Rowe, C., Southwood, R., Nogid, A., Berman, S., & Gustafson, K. (2024). A scoping review of artificial intelligence within pharmacy education. American Journal of Pharmaceutical Education, 88(1), 100615. https://doi.org/10.1016/j.ajpe.2023.100615
- Ali, K., Barhom, N., Tamimi, F., & Duggal, M. (2023). ChatGPT—A double-edged sword for healthcare education? Implications for assessments of dental students. European Journal of Dental Education. Advance online publication. https://doi.org/10.1111/eje.12937
- Ballantine, J., Boyce, G., & Stoner, G. (2024). A critical review of AI in accounting education: Threat and opportunity. Critical Perspectives on Accounting, 99, 102711. https://doi.org/10.1016/j.cpa.2024.102711
- Beerepoot, M. T. P. (2023). Formative and summative automated assessment with multiple-choice question banks. Journal of Chemical Education, 100(8), 2947–2955. https://doi.org/10.1021/acs.jchemed.3c00120
- Birss, D. (2023). The prompt collection. (Publisher not provided.)
- Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31. https://doi.org/10.1007/s11092-008-9068-5
- Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13(6), 4–16.
- Boud, D., & Molloy, E. (2013). Rethinking models of feedback for learning: The challenge of design. Assessment & Evaluation in Higher Education, 38(6), 698–712. https://doi.org/10.1080/02602938.2012.691462
- Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., & Henighan, T. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.
- Carless, D. (2022). From teacher transmission of information to student feedback literacy: Activating the learner role in feedback processes. Active Learning in Higher Education, 23(2), 143–153. https://doi.org/10.1177/1469787420945845
- Chang, D. H., Lin, M. P.-C., Hajian, S., & Wang, Q. Q. (2023). Educational design principles of using AI chatbot that supports self-regulated learning in education: Goal setting, feedback, and personalization. Sustainability, 15(17), 12921. https://doi.org/10.3390/su151712921
- Correia, A.-P., Hickey, S., & Xu, F. (2025). Realizing the possibilities of the large language models: Strategies for prompt engineering in educational inquiries. Theory Into Practice, 64(4), 434–447. https://doi.org/10.1080/00405841.2025.2528545
- Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y.-S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. In 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp. 323–325). IEEE. https://doi.org/10.1109/ICALT58122.2023.00100
- Dai, Y., Liu, A., & Lim, C. P. (2023). Reconceptualizing ChatGPT and generative AI as a student-driven innovation in higher education. Procedia CIRP, 119, 84–90. https://doi.org/10.1016/j.procir.2023.05.002
- Gao, R., Merzdorf, H. E., Anwar, S., Hipwell, M. C., & Srinivasa, A. R. (2024). Automatic assessment of text-based responses in post-secondary education: A systematic review. Computers and Education: Artificial Intelligence, 6, 100206. https://doi.org/10.1016/j.caeai.2024.100206
- Gobrecht, A., Tuma, F., Möller, M., Zöller, T., Zakhvatkin, M., Wuttig, A., Sommerfeldt, H., & Schütt, S. (2024). Beyond human subjectivity and error: A novel AI grading system (arXiv:2405.04323). arXiv. https://doi.org/10.48550/arXiv.2405.04323
- Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487
- Henderson, M., Bearman, M., Chung, J., Fawns, T., Buckingham Shum, S., Matthews, K. E., & De Mello Heredia, J. (2025). Comparing generative AI and teacher feedback: Student perceptions of usefulness and trustworthiness. Assessment & Evaluation in Higher Education, 1–16. https://doi.org/10.1080/02602938.2025.2502582
- Henderson, M., Ryan, T., & Phillips, M. (2019). The challenges of feedback in higher education. Assessment & Evaluation in Higher Education, 44(8), 1237–1252. https://doi.org/10.1080/02602938.2019.1599815
- Irons, A., & Elkington, S. (2021). Enhancing learning through formative assessment and feedback (2nd ed.). Routledge.
- Kerman, N. T., Banihashem, S. K., Karami, M., Er, E., Van Ginkel, S., & Noroozi, O. (2024). Online peer feedback in higher education: A synthesis of the literature. Education and Information Technologies, 29(1), 763–813. https://doi.org/10.1007/s10639-023-12273-8
- Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121–1134.
- Lee, D., & Palmer, E. (2025). Prompt engineering in higher education: A systematic review to help inform curricula. International Journal of Educational Technology in Higher Education, 22(1), 7. https://doi.org/10.1186/s41239-025-00503-7
- Lee, Y.-C., & Fu, W.-T. (2019). Supporting peer assessment in education with conversational agents. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion (pp. 7–8). https://doi.org/10.1145/3308557.3308695
- Lo, L. S. (2023). The CLEAR path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship, 49(4), 102720. https://doi.org/10.1016/j.acalib.2023.102720
- Memarian, B., & Doleck, T. (2024). A review of assessment for learning with artificial intelligence. Computers in Human Behavior: Artificial Humans, 2(1), 100040. https://doi.org/10.1016/j.chbah.2023.100040
- Messer, M., Brown, N. C. C., Kölling, M., & Shi, M. (2024). Automated grading and feedback tools for programming education: A systematic review. ACM Transactions on Computing Education, 24(1), 1–43. https://doi.org/10.1145/3636515
- Møgelvang, A., Bjelland, C., Grassini, S., & Ludvigsen, K. (2024). Gender differences in the use of generative artificial intelligence chatbots in higher education: Characteristics and consequences. Education Sciences, 14(12), 1363. https://doi.org/10.3390/educsci14121363
- Molenaar, I. (2022). The concept of hybrid human-AI regulation: Exemplifying how to support young learners’ self-regulated learning. Computers and Education: Artificial Intelligence, 3, 100070. https://doi.org/10.1016/j.caeai.2022.100070
- Ng, S. W. (2012). The impact of peer assessment and feedback strategy in learning computer programming in higher education. Issues in Informing Science and Information Technology, 9, 17–27. https://doi.org/10.28945/1601
- Nicol, D. D., & Macfarlane-Dick, D. (2006). Rethinking formative assessment in higher education: A theoretical model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218.
- Ocampo, J. C. G., & Panadero, E. (2023). Web-based peer assessment platforms: What educational features influence learning, feedback and social interaction? In O. Noroozi & B. De Wever (Eds.), The power of peer learning (pp. 165–182). Springer International Publishing. https://doi.org/10.1007/978-3-031-29411-2_8
- Ofosu-Ampong, K. (2023). Gender differences in perception of artificial intelligence-based tools. Journal of Digital Art & Humanities, 4(2), 52–56. https://doi.org/10.33847/2712-8149.4.2_6
- Parekh, V., Shah, D., & Shah, M. (2020). Fatigue detection using artificial intelligence framework. Augmented Human Research, 5(1), 5. https://doi.org/10.1007/s41133-019-0023-4
- Reynolds, L., & McDonell, K. (2021). Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–7). https://doi.org/10.1145/3411763.3451760
- Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.
- Wong, J., Baars, M., Davis, D., Van Der Zee, T., Houben, G.-J., & Paas, F. (2019). Supporting self-regulated learning in online learning environments and MOOCs: A systematic review. International Journal of Human–Computer Interaction, 35(4–5), 356–373. https://doi.org/10.1080/10447318.2018.1543084
- Zimmerman, B. J. (2008). Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. American Educational Research Journal, 45(1), 166–183. https://doi.org/10.3102/0002831207312909
