The core technology behind and beyond ChatGPT: A comprehensive review of language models in educational research

Kelvin Leong; Anna Sung; Lewis Jones

doi:10.46661/ijeri.8449

Authors

Kelvin Leong University of Chester https://orcid.org/0000-0002-5896-0181
Anna Sung University of Chester https://orcid.org/0000-0002-0801-3119
Lewis Jones University of Chester https://orcid.org/0009-0005-8973-8152

DOI:

https://doi.org/10.46661/ijeri.8449

Keywords:

ChatGPT, Language model, EdTech, AI

Abstract

ChatGPT has garnered significant attention within the education industry. Given the core technology behind ChatGPT is language model, this study aims to critically review related publications and suggest future direction of language model in educational research. We aim to address three questions: i) what is the core technology behind ChatGPT, ii) what is the state of knowledge of related research and iii) the potential research direction. A critical review of related publications was conducted in order to evaluate the current state of knowledge of language model in educational research. In addition, we further suggest a purpose oriented guiding framework for future research of language model in education. Our study promptly responded to the concerns raised by ChatGPT from the education industry and offers the industry with a comprehensive and systematic overview of related technologies. We believe this is the first time that a study has been conducted to systematically review the state of knowledge of language model in educational research.

Downloads

Download data is not yet available.

References

Ba, S., Hu, X., Stein, D. & Liu, Q. (2023). Assessing cognitive presence in online inquiry-based discussion through text classification and epistemic network analysis. British Journal of Educational Technology, 54, 247-266. https://doi.org/10.1111/bjet.13285

Baron, N. (2023). Even kids are worried ChatGPT will make them lazy plagiarists, says a linguist who studies tech’s effect on reading, writing and thinking. Fortune. https://fortune.com/2023/01/19/what-is-chatgpt-ai-effect-cheating-plagiarism-laziness-education-kids-students/

Bengio, Y. & Senecal, J.S. (2008). Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Transactions on Neural Networks, 19(4), 713–722. https://doi.org/10.1109/TNN.2007.912312

Beseiso, M., Alzubi, O.A. & Rashaideh, H. (2021). A novel automated essay scoring approach for reliable higher educational assessments. Journal of Computing in Higher Education, 33(3), 727-746. https://doi.org/10.1007/s12528-021-09283-1

Botarleanu, R.M., Dascalu, M., Allen, L.K., Crossley, S.A. & McNamara, D.S. (2021). Automated Summary Scoring with ReaderBench. In A. Cristea & C. Troussas (Eds.), Intelligent Tutoring Systems (ITS 2021), 321-332. Springer. https://doi.org/10.1007/978-3-030-80421-3_35

Condor, A. (2020). Exploring automatic short answer grading as a tool to assist in human rating. In I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, & E. Millan (Eds.), Artificial Intelligence in Education (AIED 2020). Springer. https://doi.org/10.1007/978-3-030-52240-7_14

Dempsey, J. (2023). AI: Arguing its Place in Higher Education. Higher Education Digest. https://www.highereducationdigest.com/ai-arguing-its-place-in-higher-education/

Devlin, J., Chang, M.W., Lee, K. & Toutanova, K. (2018). Bert: pre-training of deep bidirectional transformers for language understanding. arXiv,1810.04805v2 https://doi.org/10.48550/ARXIV.1810.04805

Dimzon, F.D. & Pascual, R.M. (2020). An automatic phoneme recognizer for children’s filipino read speech. 2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), Takamatsu, Japan, 2020,1-5. https://doi.org/10.1109/TALE48869.2020.9368399

Van-Dis, E.A.M., Bollen, J., Zuidema, W., Van-Rooij, R. and Bockting, C.L. (2023). ChatGPT: five priorities for research. Nature, 614(7947), 224–226. https://doi.org/10.1038/d41586-023-00288-7.

Du, H., Xing, W. & Pei, B. (2021). Automatic text generation using deep learning: providing large-scale support for online learning communities. Interactive Learning Environments. https://doi.org/10.1080/10494820.2021.1993932

Dyulicheva, Y.Y. (2021). Learning Analytics in MOOCS as an Instrument for Measuring Math Anxiety. Voprosy Obrazovaniya-Educational Studies Moscow. https://doi.org/10.17323/1814-9545-2021-4-243-265

Esmaeilzadeh, S., Williams, B., Shamsi, D. & Vikingstad, O. (2022). Providing insights for open-response surveys via end-to-end context-aware clustering. In M. Rodrigo, N. Matsuda, A. Cristea, & V. Dimitrova (Eds), Artificial Intelligence in Education (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11644-5_44

Fernandez, N., Ghosh, A., Liu, N., Wang, Z., Choffin, B., Baraniuk, R. & Lan, A. (2022). Automated scoring for reading comprehension via in-context BERT tuning. In M. Rodrigo, N. Matsuda, A. Cristea, & V. Dimitrova (Eds). Artificial Intelligence in Education (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11644-5_69

Firoozi, T., Mohammadi, H. & Gierl, M.J. (2022). Using active learning methods to strategically select essays for automated scoring. Educational Measurement Issues and Practice, 00, 1-10. https://doi.org/10.1111/emip.12537

Fitzpatrick, D. (2023). Overcoming ChatGPT fear in 3 steps. FE News. https://www.fenews.co.uk/exclusive/overcoming-chatgpt-fear-in-3-steps/

Geller, S.A., Gal, K., Segal, A., Sripathi, K., Kim, H.G., Facciotti, M.T., Igo, M., et al. (2021). New methods for confusion detection in course forums: student, teacher, and machine. IEEE Transactions on Learning Technologies, 14(5), 665-679. https://doi.org/10.1109/TLT.2021.3123266

Gift, T. & Norman, J. (2023). AI makes university honour codes more necessary than ever. Times Higher Education (THE). https://www.timeshighereducation.com/blog/ai-makes-university-honour-codes-more-necessary-ever

Goel, V., Sahnan, D., Venktesh, V., Sharma, G., Dwivedi, D. & Mohania, M. (2022). K-12BERT: BERT for K-12 education. In M. Rodrigo, N. Matsuda, A. Cristea, & V. Dimitrova (Eds), Artificial Intelligence in Education: Posters and Late Breaking Results. Workshops and Tutorials, Industry and Innovation Tracks, Practitioners and Doctoral Consortium (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11647-6_123

Goldberg, Y. & Levy, O. (2014). Word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXi. https://doi.org/10.48550/ARXIV.1402.3722

Greenhouse, S. (2023). US experts warn AI likely to kill off jobs – and widen wealth inequality. The Guardian. https://www.theguardian.com/technology/2023/feb/08/ai-chatgpt-jobs-economy-inequality

Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., et al. (2021). Pre-trained models: Past, present and future. AI Open, 2, 225–250. https://doi.org/10.1016/j.aiopen.2021.08.002

Hao, Y., Li, H., Ding, W., Wu, Z., Tang, J., Luckin, R. & Liu, Z. (2021). Multi-task learning based online dialogic instruction detection with pre-trained language models. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin & V. Dimitrova (Eds.), Artificial Intelligence in Education (AIED 2021). Springer. https://doi.org/10.1007/978-3-030-78270-2_33

Hess, F. (2023). Will ChatGPT Be A Blow To Learning, Or A Boon? We’ll Decide. Forbes. https://www.forbes.com/sites/frederickhess/2023/02/08/will-chatgpt-be-a-blow-to-learning-or-a-boon-well-decide/

Hsu, H.H. & Huang, N.F. (2022). Xiao-Shih: a self-enriched question answering bot with machine learning on Chinese-based MOOCs. IEEE Transactions on Learning Technologies,15(2), 223-237. https://doi.org/10.1109/TLT.2022.3162572

Ibanez, M., Reyes, L.L.A., Sapinit, R., Hussien, M.A. & Imperial, J.M. (2022). On applicability of neural language models for readability assessment in Filipino. In M. Rodrigo, N. Matsuda, A. Cristea, & V. Dimitrova (Eds). Artificial Intelligence in Education: Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners and Doctoral Consortium (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11647-6_118

Jayaraman, J.D. & Black, J. (2022). Effectiveness of an Intelligent Question Answering System for Teaching Financial Literacy: A Pilot Study. In D. Guralnick, M. Auer & A. Poce (Eds.), Innovations in Learning and Technology for the Workplace and Higher Education (TLIC 2021). Springer. https://doi.org/10.1007/978-3-030-90677-1_13

Khot, T., Clark, P., Guerquin, M., Jansen, P. & Sabharwal, A. (2020). QASC: A Dataset for Question Answering via sentence composition. Proceedings of the AAAI Conference on Artificial Intelligence, 34(5). https://doi.org/10.1609/aaai.v34i05.6319

Khushk, A., Zhiying, L., Yi, X. & Zengtian, Z. (2023). Technology Innovation in STEM Education: A Review and Analysis. International Journal of Educational Research and Innovation, 19, 29–51. https://doi.org/10.46661/ijeri.7883

Lee, J., Soleimani, F., Irish, I., Hosmer, J., Soylu, M.Y., Finkelberg, R. & Chatterjee, S. (2022). Predicting cognitive presence in at-scale online learning: MOOC and for-credit online course environments. Online Learning, 26(1). https://doi.org/10.24059/olj.v26i1.3060

Lee, M.C., Chang, J.W. & Chen, J.L. (2014). Detecting ESL/EFL grammatical errors based on n-grams and web resources. Conference name: 6th International Conference on Education and New Learning Technologies (EDULEARN14 Proceedings), 345-351.

Leong, K., Sung, A., Au, D., & Blanchard, C. (2020). A review of the trend of microlearning. Journal of Work-Applied Management, 13(1), 88-102. https://doi.org/10.1108/JWAM-10-2020-0044

Leydesdorff, L. & Etzkowitz, H. (2003). Conference report: Can ‘the public’ be considered as a fourth helix in university-industry-government relations? Report on the Fourth Triple Helix Conference, 2002. Science and Public Policy, 30(1), 55–61. https://doi.org/10.3152/147154303781780678

Li, H. (2022). Language models: past, present, and future. Communications of the ACM, 65(7), 56–63. https://doi.org/10.1145/3490443

Li, Y., Anastasopoulos, A. and Black, A.W. (2020). Towards Minimal Supervision BERT-Based Grammar Error Correction. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-20), 34(10), 13859-13860. https://doi.org/10.1609/aaai.v34i10.7202

Lin, J. (2020). Hybrid translation and language model for micro learning material recommendation. 2020 IEEE 20th International Conference on Advanced Learning Technologies (ICALT 2020), 384-386. https://doi.org/10.1109/ICALT49669.2020.00121

Liu, Q., Liu, T., Zhao, J., Fang, Q., Ding, W., Wu, Z., Xia, F., et al. (2021). Solving ESL sentence completion questions via pre-trained neural language models. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin & V. Dimitrova (Eds.), Artificial Intelligence in Education (AIED 2021). Springer. https://doi.org/10.1007/978-3-030-78270-2_46

López-Belmonte, J., Segura-Robles, A., Cho, W. C., Parra-González, M.E. & Moreno-Guerrero, A. J. (2021). What does literature teach about digital pathology? A bibliometric study in Web of Science. International Journal of Educational Research and Innovation, (16), 106–121. https://doi.org/10.46661/ijeri.4918

Lopez-Ferrero, C., Renau, I., Nazar, R. & Torner, S. (2014). Computer-assisted revision in Spanish academic texts: Peer-assessment. Procedia - Social and Behavioral Sciences,141, 470-483. https://doi.org/10.1016/j.sbspro.2014.05.083

Lu, X., Sahay, S., Yu, Z. & Nachman, L. (2021). ACAT-G: An Interactive Learning Framework for Assisted Response Generation. Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 35(18), 16084-16086. https://doi.org/10.1609/aaai.v35i18.18019

Makhlouf, J. & Mine, T. (2021). Mining students’ comments to build an automated feedback system. Proceedings of the 13th International Conference on Computer Supported Education (CSEDU),1. SciTePress. https://doi.org/10.5220/0010372200150025

Masala, M., Ruseti, S., Dascalu, M. & Dobre, C. (2021). Extracting and clustering main ideas from student feedback using language models. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin & V. Dimitrova (Eds.), Artificial Intelligence in Education (AIED 2021). Springer. https://doi.org/10.1007/978-3-030-78292-4_23

Meisner, C. (2023). Baylor professors fear students will lose critical thinking skills with ChatGPT. Baylot Lariat. https://baylorlariat.com/2023/02/07/baylor-professors-fear-students-will-lose-critical-thinking-skills-with-chatgpt/

Moore, S., Nguyen, H.A., Bier, N., Domadia, T. & Stamper, J. (2022). Assessing the quality of student-generated short answer questions using GPT-3. In I. Hilliger, P. Munoz-Merino, T. DeLaet, A. Ortega-Arranz, & T. Farrell (Eds.), Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption. EC-TEL 2022. Lecture Notes in Computer Science, 13450. Springer. https://doi.org/10.1007/978-3-031-16290-9_18

Murray, B. (2023). ChatGPT forces us to rethink student effort and laziness. Psychology Today.https://www.psychologytoday.com/intl/blog/real-happiness-in-a-digital-world/202301/chatgpt-forces-us-to-rethink-student-effort-and

Ndukwe, I.G., Amadi, C.E., Nkomo, L.M. & Daniel, B.K. (2020). Automatic grading system using sentence-BERT network. In I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, & E. Millan (Eds.), Artificial Intelligence in Education (AIED 2020). Springer. https://doi.org/10.1007/978-3-030-52240-7_41

Nehyba, J. & Stefanik, M. (2022). Applications of deep language models for reflective writings. Education and Information Technologies, 28, 2961-2999. https://doi.org/10.1007/s10639-022-11254-7

Nicula, B., Dascalu, M., Newton, N., Orcutt, E. & McNamara, D.S. (2021). Automated paraphrase quality assessment using recurrent neural networks and language models. In A. Cristea & C. Troussas (Eds.), Intelligent Tutoring Systems (ITS 2021). Springer. https://doi.org/10.1007/978-3-030-80421-3_36

Ondas, S., Hladek, D., Stas, J., Juhar, J., Kovacs, L. & Baksane, E.V. (2015). Semantic roles modeling using statistical language models. 2015 13th International Conference on Emerging Elearning Technologies and Applications (Iceta). IEEE. https://doi.org/10.1109/ICETA.2015.7558502

Pan, L. (2018). Automatic generation of children’s songs based on machine statistic learning. International Journal of Emerging Technologies in Learning, 12(3), 17-31. https://doi.org/10.3991/ijet.v13i03.8367

Parasa, N.S., Diwan, C. & Srinivasa, S. (2022). Automatic riddle generation for learning resources. In M., Rodrigo, N., Matsuda, A., Cristea, & V., Dimitrova (Eds). Artificial Intelligence in Education: Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners and Doctoral Consortium (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11647-6_66

Perkmann, M., Tartari, V., McKelvey, M., Autio, E., Broström, A., D’Este, P., Fini, R., et al. (2013). Academic engagement and commercialisation: A review of the literature on university–industry relations. Research Policy, 42(2), 423-442. https://doi.org/10.1016/j.respol.2012.09.007

Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. & Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 2227–2237. https://doi.org/10.18653/v1/N18-1202

Q.ai (2023). Here Comes the Bing Chatbot - Microsoft’s ChatGPT For Search Has Arrived, Forcing Google’s Hand. Forbes. https://www.forbes.com/sites/qai/2023/02/09/here-comes-the-bing-chatbotmicrosofts-chatgpt-for-search-has-arrived-forcing-googles-hand/?sh=6315ec6110fb

Rakovic, M., Sha, L., Nagtzaam, G., Young, N., Stratmann, P., Gasevic, D. & Chen, G. (2022). Towards the automated evaluation of legal casenote essays. In M., Rodrigo, N., Matsuda, A., Cristea, & V., Dimitrova (Eds), Artificial Intelligence in Education (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11644-5_14

Rosen, P. (2023). ChatGPT’s creator OpenAI has doubled in value since 2021 as the language bot goes viral and Microsoft pours in $10 billion. Markets Insider. https://markets.businessinsider.com/news/stocks/chatgpt-openai-valuation-bot-microsoft-language-google-tech-stock-funding-2023-1#:~:text=OpenAI%2C%20the%20parent%20company%20of

Rosenfeld, R. (2000). Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE, 88(8), 1270–1278. https://doi.org/10.1109/5.880083

Salim, S. (2023). UAE jobs and ChatGPT: Over 70% workers must learn new skills by 2025, says expert. Khaleej Times. https://www.khaleejtimes.com/jobs/uae-jobs-should-employees-worry-about-chatgpt-other-ai-tools-replacing-them

Sanghvi, S. & Westhoff, M. (2022). Education technology: Five trends to watch in the EdTech industry. Mckinsey & Company. https://www.mckinsey.com/industries/education/our-insights/five-trends-to-watch-in-the-edtech-industry

Sung, C., Dhamecha, T.I. & Mukhi, N. (2019). Improving short answer grading using transformer-based pre-training. In S. Isotani, E. Millan, A. Ogan, P. Hastings, B. McLaren, and R. Luckin (Eds.), Artificial Intelligence in Education (AIED 2019). Springer. https://doi.org/10.1007/978-3-030-23204-7_39

Tang, L., Ke, E., Singh, N., Feng, B., Austin, D., Verma, N. & Drori, I. (2022). Solving probability and statistics problems by probabilistic program synthesis at human level and predicting solvability. In M., Rodrigo, N., Matsuda, A., Cristea, & V., Dimitrova (Eds.). Artificial Intelligence in Education: Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners and Doctoral Consortium (AIED 2022). Spinger. https://doi.org/10.1007/978-3-031-11647-6_127

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., et al. (2017). Attention is all you need. arXiv. https://doi.org/10.48550/ARXIV.1706.03762

Wang, Z., Valdez, J., Mallick, D.B. & Baraniuk, R.G. (2022). Towards human-like educational question generation with large language models. In M., Rodrigo, N., Matsuda, A., Cristea, & V., Dimitrova (Eds). Artificial Intelligence in Education (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11644-5_13

Westfall, C. (2023). Educators Battle Plagiarism As 89% Of Students Admit To Using OpenAI’s ChatGPT For Homework. Forbes. https://www.forbes.com/sites/chriswestfall/2023/01/28/educators-battle-plagiarism-as-89-of-students-admit-to-using-open-ais-chatgpt-for-homework/

Wise, A.F., Cui, Y. & Jin, W.Q. (2017). Honing in on social learning networks in MOOC forums: examining critical network definition decisions. Proceedings of the International Learning Analytics & Knowledge Conference (Lak’17), 383-392. https://doi.org/10.1145/3027385.3027446

Wulff, P., Buschhueter, D., Westphal, A., Mientus, L., Nowak, A. & Borowski, A. (2022). Bridging the gap between qualitative and quantitative assessment in science education research with machine learning - a case for pretrained language models-based clustering. Journal of Science Education and Technology, 31, 490-513. https://doi.org/10.1007/s10956-022-09969-w

Xiao, C., Shi, L., Cristea, A., Li, Z. & Pan, Z. (2022). Fine-grained Main Ideas Extraction and Clustering of Online Course Reviews. In M. Rodrigo, N. Matsuda, A. Cristea, & V. Dimitrova (Eds). Artificial Intelligence in Education (AIED 2022). Springer. https://doi.org/10.1007/978-3-031-11644-5_24

Xu, S., Ding, W. & Liu, Z. (2020). Automatic Dialogic Instruction Detection for K-12 Online One-on-One Classes. In I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, & E. Millan (Eds.), Artificial Intelligence in Education (AIED 2020). Springer. https://doi.org/10.1007/978-3-030-52240-7_62

Xu, S., Xu, G., Jia, P., Ding, W., Wu, Z. & Liu, Z. (2021). Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin & V. Dimitrova (Eds.). Artificial Intelligence in Education (AIED 2021). Springer. https://doi.org/10.1007/978-3-030-78292-4_36

Yang, G., Wen, D., Kinshuk, Chen, N.S. & Sutinen, E. (2012). Personalized Text Content Summarizer for Mobile Learning: An Automatic Text Summarization System with Relevance Based Language Model.2012 IEEE Fourth International Conference on Technology for Education. https://doi.org/10.1109/T4E.2012.23

Zhu, X., Wu, H. & Zhang, L. (2022). Automatic short-answer grading via BERT-based deep neural networks. IEEE Transactions on Learning Technologies, 15(3), 364-375. https://doi.org/10.1109/TLT.2022.3175537