reeves2023evaluating

BibTeX:

@inproceedings{reeves2023evaluating,
author = {Reeves, Brent and Sarsa, Sami and Prather, James and Denny, Paul and Becker, Brett A. and Hellas, Arto and Kimmel, Bailey and Powell, Garrett and Leinonen, Juho},
title = {Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations},
year = {2023},
isbn = {9798400701382},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3587102.3588805},
doi = {10.1145/3587102.3588805},
abstract = {The recent emergence of code generation tools powered by large language models has attracted wide attention. Models such as OpenAI Codex can take natural language problem descriptions as input and generate highly accurate source code solutions, with potentially significant implications for computing education. Given the many complexities that students face when learning to write code, they may quickly become reliant on such tools without properly understanding the underlying concepts. One popular approach for scaffolding the code writing process is to use Parsons problems, which present solution lines of code in a scrambled order. These remove the complexities of low-level syntax, and allow students to focus on algorithmic and design-level problem solving. It is unclear how well code generation models can be applied to solve Parsons problems, given the mechanics of these models and prior evidence that they underperform when problems include specific restrictions. In this paper, we explore the performance of the Codex model for solving Parsons problems over various prompt variations. Using a corpus of Parsons problems we sourced from the computing education literature, we find that Codex successfully reorders the problem blocks about half of the time, a much lower rate of success when compared to prior work on more free-form programming tasks. Regarding prompts, we find that small variations in prompting have a noticeable effect on model performance, although the effect is not as pronounced as between different problems.},
booktitle = {Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1},
pages = {299–305},
numpages = {7},
keywords = {CS1, ML, chatgpt, codex, introductory programming, natural language processing, deep learning, large language models, code generation, GPT-3, ai, computer programming, neural networks, openAI, copilot, GitHub, machine learning, generative ai, artificial intelligence, academic integrity, code writing, novice programming},
location = {Turku, Finland},
series = {ITiCSE 2023}
}

EndNote:

%0 Conference Paper
%T Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations
%@ 9798400701382
%U https://doi.org/10.1145/3587102.3588805
%R 10.1145/3587102.3588805
%B Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1
%I Association for Computing Machinery
%A Brent Reeves
%A Sami Sarsa
%A James Prather
%A Paul Denny
%A Brett A. Becker
%A Arto Hellas
%A Bailey Kimmel
%A Garrett Powell
%A Juho Leinonen
%D 2023
%P 299–305
%K code writing, copilot, large language models, ai, GPT-3, neural networks, deep learning, artificial intelligence, academic integrity, GitHub, machine learning, novice programming, codex, generative ai, CS1, code generation, computer programming, chatgpt, ML, openAI, introductory programming, natural language processing
%C Turku, Finland

ACM:

Brent Reeves, Sami Sarsa, James Prather, Paul Denny, Brett A. Becker, Arto Hellas, Bailey Kimmel, Garrett Powell, and Juho Leinonen. 2023. Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 299–305. https://doi.org/10.1145/3587102.3588805