Do Programmers Prefer Predictable Expressions in Code?. Issue 12 (12th December 2020)
- Record Type:
- Journal Article
- Title:
- Do Programmers Prefer Predictable Expressions in Code?. Issue 12 (12th December 2020)
- Main Title:
- Do Programmers Prefer Predictable Expressions in Code?
- Authors:
- Casalnuovo, Casey
Lee, Kevin
Wang, Hulin
Devanbu, Prem
Morgan, Emily - Abstract:
- Abstract: Source code is a form of human communication, albeit one where the information shared between the programmers reading and writing the code is constrained by the requirement that the code executes correctly. Programming languages are more syntactically constrained than natural languages, but they are also very expressive, allowing a great many different ways to express even very simple computations. Still, code written by developers is highly predictable, and many programming tools have taken advantage of this phenomenon, relying on language model surprisal as a guiding mechanism. While surprisal has been validated as a measure of cognitive load in natural language, its relation to human cognitive processes in code is still poorly understood. In this paper, we explore the relationship between surprisal and programmer preference at a small granularity—do programmers prefer more predictable expressions in code? Using meaning‐preserving transformations, we produce equivalent alternatives to developer‐written code expressions and run a corpus study on Java and Python projects. In general, language models rate the code expressions developers choose to write as more predictable than these transformed alternatives. Then, we perform two human subject studies asking participants to choose between two equivalent snippets of Java code with different surprisal scores (one original and transformed). We find that programmers do prefer more predictable variants, and that strongerAbstract: Source code is a form of human communication, albeit one where the information shared between the programmers reading and writing the code is constrained by the requirement that the code executes correctly. Programming languages are more syntactically constrained than natural languages, but they are also very expressive, allowing a great many different ways to express even very simple computations. Still, code written by developers is highly predictable, and many programming tools have taken advantage of this phenomenon, relying on language model surprisal as a guiding mechanism. While surprisal has been validated as a measure of cognitive load in natural language, its relation to human cognitive processes in code is still poorly understood. In this paper, we explore the relationship between surprisal and programmer preference at a small granularity—do programmers prefer more predictable expressions in code? Using meaning‐preserving transformations, we produce equivalent alternatives to developer‐written code expressions and run a corpus study on Java and Python projects. In general, language models rate the code expressions developers choose to write as more predictable than these transformed alternatives. Then, we perform two human subject studies asking participants to choose between two equivalent snippets of Java code with different surprisal scores (one original and transformed). We find that programmers do prefer more predictable variants, and that stronger language models like the transformer align more often and more consistently with these preferences. … (more)
- Is Part Of:
- Cognitive science. Volume 44:Issue 12(2020)
- Journal:
- Cognitive science
- Issue:
- Volume 44:Issue 12(2020)
- Issue Display:
- Volume 44, Issue 12 (2020)
- Year:
- 2020
- Volume:
- 44
- Issue:
- 12
- Issue Sort Value:
- 2020-0044-0012-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-12-12
- Subjects:
- Surprisal -- Language models -- Dual channel constraints -- Source code expressions -- Meaning‐preserving transformations -- Human preference
Cognition -- Periodicals
Psycholinguistics -- Periodicals
Artificial intelligence -- Periodicals
153.05 - Journal URLs:
- http://firstsearch.oclc.org/journal=0364-0213;screen=info;ECOIP ↗
http://www3.interscience.wiley.com/journal/121670282/home ↗
http://onlinelibrary.wiley.com/ ↗
http://www.sciencedirect.com/science/journal/03640213 ↗ - DOI:
- 10.1111/cogs.12921 ↗
- Languages:
- English
- ISSNs:
- 0364-0213
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3292.885000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15287.xml