Parsing Excel formulas: A grammar and its application on 4 large datasets. Issue 12 (8th September 2017)
- Record Type:
- Journal Article
- Title:
- Parsing Excel formulas: A grammar and its application on 4 large datasets. Issue 12 (8th September 2017)
- Main Title:
- Parsing Excel formulas: A grammar and its application on 4 large datasets
- Authors:
- Aivaloglou, Efthimia
Hoepelman, David
Hermans, Felienne - Other Names:
- Khomh Foutse guestEditor.
Lo David guestEditor.
Godfrey Michael W. guestEditor. - Abstract:
- Abstract: Spreadsheets are popular end user programming tools, especially in the industrial world. This makes them interesting research targets. However, there does not exist a reliable grammar that is concise enough to facilitate formula parsing and analysis and to support research on spreadsheet codebases. This paper presents a grammar for spreadsheet formulas that can successfully parse 99.99% of more than 8 million unique formulas extracted from 4 spreadsheet datasets. Our grammar is compatible with the spreadsheet formula language, recognizes the spreadsheet formula elements that are required for supporting spreadsheets research, and produces parse trees aimed at further manipulation and analysis. Additionally, we use the grammar to analyze the characteristics of the formulas of the 4 datasets in 3 different dimensions: complexity, functionality, and data utilization. Our results show that (1) most Excel formulas are simple, however formulas with more than 50 functions or operations exist, (2) almost all formulas use data from other cells, which is often not local, and (3) a surprising number of referring mechanisms are used by less than 1% of the formulas. Abstract : The paper presents a grammar for spreadsheet formulas that recognizes the spreadsheet formula elements that are required for supporting spreadsheets research. The grammar is used for analyzing the characteristics of 8 million unique formulas extracted from 4 spreadsheet datasets in 3 different dimensions:Abstract: Spreadsheets are popular end user programming tools, especially in the industrial world. This makes them interesting research targets. However, there does not exist a reliable grammar that is concise enough to facilitate formula parsing and analysis and to support research on spreadsheet codebases. This paper presents a grammar for spreadsheet formulas that can successfully parse 99.99% of more than 8 million unique formulas extracted from 4 spreadsheet datasets. Our grammar is compatible with the spreadsheet formula language, recognizes the spreadsheet formula elements that are required for supporting spreadsheets research, and produces parse trees aimed at further manipulation and analysis. Additionally, we use the grammar to analyze the characteristics of the formulas of the 4 datasets in 3 different dimensions: complexity, functionality, and data utilization. Our results show that (1) most Excel formulas are simple, however formulas with more than 50 functions or operations exist, (2) almost all formulas use data from other cells, which is often not local, and (3) a surprising number of referring mechanisms are used by less than 1% of the formulas. Abstract : The paper presents a grammar for spreadsheet formulas that recognizes the spreadsheet formula elements that are required for supporting spreadsheets research. The grammar is used for analyzing the characteristics of 8 million unique formulas extracted from 4 spreadsheet datasets in 3 different dimensions: complexity, functionality, and data utilization. It is found that most Excel formulas are simple; however, surprisingly complex formulas exist, almost all formulas use data from other cells, and a large number of referring mechanisms are rarely used. … (more)
- Is Part Of:
- Journal of software. Volume 29:Issue 12(2017)
- Journal:
- Journal of software
- Issue:
- Volume 29:Issue 12(2017)
- Issue Display:
- Volume 29, Issue 12 (2017)
- Year:
- 2017
- Volume:
- 29
- Issue:
- 12
- Issue Sort Value:
- 2017-0029-0012-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2017-09-08
- Subjects:
- formula grammar -- spreadsheets -- syntax
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.1895 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5529.xml