Managing rater effects through the use of FACETS analysis: the case of a university placement test. Issue 2 (3rd March 2016)
- Record Type:
- Journal Article
- Title:
- Managing rater effects through the use of FACETS analysis: the case of a university placement test. Issue 2 (3rd March 2016)
- Main Title:
- Managing rater effects through the use of FACETS analysis: the case of a university placement test
- Authors:
- Wu, Siew Mei
Tan, Susan - Abstract:
- ABSTRACT: Rating essays is a complex task where students' grades could be adversely affected by test-irrelevant factors such as rater characteristics and rating scales. Understanding these factors and controlling their effects are crucial for test validity. Rater behaviour has been extensively studied through qualitative methods such as questionnaires and think aloud protocols and quantitatively through the use of the multi-faceted Rasch model (MFRM) [Congdon, P.J., & McQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37(2), 163–178; Engelhard, G. (1992). The measurement of writing ability with a multi-faceted Rasch model. Applied Measurement in Education, 5(3), 171–191; Lumley, T., & McNamara, T.F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(54), 54–71; Weigle, S.C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263–287]. While these studies have yielded a rich understanding of rater characteristics and rating, less is known about the use of quantitative analysis to help manage and make adjustments for differences in students' scores. This study uses the MFRM [Linacre, J.M. (1989). Multi-faceted Rasch measurement . Chicago: MESA Press] to investigate raters' scoring behaviour and ascertain how it affects students' scores in a large-scale placement test. It proposes the use of the anchoring method within the MFRM to manageABSTRACT: Rating essays is a complex task where students' grades could be adversely affected by test-irrelevant factors such as rater characteristics and rating scales. Understanding these factors and controlling their effects are crucial for test validity. Rater behaviour has been extensively studied through qualitative methods such as questionnaires and think aloud protocols and quantitatively through the use of the multi-faceted Rasch model (MFRM) [Congdon, P.J., & McQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37(2), 163–178; Engelhard, G. (1992). The measurement of writing ability with a multi-faceted Rasch model. Applied Measurement in Education, 5(3), 171–191; Lumley, T., & McNamara, T.F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(54), 54–71; Weigle, S.C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263–287]. While these studies have yielded a rich understanding of rater characteristics and rating, less is known about the use of quantitative analysis to help manage and make adjustments for differences in students' scores. This study uses the MFRM [Linacre, J.M. (1989). Multi-faceted Rasch measurement . Chicago: MESA Press] to investigate raters' scoring behaviour and ascertain how it affects students' scores in a large-scale placement test. It proposes the use of the anchoring method within the MFRM to manage the placement of students where it is not possible to have all raters score all scripts. The analysis shows that raters, while mostly internally consistent, have different levels of severity despite training. These differences would significantly affect a student's placement in the test if no measures are taken to manage this problem. The MFRM also shows that a few raters may be scoring the essays in a more holistic manner over time probably due to the halo effect [Engelhard, G. (1998). Evaluating the quality of ratings obtained from standard-setting judges. Educational and Psychological Measurement, 58(2), 179–196]. The study demonstrates how the MFRM can reveal patterns in raters' scoring and most importantly the analysis yields data that allow targeted strategies to handle the practical issue of moderation of scores to manage rater differences. … (more)
- Is Part Of:
- Higher education research & development. Volume 35:Issue 2(2016)
- Journal:
- Higher education research & development
- Issue:
- Volume 35:Issue 2(2016)
- Issue Display:
- Volume 35, Issue 2 (2016)
- Year:
- 2016
- Volume:
- 35
- Issue:
- 2
- Issue Sort Value:
- 2016-0035-0002-0000
- Page Start:
- 380
- Page End:
- 394
- Publication Date:
- 2016-03-03
- Subjects:
- assessment -- facets -- fit -- rater -- reliability -- severity -- writing
Education, Higher -- Australia -- Periodicals
378.94 - Journal URLs:
- http://www.tandfonline.com/toc/cher20/current ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/07294360.2015.1087381 ↗
- Languages:
- English
- ISSNs:
- 0729-4360
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4307.389000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2717.xml