The handling of missing binary data in language research

Francois Pichette*, Sebastien Beland, Shahab Jolani, Justyna Lesniewska

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Web of Science)

Abstract

Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002) that data replacement and deletion methods are common in research. Language researchers declared that when faced with missing answers of the yes/no type (that translate into zero or one in data tables), the three most common solutions they adopt are to exclude the participant's data from the analyses, to leave the square empty, or to fill in with zero, as for an incorrect answer. This study then examines the impact on Cronbach's a of five types of data insertion, using simulated and actual data with various numbers of participants and missing percentages. Our analyses indicate that the three most common methods we identified among language researchers are the ones with the greatest impact on Cronbach's a coefficients; in other words, they are the least desirable solutions to the missing data problem. On the basis of our results, we make recommendations for language researchers concerning the best way to deal with missing data. Given that none of the most common simple methods works properly, we suggest that the missing data be replaced either by the item's mean or by the participants' overall mean to provide a better, more accurate image of the instrument's internal consistency.

Original languageEnglish
Pages (from-to)153-169
Number of pages17
JournalStudies in Second Language Learning and Teaching
Volume5
Issue number1
DOIs
Publication statusPublished - Mar 2015
Externally publishedYes

Keywords

  • missing data
  • Cronbach's alpha
  • participant exclusion
  • second language testing
  • RESEARCH METHODOLOGY
  • NONRESPONSE
  • STATISTICS
  • MODELS
  • IMPACT
  • FIT

Cite this