Chen, X. B., & Meurers, D. (2017).

Assessment of text readability is important for assigning texts at the appropriate level to readers at different proficiency levels. The present research approached readability assessment from the lexical perspective of word frequencies derived from corpora assumed to reflect typical language experience. Three studies were conducted to test how the word-level feature of word frequency can be aggregated to characterise text-level readability. The results show that an effective use of word frequency for text readability assessment should take a range of characteristics of the distribution of words frequencies into account. For characterizing text readability, taking into account the standard deviation in addition to the mean word frequencies already significantly increases results. The best results are obtained using the mean frequencies of the words in language frequency bands or in bands obtained by agglomerative clustering of the word frequencies in the documents – though a comparison of within-corpus and cross-corpus results shows the limited generalizability of using high numbers of fine-grained frequency bands. Overall, the study advances our understanding of the relationship between word frequency and text readability and provides concrete options for more effectively making use of lexical frequency information in practice.

