
Sign up to save your podcasts
Or


Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?
While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.
In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.
For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.
By Kyle Polich4.4
475475 ratings
Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?
While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.
In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.
For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

271 Listeners

289 Listeners

1,090 Listeners

623 Listeners

585 Listeners

303 Listeners

334 Listeners

146 Listeners

269 Listeners

207 Listeners

63 Listeners

142 Listeners

95 Listeners

514 Listeners

227 Listeners