Data Skeptic

Flesch Kincaid Readability Tests


Listen Later

Given a document in English, how can you estimate the ease with which someone will find they can read it?  Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer.  One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests.  These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages.  The analysis Kyle describes in this episode yields the intuitively pleasing histogram below.  It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

 

...more
View all episodesView all episodes
Download on the App Store

Data SkepticBy Kyle Polich

  • 4.4
  • 4.4
  • 4.4
  • 4.4
  • 4.4

4.4

473 ratings


More shows like Data Skeptic

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

291 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

588 Listeners

The AI in Business Podcast by Daniel Faggella

The AI in Business Podcast

169 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

214 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

341 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

146 Listeners

Machine Learning Guide by OCDevel

Machine Learning Guide

768 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

142 Listeners

Last Week in AI by Skynet Today

Last Week in AI

303 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

96 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

557 Listeners