What is it about computational communication science?

Observing Opinions: What is Pre-Processing?


Listen Later

In this episode, Prof. Jamal Abdul Nasir from the University of Galway reveals why pre-processing is the backbone of all text analysis. He breaks down key steps like defining documents, tokenization, removing stop words, unification, and stemming vs. lemmatization. Jamal also explains unigrams vs. bigrams and how modern NLP techniques like byte-pair encoding are changing the game. Plus, he shares practical tips for making your pre-processing transparent and reproducible, helping your research stand strong and scale up.

...more
View all episodesView all episodes
Download on the App Store

What is it about computational communication science?By Emese Domahidi & Mario Haim