
Sign up to save your podcasts
Or


In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Maria Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews.
The conversation covers Alberto's research journey through knowledge graphs, graph-based recommenders, privacy considerations, and recommendation novelty. He explains why small modifications in datasets can significantly impact research outcomes, the importance of offline evaluation, and DataRec's vision as a lightweight library that integrates with existing frameworks rather than replacing them. Whether you're benchmarking new algorithms or exploring recommendation techniques, this episode offers practical insights into one of the most critical yet overlooked aspects of reproducible ML research.
By Kyle Polich4.4
475475 ratings
In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Maria Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews.
The conversation covers Alberto's research journey through knowledge graphs, graph-based recommenders, privacy considerations, and recommendation novelty. He explains why small modifications in datasets can significantly impact research outcomes, the importance of offline evaluation, and DataRec's vision as a lightweight library that integrates with existing frameworks rather than replacing them. Whether you're benchmarking new algorithms or exploring recommendation techniques, this episode offers practical insights into one of the most critical yet overlooked aspects of reproducible ML research.

32,324 Listeners

30,871 Listeners

288 Listeners

1,095 Listeners

624 Listeners

582 Listeners

301 Listeners

345 Listeners

201 Listeners

201 Listeners

309 Listeners

98 Listeners

535 Listeners

99 Listeners

225 Listeners