Datacast

Episode 13: Transition from Academia to Data Science with Martina Pugliese


Listen Later

Show Notes:



  • (2:20) Martina recalled her experience getting Bachelor, Master’s, and Ph.D. degrees in Physics from Sapienza Università’ di Roma.

  • (3:35) Martina discussed her Ph.D. thesis, in which she looked at the study of Complex Systems related to Linguistics and studied how natural language evolves in time.

  • (6:04) Martina talked about her experience doing the S2DS bootcamp in London after finishing her Ph.D.

  • (7:10) Martina gave her reasons to move to the Greater UK while looking for a tech job.

  • (8:05) Martina talked about the importance of software engineering from her time working for the education company Twig World.

  • (10:07) Martina discussed her current job as a Data Scientist at Mallzee, also known as “Tinder for Fashion.”

  • (11:46) Martina briefly went over her work in recommendation systems, data analytics, and statistical modeling in the first two years at Mallzee.

  • (13:50) Martina explained the unique features of fashion that make it a fertile field to do data science work.

  • (17:30) Martina emphasized the importance of communication for a data scientist.

  • (22:20) Martina talked about her transition to the Data Science Lead role at Mallzee since 2017.

  • (24:26) Martina gave her opinion about the relationship between the technical side and the scientific side of data science. (“Data Science Down the Line”)

  • (26:54) Martina talked about the importance of learning statistics for people coming from an engineering background who want to get into data science.

  • (30:30) Martina explained the analogy in her post “Don’t make recipes out of them” where she compared doing data science to cooking.

  • (35:00) Martina discussed the fundamental problem in academia which makes it losing appeal to young and talented individuals. (“Scientific publishing”)

  • (38:48) Martina talked about her experience organizing the PyData Edinburgh Meetup.

  • (42:04) Martina advocated for contributing to conversations to raise awareness about women in the scientific field. (“Women amaze”)

  • (46:45) Martina discussed her project using data from the corpora present in the NLTK library and analyzed the growth of types with respect to text size. (“The growth of vocabulary in different languages”)

  • (50:52) Martina mentioned her attempt to learn D3 to visualize data. (“Rallying into D3”)

  • (53:20) Martina moved on to her project analyzing tags on Stackoverflow. (“Stackoverflow Tags”)

  • (59:30) Martina talked about using TensorFlow for her deep learning project (“TensorFlow: Create the training set for the object detection”)

  • (01:03:23) Martina gave her prediction on how Data Science and Machine Learning will evolve in the next couple years.

  • (01:07:08) Closing segments.


Her Contact Info:



  • Twitter

  • Website

  • LinkedIn

  • GitHub

  • DataLab Podcast Interview


Her Recommended Resources:



  • Natural Language Toolkit NLTK

  • Scott Murray’s “Interactive Data Visualization For The Web

  • Dashing D3.js

  • Elijah Meeks’s “D3.js In Action

  • Malcolm Maclean’s “D3 Tips and Tricks

  • StackOverflow Developer Survey 2019

  • Trevor Hastie, Robert Tibshirani, Jerome Friedman’s “The Elements of Statistical Learning

  • Francis Chollet’s “Deep Learning with Python



This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe
...more
View all episodesView all episodes
Download on the App Store

DatacastBy James Le