March 14, 2023

S2E10: Leveraging Synthetic Data and Privacy Guarantees with Lipika Ramaswamy (Gretel.ai)

45 minutes

This week, we welcome Lipika Ramaswamy, Senior Applied Scientist at Gretel AI, a privacy tech company that makes it simple to generate anonymized and safe synthetic data via APIs. Previously, Lipika worked as a Data Scientist at LeapYear Technologies, and was the Machine Learning Researcher at Harvard University's Privacy Tools Project.

Lipika’s interest in both machine learning and privacy comes from her love of math and things that can be defined with equations. Her interest was piqued in grad school and accidentally walked into a classroom holding a lecture on Applying Differential Privacy for Data Science. The intersection of data combined with the privacy guarantees that we have available today has kept her hooked ever since.

---------
Thank you to our sponsor, Privado, the developer-friendly privacy platform
---------

There's a lot to unpack when it comes to synthetic data & privacy guarantees, as she takes listeners on a deep dive of these compelling topics. Lipika finds elegant how privacy assurances like differential privacy revolve around math and statistics at their core. Essentially, she loves building things with 'usable privacy' & security that people can easily use. We also delve into the metrics tracked in the Gretel Synthetic Data Report, which assesses both 'statistical integrity' & 'privacy levels' of a customer's training data.

Topics Covered:

The definition of 'synthetic data,' & good use cases
The process of creating synthetic data
How to ensure that synthetic data is 'privacy-preserving'
Privacy problems that may arise from overtraining ML models
When to use synthetic data rather than other techniques like tokenization, anonymization, aggregation & others
Examples of good use cases vs poor use cases for using synthetic data
Common misperceptions around synthetic data
Gretel.ai's approach to 'privacy assurance,' including a focus on 'privacy filters,' which prevent some privacy harms outputted by LLMs
How to plug into the 'synthetic data' community
Who bears the responsibility for educating the public about new technology like LLMs and potential harms
Highlights from Gretel.ai's Synthesize 2023 conference

Resources Mentioned:

Join Gretel's Synthetic Data Community on Discord
Watch Talks on Synthetic Data on YouTube

Guest Info:

Connect with Lipika on LinkedIn

Send us a text

Privado.ai
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

Shifting Privacy Left Media
Where privacy engineers gather, share, & learn

Buzzsprout - Launch your podcast

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

...more

View all episodes

By Debra J. Farber (Shifting Privacy Left)

4.8

1717 ratings

March 14, 2023

S2E10: Leveraging Synthetic Data and Privacy Guarantees with Lipika Ramaswamy (Gretel.ai)

45 minutes

---------
Thank you to our sponsor, Privado, the developer-friendly privacy platform
---------

Topics Covered:

The definition of 'synthetic data,' & good use cases
The process of creating synthetic data
How to ensure that synthetic data is 'privacy-preserving'
Privacy problems that may arise from overtraining ML models
When to use synthetic data rather than other techniques like tokenization, anonymization, aggregation & others
Examples of good use cases vs poor use cases for using synthetic data
Common misperceptions around synthetic data
Gretel.ai's approach to 'privacy assurance,' including a focus on 'privacy filters,' which prevent some privacy harms outputted by LLMs
How to plug into the 'synthetic data' community
Who bears the responsibility for educating the public about new technology like LLMs and potential harms
Highlights from Gretel.ai's Synthesize 2023 conference

Resources Mentioned:

Join Gretel's Synthetic Data Community on Discord
Watch Talks on Synthetic Data on YouTube

Guest Info:

Connect with Lipika on LinkedIn

Send us a text

...more

Share S2E10: Leveraging Synthetic Data and Privacy Guarantees with Lipika Ramaswamy (Gretel.ai)

Sign up to save your podcasts

S2E10: Leveraging Synthetic Data and Privacy Guarantees with Lipika Ramaswamy (Gretel.ai)

S2E10: Leveraging Synthetic Data and Privacy Guarantees with Lipika Ramaswamy (Gretel.ai)