O'Reilly Data Show Podcast

Machine learning on encrypted data


Listen Later

In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analytics and machine learning to encrypted data. In a recent talk, I described the importance of data, various methods for estimating the value of data, and emerging tools for incentivizing data sharing across organizations. As I noted, the main motivation for improving data liquidity is the growing importance of machine learning. We’re all familiar with the importance of data security and privacy, but probably not as many people are aware of the emerging set of tools at the intersection of machine learning and security. Kaufman and his stellar roster of co-founders are doing some of the most interesting work in this area.
Here are some highlights from our conversation:
Running machine learning models on encrypted data
Four or five years ago, techniques for running machine learning models on data while it’s encrypted were being discussed in the academic world. We did a few trials of this and although the results were fascinating, it still wasn’t practical.
… There have been big breakthroughs that have led to it becoming feasible. A few years ago, it was more theoretical. Now it’s becoming feasible. This is the right time to build a company. Not only because of the technology feasibility but definitely because of the need in the market.
From inference to training
A classical example would be model inference. I have data; you have some predictive model. I want to consume your model. I’m not willing to share my data with you, so I’ll encrypt my data; you’ll apply your model to the encrypted data, so you’ll never see the data. I will never see your model. The result that comes out of this computation, which is encrypted as well, will be decrypted only by me, as I have the key. This means I can basically utilize your predictive insight, you can sell your model, and no one ever exchanged data or models between the parties.
… The next frontier of research is doing model training with these type of technologies. We have some great results, and there are others who are starting to do and implement some things in hardware. … Some of our recent work around applying deep learning to encrypted data combines different methods. Homomorphic encryption has its pros and cons; secure multi-party computation has other advantages and disadvantages. We basically mash various methods together to derive very, very interesting results. … For example, we have applied algorithms to genomic data at scale and we obtained impressive performance.
Related resources:
Sharad Goel and Sam Corbett-Davies on “Why it’s hard to design fair machine learning models”
Chang Liu on “How privacy-preserving techniques can lead to more robust machine learning models”
“How to build analytic products in an age when data privacy has become critical”
“Data collection and data markets in the age of privacy and machine learning”
“What machine learning means for software development”
“Lessons learned turning machine learning models into real products and services”
...more
View all episodesView all episodes
Download on the App Store

O'Reilly Data Show PodcastBy O'Reilly Media

  • 4
  • 4
  • 4
  • 4
  • 4

4

63 ratings


More shows like O'Reilly Data Show Podcast

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

283 Listeners

O'Reilly Radar Podcast - O'Reilly Media Podcast by O'Reilly Media

O'Reilly Radar Podcast - O'Reilly Media Podcast

36 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

482 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

592 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

O'Reilly Design Podcast - O'Reilly Media Podcast by O'Reilly Media

O'Reilly Design Podcast - O'Reilly Media Podcast

8 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

446 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

297 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

323 Listeners

Machine Learning Guide by OCDevel

Machine Learning Guide

764 Listeners

AI Today Podcast by AI & Data Today

AI Today Podcast

146 Listeners

DataFramed by DataCamp

DataFramed

267 Listeners

Practical AI by Practical AI LLC

Practical AI

192 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

197 Listeners

Last Week in AI by Skynet Today

Last Week in AI

287 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

199 Listeners