Public Spark

Identifying Personal Identifiable Information (PII) in Unstructured Data with Microsoft Presidio


Listen Later

Source: https://www.statcan.gc.ca/en/data-science/network/identifying-personal-identifiable-information
In this episode, we explore how organizations can protect sensitive personal data using Microsoft Presidio, a powerful open-source tool for PII detection and anonymization. Saptarshi Dutta Gupta from Statistics Canada explains the importance of safeguarding personally identifiable information in compliance with Canadian privacy laws like the Privacy Act and PIPEDA. Learn about Presidio's key features including its ability to detect PII in unstructured text using pattern recognition, Named Entity Recognition, and contextual analysis. The episode covers how to customize Presidio to identify new types of PII entities, support multiple languages, and implement various anonymization techniques such as replacing, redacting, masking, and encrypting sensitive information. Discover practical examples of implementing Presidio in your data protection strategy to ensure privacy compliance while working with large datasets.
The contents and hosts of this podcast are AI generated.
...more
View all episodesView all episodes
Download on the App Store

Public SparkBy Doug Keefe