Partially Redacted: Data, AI, Security, and Privacy

Privacy-aware Data Pipelines with Skyflow’s Piper Keyes


Listen Later

A data analytics pipeline is important to modern businesses because it allows them to extract valuable insights from the large amounts of data they generate and collect on a daily basis. This leads to better decision making, improved efficiency, and increased ROI.

However, despite your best efforts, sensitive customer data tends to find its way into our analytics pipelines, ending up in our data warehouses and metrics dashboards. Replicating customer PII to your downstream services greatly increases your compliance scope and makes maintaining data privacy and security significantly more challenging.

In this episode, Engineering Lead at Skyflow Piper Keyes joins the show to discuss what goes into building a privacy-aware data pipeline, what tools and technologies should you be using, and how Skyflow addresses this problem.

Topics:

  • What is a data analytics pipeline?
  • What does it mean to build a privacy-aware data pipeline?
  • Can you give some examples of use cases where privacy-aware data pipelines are particularly important?
  • What’s it mean to de-identify data and how does that work?
  • What are some common techniques used to preserve privacy in data pipelines?
  • How does analytics work for de-identified data?
  • How do you balance the need for data privacy with the need for actually being able to use the data?
  • What’s it take to build a privacy-aware pipeline from scratch?
  • What are some of the biggest challenges in building privacy-aware data pipelines?
  • How does something like this work with Skyflow?
  • Let’s say I have customer’s transactional data from Visa, how could I ingest that data into my data warehouse but avoid having to build PCI compliance infrastructure? Walk me through how that works.
  • Could you build a machine learning model based on the de-identified data?
  • Once I have the data in my warehouse, let’s say I needed to inform a clinical trial participant about an issue but I also want to maintain their privacy, how could I perform an operation like that?
  • What other use cases does this product enable?
  • Resources:

    • Running Secure Workflows with Sensitive Customer Data
  • Maximize Privacy while Preserving Utility for Data Analytics
  • ...more
    View all episodesView all episodes
    Download on the App Store

    Partially Redacted: Data, AI, Security, and PrivacyBy Skyflow

    • 4.8
    • 4.8
    • 4.8
    • 4.8
    • 4.8

    4.8

    19 ratings


    More shows like Partially Redacted: Data, AI, Security, and Privacy

    View all
    Security Now (Audio) by TWiT

    Security Now (Audio)

    1,952 Listeners

    This American Life by This American Life

    This American Life

    90,380 Listeners

    Uncanny Valley | WIRED by WIRED

    Uncanny Valley | WIRED

    365 Listeners

    Freakonomics Radio by Freakonomics Radio + Stitcher

    Freakonomics Radio

    32,109 Listeners

    a16z Podcast by Andreessen Horowitz

    a16z Podcast

    1,008 Listeners

    Software Engineering Daily by Software Engineering Daily

    Software Engineering Daily

    624 Listeners

    Hidden Brain by Hidden Brain, Shankar Vedantam

    Hidden Brain

    43,343 Listeners

    Acquired by Ben Gilbert and David Rosenthal

    Acquired

    3,636 Listeners

    The Daily by The New York Times

    The Daily

    112,729 Listeners

    Up First from NPR by NPR

    Up First from NPR

    56,140 Listeners

    Hacking Humans by N2K Networks

    Hacking Humans

    304 Listeners

    Lex Fridman Podcast by Lex Fridman

    Lex Fridman Podcast

    12,694 Listeners

    All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

    All-In with Chamath, Jason, Sacks & Friedberg

    8,385 Listeners

    Hard Fork by The New York Times

    Hard Fork

    5,377 Listeners