Data Engineering Podcast

ThreatStack: Data Driven Cloud Security with Pete Cheslock and Patrick Cable - Episode 25


Listen Later

Summary

Cloud computing and ubiquitous virtualization have changed the ways that our applications are built and deployed. This new environment requires a new way of tracking and addressing the security of our systems. ThreatStack is a platform that collects all of the data that your servers generate and monitors for unexpected anomalies in behavior that would indicate a breach and notifies you in near-realtime. In this episode ThreatStack’s director of operations, Pete Cheslock, and senior infrastructure security engineer, Patrick Cable, discuss the data infrastructure that supports their platform, how they capture and process the data from client systems, and how that information can be used to keep your systems safe from attackers.

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt.
  • Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
  • Your host is Tobias Macey and today I’m interviewing Pete Cheslock and Pat Cable about the data infrastructure and security controls at ThreatStack
  • Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Why don’t you start by explaining what ThreatStack does?
      • What was lacking in the existing options (services and self-hosted/open source) that ThreatStack solves for?

      • Can you describe the type(s) of data that you collect and how it is structured?

      • What is the high level data infrastructure that you use for ingesting, storing, and analyzing your customer data?

        • How do you ensure a consistent format of the information that you receive?
        • How do you ensure that the various pieces of your platform are deployed using the proper configurations and operating as intended?
        • How much configuration do you provide to the end user in terms of the captured data, such as sampling rate or additional context?

        • I understand that your original architecture used RabbitMQ as your ingest mechanism, which you then migrated to Kafka. What was your initial motivation for that change?

          • How much of a benefit has that been in terms of overall complexity and cost (both time and infrastructure)?

          • How do you ensure the security and provenance of the data that you collect as it traverses your infrastructure?

          • What are some of the most common vulnerabilities that you detect in your client’s infrastructure?

          • For someone who wants to start using ThreatStack, what does the setup process look like?

          • What have you found to be the most challenging aspects of building and managing the data processes in your environment?

          • What are some of the projects that you have planned to improve the capacity or capabilities of your infrastructure?

          • Contact Info
            • Pete Cheslock
              • @petecheslock on Twitter
              • Website
              • petecheslock on GitHub

              • Patrick Cable

                • @patcable on Twitter
                • Website
                • patcable on GitHub

                • ThreatStack

                  • Website
                  • @threatstack on Twitter
                  • threatstack on GitHub

                  • Parting Question
                    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
                    • Links
                      • ThreatStack
                      • SecDevOps
                      • Sonian
                      • EC2
                      • Snort
                      • Snorby
                      • Suricata
                      • Tripwire
                      • Syscall (System Call)
                      • AuditD
                      • CloudTrail
                      • Naxsi
                      • Cloud Native
                      • File Integrity Monitoring (FIM)
                      • Amazon Web Services (AWS)
                      • RabbitMQ
                      • ZeroMQ
                      • Kafka
                      • Spark
                      • Slack
                      • PagerDuty
                      • JSON
                      • Microservices
                      • Cassandra
                      • ElasticSearch
                      • Sensu
                      • Service Discovery
                      • Honeypot
                      • Kubernetes
                      • PostGreSQL
                      • Druid
                      • Flink
                      • Launch Darkly
                      • Chef
                      • Consul
                      • Terraform
                      • CloudFormation
                      • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

                        Support Data Engineering Podcast

                        ...more
                        View all episodesView all episodes
                        Download on the App Store

                        Data Engineering PodcastBy Tobias Macey

                        • 4.6
                        • 4.6
                        • 4.6
                        • 4.6
                        • 4.6

                        4.6

                        135 ratings


                        More shows like Data Engineering Podcast

                        View all
                        Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

                        Software Engineering Radio - the podcast for professional software developers

                        272 Listeners

                        The Changelog: Software Development, Open Source by Changelog Media

                        The Changelog: Software Development, Open Source

                        283 Listeners

                        The Cloudcast by Massive Studios

                        The Cloudcast

                        153 Listeners

                        Thoughtworks Technology Podcast by Thoughtworks

                        Thoughtworks Technology Podcast

                        41 Listeners

                        Data Skeptic by Kyle Polich

                        Data Skeptic

                        483 Listeners

                        Talk Python To Me by Michael Kennedy

                        Talk Python To Me

                        592 Listeners

                        Software Engineering Daily by Software Engineering Daily

                        Software Engineering Daily

                        624 Listeners

                        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                        444 Listeners

                        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                        Super Data Science: ML & AI Podcast with Jon Krohn

                        298 Listeners

                        Python Bytes by Michael Kennedy and Brian Okken

                        Python Bytes

                        213 Listeners

                        DataFramed by DataCamp

                        DataFramed

                        266 Listeners

                        Practical AI by Practical AI LLC

                        Practical AI

                        190 Listeners

                        The Stack Overflow Podcast by The Stack Overflow Podcast

                        The Stack Overflow Podcast

                        64 Listeners

                        The Real Python Podcast by Real Python

                        The Real Python Podcast

                        140 Listeners

                        Latent Space: The AI Engineer Podcast by swyx + Alessio

                        Latent Space: The AI Engineer Podcast

                        77 Listeners