
Sign up to save your podcasts
Or
Today we talk with Matvey Kukuy and Tal Borenstein, co-founders of Keep, a startup focused on helping companies manage and make sense of their alert systems. The discussion comes three years after Matvey's previous appearance - https://shipit.show/36 - where he talked about Grafana Labs' acquisition of his previous startup Amixr (now Grafana OnCall).
Keep tackles a significant challenge in modern tech infrastructure: managing the overwhelming volume of alerts that companies receive from their various monitoring systems. Some enterprises deal with up to 70,000 alerts daily, making it crucial to identify which ones represent actual incidents requiring attention.
We explore real-world examples of major incidents, including the significant CrowdStrike outage in July 2024 that caused widespread system crashes and resulted in an estimated $10 billion in worldwide damages. This incident highlighted how critical it is to quickly identify and respond to serious issues among numerous alerts. Matvey tells us about his most black swan experience.
The episode concludes with a hint that some of Keep's AI features may eventually be released as open source once they're sufficiently polished.
LINKS
EPISODE CHAPTERS
Today we talk with Matvey Kukuy and Tal Borenstein, co-founders of Keep, a startup focused on helping companies manage and make sense of their alert systems. The discussion comes three years after Matvey's previous appearance - https://shipit.show/36 - where he talked about Grafana Labs' acquisition of his previous startup Amixr (now Grafana OnCall).
Keep tackles a significant challenge in modern tech infrastructure: managing the overwhelming volume of alerts that companies receive from their various monitoring systems. Some enterprises deal with up to 70,000 alerts daily, making it crucial to identify which ones represent actual incidents requiring attention.
We explore real-world examples of major incidents, including the significant CrowdStrike outage in July 2024 that caused widespread system crashes and resulted in an estimated $10 billion in worldwide damages. This incident highlighted how critical it is to quickly identify and respond to serious issues among numerous alerts. Matvey tells us about his most black swan experience.
The episode concludes with a hint that some of Keep's AI features may eventually be released as open source once they're sufficiently polished.
LINKS
EPISODE CHAPTERS