
Sign up to save your podcasts
Or
Uber manages the car rides for millions of people. The Uber system must remain operational 24/7, and the app involves financial transactions and the safety of passengers.
Uber infrastructure runs across thousands of server instances and produce terabytes of monitoring data. The monitoring data is used to understand the health of the software systems as well as relevant business metrics, such as driver efficiency, daily revenues, and user satisfaction.
Uber adopted the Prometheus monitoring system to manage their monitoring data. Prometheus regularly scrapes metrics across infrastructure to gather time series data about the state of everything across Uber. As the usage of Prometheus has grown within the company, Uber has had to figure out how to scale their monitoring platform.
M3 is a monitoring system built at Uber to scale Prometheus and provide a platform that can effectively scale the data storage as well as the query serving. Rob Skillington is a staff software engineer at Uber, and he joins the show to talk about monitoring at Uber–from the requirements of the system to the implementation of M3.
At Uber, M3 powers dashboards, ad-hoc queries, and alerting. M3 was open sourced to give other users access to a scalable Prometheus solution. In a previous episode with Brian Boreham, we discussed one strategy for scaling Prometheus. Today’s episode covers another scalability solution, with M3.
The post Uber’s Monitoring Platform with Rob Skillington appeared first on Software Engineering Daily.
4.4
1414 ratings
Uber manages the car rides for millions of people. The Uber system must remain operational 24/7, and the app involves financial transactions and the safety of passengers.
Uber infrastructure runs across thousands of server instances and produce terabytes of monitoring data. The monitoring data is used to understand the health of the software systems as well as relevant business metrics, such as driver efficiency, daily revenues, and user satisfaction.
Uber adopted the Prometheus monitoring system to manage their monitoring data. Prometheus regularly scrapes metrics across infrastructure to gather time series data about the state of everything across Uber. As the usage of Prometheus has grown within the company, Uber has had to figure out how to scale their monitoring platform.
M3 is a monitoring system built at Uber to scale Prometheus and provide a platform that can effectively scale the data storage as well as the query serving. Rob Skillington is a staff software engineer at Uber, and he joins the show to talk about monitoring at Uber–from the requirements of the system to the implementation of M3.
At Uber, M3 powers dashboards, ad-hoc queries, and alerting. M3 was open sourced to give other users access to a scalable Prometheus solution. In a previous episode with Brian Boreham, we discussed one strategy for scaling Prometheus. Today’s episode covers another scalability solution, with M3.
The post Uber’s Monitoring Platform with Rob Skillington appeared first on Software Engineering Daily.