Jim shares his Nagios tips and Wes chimes in with some modern tools as we chat monitoring in the wake of some high-profile outages.
Plus we turn our eye to hardware and get excited about the latest Ryzen line from AMD.
Links:
- Third parties confirm AMD’s outstanding Ryzen 3000 numbers | Ars Technica — AMD debuted its new Ryzen 3000 desktop CPU line a few weeks ago at E3, and it looked fantastic. For the first time in 20 years, it looked like AMD could go head to head with Intel's desktop CPU line-up across the board. The question: would independent, third-party testing back up AMD's assertions?
- The Internet broke today: Facebook, Verizon, and more see major outages | Ars Technica — Last week, Verizon caused a major BGP misroute that took large chunks of the Internet, including CDN company Cloudflare, partially down for a day. This week, the rest of the Internet has apparently asked Verizon to hold its beer.
It was a really bad month for the internet | TechCrunch — In the past month there were several major internet outages affecting millions of users across the world. Sites buckled, services broke, images wouldn’t load, direct messages ground to a halt and calendars and email were unavailable for hours at a time.Cloudflare outage caused by bad software deploy (updated) — For about 30 minutes today, visitors to Cloudflare sites received 502 errors caused by a massive spike in CPU utilization on our network. This CPU spike was caused by a bad software deploy that was rolled back.How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today — Today at 10:30UTC, the Internet had a small heart attack. A small company in Northern Pennsylvania became a preferred path of many Internet routes through Verizon (AS701), a major Internet transit provider. Getting started | Prometheus — This guide is a "Hello World"-style tutorial which shows how to install, configure, and use Prometheus in a simple example setup. prometheus/node_exporter — Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.Using netdata with Prometheus — Prometheus is a distributed monitoring system which offers a very simple setup along with a robust data model. Recently netdata added support for Prometheus.prometheus/nagios_plugins — Nagios plugin for alerting on prometheus query results.RobustPerception/nrpe_exporter — The NRPE exporter exposes metrics on commands sent to a running NRPE daemon.m-lab/prometheus-nagios-exporter — The Prometheus Nagios exporter reads status and performance data from nagios plugins via the MK Livestatus Nagios plugin and publishes this in a form that can be scrapped by Prometheus.Comparison to alternatives | Prometheus — Prometheus is a full monitoring and trending system that includes built-in and active scraping, storing, querying, graphing, and alerting based on time series data.Quality server monitoring solution using NetData/Prometheus/Grafana — I’m going to quickly show you how to install both netdata and Prometheus on the client and server. We can then use grafana pointed at Prometheus to obtain long-term metrics netdata offers.Monitoring stack by using Grafana + Prometheus + Netdata — This monitoring stack you can monitoring in real-time by Netdata and see the history by using Grafana.Monitoring Agent · NCPA — New to NCPA? See some of the awesome features present in the Web GUI and API, available on any operating system.Nagios 101: Understanding the Fundamentals - NagiosNagios Documentation