TechSNAP

407: Old School Outages


Listen Later

Jim shares his Nagios tips and Wes chimes in with some modern tools as we chat monitoring in the wake of some high-profile outages.

Plus we turn our eye to hardware and get excited about the latest Ryzen line from AMD.

Links:

  • Third parties confirm AMD’s outstanding Ryzen 3000 numbers | Ars Technica — AMD debuted its new Ryzen 3000 desktop CPU line a few weeks ago at E3, and it looked fantastic. For the first time in 20 years, it looked like AMD could go head to head with Intel's desktop CPU line-up across the board. The question: would independent, third-party testing back up AMD's assertions?
  • The Internet broke today: Facebook, Verizon, and more see major outages | Ars Technica — Last week, Verizon caused a major BGP misroute that took large chunks of the Internet, including CDN company Cloudflare, partially down for a day. This week, the rest of the Internet has apparently asked Verizon to hold its beer.
  • It was a really bad month for the internet | TechCrunch — In the past month there were several major internet outages affecting millions of users across the world. Sites buckled, services broke, images wouldn’t load, direct messages ground to a halt and calendars and email were unavailable for hours at a time.
  • Cloudflare outage caused by bad software deploy (updated) — For about 30 minutes today, visitors to Cloudflare sites received 502 errors caused by a massive spike in CPU utilization on our network. This CPU spike was caused by a bad software deploy that was rolled back.
  • How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today — Today at 10:30UTC, the Internet had a small heart attack. A small company in Northern Pennsylvania became a preferred path of many Internet routes through Verizon (AS701), a major Internet transit provider.
  • Getting started | Prometheus — This guide is a "Hello World"-style tutorial which shows how to install, configure, and use Prometheus in a simple example setup.
  • prometheus/node_exporter — Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.
  • Using netdata with Prometheus — Prometheus is a distributed monitoring system which offers a very simple setup along with a robust data model. Recently netdata added support for Prometheus.
  • prometheus/nagios_plugins — Nagios plugin for alerting on prometheus query results.
  • RobustPerception/nrpe_exporter — The NRPE exporter exposes metrics on commands sent to a running NRPE daemon.
  • m-lab/prometheus-nagios-exporter — The Prometheus Nagios exporter reads status and performance data from nagios plugins via the MK Livestatus Nagios plugin and publishes this in a form that can be scrapped by Prometheus.
  • Comparison to alternatives | Prometheus — Prometheus is a full monitoring and trending system that includes built-in and active scraping, storing, querying, graphing, and alerting based on time series data.
  • Quality server monitoring solution using NetData/Prometheus/Grafana — I’m going to quickly show you how to install both netdata and Prometheus on the client and server. We can then use grafana pointed at Prometheus to obtain long-term metrics netdata offers.
  • Monitoring stack by using Grafana + Prometheus + Netdata — This monitoring stack you can monitoring in real-time by Netdata and see the history by using Grafana.
  • Monitoring Agent · NCPA — New to NCPA? See some of the awesome features present in the Web GUI and API, available on any operating system.
  • Nagios 101: Understanding the Fundamentals - Nagios
  • Nagios Documentation
  • ...more
    View all episodesView all episodes
    Download on the App Store

    TechSNAPBy Jupiter Broadcasting

    • 4.9
    • 4.9
    • 4.9
    • 4.9
    • 4.9

    4.9

    112 ratings


    More shows like TechSNAP

    View all
    Packet Protector by Packet Pushers

    Packet Protector

    7 Listeners