Services
I’m offering both coaching and hands-on services to help you get up and running with SRE ways of working, observability & incident response.
Managing Monitoring Mayhem
-
monitoring & alerting review to identify antipatterns
-
propose changes to the setup to improve signal-to-noise ratio
-
workshop on how to implement changes to get results ASAP
product pre-requisites:
- monitoring & alerting deployed (ie Prometheus, Grafana)
Incident Commander Class
- workshop focused on commanding incidents
- how to be successful as an Incident Commander and how to help others succeed
- how to build confidence & excellence in handling incidents
- how to build an incident response process
Practical Service Level Monitoring
-
practical workshop on SLI/SLO/SLA
-
how to start building SLOs and put them to actual use
-
how to build your alerting around SLOs
product pre-requisites:
- monitoring & alerting deployed (ie Prometheus, Grafana)
- alert routing system (ie OpsGenie, PagerDuty)
Game Day Coaching
-
let’s set up a guided game day (planned practice incident) for your team
-
learn what to focus on as a manager during a game day
-
learn what to keep an eye out for and how to use the game day improve your incident response procedures
product pre-requisites:
- monitoring & alerting deployed (ie Prometheus, Grafana)
- alert routing system (ie OpsGenie, PagerDuty)
- (ideally) Managing Monitoring Mayhem and Practical Service Level Monitoring
Guided Virtual Incident
-
easy way to start with practicing incident response
-
using a game to simulate an incident we’ll practice for the real thing
-
games are included in the cost of the training
product pre-requisites:
- none