runbooks   17

Some notes on running new software in production
This is really good -- how to approach new infrastructure/software dependencies in production with reliability and uptime in mind.

(via Tony Finch)
reliability  uptime  slas  kubernetes  envoy  outages  runbooks  ops 
4 weeks ago by jm
Automatron
Pretty much Nagios + IFTTT. A way to automate runbooks.
devops  runbooks 
june 2018 by tobym
Dashboards for DevOps: Examples of What to Measure18 DevOps Leaders to Follow OnlineDevOps in Practice at Dealertrack: 11 Things That Help Build a Delivery Culture
An inside look at how we approach DevOps internally at New Relic, including the key practices and technology tools we use and the benefits we’ve seen.

#developers #devops #engineers #operations


refrr:https://www.google.com/
An inside look at how we approach DevOps internally at New Relic, including the key practices and technology tools we use and the benefits we’ve seen.

#developers #devops #engineers #operations


refrr:https://www.google.com/
An inside look at how we approach DevOps internally at New Relic, including the key practices and technology tools we use and the benefits we’ve seen.

#developers #devops #engineers #operations


refrr:https://www.google.com/
devops  styleguide    documentation  bestpractices  runbooks  automation 
june 2017 by michaelfox
Things I Learned Managing Site Reliability for Some of the World’s Busiest Gambling Sites
Solid article proselytising runbooks/playbooks (or in this article's parlance, "Incident Models") for dev/ops handover and operational knowledge
ops  process  sre  devops  runbooks  playbooks  incident-models 
april 2017 by jm
Annotated tenets of SRE
A google SRE annotates the Google SRE book with his own thoughts. The source material is great, but the commentary improves it alright.

Particularly good for the error budget concept.

Also: when did "runbooks" become "playbooks"? Don't particularly care either way, but needless renaming is annoying.
runbooks  playbooks  ops  google  sre  error-budget  via:jm 
march 2017 by micktwomey
Annotated tenets of SRE
A google SRE annotates the Google SRE book with his own thoughts. The source material is great, but the commentary improves it alright.

Particularly good for the error budget concept.

Also: when did "runbooks" become "playbooks"? Don't particularly care either way, but needless renaming is annoying.
runbooks  playbooks  ops  google  sre  error-budget 
march 2017 by jm
Runbooks are stupid and you’re doing them wrong | @jgoldschrafe
(Those of you following the DevOps movement: look up that video about Etsy’s dashboards. The best and brightest ops people these days are keeping an eye on business metrics, like sales figures or numbers of code deployments, rather than low-level system metrics.)

runbooks  sysadmin  process 
october 2016 by pdaukintis
Introducing Winston
'Event driven Diagnostic and Remediation Platform' -- aka 'runbooks as code'
runbooks  winston  netflix  remediation  outages  mttr  ops  devops 
august 2016 by jm
Chandler Hub Runbook
a pretty good example of a well-worked-out runbook, this one for OSAF's Chandler
chandler  runbooks  ops  sysadmin  operations  procedures 
november 2007 by jmason
Runbook - Wikipedia, the free encyclopedia
'a routine compilation of the procedures and operations being made by the administrator or operator of the system. Typically, it will contain the procedures to begin, stop and supervise the system.' Good to know there's a name for these files ;)
runbooks  systems  configuration  sysadmin  procedures  upgrading  ops 
november 2007 by jmason

related tags

automation  bestpractices  book  chandler  configuration  devops  documentation  envoy  error-budget  gitlab  google  howto  incident-models  kubernetes  monitoring  mttr  netflix  operations  ops  outages  playbooks  procedures  process  reliability  remediation  run  runbook  slas  sre  styleguide  sysadmin  systems  upgrading  uptime  wiki  winston   

Copy this bookmark:



description:


tags: