Error Budget formula - ✔️✔️Error Budget = 100% - SLO
error budget calculation - ✔️✔️SLO = 99.8%
100% - 99.8% = 0.2%
.2% Error Budget
.2% = 0.002
0.002 X 30 day/month
X 24 hours/day
X 60 minutes/hour =
86.4 minutes/month
Availability formula - ✔️✔️Availability = Successful Requests / Total Requests
,Why error budgets are good? - ✔️✔️Releasing new features
Expected system changes
Inevitable failure in networks, etc
Planned downtime
Risky experiments
Unforseen circumstances
Toil - ✔️✔️Work tied to running a production service that tends
to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a
service grows."
Characteristics of Toil - ✔️✔️Manual: This characteristic extends to include the
running of a script, which, although it saves
time must still be run by hand.
Repetitive: If a task is repeated multiple times, not just
once or twice, then the work is toil.
Automatable: If the task can be done by a machine just as well as by a person, you can consider it toil.
Tactical: Toil, by its very nature, is not proactive or
strategy-driven. Rather, it is reactive and
interrupt-driven, e.g., pager alerts.
Devoid of enduring value: Tasks that contribute to adding a permanent improvement to the service are
not considered toil, but work that does not change the state is.
Scales linearly as service grows: The best designed service can grow by at least one order of magnitude
without change; tasks that scale up with service size or traffic are toil.
Overhead - ✔️✔️Email, Expense report, Commuting, Meetings, actions which are not tied to production
service
Toil Reduction Benefits - ✔️✔️• Increased engineering time
• Higher team morale, lower burnout
, • Increased process standardization
• Enhanced team technical skills
• Fewer human error outages
• Shorter incident response times
3 Top Tips for Reducing Toil - ✔️✔️Identify toil, Estimate time to automate, Measure everything
White-Box monitoring - ✔️✔️Metrics exposed by the internals of
the system
• Focus on predicting problems
• Heavy use recommended
• Best for detecting imminent issues
Black-Box monitoring - ✔️✔️Testing externally visible behavior as
a user would see it
• Symptom-oriented, active problems
• Moderate use of critical issues
• Best for paging of incidents
Metrics - ✔️✔️Numerical measurements representing attributes and events
Error Budget Burn Rate - ✔️✔️Error Budget Burn Rate = 100% - SLO X (Events over set time)
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller VasilyKichigin. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $13.48. You're not tied to anything after your purchase.