AWS CCP Notes

Search

Search IconIcon to open search

22-07 Operational Excellence Design Principles

Last updated Aug 17, 2023 Edit Source

# Perform operations as code

Apply the same engineering discipline you would to application code to your cloud infrastructure. By treating your operations as code you can limit human error and enable consistent responses to events.

Example

Infrastructure as Code (IaC)


# Make frequent, small, reversible changes

Design workloads to allow components to be updated regularly.

Example

rollbacks, incremental changes, Blue/Green, CI/CD


# Refine operations procedures frequently

Look for continuous opportunities to improve your operations

Example

Use game days to simulate traffic or event failure on your production workloads


# Anticipate failure

Perform post-mortems on system failures to better improve, write test code, kill production serves to test recovery


# Learn from all operational failures

share lessons learned in a knowledge base for operational events and failures across your entire organization