Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems.
SRE is "what happens when a software engineer is tasked with what used to be called operations."
软件工程有的时候和养孩子类似:虽然生育的过程是痛苦和困难的,但是养育孩子成人的过程才是真正需要花费绝大部分精力的地方。 一个软件系统的 40%~90% 的花销其实是花在开发建设完成之后不断维护过程中的。
| |
| |
| |
| |
| |
|
| |
| |
| |
| |
| |
|
豆瓣 SRE
https://book.douban.com/subject/26875239/
wikipedia: Site reliability engineering
https://en.wikipedia.org/wiki/Site_reliability_engineering
wikipedia: Controllability
https://en.wikipedia.org/wiki/Controllability
wikipedia: Observability
https://en.wikipedia.org/wiki/Observability
site: google sre
https://sre.google/