Es-operator Building an Operator From the Bottom Up
如果无法正常显示,请先停止浏览器的去广告插件。
相关话题:
#zalando
1. Es-operator
Building an Operator
From the Bottom Up
MIKKEL LARSEN
@mikkeloscar
2019-05-21
2. $ whoami
Mikkel Larsen
Software Engineer
Cloud Infrastructure (Kubernetes/AWS)
@ Zalando SE
@mikkeloscar
2
@mikkeloscar
3. “EUROPE’S LEADING ONLINE FASHION PLATFORM”
3
4. WE BRING FASHION TO PEOPLE IN 17 COUNTRIES
17 markets
7 fulfillment centers
26 million active customers
5.4 billion € revenue 2018
250 million visits per month
15,000 employees in Europe
4
5. KUBERNETES @ ZALANDO
5
Default
Deployment
Target ~125
clusters
1400~
nodes Node
Autoscaling
Since
Oct 2016 From v1.4
to v1.13
6. SEARCH @ ZALANDO
6
300k+ 45%
Products
per country Mobile Traffic
~2000
Brands 12K
QPS
~700
Categories 8K
Updates/s
7.
8. WORKLOAD
EC2
~200 instances
8
K8S
9. RUNNING ELASTICSEARCH IN KUBERNETES
1. Safe automatic updates
(Including Kubernetes cluster updates)
2. Advanced auto-scaling for cost efficiency
9
10. UPDATING ELASTICSEARCH (STATEFULSET)
1) PreStop Hook (bash script)
● Exclude node in ES
● Wait for node to drain (up to 1h)
● Data is moved to existing nodes
2) PostStart Hook (bash script)
● Remove all excludes
● Let ES rebalance from existing nodes
ES Pod ES Pod
ready ready
Node Node
ES Pod ES Pod
terminating
ready ready
Node Node
draining
10
11. OPERATOR PATTERN
11
coreos.com/blog/introducing-operators.html
12. v0: MANAGE STATEFULSET
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-cluster
annotations:
es-operator/desired-replicas: ”3”
spec:
updateStrategy:
type: OnDelete
replicas: 2
template: # PodTemplate
{...}
12
● Complicated to update
without changing replicas.
● State must be stored in
annotations
13. v1: ELASTICSEARCH DATA SETS
apiVersion: zalando.org/v1
kind: ElasticsearchDataSet
metadata:
name: test-cluster
spec:
scaling:
{...}
replicas: 3
template: # PodTemplate
{...}
volumeClaimTemplates:
{...}
13
github.com/zalando-incubator/es-operator
14. ELASTICSEARCH DATA SETS
ES
Operator
ES
Data
ES
Data
ES
Data
ES Cluster
ES
Data
ES
Data
ES
Data
ES
Master
ES
Master
ES
Data
ES
Data
ES
Data
ES
Master
14
github.com/zalando-incubator/es-operator
15. UPDATING ELASTICSEARCH (OPERATOR)
2) Drain node
ES Pod ES Pod
ready ready
Node Node
ES Service
1) Scale out by 1
draining
ES
Operator
ES Pod ES Pod
ready ready
Node Node
3) Delete Pod
draining
github.com/zalando-incubator/es-operator
15
16. SCALING UP ELASTICSEARCH (1)
METRICS
Thresholds
● CPU
● Duration
● Cooldown
ES 6 Pod
shards
ready
Node
ES 3 Pod
shards
ES 3 Pod
shards
ready
ready
Node
Node
Increase pod replicas
16
Boundaries
● Max # Pod replicas
● Min # Shards per node
ES 2 Pod
shards
ES 2 Pod
shards
ES 2 Pod
ready
shards
ready
Node
ready
Node
Node
ES 2 Pod
ES 2 Pod
shards
ES 2 Pod
shards
shards
ES 2 Pod
ready
ready
shards
ES 2 Pod
Node ready shards
ES 1 Pod
Node ready
Node shard
ready
Node ready
Node
Node
17. SCALING DOWN ELASTICSEARCH
METRICS
Thresholds
● CPU
● Duration
● Cooldown
ES 2 Pod
ES 2 Pod
shards
ES 2 Pod
shards
shards
ES 2 Pod
ready
ready
shards
ES 2 Pod
Node ready shards
ES 1 Pod
Node ready
Node shard
ready
Node ready
Node
Node
17
ES 2 Pod
shards
ES 2 Pod
shards
ES 2 Pod
ready
shards
ready
Node
ready
Node
Node
DON
’
WHE T OPERA
N CL
T
IS N
UST E
O
ER
T GR
Boundaries
EEN
!
● Min # Replica
● Max # Shards per node
● Max disk usage (%)
ES 3 Pod
shards
ES 3 Pod
shards
ready
ready
Node
Node
Decrease Pod replicas
ES 6 Pod
shards
ready
Node
18. SCALING UP ELASTICSEARCH (2)
METRICS
Thresholds
● CPU
● Duration
● Cooldown
ES 1 Pod
shard
ready
Node
ES 3 Pod
shards
ES 1 Pod
shard
ready
ready
Node
Node
Increase index replicas
18
Boundaries
● Min # Shards per node
● Max # Pod replicas
ES 2 Pod
shards
ES 2 Pod
shards
ES 1 Pod
ready
shard
ready
Node
ready
Node
Node
ES 2 Pod
ES 2 Pod
shards
ES 2 Pod
shards
shards
ES 1 Pod
ready
ready shard
Node ready
Node ready
Node
Node
19. SCALING IN PRODUCTION (7d)
19
20. SCALING IN PRODUCTION (24h)
20
21. LESSONS LEARNED / TAKEAWAYS
● Turn those bash scripts into an operator!
● Assume Operator can die at any point.
● Start simple, add abstractions only when needed.
21
22. OPEN SOURCE
Elasticsearch Operator
github.com/zalando-incubator/es-operator
Kubernetes on AWS
github.com/zalando-incubator/kubernetes-on-aws
Postgres Operator
github.com/zalando/postgres-operator
Kubernetes Operator Pythonic Framework (Kopf)
github.com/zalando-incubator/kopf
22
23. ¡GRACIAS!
MIKKEL LARSEN
mikkel.larsen@zalando.de
@mikkeloscar
2019-05-21