-
Notifications
You must be signed in to change notification settings - Fork 60
Description
How to categorize this issue?
/area control-plane
/area performance
/area scalability
/kind enhancement
What would you like to be added:
Today etcd-druid deploys an etcd cluster with a single SSD that is shared to store WAL and snapshot files. All these SSDs come with IOPS limits. For clusters where the etcd read/write activity is LOT, there is possibility that etcd slows down significantly which then causes timeouts from kube-apiserver.
Trace[1428471700]: ---"Txn call failed" err:etcdserver: request timed out 7015ms (06:23:13.521)]
E0729 06:23:13.532618 1 status.go:71] "Unhandled Error" err="apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0xe, desc:\"etcdserver: request timed out\"}: etcdserver: request timed out" logger="UnhandledError"
Details of one such occurrence can be seen in Live Issue#7539.
It is a recommendation from upstream etcd to have dedicated disk for WAL (https://etcd.io/docs/v2.3/admin_guide/). Since these SSDs have an associated cost this should be made configurable via Etcd resource.
Why is this needed:
etcd clusters rely heavily on extremely fast SSDs and their response times are sensitive to disk performance. For large/busy etcd clusters the IOPS can easily cross the limits for the SSD used. In order to prevent timeouts from the kube-apiserver which results in an outage it is essential to provide an option to use multiple SSDs by individual etcd members.