f8a59446e8
When etcd exceeds its memory limit, it becomes useless but keeps running. We should let OOM killer kill etcd process in the container, so systemd can spot the problem and restart etcd according to "Restart" setting in etcd.service unit file. If OOME problem keep repeating, i.e. it happens every single restart, systemd will eventually back off and stop restarting it anyway. --restart=on-failure:5 in this file has no effect because memory allocation error doesn't by itself cause the process to die Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2 This kind of reverts a change introduced in #1860.
21 lines
712 B
Django/Jinja
21 lines
712 B
Django/Jinja
#!/bin/bash
|
|
{{ docker_bin_dir }}/docker run \
|
|
--restart=on-failure:5 \
|
|
--env-file=/etc/etcd.env \
|
|
--net=host \
|
|
-v /etc/ssl/certs:/etc/ssl/certs:ro \
|
|
-v {{ etcd_cert_dir }}:{{ etcd_cert_dir }}:ro \
|
|
-v {{ etcd_data_dir }}:{{ etcd_data_dir }}:rw \
|
|
{% if etcd_memory_limit is defined %}
|
|
--memory={{ etcd_memory_limit|regex_replace('Mi', 'M') }} \
|
|
{% endif %}
|
|
{% if etcd_cpu_limit is defined %}
|
|
--cpu-shares={{ etcd_cpu_limit|regex_replace('m', '') }} \
|
|
{% endif %}
|
|
{% if etcd_blkio_weight is defined %}
|
|
--blkio-weight={{ etcd_blkio_weight }} \
|
|
{% endif %}
|
|
--name={{ etcd_member_name | default("etcd") }} \
|
|
{{ etcd_image_repo }}:{{ etcd_image_tag }} \
|
|
/usr/local/bin/etcd \
|
|
"$@"
|