c12s-kubespray/roles/etcd/templates/etcd.j2
Damian Nowak f8a59446e8 Enable OOM killing
When etcd exceeds its memory limit, it becomes useless but keeps running.
We should let OOM killer kill etcd process in the container, so systemd can spot
the problem and restart etcd according to "Restart" setting in etcd.service unit file.
If OOME problem keep repeating, i.e. it happens every single restart,
systemd will eventually back off and stop restarting it anyway.

--restart=on-failure:5 in this file has no effect because memory allocation error
doesn't by itself cause the process to die

Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2

This kind of reverts a change introduced in #1860.
2018-02-09 11:00:13 -06:00

22 lines
712 B
Django/Jinja

#!/bin/bash
{{ docker_bin_dir }}/docker run \
--restart=on-failure:5 \
--env-file=/etc/etcd.env \
--net=host \
-v /etc/ssl/certs:/etc/ssl/certs:ro \
-v {{ etcd_cert_dir }}:{{ etcd_cert_dir }}:ro \
-v {{ etcd_data_dir }}:{{ etcd_data_dir }}:rw \
{% if etcd_memory_limit is defined %}
--memory={{ etcd_memory_limit|regex_replace('Mi', 'M') }} \
{% endif %}
{% if etcd_cpu_limit is defined %}
--cpu-shares={{ etcd_cpu_limit|regex_replace('m', '') }} \
{% endif %}
{% if etcd_blkio_weight is defined %}
--blkio-weight={{ etcd_blkio_weight }} \
{% endif %}
--name={{ etcd_member_name | default("etcd") }} \
{{ etcd_image_repo }}:{{ etcd_image_tag }} \
/usr/local/bin/etcd \
"$@"