Compare commits

...

14 commits

Author SHA1 Message Date
Kenichi Omichi eeeca4a1d0
[2.17] Update kubernetes version to 1.21.6 (#8142) 2021-11-02 01:32:58 -07:00
Sébastien Masset 7e296b1523
Fixed default DNS min replica for single node clusters (#8109) 2021-10-26 23:59:25 -07:00
Utku Özdemir 488fbd8a37
Implement drain fallback with --disable-eviction to ignore PDBs (#8102)
Signed-off-by: Utku Ozdemir <uoz@protonmail.com>
2021-10-21 06:14:09 -07:00
Cristian Calin f7242d39b9
Calico: increase calico node probe timeouts and allow tunning (#7981) (#8103) 2021-10-21 05:06:10 -07:00
Mathieu Parent 87fee0cccf
[2.17] Fix containerd failed to start if apparmor is not installed (#8042)
* Ensure apparmor is installed (#8011)

Kubespray deployment failed when using containerd backend on nodes that apparmor was not installed or previously removed. This PR ensure apparmor is installed by adding it into required_pkgs var.

(cherry picked from commit 4bace2491d)

* Ensure apparmor is installed (#8036)

Kubespray deployment failed when using containerd backend on nodes that apparmor was not installed or previously removed. This PR ensure apparmor is installed by adding it into required_pkgs var.

(cherry picked from commit af04906b51)

Co-authored-by: rtsp <git@rtsp.us>
2021-10-01 10:00:24 -07:00
Kenichi Omichi 45018ac077
Check if openstack application credentials are empty since they always exists (#8021) (#8038)
Co-authored-by: Hugo Blom <bl0m1@users.noreply.github.com>
2021-09-30 08:02:08 -07:00
Kenichi Omichi 9fafe9849b
Add proxy for subscription-manager (#8012) (#8039)
If using proxy, it is necessary to configure it before running
"subscription-manager status" command.
This adds the step.
2021-09-30 02:20:08 -07:00
Kenichi Omichi 3b2b618cd2
check if 'plugins' key exists in calico_cni_config object (#7717) (#8040)
* check if 'plugins' key exists in calico_cni_config object

* fix whitespace linting error

* fixed when list indentation

Co-authored-by: David Louks <2402775+dlouks@users.noreply.github.com>
2021-09-30 02:12:07 -07:00
Kenichi Omichi bf1bb5984b
Use kube_config_dir for kubeconfig (#7996) (#8037)
The path of kubeconfig should be configurable, and its default value
is /etc/kubernetes/admin.conf. Most paths of the file are configurable
but some were not. This make those configurable.
2021-09-30 02:08:08 -07:00
Kenichi Omichi 04a8a19ce6
Issue 8004: Fix typha prometheus (#8005) (#8035)
The typha prometheus settings were in the `volumeMounts` section of the
spec and not in the `envs` section. This was cauing the deployment to
fail because it was looking for a volumeMount.

```
failed: [controller-001.a2.da.dev.logdna.net] (item=calico-typha.yml) => {"ansible_loop_var": "item", "changed": false, "item": {"ansible_loop_var": "item", "changed": true, "checksum": "598ac79530749e8e2110793b53fc49ac208e7130", "dest": "/etc/kubernetes/calico-typha.yml", "diff": [], "failed": false, "gid": 0, "group": "root", "invocation": {"module_args": {"_original_basename": "calico-typha.yml.j2", "attributes": null, "backup": false, "checksum": "598ac79530749e8e2110793b53fc49ac208e7130", "content": null, "delimiter": null, "dest": "/etc/kubernetes/calico-typha.yml", "directory_mode": null, "follow": false, "force": true, "group": null, "local_follow": null, "mode": null, "owner": null, "regexp": null, "remote_src": null, "selevel": null, "serole": null, "setype": null, "seuser": null, "src": "/home/core/.ansible/tmp/ansible-tmp-1632349768.56-75434-32452975679246/source", "unsafe_writes": null, "validate": null}}, "item": {"file": "calico-typha.yml", "name": "calico", "type": "typha"}, "md5sum": "53c00ac7f562cf9ecbbfd27899ea066d", "mode": "0644", "owner": "root", "size": 5378, "src": "/home/core/.ansible/tmp/ansible-tmp-1632349768.56-75434-32452975679246/source", "state": "file", "uid": 0}, "msg": "error running kubectl (/opt/bin/kubectl --namespace=kube-system apply --force --filename=/etc/kubernetes/calico-typha.yml) command (rc=1), out='service/calico-typha unchanged\n', err='error: error validating \"/etc/kubernetes/calico-typha.yml\": error validating data: [ValidationError(Deployment.spec.template.spec.containers[0].volumeMounts[2]): unknown field \"value\" in io.k8s.api.core.v1.VolumeMount, ValidationError(Deployment.spec.template.spec.containers[0].volumeMounts[2]): missing required field \"mountPath\" in io.k8s.api.core.v1.VolumeMount, ValidationError(Deployment.spec.template.spec.containers[0].volumeMounts[3]): unknown field \"value\" in io.k8s.api.core.v1.VolumeMount, ValidationError(Deployment.spec.template.spec.containers[0].volumeMounts[3]): missing required field \"mountPath\" in io.k8s.api.core.v1.VolumeMount]; if you choose to ignore these errors, turn validation off with --validate=false\n'"}
```

Co-authored-by: Eric Lake <ericlake@gmail.com>
2021-09-29 10:22:49 -07:00
Kenichi Omichi ae1fb69382
Fix cilium operator metrics activation (#8000) (#8033)
This is a cherry-pick of 598f178054

Co-authored-by: Léopold Jacquot <leopold.jacquot@infomaniak.com>
2021-09-29 01:32:49 -07:00
Kenichi Omichi dfee7a8ec5
Fix k8s-certs-renew cp path (#7992) (#8032)
This is a cherry-pick of 2211504790

Signed-off-by: Wang Zhen <lazybetrayer@gmail.com>

Co-authored-by: Wang Zhen <lazybetrayer@gmail.com>
2021-09-29 01:28:48 -07:00
Kenichi Omichi bd4407199c
Add metrics_server_resizer option (#8018) (#8031)
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961edbe

[1]: 527679e5e8/manifests/base/deployment.yaml
2021-09-28 11:15:16 -07:00
Kenichi Omichi 6cfa3bbb22
Remove allowPrivilegeEscalation from metrics-server (#8014) (#8025)
"allowPrivilegeEscalation: false" blocks deploying metrics-server
on CentOS7. In addition, the original metrics-server manifest doesn't
contain it as [1]. This removes it.

[1]: 527679e5e8/manifests/base/deployment.yaml
2021-09-27 23:54:43 -07:00
26 changed files with 107 additions and 26 deletions

View file

@ -130,7 +130,7 @@ Note: Upstart/SysV init based OS types are not supported.
## Supported Components
- Core
- [kubernetes](https://github.com/kubernetes/kubernetes) v1.21.5
- [kubernetes](https://github.com/kubernetes/kubernetes) v1.21.6
- [etcd](https://github.com/coreos/etcd) v3.4.13
- [docker](https://www.docker.com/) v20.10 (see note)
- [containerd](https://containerd.io/) v1.4.9

View file

@ -189,7 +189,7 @@ To re-define default action please set the following variable in your inventory:
calico_endpoint_to_host_action: "ACCEPT"
```
## Optional : Define address on which Felix will respond to health requests
### Optional : Define address on which Felix will respond to health requests
Since Calico 3.2.0, HealthCheck default behavior changed from listening on all interfaces to just listening on localhost.
@ -199,6 +199,15 @@ To re-define health host please set the following variable in your inventory:
calico_healthhost: "0.0.0.0"
```
### Optional : Configure Calico Node probe timeouts
Under certain conditions a deployer may need to tune the Calico liveness and readiness probes timeout settings. These can be configured like this:
```yml
calico_node_livenessprobe_timeout: 10
calico_node_readinessprobe_timeout: 10
```
## Config encapsulation for cross server traffic
Calico supports two types of encapsulation: [VXLAN and IP in IP](https://docs.projectcalico.org/v3.11/networking/vxlan-ipip). VXLAN is supported in some environments where IP in IP is not (for example, Azure).

View file

@ -14,6 +14,7 @@ registry_enabled: false
# Metrics Server deployment
metrics_server_enabled: false
# metrics_server_resizer: false
# metrics_server_kubelet_insecure_tls: true
# metrics_server_metric_resolution: 15s
# metrics_server_kubelet_preferred_address_types: "InternalIP"

View file

@ -17,7 +17,7 @@ kube_token_dir: "{{ kube_config_dir }}/tokens"
kube_api_anonymous_auth: true
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.21.5
kube_version: v1.21.6
# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)

View file

@ -103,3 +103,7 @@
# Enable calico traffic encryption with wireguard
# calico_wireguard_enabled: false
# Under certain situations liveness and readiness probes may need tunning
# calico_node_livenessprobe_timeout: 10
# calico_node_readinessprobe_timeout: 10

View file

@ -16,6 +16,13 @@
become: true
when: not skip_http_proxy_on_os_packages
- name: Add proxy to RHEL subscription-manager if http_proxy is defined
command: /sbin/subscription-manager config --server.proxy_hostname={{ http_proxy | regex_replace(':\\d+$') }} --server.proxy_port={{ http_proxy | regex_replace('^.*:') }}
become: true
when:
- not skip_http_proxy_on_os_packages
- http_proxy is defined
- name: Check RHEL subscription-manager status
command: /sbin/subscription-manager status
register: rh_subscription_status

View file

@ -143,6 +143,7 @@ kubelet_checksums:
v1.22.2: 941e639b0f859eba65df0c66be82808ea6be697ed5dbf4df8e602dcbfa683aa3
v1.22.1: f42bc00f274be7ce0578b359cbccc48ead03894b599f5bf4d10e44c305fbab65
v1.22.0: 4354dc8db1d8ca336eb940dd73adcd3cf17cbdefbf11889602420f6ee9c6c4bb
v1.21.6: 20571caa4edcab5c17c448099cff74f0c0c54087c91888a23fc59407b8836127
v1.21.5: 9130b8b5677fc82b8292f115996370311021ebec404b9be01ff572b187efd45d
v1.21.4: b3ca234719d75df246f5f3ae2426cb2a2659fcb2f42bae15ed2017f29b911e4d
v1.21.3: 7375096bf6985ca3df94285bc69216b827ccabbc459b738984318df904679958
@ -181,6 +182,7 @@ kubelet_checksums:
v1.22.2: f5fe3d6f4b2df5a794ebf325dc17fcdfe905a188e25f7c7e47d9cd15f14f8c2d
v1.22.1: d5ffd67d8285fb224a1c49622fd739131f7b941e3d68f233dec96e72c9ebee63
v1.22.0: cea637a7da4f1097b16b0195005351c07032a820a3d64c3ff326b9097cfac930
v1.21.6: 041441623c31bc6b0295342b8a2a5930d87545473e7c761ea79f3ff186c0ff52
v1.21.5: 746a535956db55807ef71772d2a4afec5cc438233da23952167ec0aec6fe937b
v1.21.4: 12c849ccc627e9404187adf432a922b895c8bdecfd7ca901e1928396558eb043
v1.21.3: 5d21da1145c25181605b9ad0810401545262fc421bbaae683bdb599632e834c1
@ -219,6 +221,7 @@ kubelet_checksums:
v1.22.2: 0fd6572e24e3bebbfd6b2a7cb7adced41dad4a828ef324a83f04b46378a8cb24
v1.22.1: 2079780ad2ff993affc9b8e1a378bf5ee759bf87fdc446e6a892a0bbd7353683
v1.22.0: fec5c596f7f815f17f5d7d955e9707df1ef02a2ca5e788b223651f83376feb7f
v1.21.6: 422c29a1ba3bfeb2fc26ebd1c3596847fbbeeeef0ce2694515504513dc907813
v1.21.5: 600f70fe0e69151b9d8ac65ec195bcc840687f86ba397fce27be1faae3538a6f
v1.21.4: cdd46617d1a501531c62421de3754d65f30ad24d75beae2693688993a12bb557
v1.21.3: 5bd542d656caabd75e59757a3adbae3e13d63c7c7c113d2a72475574c3c640fe
@ -258,6 +261,7 @@ kubectl_checksums:
v1.22.2: a16f7d70e65589d2dbd5d4f2115f6ccd4f089fe17a2961c286b809ad94eb052a
v1.22.1: 50991ec4313ee42da03d60e21b90bc15e3252c97db189d1b66aad5bbb555997b
v1.22.0: 6d7c787416a148acffd49746837df4cebb1311c652483dc3d2c8d24ce1cc897e
v1.21.6: 9100bc13498f770a5a1524665a9dc2470d3a15518e53aba68c700f10f3def978
v1.21.5: 51955c2fec47b83c904004fedde970b6c8f37a7a5f3c2910b6dd63b99fa697e5
v1.21.4: bb741dae49b17b7784dc2460467c876e9f961c14f628de7553d023cdef85b1ac
v1.21.3: 603b6e57c5546c079faee6b606014e83b95ea076146fbf73329f3069968f83bf
@ -296,6 +300,7 @@ kubectl_checksums:
v1.22.2: c5bcc7e5321d34ac42c4635ad4f6fe8bd4698e9c879dc3367be542a0b301297b
v1.22.1: 5c7ef1e505c35a8dc0b708f6b6ecdad6723875bb85554e9f9c3fe591e030ae5c
v1.22.0: 8d9cc92dcc942f5ea2b2fc93c4934875d9e0e8ddecbde24c7d4c4e092cfc7afc
v1.21.6: a193997181cdfa00be0420ac6e7f4cfbf6cedd6967259c5fda1d558fa9f4efe0
v1.21.5: fca8de7e55b55cceab9902aae03837fb2f1e72b97aa09b2ac9626bdbfd0466e4
v1.21.4: 8ac78de847118c94e2d87844e9b974556dfb30aff0e0d15fd03b82681df3ac98
v1.21.3: 2be58b5266faeeb93f38fa72d36add13a950643d2ae16a131f48f5a21c66ef23
@ -334,6 +339,7 @@ kubectl_checksums:
v1.22.2: aeca0018958c1cae0bf2f36f566315e52f87bdab38b440df349cd091e9f13f36
v1.22.1: 78178a8337fc6c76780f60541fca7199f0f1a2e9c41806bded280a4a5ef665c9
v1.22.0: 703e70d49b82271535bc66bc7bd469a58c11d47f188889bd37101c9772f14fa1
v1.21.6: 810eadc2673e0fab7044f88904853e8f3f58a4134867370bf0ccd62c19889eaa
v1.21.5: 060ede75550c63bdc84e14fcc4c8ab3017f7ffc032fc4cac3bf20d274fab1be4
v1.21.4: 9410572396fb31e49d088f9816beaebad7420c7686697578691be1651d3bf85a
v1.21.3: 631246194fc1931cb897d61e1d542ef2321ec97adcb859a405d3b285ad9dd3d6
@ -373,6 +379,7 @@ kubeadm_checksums:
v1.22.2: 6ccc26494160e19468b0cb55d56b2d5c62d21424fac79cb66402224c2bf73a0d
v1.22.1: cc08281c5261e860df9a0b5040b8aa2e6d202a243daf25556f5f6d3fd8f2e1e9
v1.22.0: 6a002deb0ee191001d5c0e0435e9a995d70aa376d55075c5f61e70ce198433b8
v1.21.6: 02951dae946dd5588ccda71b6e28f0d91adf7a94b57792b412635fcce7099d74
v1.21.5: 39c98582b0a2444e7d6bc85dc5eac5217aee5dd18c2de7e1d5aed09415023201
v1.21.4: f1ff5765439624c162489e4f037d12d9f8adf96c04cb298c06aeb7217d620349
v1.21.3: 25eac1922276a0b4aabda92df67882be25a2462e84245f4231f5a888a8ab8bae
@ -411,6 +418,7 @@ kubeadm_checksums:
v1.22.2: 77b4c6a56ae0ec142f54a6f5044a7167cdd7193612b04b77bf433ffe1d1918ef
v1.22.1: 85df7978b2e5bb78064ed0bcce14a39d105a1a3968bb92ee5d2f96a1fa09ed12
v1.22.0: 9fc14b993de2c275b54445255d7770bd1d6cdb49f4cf9c227c5b035f658a2351
v1.21.6: 498325da2521ce67b27902967daf4087153c5797070e03bf0bdd7c846f4d61a8
v1.21.5: 5a273b023eaa60d7820436b0f0062c4bd467274d6f2b86a9e13270c91d663618
v1.21.4: 30645f57296281d214a9dd787a90bd16207df4b1fca7ac320913c616818a92cd
v1.21.3: 5bff1c6cd1d683ce191d271b968d7b776ae5ed7403bdab5fa88446100e74972c
@ -449,6 +457,7 @@ kubeadm_checksums:
v1.22.2: 4ff09d3cd2118ee2670bc96ed034620a9a1ea6a69ef38804363d4710a2f90d8c
v1.22.1: 50a5f0d186d7aefae309539e9cc7d530ef1a9b45ce690801655c2bee722d978c
v1.22.0: 90a48b92a57ff6aef63ff409e2feda0713ca926b2cd243fe7e88a84c483456cc
v1.21.6: fef4b40acd982da99294be07932eabedd476113ce5dc38bb9149522e32dada6d
v1.21.5: e384171fcb3c0de924904007bfd7babb0f970997b93223ed7ffee14d29019353
v1.21.4: 286794aed41148e82a77087d79111052ea894796c6ae81fc463275dcd848f98d
v1.21.3: 82fff4fc0cdb1110150596ab14a3ddcd3dbe53f40c404917d2e9703f8f04787a

View file

@ -3,7 +3,7 @@
dns_memory_limit: 170Mi
dns_cpu_requests: 100m
dns_memory_requests: 70Mi
dns_min_replicas: 2
dns_min_replicas: "{{ [ 2, groups['k8s_cluster'] | length ] | min }}"
dns_nodes_per_replica: 16
dns_cores_per_replica: 256
dns_prevent_single_point_failure: "{{ 'true' if dns_min_replicas|int > 1 else 'false' }}"

View file

@ -1,6 +1,6 @@
[Global]
auth-url="{{ external_openstack_auth_url }}"
{% if external_openstack_application_credential_id is not defined and external_openstack_application_credential_name is not defined %}
{% if external_openstack_application_credential_id == "" and external_openstack_application_credential_name == "" %}
username="{{ external_openstack_username }}"
password="{{ external_openstack_password }}"
{% endif %}

View file

@ -1,4 +1,5 @@
---
metrics_server_resizer: false
metrics_server_kubelet_insecure_tls: true
metrics_server_kubelet_preferred_address_types: "InternalIP"
metrics_server_metric_resolution: 15s

View file

@ -67,7 +67,6 @@ spec:
failureThreshold: 3
initialDelaySeconds: 40
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["all"]
add: ["NET_BIND_SERVICE"]
@ -82,6 +81,7 @@ spec:
requests:
cpu: {{ metrics_server_requests_cpu }}
memory: {{ metrics_server_requests_memory }}
{% if metrics_server_resizer %}
- name: metrics-server-nanny
image: {{ addon_resizer_image_repo }}:{{ addon_resizer_image_tag }}
imagePullPolicy: {{ k8s_image_pull_policy }}
@ -119,6 +119,7 @@ spec:
# Specifies the smallest cluster (defined in number of nodes)
# resources will be scaled to.
- --minClusterSize={{ metrics_server_min_cluster_size }}
{% endif %}
volumes:
- name: metrics-server-config-volume
configMap:

View file

@ -150,8 +150,8 @@
- name: Create hardcoded kubeadm token for joining nodes with 24h expiration (if defined)
shell: >-
{{ bin_dir }}/kubeadm --kubeconfig /etc/kubernetes/admin.conf token delete {{ kubeadm_token }} || :;
{{ bin_dir }}/kubeadm --kubeconfig /etc/kubernetes/admin.conf token create {{ kubeadm_token }}
{{ bin_dir }}/kubeadm --kubeconfig {{ kube_config_dir }}/admin.conf token delete {{ kubeadm_token }} || :;
{{ bin_dir }}/kubeadm --kubeconfig {{ kube_config_dir }}/admin.conf token create {{ kubeadm_token }}
changed_when: false
when:
- inventory_hostname == groups['kube_control_plane']|first
@ -161,7 +161,7 @@
- kubeadm_token
- name: Create kubeadm token for joining nodes with 24h expiration (default)
command: "{{ bin_dir }}/kubeadm --kubeconfig /etc/kubernetes/admin.conf token create"
command: "{{ bin_dir }}/kubeadm --kubeconfig {{ kube_config_dir }}/admin.conf token create"
changed_when: false
register: temp_token
retries: 5

View file

@ -62,7 +62,7 @@
- name: kubeadm | scale down coredns replicas to 0 if not using coredns dns_mode
command: >-
{{ bin_dir }}/kubectl
--kubeconfig /etc/kubernetes/admin.conf
--kubeconfig {{ kube_config_dir }}/admin.conf
-n kube-system
scale deployment/coredns --replicas 0
register: scale_down_coredns

View file

@ -14,7 +14,7 @@ echo "## Restarting control plane pods managed by kubeadm ##"
{% endif %}
echo "## Updating /root/.kube/config ##"
/usr/bin/cp {{ kube_config_dir }}/admin.conf /root/.kube/config
cp {{ kube_config_dir }}/admin.conf /root/.kube/config
echo "## Waiting for apiserver to be up again ##"
until printf "" 2>>/dev/null >>/dev/tcp/127.0.0.1/6443; do sleep 1; done

View file

@ -6,3 +6,4 @@ required_pkgs:
- software-properties-common
- conntrack
- iptables
- apparmor

View file

@ -5,3 +5,4 @@ required_pkgs:
- apt-transport-https
- software-properties-common
- conntrack
- apparmor

View file

@ -5,3 +5,4 @@ required_pkgs:
- apt-transport-https
- software-properties-common
- conntrack
- apparmor

View file

@ -15,7 +15,7 @@ is_fedora_coreos: false
disable_swap: true
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.21.3
kube_version: v1.21.6
## The minimum version working
kube_version_min_required: v1.19.0

View file

@ -12,7 +12,9 @@
- name: Set fact calico_datastore to etcd if needed
set_fact:
calico_datastore: etcd
when: "'etcd_endpoints' in calico_cni_config.plugins.0"
when:
- "'plugins' in calico_cni_config"
- "'etcd_endpoints' in calico_cni_config.plugins.0"
when: calico_cni_config_slurp.content is defined
- name: Calico | Get kubelet hostname

View file

@ -305,6 +305,7 @@ spec:
{% endif %}
periodSeconds: 10
initialDelaySeconds: 10
timeoutSeconds: {{ calico_node_livenessprobe_timeout | default(10) }}
failureThreshold: 6
readinessProbe:
exec:
@ -315,6 +316,7 @@ spec:
{% endif %}
- -felix-ready
periodSeconds: 10
timeoutSeconds: {{ calico_node_readinessprobe_timeout | default(10) }}
failureThreshold: 6
volumeMounts:
- mountPath: /lib/modules

View file

@ -108,14 +108,6 @@ spec:
value: /etc/typha/server_certificate.pem
- name: TYPHA_SERVERKEYFILE
value: /etc/typha/server_key.pem
volumeMounts:
- mountPath: /etc/typha
name: typha-server
readOnly: true
- mountPath: /etc/ca/ca.crt
subPath: ca.crt
name: cacert
readOnly: true
{% endif %}
{% if typha_prometheusmetricsenabled %}
# Since Typha is host-networked,
@ -124,6 +116,16 @@ spec:
value: "true"
- name: TYPHA_PROMETHEUSMETRICSPORT
value: "{{ typha_prometheusmetricsport }}"
{% endif %}
{% if typha_secure %}
volumeMounts:
- mountPath: /etc/typha
name: typha-server
readOnly: true
- mountPath: /etc/ca/ca.crt
subPath: ca.crt
name: cacert
readOnly: true
{% endif %}
# Needed for version >=3.7 when the 'host-local' ipam is used
# Should never happen given templates/cni-calico.conflist.j2

View file

@ -38,6 +38,8 @@ data:
# scheduled.
{% if cilium_enable_prometheus %}
prometheus-serve-addr: ":9090"
operator-prometheus-serve-addr: ":6942"
enable-metrics: "true"
{% endif %}
# If you want to run cilium in debug mode change this value to true

View file

@ -9,7 +9,7 @@
- name: remove-node | Drain node except daemonsets resource # noqa 301
command: >-
{{ bin_dir }}/kubectl --kubeconfig /etc/kubernetes/admin.conf drain
{{ bin_dir }}/kubectl --kubeconfig {{ kube_config_dir }}/admin.conf drain
--force
--ignore-daemonsets
--grace-period {{ drain_grace_period }}

View file

@ -1,6 +1,6 @@
---
- name: Uncordon node
command: "{{ bin_dir }}/kubectl --kubeconfig /etc/kubernetes/admin.conf uncordon {{ kube_override_hostname|default(inventory_hostname) }}"
command: "{{ bin_dir }}/kubectl --kubeconfig {{ kube_config_dir }}/admin.conf uncordon {{ kube_override_hostname|default(inventory_hostname) }}"
delegate_to: "{{ groups['kube_control_plane'][0] }}"
when:
- needs_cordoning|default(false)

View file

@ -6,6 +6,12 @@ drain_nodes: true
drain_retries: 3
drain_retry_delay_seconds: 10
drain_fallback_enabled: false
drain_fallback_grace_period: 300
drain_fallback_timeout: 360s
drain_fallback_retries: 0
drain_fallback_retry_delay_seconds: 10
upgrade_node_always_cordon: false
upgrade_node_uncordon_after_drain_failure: true
upgrade_node_fail_if_drain_fails: true

View file

@ -73,18 +73,50 @@
{{ bin_dir }}/kubectl drain
--force
--ignore-daemonsets
--grace-period {{ drain_grace_period }}
--timeout {{ drain_timeout }}
--grace-period {{ hostvars['localhost']['drain_grace_period_after_failure'] | default(drain_grace_period) }}
--timeout {{ hostvars['localhost']['drain_timeout_after_failure'] | default(drain_timeout) }}
--delete-local-data {{ kube_override_hostname|default(inventory_hostname) }}
{% if drain_pod_selector %}--pod-selector '{{ drain_pod_selector }}'{% endif %}
when: drain_nodes
register: result
failed_when:
- result.rc != 0
- not drain_fallback_enabled
until: result.rc == 0
retries: "{{ drain_retries }}"
delay: "{{ drain_retry_delay_seconds }}"
- name: Drain fallback
block:
- name: Set facts after regular drain has failed
set_fact:
drain_grace_period_after_failure: "{{ drain_fallback_grace_period }}"
drain_timeout_after_failure: "{{ drain_fallback_timeout }}"
delegate_to: localhost
delegate_facts: yes
run_once: yes
- name: Drain node - fallback with disabled eviction
command: >-
{{ bin_dir }}/kubectl drain
--force
--ignore-daemonsets
--grace-period {{ drain_fallback_grace_period }}
--timeout {{ drain_fallback_timeout }}
--delete-local-data {{ kube_override_hostname|default(inventory_hostname) }}
{% if drain_pod_selector %}--pod-selector '{{ drain_pod_selector }}'{% endif %}
--disable-eviction
register: drain_fallback_result
until: drain_fallback_result.rc == 0
retries: "{{ drain_fallback_retries }}"
delay: "{{ drain_fallback_retry_delay_seconds }}"
when:
- drain_nodes
- drain_fallback_enabled
- result.rc != 0
rescue:
- name: Set node back to schedulable
command: "{{ bin_dir }}/kubectl --kubeconfig /etc/kubernetes/admin.conf uncordon {{ inventory_hostname }}"
command: "{{ bin_dir }}/kubectl --kubeconfig {{ kube_config_dir }}/admin.conf uncordon {{ inventory_hostname }}"
when: upgrade_node_uncordon_after_drain_failure
- name: Fail after rescue
fail: