c12s-kubespray/roles/recover_control_plane/master/tasks/main.yml
qvicksilver ac2135e450
Fix recover-control-plane to work with etcd 3.3.x and add CI (#5500)
* Fix recover-control-plane to work with etcd 3.3.x and add CI

* Set default values for testcase

* Add actual test jobs

* Attempt to satisty gitlab ci linter

* Fix ansible targets

* Set etcd_member_name as stated in the docs...

* Recovering from 0 masters is not supported yet

* Add other master to broken_kube-master group as well

* Increase number of retries to see if etcd needs more time to heal

* Make number of retries for ETCD loops configurable, increase it for recovery CI and document it
2020-02-11 01:38:01 -08:00

29 lines
983 B
YAML

---
- name: Wait for apiserver
shell: "{{ bin_dir }}/kubectl get nodes"
environment:
- KUBECONFIG: "{{ ansible_env.HOME | default('/root') }}/.kube/config"
register: apiserver_is_ready
until: apiserver_is_ready.rc == 0
retries: 6
delay: 10
changed_when: false
when: groups['broken_kube-master']
- name: Delete broken kube-master nodes from cluster
shell: "{{ bin_dir }}/kubectl delete node {{ item }}"
environment:
- KUBECONFIG: "{{ ansible_env.HOME | default('/root') }}/.kube/config"
with_items: "{{ groups['broken_kube-master'] }}"
register: delete_broken_kube_masters
failed_when: false
when: groups['broken_kube-master']
- name: Fail if unable to delete broken kube-master nodes from cluster
fail:
msg: "Unable to delete broken kube-master node: {{ item.item }}"
loop: "{{ delete_broken_kube_masters.results }}"
changed_when: false
when:
- groups['broken_kube-master']
- "item.rc != 0 and not 'NotFound' in item.stderr"