* Run 'container-engine' after drain.
Move possibly disruptive role 'container-engine' to run after the node
is drained.
As that role have to be run on non-cluster nodes as well (etcd and
calico-rr), and those nodes are not drained, add play for that case.
* Check if api is up before upgrade.
If container engine is restarted in previous role, api controller can
take some time to start. This check ensures api is up before upgrade.
When kube-router is used as cni, rules might be added to the mangle table
to support external IPs. Therefore, mangle table should be flushed during
reset as well.
This 38688a4486 change replaces the
value for dockerproject_.+_repo_.+ docker variables but their new
value was previously defined in other variables. This change removes
the dockerproject_.+_repo_.+ docker variables in favor of the older
ones.
* Fix incorrect assertion comparison for kube_network_node_prefix
* Ignore assertion comparison for kube_network_node_prefix when using calico
* Adding more var docs description for kube_network_node_prefix
* Fixing trailing whitespaces
* External OpenStack Cloud Controller Manager implementation
* Adding controller image tag
* Minor fixes
* Restructuring the external cloud controller to work with KubeADM
* Added in code to allow control over pull policy for local path provisioner
* change to imagePullPolicy to use globally used variable k8s_image_pull_policy
* removed unusued variable from defaults
* updated contiv-etcd and cinder-csi-controllerplugin to use k8s_image_pull_policy variable
* Introduce kubelet_config_extra_args and kubelet_node_config_extra_args to pass params to kubelet via YAML config
* kubelet_config_extra_args is not the alternative
* Fix recover-control-plane to work with etcd 3.3.x and add CI
* Set default values for testcase
* Add actual test jobs
* Attempt to satisty gitlab ci linter
* Fix ansible targets
* Set etcd_member_name as stated in the docs...
* Recovering from 0 masters is not supported yet
* Add other master to broken_kube-master group as well
* Increase number of retries to see if etcd needs more time to heal
* Make number of retries for ETCD loops configurable, increase it for recovery CI and document it
* containerd: add proxy support
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubespray-defaults: add kube_service_addresses / kube_pods_subnet to no_proxy
CIDR notation in no_proxy is supported by a lot of programs/languages,
including go: https://github.com/golang/go/issues/16704
Without that containerd cannot talk the the API server (kube_apiserver_ip),
but it should not go through an external proxy for the nodes/pods/services
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Change dockerproject.org to download.docker.com
dockerproject.org was deprecated in 2017 and has gone down.
* Restore yum repo for containerd
Change-Id: I883bb512a2164a85865b1bd4fb569af0358c8c2b
Co-authored-by: Craig Rodrigues <rodrigc@crodrigues.org>
When running with serial != 100%, like upgrade_cluster.yml, we need to apply this fixup each time
Problem was introduced in 05dc2b3a09
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Raises limit from 100 to 300 because the default is far too low
and the pod can handle 300 with the given resources.
Change-Id: Ib1eec10da3d09d198933fcfe87291587e58d7cdb
I've tested this update by deploying a containerd / etcd cluster on top CentOS7,
MetalLB + NGINX Ingress. Upgrade using upgrade-cluster.yml
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Resolves issue where kubectl cache of <v1.16 api schema
interferes with interacting with daemonsets and deployments.
Change-Id: I63b7046958f2008eb144b6da0004c598f945e0ae
* Fix crictl
* Reload systemd daemon before enabling service
* Typo
* Add crictl template
* Remove seccomp.json for ubuntu
* Set runtime path of runc for ubuntu
* Change path to conmon
There is no cri-tools package in CentOS/EPEL/Red Hat.
Additionally, cri-tools is provided into the installation via
roles/download/defaults/main.yml:104:crictl_download_url.
* Fix python3-libselinux installation for RHEL/CentOS 8
In bootstrap-centos.yml we haven't gathered the facts,
so #5127 couldn't work
Minimum ansible version to run kubespray is 2.7.8,
so ansible_distribution_major_version is defined an there is no need to default it
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Restart NetworkManager for RHEL/CentOS 8
network.service doesn't exist anymore
# systemctl status network
Unit network.service could not be found.
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Add module_hotfixes=True to docker / containerd yum repo config
https://bugzilla.redhat.com/show_bug.cgi?id=1734081https://bugzilla.redhat.com/show_bug.cgi?id=1756473
Without this setting you end up with the following error:
# yum install docker-ce
Failed to set locale, defaulting to C
Last metadata expiration check: 0:03:21 ago on Thu Sep 26 22:00:05 2019.
Error:
Problem: package docker-ce-3:19.03.2-3.el7.x86_64 requires containerd.io >= 1.2.2-3, but none of the providers can be installed
- cannot install the best candidate for the job
- package containerd.io-1.2.2-3.3.el7.x86_64 is excluded
- package containerd.io-1.2.2-3.el7.x86_64 is excluded
- package containerd.io-1.2.4-3.1.el7.x86_64 is excluded
- package containerd.io-1.2.5-3.1.el7.x86_64 is excluded
- package containerd.io-1.2.6-3.3.el7.x86_64 is excluded
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Initially this was to fix a mis-indented approvers key. However, it turns
out that 'oilbeater' is not a member of kubernetes-sigs nor
kubernetes-incubator (the org this repo was migrated from). Thus this
OWNERS file is failing prow's validation check.
As a workaround I've opted to move them to emeritus_approver, which
isn't valiated and can be used as a hint for other approvers in this
repo
This fixes the scenario where masters are upgraded one at a time
and coredns gets improperly scaled back up to 2 replicas.
Change-Id: I7cc9283f40efcfd61b5813c89a5805c95d901567
Kubespray Pull Request #5084 (https://github.com/kubernetes-sigs/kubespray/pull/5084) caused more problems than it solved due to limitations with the synchronize module. See comments on Kubespray Issues #5059 (https://github.com/kubernetes-sigs/kubespray/issues/5059) and #5116 (https://github.com/kubernetes-sigs/kubespray/issues/5116). Details from Ansible documentation: "Currently, synchronize is limited to elevating permissions via passwordless sudo. This is because rsync itself is connecting to the remote machine and rsync doesn’t give us a way to pass sudo credentials in. ... Currently there are only a few connection types which support synchronize (ssh, paramiko, local, and docker) because a sync strategy has been determined for those connection types. Note that the connection for these must not need a password as rsync itself is making the connection and rsync does not provide us a way to pass a password to the connection. ..." Thus, reverting Pull Request #5084.
* Add support for Kubernetes 1.16.1
* Defaults to 1.16.1
* add 1.16.2 checksums and set new version as default
* correct 1.16.2 checksums and add 1.15.5 checksums
When using cluster.yml or scale.yml to add/scale nodes in the existing
k8s cluster, the `kubeadm init` wouldn't run. As a result, kube-proxy
wouldn't be created, and therefore the kube-proxy deletion task would
fail, e.g. in the case where kube-router is used and "kube_proxy_remove"
is set to true. As a workaround, add ignore_errors to the kube-proxy
deletion task.
The script is not usable unless you are in the '.vagrant/provisioners/ansible/inventory/artifacts' folder.
This update makes this usable from anywhere.
- do not run etcd role when etcd_kubeadm_enabled == true
- remove default value 'systemd' for cgroup driver in containerd role.
this value override autodetect in kubelet_cgroup_driver_detected from docker info
This allows to easily override the gcr, quay, and docker repos with the
mirror repos in countries like China, where the default accesses are
blocked or unstable.
mydict.keys() should be converted to list,
otherwise it causes errors in loop iteration.
Remove extra space after class name, which broke configmap.
Also allow set reclaimPolicy property.
Cleaned up deprecated APIs:
apps/v1beta1
apps/v1beta2
extensions/v1beta1 for ds,deploy,rs
Add workaround for deploying helm using incompatible
deployment manifest.
Change-Id: I78b36741348f47a999df3841ee63cf4e6f377830