Commit graph

1349 commits

Author SHA1 Message Date
Etienne Champetier a35b6dc1af
Fix scaling (#5889)
* etcd: etcd-events doesn't depend on etcd_cluster_setup

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: remove condition already present on include_tasks

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: fix scaling up

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: use *access_addresses, do not delegate to etcd[0]

We want to wait for the full cluster to be healthy,
so use all the cluster addresses
Also we should be able to run the playbook when etcd[0] is down
(not tested), so do not delegate to etcd[0]

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: use failed_when for health check

unhealthy cluster is expected on first run, so use failed_when
instead of ignore_errors to remove scary red messages

Also use run_once

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* kubernetes/preinstall: ensure ansible_fqdn is up to date after changing /etc/hosts

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* kubernetes/master: regenerate apiserver cert if needed

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-04-08 01:27:43 -07:00
spaced 0c51352a74
remove unused kubelet options (#5903) 2020-04-07 11:51:44 -07:00
Vinayaka V Ladwa f8ad44a99f
Azure vmss - kubelet: failed to get instance ID from cloud provider: instance not found #5824 (#5855)
* kubernetes-sigs-kubespray #5824

Added support nodes which are part of Virtual Machine Scale Sets(VMSS)

* kubernetes-sigs-kubespray #5824

* kubernetes-sigs-kubespray #5824

Added comments and updatetd azure docs.

* kubernetes-sigs-kubespray #5824

Added supported values comments for "azure_vmtype" in azure.yml
2020-03-31 10:12:40 -07:00
Xiaodu 63fa406c3c
Move host_architecture to kubespray-defaults (#5811)
The variable is defined in `kubernetes/preinstall` role and used in several roles. Since `kubernetes/preinstall` is not always included when `ansible-playbook` is run with tag selectors (see #5734 for reason), they will fail, or individual roles must copy the same fact definitions (as in #3846). Moving the definition to the always-included `kubespray-defaults` role will resolve the dependency problem.
2020-03-25 12:58:25 -07:00
Stephen Schmidt 0379a52f03
Fix etcd install with docker and etcd_kubeadm_enabled (#5777)
- This solves issue #5721 & #5713 (dupes)
  - Provide a cleaner default usage pattern for the download role
    around etcd that supports 'host' and 'docker' properly
  - Extract the 'etcdctl' as a separate task install piece and reuse it where
    appropriate
  - Update the kubeadm-etcd task to reflect the above change
2020-03-24 08:12:47 -07:00
Sergey b8d628c5f3
rename handler to fix ansible 2.8 issue (#5801) 2020-03-20 13:54:08 -07:00
Maxime Guyot a7a204ebca
Add kube_encryption_resources variable to configure which resources are encrypted at rest (#5797) 2020-03-20 04:14:36 -07:00
spaced 8ce5a9dd19
remove atomic support because reached end of live (#5783) 2020-03-17 14:31:27 -07:00
spaced 876d4de6be
Fedora CoreOS support (#5657)
* fedora coreos support
- bootstrap and new fact for

* fedora coreos support
- fix bootstrap condition

* fedora coreos support
- allow customize packages for fedora coreos bootstrap

* fedora coreos support
- prevent install ptyhon3 and epel via dnf for fedora coreos

* fedora coreos support
- handle all ostree like os in same way

* fedora coreos support
- handle all ostree like os in same way for crio

* fedora coreos support
- add fcos documentations
2020-03-17 03:12:21 -07:00
Qingkun Li 43020bd064
Fix the command for kube-proxy cleanup (#5671) 2020-03-13 05:32:39 -07:00
Xiaodu c47f441b13
fix kube-proxy server address when local apiserver lb is disabled (#5730)
refs #5277

As the issue describes, when no external or local load-balanced is used,
kube-proxy won't be able to contact apiserver at 127.0.0.1. So the
config map should be left as is.
2020-03-12 10:40:39 -07:00
Sergey 9f3ed7d855
change ignore_errors: to when: in assert tasks (#5716) 2020-03-10 08:09:36 -07:00
Sergey 221b429c24
move var preinstall_selinux_state: to roles/kubespray-defaults/defaults/main.yaml (#5715) 2020-03-10 07:45:35 -07:00
Kubernetes Prow Robot 66408a87ee
Refactor download role (#5697)
* download file

* download containers

* fix push image to nodes

* pull if none image on host

* fix

* improve docker image tag checks.
do not pull already cached images

* rebase fix merge conflict

* add support download_run_once when upgrade and scale cluster
add some test with download_run_once

* set default values to temp flag for every download cycle

* add save,load abilty for containerd and crio when download_run_once=true

* return redefine image save/load command to  set_docker_image_facts.yml

* move set command to set_container_facts

* ctr in containerd_bin_dir

* fix order of ctr image export arguments

* temporary disable download_run_once for containerd and crio
due https://github.com/containerd/containerd/issues/4075

* remove unused files

* fix strict yaml linter warning and errors

* refactor logical conditions to pull and cache container images

* remove comment due lint check

* document role

* remove image_load_on_localhost, because cached images are always loaded to docker on remote sites

* remove XXX from debug output
2020-03-05 07:31:39 -08:00
Sergey 678ed5ced5
fix upgrade procedure when in playbook (#5695)
exists role kubernetes/preinstall and not exists role container-engine

 error 'yum_repo_dir' is undefined
2020-02-28 01:56:38 -08:00
Lovro Seder 7f87ce0362
Upgrade container-engine after draining (#5601)
* Run 'container-engine' after drain.

Move possibly disruptive role 'container-engine' to run after the node
is drained.

As that role have to be run on non-cluster nodes as well (etcd and
calico-rr), and those nodes are not drained, add play for that case.

* Check if api is up before upgrade.

If container engine is restarted in previous role, api controller can
take some time to start. This check ensures api is up before upgrade.
2020-02-27 11:47:28 -08:00
Javeria Khan 6368c626c5
Ignore assertion comparison for kube_network_node_prefix when using calico (#5632)
* Fix incorrect assertion comparison for kube_network_node_prefix

* Ignore assertion comparison for kube_network_node_prefix when using calico

* Adding more var docs description for kube_network_node_prefix

* Fixing trailing whitespaces
2020-02-20 00:39:03 -08:00
Adrien Gooris da86457cda
remove unused var 'kube_apiserver_admission_control' (#5648) (#5651) 2020-02-19 05:08:25 -08:00
Ali Sanhaji 646fd5f47b
External OpenStack Cloud Controller Manager implementation (#5491)
* External OpenStack Cloud Controller Manager implementation

* Adding controller image tag

* Minor fixes

* Restructuring the external cloud controller to work with KubeADM
2020-02-18 04:47:28 -08:00
Sylvain Chateau 0ca7aa126b
added "Flatcar", "Flatcar Container Linux by Kinvolk" for all coreOS role (#5607) 2020-02-18 00:15:29 -08:00
Sergey 36c1f32ef9
remove legacy docker repo in kubernetes/preinstall before any packages installed (#5640) 2020-02-17 08:59:28 -08:00
Erwan Miran 26700e7882
kubelet_config_extra_args and kubelet_node_config_extra_args (#5623)
* Introduce kubelet_config_extra_args and kubelet_node_config_extra_args to pass params to kubelet via YAML config

* kubelet_config_extra_args is not the alternative
2020-02-14 16:05:28 -08:00
Sergey 14b1cab5d2
force rotate control plane certifcate on master node when upgrade cluster (#5596) 2020-02-10 06:09:54 -08:00
Florian Ruynat e570e2e736
Remove last rkt references (#5606) 2020-02-07 02:19:43 -08:00
aca 9d32e2c3b0
fix duplicates when scheduler_extra_volumes defined (#5566) 2020-02-07 02:09:44 -08:00
Etienne Champetier 5e9479cded Ensure we always fixup kube-proxy kubeconfig (#5524)
When running with serial != 100%, like upgrade_cluster.yml, we need to apply this fixup each time
Problem was introduced in 05dc2b3a09

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-01-14 02:45:09 -08:00
Damon Wang 48c41bcbe7 kube-proxy need conntrack (#5478) 2020-01-06 02:31:35 -08:00
Matthew Mosesohn 5fab610fab Clean kubectl cache after upgrade on first master (#5479)
Resolves issue where kubectl cache of <v1.16 api schema
interferes with interacting with daemonsets and deployments.

Change-Id: I63b7046958f2008eb144b6da0004c598f945e0ae
2020-01-06 02:23:35 -08:00
Matthew Mosesohn 696fcaf391 Ensure 0644 mode for ca.crt on nodes (#5428)
Change-Id: I5e018dfaeffe314300b373aeb7ed5f59929cf4f9
2019-12-11 00:54:04 -08:00
Sergey 9fda84b1c9 set node label via kubectl label command (#5257)
* set varios node label via kubectl label command, not kubelet options

* remove node_labels from KUBELET_ARGS
2019-12-09 01:43:09 -08:00
Etienne Champetier 42702dc1a3 Fixes for CentOS 8 (#5213)
* Fix python3-libselinux installation for RHEL/CentOS 8

In bootstrap-centos.yml we haven't gathered the facts,
so #5127 couldn't work

Minimum ansible version to run kubespray is 2.7.8,
so ansible_distribution_major_version is defined an there is no need to default it

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* Restart NetworkManager for RHEL/CentOS 8

network.service doesn't exist anymore
 # systemctl status network
 Unit network.service could not be found.

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* Add module_hotfixes=True to docker / containerd yum repo config

https://bugzilla.redhat.com/show_bug.cgi?id=1734081
https://bugzilla.redhat.com/show_bug.cgi?id=1756473
Without this setting you end up with the following error:
 # yum install docker-ce
 Failed to set locale, defaulting to C
 Last metadata expiration check: 0:03:21 ago on Thu Sep 26 22:00:05 2019.
 Error:
  Problem: package docker-ce-3:19.03.2-3.el7.x86_64 requires containerd.io >= 1.2.2-3, but none of the providers can be installed
   - cannot install the best candidate for the job
   - package containerd.io-1.2.2-3.3.el7.x86_64 is excluded
   - package containerd.io-1.2.2-3.el7.x86_64 is excluded
   - package containerd.io-1.2.4-3.1.el7.x86_64 is excluded
   - package containerd.io-1.2.5-3.1.el7.x86_64 is excluded
   - package containerd.io-1.2.6-3.3.el7.x86_64 is excluded
 (try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2019-12-09 01:37:10 -08:00
Maxime Guyot b15d41a96a Add support to Ansible 2.9 (#5361) 2019-12-05 07:24:32 -08:00
Hugo Blom f7aea8ed89 update oidc to contain quotes (#5406) 2019-12-05 00:24:32 -08:00
Matthew Mosesohn 57fef8f75e Allow customizing kubelet healthz port and bind addr (#5403)
Change-Id: I1634ba2d2d3337243ffcdea86750003a559f2576
2019-12-03 11:56:58 -08:00
Matthew Mosesohn f599a4a859 force other resolvers to be secondary when using systemd-resolved (#5391)
Change-Id: I33d46c7e0c5374467e22c5a652b282d1703dea85
2019-12-02 08:41:04 -08:00
Matthew Mosesohn 18cee65c4b Add support for k8s v1.17.0-rc.1, remove hyperkube (#5378)
Change-Id: I3fff04f0211cd9c2e8235acaf51c3aa98abc8bb7
2019-11-28 05:41:03 -08:00
Yujun Zhang aec5080a47 kubernetes/masters: fix task name in kubeadm setup (#5377) 2019-11-27 06:05:20 -08:00
Michael Shen 6924c6e5a3 [FIX] fix match because trim removes leading/trailing whitespace (#5356) 2019-11-19 22:35:18 -08:00
Matthew Mosesohn 85c851f519 scale down coredns on each master during graceful upgrade (#5344)
This fixes the scenario where masters are upgraded one at a time
and coredns gets improperly scaled back up to 2 replicas.

Change-Id: I7cc9283f40efcfd61b5813c89a5805c95d901567
2019-11-18 00:13:41 -08:00
Matthew Mosesohn 8b67159239 Do not run kubeadm upgrade on first deploy (#5339)
Change-Id: I68a962a9dd28c83ef07eaeaf53eb98287f38bca9
2019-11-14 02:05:34 -08:00
LuciferInLove 4f70da2731 Added Amazon Linux 2 support for deploying with docker (#5301) 2019-11-11 07:05:41 -08:00
Matthew Mosesohn db5040e6ea Set certs and files with kubeadm token to mode 0640 (#5325)
Change-Id: I298496e55a6889c158b2085fcadeda5e679a873e
2019-11-11 05:41:41 -08:00
Matthew Mosesohn 1c25ed669c Remove unnecessary and risky reload network for resolvconf propagation (#5322)
Change-Id: I54d706f7941b4b86c4c6cd45340295577155b884
2019-11-06 10:11:52 -08:00
Matthew Mosesohn a005d19f6f Enable systemd-resolved DNS resolution mode (#5318)
Change-Id: If3e253a40782e03cde7fc4a91493517ae31fda17
2019-11-06 03:33:52 -08:00
Matthew Mosesohn 471589f1f4 Scale down coredns created by kubeadm upgrade to 0 replicas (#5308)
Change-Id: I128b0f9c1acbb956d9a6c4e5510b45a36e296af7
2019-11-05 03:34:38 -08:00
Matthew Mosesohn 186ec13579 Fix incorrect suggestion to enable old k8s apis (#5292)
Change-Id: If965cc6aa0daaca232dcf2ca0efd649aa097497f
2019-10-30 01:58:53 -07:00
Matthew Mosesohn 81da231b1e Set cluster DNS in kubeadm config for kubelet dynamic config (#5293)
Change-Id: I23116efefe8626d361d1904fc6fb8448f66cf3c5
2019-10-25 02:23:40 -07:00
Sergey 3118437e10 check on all cluster node - kubelet_max_pods <= (2 ** (32 - kube_network_node_prefix | int)) - 2 (#5279) 2019-10-17 05:48:38 -07:00
Michael Oglesby c672681ce5 Revert Pull Request #5084 (#5120)
Kubespray Pull Request #5084 (https://github.com/kubernetes-sigs/kubespray/pull/5084) caused more problems than it solved due to limitations with the synchronize module. See comments on Kubespray Issues #5059 (https://github.com/kubernetes-sigs/kubespray/issues/5059) and #5116 (https://github.com/kubernetes-sigs/kubespray/issues/5116). Details from Ansible documentation: "Currently, synchronize is limited to elevating permissions via passwordless sudo. This is because rsync itself is connecting to the remote machine and rsync doesn’t give us a way to pass sudo credentials in. ... Currently there are only a few connection types which support synchronize (ssh, paramiko, local, and docker) because a sync strategy has been determined for those connection types. Note that the connection for these must not need a password as rsync itself is making the connection and rsync does not provide us a way to pass a password to the connection. ..." Thus, reverting Pull Request #5084.
2019-10-17 05:26:37 -07:00
yelhouti d332a254ee install python3 instead of python2 for fedora >= 30 fixes 5056, fixes 4802 (#5111) 2019-10-17 05:04:38 -07:00