no_proxy is a pain to get right, and having proxy variables present causes issues
(k8s components get proxy configuration after upgrade, see #7100)
It's better to only configure what require proxy:
- the runtime (containerd/docker/crio)
- the package manager + apt_key
- the download tasks
Tested with the following clusters
- 4 CentOS 8 nodes
- 1 Ubuntu 20.04 node
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Update hashes and set default version to 1.19.5
Signed-off-by: anthr76 <hello@anthonyrabbito.com>
* Reorder hashes
1.19.5 hashes should be near 1.19.x
* Added back blank line
* copying ssh key no longer required, works with password auth
* use copy module instead of synchronize (which requires sshpass)
* less tasks and always changed tasks
This new version uses the same base image as kube-proxy
(k8s.gcr.io/build-image/debian-iptables)
This allow to automatically pick iptables-legacy or iptables-nft,
and be compatible with RHEL/CentOS 8
https://github.com/kubernetes/dns/pull/367
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* fix flake8 errors in Kubespray CI - tox-inventory-builder
* Invalidate CRI-O kubic repo's cache
Signed-off-by: Victor Morales <v.morales@samsung.com>
* add support to configure pkg install retries
and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)
* Switch Calico, Cilium and MetalLB image repos to Quay.io
Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
The 0d0cc8cf9c change creates several
DaemonSets to cover the Flannel CNI installation for different CPU
architectures. This change removes the unnecessary architecture value
from the docker tag value.
Signed-off-by: Victor Morales <v.morales@samsung.com>
* calico: add constant calico_min_version_required
and verify current deployed version against it.
* calico: remove upgrade support with data migration
The tool was used pre v3.0.0 and is no longer needed.
* calico: remove old version support from tasks
* calico: remove old ver support from policy ctrl
* calico: remove old ver support from node
* canal: remove old ver support
* remove unused calicoctl download checksums
calico_min_version_required is the oldest version that can be installed
Older versions can be removed.
* Make metallb image repos configurable
* Moved metallb image repo definitions to download role defaults
* Removed comment. These are set in download defaults
* add snapshot-controller and v1beta1 snapshot api
* fix typo
* udpate manifest to v1beta1
* update
* update manifests
* fix spelling
* wait until crd is applied
* fix missing info in kube module
* revert snapshotclass
* add snapshot crds before applying the csi driver
* add crds, missed them in last commit
* use pull policy from kubespray
Support for Ambassador OSS as an Ingress Controller when
settings `ingress_ambassador_enabled: true`.
Signed-off-by: Alvaro Saurin <alvaro.saurin@gmail.com>
with the Python ruamel.yml library
- Change True/False to true/false in a few places so file can
be more easily round-tripped with the Python ruamel.yml library
* bump to dashboard 2.0 rc6 with metrics scrapper
* fix missing yaml seperator making Replicaset complaining about missing ServiceAccount
* unwanted legay gross hack forgot to remove before
* no need namespace on CrBinding
* bump to 2.0.0 release
* remove dashboard_metrics_scrapper_enabled
* added required permissions for querying endpointslice resources
* copy-pasted role permissions from cilium install manifests
* bumped cilium version to v1.7.2
- This solves issue #5721 & #5713 (dupes)
- Provide a cleaner default usage pattern for the download role
around etcd that supports 'host' and 'docker' properly
- Extract the 'etcdctl' as a separate task install piece and reuse it where
appropriate
- Update the kubeadm-etcd task to reflect the above change
* Upgrade etcd to 3.3.18
* Try with etcd 3.3.15 (kubeadm 1.16.7 default)
* Back to square one
* Try with 3.3.11
* Upgrade etcd to 3.3.18 (take 2)
* Try with 3.3.12
* download file
* download containers
* fix push image to nodes
* pull if none image on host
* fix
* improve docker image tag checks.
do not pull already cached images
* rebase fix merge conflict
* add support download_run_once when upgrade and scale cluster
add some test with download_run_once
* set default values to temp flag for every download cycle
* add save,load abilty for containerd and crio when download_run_once=true
* return redefine image save/load command to set_docker_image_facts.yml
* move set command to set_container_facts
* ctr in containerd_bin_dir
* fix order of ctr image export arguments
* temporary disable download_run_once for containerd and crio
due https://github.com/containerd/containerd/issues/4075
* remove unused files
* fix strict yaml linter warning and errors
* refactor logical conditions to pull and cache container images
* remove comment due lint check
* document role
* remove image_load_on_localhost, because cached images are always loaded to docker on remote sites
* remove XXX from debug output
I've tested this update by deploying a containerd / etcd cluster on top CentOS7,
MetalLB + NGINX Ingress. Upgrade using upgrade-cluster.yml
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Add support for Kubernetes 1.16.1
* Defaults to 1.16.1
* add 1.16.2 checksums and set new version as default
* correct 1.16.2 checksums and add 1.15.5 checksums
This allows to easily override the gcr, quay, and docker repos with the
mirror repos in countries like China, where the default accesses are
blocked or unstable.
Cleaned up deprecated APIs:
apps/v1beta1
apps/v1beta2
extensions/v1beta1 for ds,deploy,rs
Add workaround for deploying helm using incompatible
deployment manifest.
Change-Id: I78b36741348f47a999df3841ee63cf4e6f377830
* run 'task download_container | Copy image to ansible host cache' with synchronize on download_delegate host
* try to run task copy file to ansible host on all inventory, not only on first random host
Fixes situation when using manual mode because it
tries to download coredns v1.3.1 from the same
image repository where kubernetes images are
downloaded from.
Change-Id: Ibbec8a72c8162ce8befa74e2013a268737ea5f8a
* Enable containerd to deploy vanilla containerd package
Fixes kubeadm references to CRI socket for containerd
Fixes download role cache feature to work with containerd
Change-Id: I2ab8f0031107e2f0d1a85c39b4beb66f08509a01
* use containerd for flannel-addons job
Change-Id: Ied375c7d65e64a625ffbd995ff16f2374067dee6
* add containerd vars
Change-Id: Ib9a8a04e501c481a86235413cbec63f3672baf91
* fixup vars
Change-Id: Ibea64e4b18405a578b52a13da100384582aa24c2
* more fixes
* fix rh repo
Change-Id: I00575a77cfb7b81d6095db5d918a52023c8f13ba
* Adjust helm host install for containerd
* Add calico 3.7.3 support
* add calico_datastore variable to policy controller role
* add missing clusterrole rules for calico policy controller
* disable calico kube controller when kdd mode is used for versions < 3.6
* Use K8s 1.15
* Use Kubernetes 1.15 and use kubeadm.k8s.io/v1beta2 for
InitConfiguration.
* bump to v1.15.0
* Remove k8s 1.13 checksums.
* Update README kubernetes version 1.15.0.
* Update metrics server 0.3.3 for k8s 1.15
* Remove less than k8s 1.14 related code
* Use kubeadm with --upload-certs instead of --experimental-upload-certs due to depricate
* Update dnsautoscaler 1.6.0
* Skip certificateKey if it's not defined
* Add kubeadm-conftolplane.v2beta2 for k8s 1.15 or later
* Support kubeadm control plane for k8s 1.15
* Update sonobuoy version 0.15.0 for k8s 1.15
* Add limited containerd support
Containerd support for Ubuntu + Calico
* Added CRI-O support for ubuntu
* containerd support.
* Reset containerd support.
* fix lint.
* implemented feedback
* Change task name cri xx instead of cri-o in reset task and timeout condition.
* set crictl to fixed version
* Use docker-ce's container.io package for containerd.
* Add check containerd is installable or not.
* Avoid stop docker when use containerd and optimize retry for reset.
* Add config.toml.
* Fixed containerd for kubelet.env.
* Merge PR #4629
* Remove unused ubuntu variable for containerd
* Polish code for containerd and cri-o
* Refactoring cri socket configuration.
* Configurable conmon.
* Remove unused crictl/runc download
* Now crictl and runc is downloaded by common crictl.yml.
* fixed yamllint error
* Fixed brokenfiles by conflict.
* Remove commented line in config.toml
* Remove readded v1.12.x version
* Fixed broken set_docker_image_facts
* Fix yamllint errors.
* Remove unused apt source
* Fix crictl could not be installed
* Add containerd config from skolekonov's PR #4601
* Require minimum version of Kubernetes
* Remove checksums for kubernetes version 1.12
* Add kube_version to precheck output and add min required version to README
* Fix merge
* Fix defaults
* Fix typo in precheck
* File and container image downloads are now cached localy, so that repeated vagrant up/down runs do not trigger downloading of those files. This is especially useful on laptops with kubernetes runnig locally on vm's. The total size of the cache, after an ansible run, is currently around 800MB, so bandwidth (=time) savings can be quite significant.
* When download_run_once is false, the default is still not to cache, but setting download_force_cache will still enable caching.
* The local cache location can be set with download_cache_dir and defaults to /tmp/kubernetes_cache
* A local docker instance is no longer required to cache docker images; Images are cached to file. A local docker instance is still required, though, if you wish to download images on localhost.
* Fixed a FIXME, wher the argument was that delegate_to doesn't play nice with omit. That is a correct observation and the fix is to use default(inventory_host) instead of default(omit). See ansible/ansible#26009
* Removed "Register docker images info" task from download_container and set_docker_image_facts because it was faulty and unused.
* Removed redundant when:download.{container,enabled,run_once} conditions from {sync,download}_container.yml
* All features of commit d6fd0d2aca by Timoses <timosesu@gmail.com>, merged May 1st 2019, are included in this patch. Not all code was included verbatim, but each feature of that commit was checked to be working in this patch. One notable change: The actual downloading of the kubeadm images was moved to {download,sync)_container, to enable caching.
Note 1: I considered splitting this patch, but most changes that are not directly related to caching, are a pleasant by-product of implementing the caching code, so splitting would be impractical.
Note 2: I have my doubts about the usefulness of the upload, download and upgrade tags in the download role. Must they remain or can they be removed? If anybody knows, then please speak up.
* Add support for arm images for hyperkube, kubeadm and cni_binary
* Add dummy etcd checksum for arm
This commit adds dummy etcd checksum for arm to avoid "no attribute" error
during setup.
* Add etcd host assert check
* Add 1.13.4 checksums of kubeadm and hyperkube for arm
* Update checksums of kubeadm and hyperkube for arm
* Add dummy checksums for calicoctl_binary_checksums dict
* disable gather_facts because it causes tests to fail
* Remove architecture check for etcd, due to unable to run tests
Without this, pulls are considered for all
hosts groups, even if not targetted by the downloads
`groups` list. Hence, a download/sync is triggered
even though the host does not require the image.
* Download to delegate and sync files when download_run_once
* Fail on error after saving container image
* Do not set changed status when downloaded container was up to date
* Only sync containers when they are actually required
Previously, non-required images (pull_required=false as
image existed on target host) were synced to the target
hosts. This failed as the image was not downloaded to
the download_delegate and hence was not available for
syncing.
* Sync containers when only missing on some hosts
* Consider images with multiple repo tags
* Enable kubeadm images pull/syncing with download_delegate
* Use kubeadm images list to pull/sync
'kubeadm config images pull' is replaced by collecting the images
list with 'kubeadm config images list' and using the commonly
used method of pull/syncing the images.
* Ensure containers are downloaded and synced for all hosts
* Fix download/syncing when download_delegate is a kubernetes host
* Use K8s 1.14 and add kubeadm experimental control plane mode
This reverts commit d39c273d96.
* Cleanup kubeadm setup run on first master
* pin kubeadm_certificate_key in test
* Remove kubelet autolabel of kube-node, add symlink for pki dir
Change-Id: Id5e74dd667c60675dbfe4193b0bc9fb44380e1ca
* Add ansible-lint as gitlab-ci step
* Fix jinja2 syntax in include_tasks that breaks ansible-lint
* Use a block scalar to get around gitlab quoting/escaping rules
* Run ansible-lint in verbose mode in CI
Both kubedns and dnsmasq modes are long not maintained.
We should run dns_late steps at the end because sshd
makes DNS lookups during Ansible run and has 2s timeouts
for each failed lookup trying to connect to coredns before
it is ready.
* feat(external-provisioner/local-path-provisioner): adds support for local path provisioner
Helpful for local development but also in production workloads (once the
permission model is worked out) where you have redundancy built into the
software uses the PVCs (e.g. database cluster with synchronous
replication)
* feat(local-path-provisioner): adds debug flag, image tag group var
* fix(local-path-provisioner): moves image repo/tag to download role
* test(gce_centos7-flannel): enables local-path-provisioner in test case
* fix(addons): add image repo/tag to commented default values
* fix(local-path-provisioner): typo in jinja template for local path provisioner
* style(local-path-provisioner): debug flag condition re-formatted
* fix(local-path-provisioner): adds missing default value for debug flag
* fix(local-path-provisioner): syntax fix for debug if condition end
* fix(local-path-provisioner): jinja template syntax: if condition white space
Currently, the task `container_download | download images for kubeadm config images` fetches etcd image even though it's not required (etcd is bootstrapped by kubespray, not kubeadm).
`kubeadm-images.yaml` is only a subset of `kubeadm-config.yaml`, therefore ``kubeadm config images pull` will try to get all this list (including etcd)
```
# kubeadm config images list --config /etc/kubernetes/kubeadm-images.yaml
k8s.gcr.io/kube-apiserver:v1.13.2
k8s.gcr.io/kube-controller-manager:v1.13.2
k8s.gcr.io/kube-scheduler:v1.13.2
k8s.gcr.io/kube-proxy:v1.13.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.6
```
When using the `kubeadm-config.yaml` though, it doesn't list etcd image:
```
# kubeadm config images list --config /etc/kubernetes/kubeadm-config.yaml
k8s.gcr.io/kube-apiserver:v1.13.2
k8s.gcr.io/kube-controller-manager:v1.13.2
k8s.gcr.io/kube-scheduler:v1.13.2
k8s.gcr.io/kube-proxy:v1.13.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/coredns:1.2.6
```
This change just adds the etcd endpoints in the `kubeadm-images.yaml` to give a hint to kubeadm it doesn't need etcd image for its boostrapping as etcd is "external".
I confess it is a ugly hack, a better way would be to use a single `kubeadm-config.yaml` for both tasks, but they are triggered by different roles (`kubeadm-images.yaml` is used by download, `kubeadm-config.yaml` by kubernetes/master) at different steps and I didn't want to refactor too many things to prevent breakage.
This is specially useful for offline installation where a whitelist of container images is mirrored on a local private container registry. `k8s.gcr.io/etcd` and `quay.io/coreos/etcd` are two different repositories hosting the same images but using *different tags*!
* coreos/etcd:v3.2.24
* k8s.gcr.io/etcd:3.2.24 (note the missing 'v' in the tag name)
Addressing the discussion started in #4064, this PR moves kubeadm and
hyperkube binaries to /usr/local/bin before running them on the master
nodes.
It is to address the case where local_release_dir points to /tmp
(kubespray default) and /tmp is mounted with noexec mode, preventing
any binaries to be run in that partition.
In role "node", we still move kubeadm to bin_dir only on the worker
nodes.
* Add support for running a nodelocal dns cache
After encountering dns issues in a cluster I was recently working on I
noticed Kubernetes 1.13 introduced support for running a nodelocal dns
cache.
I believe this can usefull for more people.
73b548db06https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/0030-nodelocal-dns-cache.md
* Add requested changes
* Add additional requested changes + documentation
* Add requested changes after review
* Replace incorrect variable
* Upgrade kubernetes to v1.13.0
* Remove all precense of scheduler.alpha.kubernetes.io/critical-pod in templates
* Fix cert dir
* Use kubespray v2.8 as baseline for gitlab
* Remove non-kubeadm deployment
* More cleanup
* More cleanup
* More cleanup
* More cleanup
* Fix gitlab
* Try stop gce first before absent to make the delete process work
* More cleanup
* Fix bug with checking if kubeadm has already run
* Fix bug with checking if kubeadm has already run
* More fixes
* Fix test
* fix
* Fix gitlab checkout untill kubespray 2.8 is on quay
* Fixed
* Add upgrade path from non-kubeadm to kubeadm. Revert ssl path
* Readd secret checking
* Do gitlab checks from v2.7.0 test upgrade path to 2.8.0
* fix typo
* Fix CI jobs to kubeadm again. Fix broken hyperkube path
* Fix gitlab
* Fix rotate tokens
* More fixes
* More fixes
* Fix tokens
* Remove variables defined in download role. Fixes#3799
* Cleanup some more variables
* Fix bad templating
* Minor fix
* Add dashboard to download role. Fixes#3736
When `ansible_user` is not root, using `-b` option.
And with `download_run_once` and `download_localhost` set `true`.
Ansible will executes `container_download | upload container images to nodes` task.
It uses rsync to upload images to `/tmp/release/container/`, but the
`container` directory owned by `root`.
* Support Metrics Server as addon (#3560).
* Update metrics server v0.3.1.
* Add metrics server test.
* Replace metrics server manifests with kubernetes/cluster/addons's.
* Modify metrics server manifests for kubespray.
* Follow PR#3558 node label node-role.kubernetes.io/master change
* Fix metrics server parameters base_metrics_server_... to metrics_server_...
* Fix too hard corded metrics_server_memory_per_node
* Add configurable insecure tls for metrics-apiservice
* Downloadable addon-resizer and extract parameter as variables
* Remove metrics server version from deployment name
* Metrics Server work when all masters has node role
* Download metrics-server and add-resizer container only on master
* ServiceAccount and ConfigMap is separated and fix application name
* Remove old metrics server clusterrole template
* Fix addon-resizer image specify
* Make InternalIP default for metrics_server_kubelet_preferred_address_types
Make InternalIP default because multiple preferrred address types does not work.
* Enable AutoScaler for CoreDNS
* Only use one template for dns autoscaler
* Rename a few variables for replicas and minimum pods
* Rename a few variables for replicas and minimum pods
* Remove replicas to make autoscale work
* Cleanup kubedns-autoscaler as it has been renamed
* Adds support for Multus (multiple interfaces) CNI plugin
Multus is a latin word for "Multi". As the name suggests, it acts as a
Multi plugin in Kubernetes and provides multiple network interface
support in a pod. Multus uses the concept of invoking delegates by
grouping multiple plugins into delegates and invoking them in the
sequential order of the CNI configuration file provided in json format.
* Change CNI version (0.1.0->0.3.1) of Contiv to be compatible with Multus
kube-router v0.2.1 highlights from changelog:
- IPv6 WIP but pretty close to full working functionality
- fully support network policy semantics with addition of support for
ipblock and except
* failed
* version_compare
* succeeded
* skipped
* success
* version_compare becomes version since ansible 2.5
* ansible minimal version updated in doc and spec
* last version_compare
* [jjo] add kube-router support
Fixescloudnativelabs/kube-router#147.
* add kube-router as another network_plugin choice
* support most used kube-router flags via
`kube_router_foo` vars as other plugins
* implement replacing kube-proxy (--run-service-proxy=true) via
`kube_proxy_mode: none`, verified in a _non kubeadm_enabled_
install, should also work for recent kubeadm releases via
`skipKubeProxyInstall: true` config
* [jjo] address PR#3339 review from @woopstar
* add busybox image used by kube-router to downloads
* fix busybox download groups key
* rework kubeadm_enabled + kube_router_run_service_proxy
- verify it working ok w/the kubeadm_enabled and
kube_router_run_service_proxy true or false
- introduce `kube_proxy_remove` fact, to decouple logic
from kube_proxy_mode (which affects kubeadm configmap
settings, thus no-good to ab-use it to 'none')
* improve kube-router.md re: kubeadm_enabled and kube_router_run_service_proxy
* address @woopstar latest review
* add inventory/sample/group_vars/k8s-cluster/k8s-net-kube-router.yml
* fix kube_router_run_service_proxy conditional for kube-proxy removal
* fix kube_proxy_remove fact (w/ |bool), add some needed kube-proxy tags on my and existing changes
* update kube-router tolerations for 1.12 compatibility
* add PriorityClass to kube-router DaemonSet
According to the documentation, container images are described
by vars like `foo_image_repo` and `foo_image_tag`.
The variables netcheck_{agent,server}_{img_repo,tag} do not
follow that convention.