This new version uses the same base image as kube-proxy
(k8s.gcr.io/build-image/debian-iptables)
This allow to automatically pick iptables-legacy or iptables-nft,
and be compatible with RHEL/CentOS 8
https://github.com/kubernetes/dns/pull/367
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* fix flake8 errors in Kubespray CI - tox-inventory-builder
* Invalidate CRI-O kubic repo's cache
Signed-off-by: Victor Morales <v.morales@samsung.com>
* add support to configure pkg install retries
and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)
* Switch Calico, Cilium and MetalLB image repos to Quay.io
Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
calico PODs are first started and then in a handler killed and
restarted for no reason, nothing has changed.
By using the existing variable 'calico_cni_config' (only defined when
calico has already started) the restart can be skipped.
* create a wrapper script with pki options
* supports all kubespray managed container engines
Co-authored-by: Hans Feldt <hafe@users.noreply.github.com>
* Allow the eventRecordQPS setting to be set.
The eventRecordQPS parameter controls rate limiting for event recording. When zero, unlimited events can cause denial-of-service situations. For my situation, I don't need more than a setting of "5". This change allows me to configure the setting before creating the cluster.
* Allow the eventRecordQPS setting to be set.
The default settings (see types.go) is five. So, this change does not affect the cluster provisioning. However, it does allow for the setting to be changed.
Fedora 31 uses Cgroups v2 by default. This change by passes the kernel
parameter systemd.unified_cgroup_hierarchy=0.
Signed-off-by: Victor Morales <v.morales@samsung.com>
* update version of ingress-nginx controller.
Change tag from controller-v0.34.0 to controller-v0.40.2 to use newest tag.
* Update docs about aws deploy templates.
In the yaml templates, there is no mention of idle timeouts. This is why I removed the documentation about it. This might be a mistake. Please verify this. I don't know enough to verify it myself.
* Change label when checking version.
When checking for `app.kubernetes.io/name=ingress-nginx`, a completed pod was selected which is not helpful when trying to `exec`. Changing the label selects the running controller pod.
* put back the information about ELB Idle Timeouts.
When I removed the information, I had overlooked that it was mentioned in the L7 yaml file. Thanks.
and thereby support upgrade from e.g. 1.18.x to 1.19.y
Included OSes:
- Centos7/8
- Ubuntu18/20
New variables for overriding by default installed packages:
- centos_crio_packages
- ubuntu_crio_packages
* Enable Kata Containers for CRI-O runtime
Kata Containers is an OCI runtime where containers are run inside
lightweight VMs. This runtime has been enabled for containerd runtime
thru the kata_containers_enabled variable. This change enables Kata
Containers to CRI-O container runtime.
Signed-off-by: Victor Morales <v.morales@samsung.com>
* Set appropiate conmon_cgroup when crio_cgroup_manager is 'cgroupfs'
* Set manage_ns_lifecycle=true when KataContainers is enabed
* Add preinstall check for katacontainers
Signed-off-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Pasquale Toscano <pasqualetoscano90@gmail.com>
Command line flags aren't added to kube-proxy which results in missing
feature gates set in this component. Add appropriate setting to
ConfigMap instead.
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
'ansible.vars.hostvars.HostVarsVars object' has no attribute 'kubeadm_upload_cert'
kubeadm_upload_cert will never be found as a hostvar for the first
master since the task is executed for a worker.
Fix by executing the upload task for the first master and register
the needed key. After that, workers can read hostvars for the master
Var kubeadm_etcd_refresh_cert_key removed since it no longer has
any use.
* Adding option to disable gloablly applying a proxy to etc/yum.conf
* Change made to proxy_yum_globaly basedon reviewer feedback
* fix trailing spaces in ymllint
This fixes the Containerd + EL8 case that was missed in 7d1ab3374e
On CentOS 8 with proxy ansible render inline `proxy` and `module_hotfixes` options.
For example:
```
proxy=http://127.0.0.1:3128module_hotfixes=True
```
But expected result:
```
proxy=http://127.0.0.1:3128
module_hotfixes=True
```
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
crio refuses to delete pods when cni is unavailable which is the
case e.g. using calico with kdd datastore. See:
https://github.com/cri-o/cri-o/issues/4084
Fix by deleting storage associated with containers. Stop and disable
crio service so switching container runtime can be done.
* Added option to force apiserver and respective client certificate to be regenerated without necessarily needing to bump the K8S cluster version
* Removed extra blank line
Handlers with the same name (Kubeadm | restart kubelet) leads to incorrect playbook execution. As a result, after completing the tasks, kubelet does not restart. This PR fix this behavior
After upgrading to newer Kubernetes(v1.17 at least), kubectl command
shows the following warning message:
WARNING: Kubernetes configuration file is group-readable.
This is insecure. Location: /home/foo/.kube/config
The kubeconfig was copied from {{ artifacts_dir }}/admin.conf with
kubeconfig_localhost feature. It is better to set valid file mode
at getting it on Kubespray.
The 0d0cc8cf9c change creates several
DaemonSets to cover the Flannel CNI installation for different CPU
architectures. This change removes the unnecessary architecture value
from the docker tag value.
Signed-off-by: Victor Morales <v.morales@samsung.com>
In case multiple nodeselectors are specified in ingress_nginx_nodeselector, the generated daemonset yaml template for nginx is invalid due to missing indentation starting with the second nodeselector
When stopping at the check of "Stop if ip var does not match local ips"
the error message is like:
fatal: [single-k8s]: FAILED! => {
"assertion": "ip in ansible_all_ipv4_addresses",
"changed": false,
"evaluated_to": false,
"msg": "Assertion failed"
}
That doesn't contain actual IP addresses and it is difficult to understand
what was wrong. This adds the error message which contain actual IP addresses
to investigate the issue if happens.
* calico: add constant calico_min_version_required
and verify current deployed version against it.
* calico: remove upgrade support with data migration
The tool was used pre v3.0.0 and is no longer needed.
* calico: remove old version support from tasks
* calico: remove old ver support from policy ctrl
* calico: remove old ver support from node
* canal: remove old ver support
* remove unused calicoctl download checksums
calico_min_version_required is the oldest version that can be installed
Older versions can be removed.
* Add retries to update calico-rr data in etcd through calicoctl
* Update update-node yaml syntax
* Add comment to clarify ansible block loop
* Remove trailing space
* Fix reserved memory unit in kubelet configuration
Signed-off-by: Wang Zhen <lazybetrayer@gmail.com>
* Move systemReserved default values from template
Signed-off-by: Wang Zhen <lazybetrayer@gmail.com>
* Added ability to set calico vxlan vni and port. defaults to calico's documented defaults.
* Check if calico_network_backend is defined prior to checking value
* Removed calico hidden defaults for vxlan port and vni
* Fixed FELIX_VXLANVNI typo
* Added support for setting tiller_service_account and tiller_replicas
* Specify helm 2 version to ensure we have a test path that still hits helm 2 code
* Moved tiller_service_account to defaults.yml. Fixed is tiller_replicas defined check.
* Make metallb image repos configurable
* Moved metallb image repo definitions to download role defaults
* Removed comment. These are set in download defaults
After host reboot kubelet and crio goes into a loop and no container is started.
storage_driver in crio.conf overrides system defaults in etc/containers/storage.conf
/etc/containers/storage.conf is installed by package containers-common dependency
installed from cri-o (centos7) and contains "overlay".
Hosts already configured with overlay2 should be reconfigured and the
/var/lib/containers content removed.
* Add comment from roles/kubespray-defaults/defaults/main.yaml clarifying network allocation and sizes
Signed-off-by: Mikael Johansson <mik.json@gmail.com>
* Rewrite of the comment and added new examples
Signed-off-by: Mikael Johansson <mik.json@gmail.com>
* remove podman cni plugin
* configure networkamanger global dns
* allow installation of python3-libselinux by disabling update repo temporary
* remove ipv4 section because it is not a valid configuration
Removes these startup warnings:
Warning: For remote container runtime, --pod-infra-container-image is ignored in kubelet, which should be set in that remote runtime instead
Using "/var/run/crio/crio.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/crio/crio.sock".
* add snapshot-controller and v1beta1 snapshot api
* fix typo
* udpate manifest to v1beta1
* update
* update manifests
* fix spelling
* wait until crd is applied
* fix missing info in kube module
* revert snapshotclass
* add snapshot crds before applying the csi driver
* add crds, missed them in last commit
* use pull policy from kubespray
* Update CustomResourceDefinition for kubecontrollersconfigurations.crd.projectcalico.org to v1
* Align ClusterRole for kube-controllers with upstream (calico)
* Add support for openstack application credentials
* Add some lines for readability
* Update external_openstack_tenant_id check
Do not check external_openstack_tenant_id when application credentials are defined
* Add check for external_openstack_domain_id
* Fix typo
By default do not allow "unqualified" (without a registry) images
because it is considered unsecure and subject to mitm attacks.
To enable insecure pull configure for example:
crio_registries:
- "docker.io"
- "quay.io"
* Use proper openssl command to differentiate between host and ip in current certificate check
* fixup! Use proper openssl command to differentiate between host and ip in current certificate check
The bootstrap-os role uses a bootstrap script to provision a
python interpreter on flatcar and container os hosts. As the
pypy project switched to another hoster, the download url changed.
If applied this will use the new proper pypy download url in bootstrap script
* Update the cilium svc proxy test to HA mode
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Fix cilium strict kube-proxy in HA
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Add a single global endpoint variable
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Add cilium docs about kube-proxy replacement
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Fix issues in docs
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Option for MetalLB to talk BGP
* Check for BGP peers when metallb_protocol is bgp
* README clarification
* Commented values as documentation only in the sample inventory
* layer 2 or BGP, not both
* log level by default increased to 'info'
* cgroup manager by default set to 'systemd'
* stream port (used by kubelet) bound to 127.0.0.1 for security reasons
* metrics can be enabled and port specified
Trying to layer this package on Fedora 32 causes the install to crash
and furthermore it looks like the original bug linked to in the comment
has been resolved for Fedora 31
* Update calico_veth_mtu to FELIX_IPINIP variable
calico_veth_mtu is specified in the configuration, but since it only works for wireguard, modify it to work for IP-in-IP users.
* Update template with more cleaner expression
CI job 624031102 failed with:
fatal: [ubuntu1804]: FAILED! => {"changed": false, "msg": "Failed to download key at https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable/xUbuntu_18.04/Release.key: Request failed: <urlopen error [Errno -3] Temporary failure in name resolution>"}
Assuming its a temporary problem it should get more robust with a
couple of retries like in other roles.
Co-authored-by: Hans Feldt <hafe@users.noreply.github.com>
* Add additional metadata configuration option to external Openstack CCM (kubernetes-sigs#6338)
* Set the variable external_openstack_metadata_search_order undefined by default
* Fix kubelet cgroup driver detection for crio
Remove fact standalone_kubelet since it is not used
* Fix yamllint complaints of roles/kubernetes/node/tasks/facts.yml
Co-authored-by: Hans Feldt <hafe@users.noreply.github.com>
This changes MetalLB contrib to one of addons for deploying MetalLB with
Kubernetes cluster deployment. By the default, Kubespray doesn't deploy
MetalLB addon.
Support for Ambassador OSS as an Ingress Controller when
settings `ingress_ambassador_enabled: true`.
Signed-off-by: Alvaro Saurin <alvaro.saurin@gmail.com>
* Install Kata Containers as additional container runtime
* Create RuntimeClasses for Kata Containers
* Updated Vagrant to optionally run without Docker as container manager
* Updated Vagrant to optionally use Libvirt nested virtualization
* Add Kata Containers documentation
* Fix lint errors
* Add kata_containers_enabled to kubespray-defaults
* Fixed typo error
* Fixed typo error
We need to specify either external_openstack_tenant_name or
external_openstack_tenant_id. Those values were checked by seeing they
are defined or they have actual values separately.
However those values are always defined because of the following code
of openstack/defaults/main.yml:
external_openstack_tenant_id: "{{ lookup('env','OS_TENANT_ID')| default(lookup('env','OS_PROJECT_ID'),true) }}"
external_openstack_tenant_name: "{{ lookup('env','OS_TENANT_NAME')| default(lookup('env','OS_PROJECT_NAME'),true) }}"
So even if not specifying both values, those checks could not detect
the misconfiguration. This fixes this to detect the misconfiguration.
* MINOR: Check kernel version before enable modprobe nf_conntrack
* CLEANUP: no more need to ignore error of this task
* MINOR: Fixing yaml and ansible lint error - remove trailling-space
If the special parameter "$@" is not quoted, the following command will not work:
./kubectl.sh patch storageclass my-storage-class -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
* Add oraclelinux8 and disable firewalld
Add oraclelinux8 image and disable firewalld on oraclelinux VMs
* Fix Oracle Linux repositories
As documented in: http://yum.oracle.com/getting-started.html#installing-software-from-oracle-linux-yum-server
public-yum-ol7.repo was deprecated on release 7.6. Some repos were integrated into oracle-linux-ol7.repo (i.e.: ol7_latest, ol7_addons) and other are available as packages (epel). This also adds support for oraclelinux8
* Fix to use ansible_distribution_version
Instead of ansible_distribution_major_version
* Update README.md
On OpenStack history, we used to call "tenant" for separeted namespace.
However we use "project" now instead.
Then we have replaced "tenant" with "project". Then all "TENANT" variables
also are renamed to "PROJECT".
This makes Kubespray search "PROJECT" variable also for newer OpenStack
clouds.
with the Python ruamel.yml library
- Change True/False to true/false in a few places so file can
be more easily round-tripped with the Python ruamel.yml library
flannel, ovn and multus network plugins did not support all taint keys. This
update changes the tolerations to support them all.
According to the documentation:
```
There are two special cases: An empty key with operator Exists matches all keys,
values and effects which means this will tolerate everything. An empty effect matches
all effects with key key.
```
Usage of the empty `key` and `effect` ensures the network plugin daemonset will
be deployed on every nodes (ex: in case of custom taints, or NoExecute effect)
Since weave 2.5.1, `NoExecute` taint effect is no more supported,
this changes the daemonset tolerations to change this behavior.
Also remove the toleration key `CriticalAddonsOnly` not required anymore.
* fix(kubelet): exec notify restart kubelet service when kube-config.yml changed
* Revert "refactor(kubelet handler): change task name("reload kubelet") this is misleading"
This reverts commit 8f5d29560802c7c997293adb1ce9f84d3b20b6cb.
* fix(handlers,kubelet): setting right notify task name
* Add additional network configuration options to external Openstack CCM (#6083)
* Change the default version of external openstack cloud controller image to v1.18.1 since there was an issue in v1.18.0 where some IPs of the private network were ignored
* Change Network section in external-openstack-cloud-config.j2 to Networking
* Add networking customization information in the openstack documentation
The 98e7a07fba commit udpates the
dashboard version to 2.0.0 but it enable skip login flag wasn't
updated. This change updates its identation to avoid issues when
dashboard_skip_login is enabled.
* bump to dashboard 2.0 rc6 with metrics scrapper
* fix missing yaml seperator making Replicaset complaining about missing ServiceAccount
* unwanted legay gross hack forgot to remove before
* no need namespace on CrBinding
* bump to 2.0.0 release
* remove dashboard_metrics_scrapper_enabled
* replace removed repo with kubic repository for centos 7
* add crio configuration for centos8
* add crio configurations for debian
* use correct crio version for fedora
* simplify calulation of required crio version
- gives possibility to overwrite
* change default path for runc
* change default for seccomp path
* change default for conmon
* declare kubic repo for ubuntu
* do not install crictl twice
* move fedora repo modular tasks to crio_repo file
* move centos repo tasks to crio_repo
* declare crio version matrix for ubuntu
* update documentation crio support for ubuntu
* Add proxy support to CRI-O service
The crio.service requires proxy environment variables when it's
deployed behind a corporated network. This change creates a systemd
configuration file when the proxy variables are defined.
* Remove unnecesary crio's tasks
The playbook that bootstrap openSUSE servers assumes that the
/etc/sysconfig/proxy file exists but the execution fails when
these file is not present. This change guarantees its existence.
* assembly fallback_ips and no_proxy var only one time on localhost and populate result on all hosts
* add tag always, fix ansible lint errors
* workaround to mitogen issue dw/mitogen#663
* do not gather fact before install python on coreos like distros
* try to pass docker molecule test
* fix upgrade of crio on fcos
- update documents
* install conntrack required by kube-proxy
- like commit 48c41bcbe7
* enable fedora modular repo for crio
* allow to override crio configuration
- set cgroup manager same to kubelet_cgroup_driver if defined
- path of seccomp_profile depends on distribution
* allow to override crio configuration
- fix path for ubuntu
* allow to override crio configuration
- fix cni path for fcos
* added required permissions for querying endpointslice resources
* copy-pasted role permissions from cilium install manifests
* bumped cilium version to v1.7.2
* Fix proxy and module_hotfixes
On CentOS 8 with proxy ansible render inline `proxy` and `module_hotfixes` options.
For example:
`proxy=http://127.0.0.1:3128module_hotfixes=True`
But expected result:
```
proxy=http://127.0.0.1:3128
module_hotfixes=True
```
* Use ini_file module for work with ini files
* Prevent duplicates proxy= option in /etc/yum.conf
Module `lineinfile` is weak, use most powerful module `ini_file` and add or remove `proxy=` when `http_proxy` is defined or not.
* etcd: etcd-events doesn't depend on etcd_cluster_setup
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: remove condition already present on include_tasks
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: fix scaling up
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use *access_addresses, do not delegate to etcd[0]
We want to wait for the full cluster to be healthy,
so use all the cluster addresses
Also we should be able to run the playbook when etcd[0] is down
(not tested), so do not delegate to etcd[0]
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use failed_when for health check
unhealthy cluster is expected on first run, so use failed_when
instead of ignore_errors to remove scary red messages
Also use run_once
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/preinstall: ensure ansible_fqdn is up to date after changing /etc/hosts
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/master: regenerate apiserver cert if needed
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Fix chicken and egg problem with proxy_env not defined on the first envinronment usage.
* Disable fact gathering for the first proxy_env evaluation.
* Move proxy_env var set up from the role defaults to the root playbooks as fact.
* kubernetes-sigs-kubespray #5824
Added support nodes which are part of Virtual Machine Scale Sets(VMSS)
* kubernetes-sigs-kubespray #5824
* kubernetes-sigs-kubespray #5824
Added comments and updatetd azure docs.
* kubernetes-sigs-kubespray #5824
Added supported values comments for "azure_vmtype" in azure.yml