Running remove-node.yml tasks for clean up cluster on Fedora CoreOS.
The task failed to restart network daemon (task name: "reset | Restart network").
Fedora CoreOS is essentially using NetworkManager, but this task returns network.
Signed-off-by: Takashi IIGUNI <iiguni.tks@gmail.com>
* Add unique annotation on coredns deployment and only remove existing deployment if annotation is missing.
* Ignore errors when gathering coredns deployment details to handle case where it doesn't exist yet
* Remove run_once, deletegate_to and add to when statement
* Added force_etcd_cert_refresh var to maintain existing functionality. Broke out etcd node cert syncing from member and admin cert sync logic. Now first etcd will sync node certs to other etcd members on every run to keep all etcds up to date after adding additional worker nodes to the cluster
* Updated etcd cert check tasks to better detect when new certificates need to be generated
* Move usage of force_etcd_cert_refresh var to gen_certs fact set
* Force etcd cert generation per server if force_etcd_cert_refresh is set to true
* Include gathering of node certs even if k8s-cluster member and in etcd group.
* Removed run_once due to when statement
Helm v3.5.2 is a security (patch) release. Users are strongly
recommended to update to this release. It fixes two security issues in
upstream dependencies and one security issue in the Helm codebase.
See https://github.com/helm/helm/releases/tag/v3.5.2
This makes the docker role work the same as the containerd role.
Being able to override this is needed when you have your own debian
repository. E.g. when performing an airgapped installation
* update local-path-storage config template to version v0.0.19
* changes local_path_provisioner image tag to v0.0.19
* removes copy paste example from rancher local-path-provisioner repo
According to the following recommendation, this moves the directory
to control-plane:
The Kubernetes project is moving away from wording that is considered
offensive. A new working group WG Naming was created to track this work,
and the word "master" was declared as offensive. A proposal was formalized
for replacing the word "master" with "control plane".
Previous check for presence of NM assumed "systemctl show
NetworkManager" would exit with a nonzero status code, which seems not
the case anymore with recent Flatcar Container Linux.
This new check also checks the activeness of network manager, as
`is-active` implies presence.
Signed-off-by Jorik Jonker <jorik@kippendief.biz>
This was introduced in 143e2272ff
Extra repo is enabled by default in CentOS, and is not the right repo for EL8
Instead of adding a CentOS repo to RHEL, enable the needed RHEL repos with rhsm_repository
For RHEL 7, we need the "extras" repo for container-selinux
For RHEL 8, we need the "appstream" repo for container-selinux, ipvsadm and socat
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Only checking the kubernetes api on the first master when upgrading is not enough.
Each master needs to be checked before it's upgrade.
Signed-off-by: Rick Haan <rickhaan94@gmail.com>
yum_repository expect really different params, so nothing to factor here
Ubuntu is not an ansible_os_family, the OS family for Ubuntu is Debian
Check for ansible_pkg_mgr == apt
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
we don't need rpm_key, so nothing to factor here
Ubuntu is not an ansible_os_family, the OS family for Ubuntu is Debian
Check for ansible_pkg_mgr == apt
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Here the desciption from Ansible docs
Corresponds to the --force-yes to apt-get and implies allow_unauthenticated: yes
This option will disable checking both the packages' signatures and the certificates of the web servers they are downloaded from.
This option *is not* the equivalent of passing the -f flag to apt-get on the command line
**This is a destructive operation with the potential to destroy your system, and it should almost never be used.** Please also see man apt-get for more information.
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
TASK [Generate a list of information about the images on a node]
registers list of container images to docker_images.
Then the next TASK [Set pull_required if the desired image is not
yet loaded] does based on expecting images are registered.
However sometimes the first TASK was failed as [1] but the failure
is ignored due to failed_when:false and it makes another issue.
This removes this unnecessary failed_when to detect the failure
at the point.
In addition, this removes no_log:true also because the output doesn't
contain any sensitive data and now it just makes debugging difficult.
[1]: https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/934714534#L2953
no_proxy is a pain to get right, and having proxy variables present causes issues
(k8s components get proxy configuration after upgrade, see #7100)
It's better to only configure what require proxy:
- the runtime (containerd/docker/crio)
- the package manager + apt_key
- the download tasks
Tested with the following clusters
- 4 CentOS 8 nodes
- 1 Ubuntu 20.04 node
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
In some environments, it might not be possible to ping the IP address
of the nodes, e.g., because ICMP echo is blocked.
This commit allows kubespray to be configured to disable the ping
check, while performing all other checks.
TASK [network_plugin/calico : Calico | Configure calico network pool] **********
task path: /builds/kargo-ci/kubernetes-sigs-kubespray/roles/network_plugin/calico/tasks/install.yml:138
Friday 08 January 2021 17:10:12 +0000 (0:00:01.521) 0:11:36.885 ********
[WARNING]: The value {'kind': 'IPPool', 'apiVersion': 'projectcalico.org/v3',
'metadata': {'name': 'default-pool'}, 'spec': {'blockSize': 24, 'cidr':
'10.233.64.0/18', 'ipipMode': 'Always', 'vxlanMode': 'Never', 'natOutgoing':
True}} (type dict) in a string field was converted to "{'kind': 'IPPool',
'apiVersion': 'projectcalico.org/v3', 'metadata': {'name': 'default-pool'},
'spec': {'blockSize': 24, 'cidr': '10.233.64.0/18', 'ipipMode': 'Always',
'vxlanMode': 'Never', 'natOutgoing': True}}" (type string). If this does not
look like what you expect, quote the entire value to ensure it does not change.
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Improve how we set 'proxy=' in yum.conf or dnf.conf
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Fixup spaces in no_proxy
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Add svc,svc.{{ dns_domain }} to no_proxy
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Remove because of empty need_http_proxy.rc if http/https_proxy and skip_http_proxy_on_os_packages=true is set
* Modify sample for debian and centos skip_http_proxy
* Modify sample for debian and centos skip_http_proxy
If some settings were changed from the default but not commited into an inventory repo,
we risk breaking the cluster / cause downtime, so add some extra checks
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Upgrading docker / containerd without adapting the configuration might break the node,
so disable docker-ce repo by default.
We are already using dpkg hold for Debian.
All containerd.io packages provide /usr/bin/runc, so no need to check
yum_conf was never used for containerd
module_hotfixes should not be needed with the EL8 repo
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Fedora CoreOS: Fix for ethtool pre-installed
Fix error in rpm-ostree when ethtool is already insatlled (FCOS >= 32.20201104.3.0)
* Fedora CoreOS: Fix connection lost
Fedora CoreOS: Ignore connection lost due to reboot and continues the playbook
We are currently setting the IP variable to hostIP,
Before https://github.com/projectcalico/node/pull/593 (not yet released)
Calico interpret that as hostIP/32
Using 'can-reach' we get the future behavior
This fixes vxlan and IPIP CrossSubnet modes
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* update files to handle multi-asn bgp peering conditions.
* put back in the serviceClusterIPs. Bad merge.
* remove extraneous environment var.
* update files as discussed with mirwan
* update titles.
* add not in.
* add a conditional for using bgp to advertise cluster ips.
Co-authored-by: marlow-h <mweston@habana.ai>
If cluster-name is not set, the default value "kubernetes" is used.
The loadbalancees created by Kubernetes follow the format:
kube_service_clusterName_serviceNamespace_serviceName
If 2 clusters create a loadbalancer for the same service in the same
namespace, they will share the same non-working loadbalancer.
Signed-off-by: Cedric Hnyda <cedric.hnyda@itera.io>
* Update hashes and set default version to 1.19.5
Signed-off-by: anthr76 <hello@anthonyrabbito.com>
* Reorder hashes
1.19.5 hashes should be near 1.19.x
* Added back blank line
This fixes the following warning:
[kubernetes/client : Generate admin kubeconfig with external api endpoint]
[WARNING]: Consider using the file module with state=directory rather than
running 'mkdir'. If you need to use command because file is insufficient
you can
RedHat 8.3 merged nf_conntrack_ipv4 in nf_conntrack but still advertise 4.18
so just try to modprobe and decide depending on the success
Also nf_conntrack is a dependency of ip_vs, so no need to care about it
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Ensure libseccomp is installed before starting containerd on CentOS 8
* Simplify libseccomp install on CentOS 8
- Uses `package` module
- Replaces complex version check with 'state: latest'. The version must
be > 2.3 when using with cri-o.
- Removes unnecessary `not is_ostree` condition as CentOS 8 does not use
ostree
* copying ssh key no longer required, works with password auth
* use copy module instead of synchronize (which requires sshpass)
* less tasks and always changed tasks
* containerd docker hub registry mirror support
* add docs
* fix typo
* fix yamllint
* fix indent in sample
and ansible-playbook param in testcases_run
* fix md
* mv common vars to tests/common/_docker_hub_registry_mirror.yml
* checkout vars to upgrade tests
If crictl (and docker) binaries are deployed to the directories
that are not in standard PATH (e.g. /usr/local/bin), it is required
to specify full path to the binaries.
The task outputs the following warning:
TASK [kubernetes/preinstall : Enable ip forwarding]
[WARNING]: The value 1 (type int) in a string field was converted
to u'1' (type string). If this does not look like what you expect,
quote the entire value to ensure it does not change.
This new version uses the same base image as kube-proxy
(k8s.gcr.io/build-image/debian-iptables)
This allow to automatically pick iptables-legacy or iptables-nft,
and be compatible with RHEL/CentOS 8
https://github.com/kubernetes/dns/pull/367
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* fix flake8 errors in Kubespray CI - tox-inventory-builder
* Invalidate CRI-O kubic repo's cache
Signed-off-by: Victor Morales <v.morales@samsung.com>
* add support to configure pkg install retries
and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)
* Switch Calico, Cilium and MetalLB image repos to Quay.io
Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
calico PODs are first started and then in a handler killed and
restarted for no reason, nothing has changed.
By using the existing variable 'calico_cni_config' (only defined when
calico has already started) the restart can be skipped.
* create a wrapper script with pki options
* supports all kubespray managed container engines
Co-authored-by: Hans Feldt <hafe@users.noreply.github.com>
* Allow the eventRecordQPS setting to be set.
The eventRecordQPS parameter controls rate limiting for event recording. When zero, unlimited events can cause denial-of-service situations. For my situation, I don't need more than a setting of "5". This change allows me to configure the setting before creating the cluster.
* Allow the eventRecordQPS setting to be set.
The default settings (see types.go) is five. So, this change does not affect the cluster provisioning. However, it does allow for the setting to be changed.
Fedora 31 uses Cgroups v2 by default. This change by passes the kernel
parameter systemd.unified_cgroup_hierarchy=0.
Signed-off-by: Victor Morales <v.morales@samsung.com>