* fix(misc): terraform/aws
- handles deployment with a single availability zone
- handles deployment with more than two availability zone
- handles etcd collocation with control-plane nodes (`aws_etcd_num=0`)
- allows to set a bastion instances count (`aws_bastion_num`)
- allows to set bastion/etcd/control-plane/workers rootfs volume size
- removes variables from terraform.tfvars that were not re-used
- adds .terraform.lock.hcl to .gitignore
- changes/updates base image from ubuntu-18.03 to debian-10
tested by a few coworkers of mine, and myself: thanks for the outstanding
work, on both those terraform samples and kubespray playbooks.
I did not test ubuntu deployments, I could still swap from buster to
focal. LMK.
* fix(gitlab-ci)
AFAIU, terraform.tfvars indentation should be fixed for / no diff
returned running `terraform fmt -check -diff`
https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/1445622114
When running the script, I faced the following error but it was
difficult to know the root problem due to lack of error handling.
docker tag" requires exactly 2 arguments.
See 'docker tag --help'.
Usage: docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
To investigate such errors easily, this adds an error handling.
* Ansible: move to Ansible 3.4.0 which uses ansible-base 2.10.10
* Docs: add a note about ansible upgrade post 2.9.x
* CI: ensure ansible is removed before ansible 3.x is installed to avoid pip failures
* Ansible: use newer ansible-lint
* Fix ansible-lint 5.0.11 found issues
* syntax issues
* risky-file-permissions
* var-naming
* role-name
* molecule tests
* Mitogen: use 0.3.0rc1 which adds support for ansible 2.10+
* Pin ansible-base to 2.10.11 to get package fix on RHEL8
* terraform/openstack: Use path.root for ansible_bastion_template.txt
The path.root variable points to the root module path. Using this
instead of a relative path makes less assumptions about the current
working directory.
* terraform/openstack: Add group_vars_path variable
Previously, the group_vars path was assumed to be in CWD. The
default value for the group_vars_path variable is still relative
to CWD and thus should be backwards compatible if unset.
* Packet->Equinix Metal rename #6901
Updates throughout to reflect #6901 renaming for Packet to Equinix Metal.
* Rename Packet to Equinix Metal throughout the project #6901
Packet is renamed to Equinix Metal in more contexts including
documentation links. The Terraform provider used is still the Packet
provider. The environment variables and configuration options still
refer to the Packet name.
Signed-off-by: Marques Johansson <mjohansson@equinix.com>
Co-authored-by: Edward Vielmetti <ed@packet.net>
Basically we need to make necessary TCP/UDP ports open.
However the necessary ports are so many, and sometimes it is difficult
to figure out that is due to firewall issues or not if facing deployment
issues.
To distinguish a root problem on such situation, this adds contrib
playbook to disable the service firewall for Kubespray development
and test.
* rename ansible groups to use _ instead of -
k8s-cluster -> k8s_cluster
k8s-node -> k8s_node
calico-rr -> calico_rr
no-floating -> no_floating
Note: kube-node,k8s-cluster groups in upgrade CI
need clean-up after v2.16 is tagged
* ensure old groups are mapped to the new ones
Context: Load-balancing in Exoscale is performed by associating many
workers with the same EIP. This works, however, the workers cannot access
themselves via the EIP, which is needed at least for cert-managers
"self-test".
Problem: The old iptables based workaround felt fragile and disappointed
me at least once.
New solution: Add the EIP to a loopback interface on each worker.
* Remove contrib/vault
This is marked as broken since 2018 / 3dcb914607
This still reference apiserver.pem, not used since ddffdb63bf
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
* Finish nuking vault from the codebase
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
This replaces kube-master with kube_control_plane because of [1]:
The Kubernetes project is moving away from wording that is
considered offensive. A new working group WG Naming was created
to track this work, and the word "master" was declared as offensive.
A proposal was formalized for replacing the word "master" with
"control plane". This means it should be removed from source code,
documentation, and user-facing configuration from Kubernetes and
its sub-projects.
NOTE: The reason why this changes it to kube_control_plane not
kube-control-plane is for valid group names on ansible.
[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
* terraform support for UpCloud
* terraform support for UpCloud
* terraform support for UpCloud
* terraform support for UpCloud
* terraform support for UpCloud
* terraform support for UpCloud
* terraform support for UpCloud
* Updates to README.md and main.tf files
* formatting and updating readme
* added a .terraform_validate CI job
* fixed format issue
* added sample inventory
* added symbolic link to group_vars
* added missing tf variables and minor fixes
* added text formatting
* minor formatting fixes
* Add terraform scripts for vSphere
* Fixup: Add terraform scripts for vSphere
* Add inventory generation
* Use machines var to provide IPs
* Add README file
* Add default.tfvars file
* Fix newlines at the end of files
* Remove master.count and worker.count variables
* Fixup cloud-init formatting
* Fixes after initial review
* Add warning about disabled DHCP
* Fixes after second review
* Add sample-inventory
This replaces KUBE_MASTERS with KUBE_CONTROL_HOSTS because of [1]:
```
The Kubernetes project is moving away from wording that is
considered offensive. A new working group WG Naming was created
to track this work, and the word "master" was declared as offensive.
A proposal was formalized for replacing the word "master" with
"control plane". This means it should be removed from source code,
documentation, and user-facing configuration from Kubernetes and
its sub-projects.
```
[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
* contrib/terraform/exoscale: Rework SSH public keys
Exoscale has a few limitations with `exoscale_ssh_keypair` resources.
Creating several clusters with these scripts may lead to an error like:
```
Error: API error ParamError 431 (InvalidParameterValueException 4350): The key pair "lj-sc-ssh-key" already has this fingerprint
```
This patch reworks handling of SSH public keys. Specifically, we rely on
the more cloud-agnostic way of configuring SSH public keys via
`cloud-init`.
* contrib/terraform/exoscale: terraform fmt
* contrib/terraform/exoscale: Add terraform validate
* contrib/terraform/exoscale: Inline public SSH keys
The Terraform scripts need to install some SSH key, so that Kubespray
(i.e., the "Ansible part") can take over. Initially, we pointed the
Terraform scripts to `~/.ssh/id_rsa.pub`. This proved to be suboptimal:
Operators sharing responbility for a cluster risk unnecessarily replacing resources.
Therefore, it has been determined that it's best to inline the public
SSH keys. The chosen variable `ssh_public_keys` provides some uniformity
with `contrib/azurerm`.
* Fix Terraform Exoscale test
* Fix Terraform 0.14 test
This variable was added as KUBE_MASTERS_MASTERS. That's probably a typo.
Remove the redundant `_MASTERS` suffix. Also, document the variable in the
help message.
This fixes the following failures:
./contrib/offline/README.md:14:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
./contrib/offline/README.md:20:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
* [terraform/aws] Fix Terraform >=0.13 warnings
Terraform >=0.13 gives the following warning:
```
Warning: Interpolation-only expressions are deprecated
```
The fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings.
* [terraform/openstack] Fixes for Terraform >=0.13
Terraform >=0.13 gives the following error:
```
Error: Failed to install providers
Could not find required providers, but found possible alternatives:
hashicorp/openstack -> terraform-provider-openstack/openstack
```
This patch fixes these errors.
This fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings for Terraform 0.13.5 and Terraform
0.14.3. Unfortunately, 0.12.x gives a harmless warning, but
with 0.14.3 out the door, I guess we need to move on.
* [terraform/packet] Fixes for Terraform >=0.13
This fix was tested as follows:
```
export PACKET_AUTH_TOKEN=blah-blah
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
Errors are gone, but warnings still remain. It is impossible to please
all three versions of Terraform.
* Add tests for Terraform >=0.13
Now markdownlint covers ./README.md and md files under ./docs only.
However we have a lot of md files under different directories also.
This enables markdownlint for other md files also.
* fix flake8 errors in Kubespray CI - tox-inventory-builder
* Invalidate CRI-O kubic repo's cache
Signed-off-by: Victor Morales <v.morales@samsung.com>
* add support to configure pkg install retries
and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)
* Switch Calico, Cilium and MetalLB image repos to Quay.io
Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
* Add note about changing private IP in admin.conf.
When I run kubespray, a load balancer is created which should be used instead of the ip of the controller node.
* Procedure to find load balancer and update admin.conf
When I run kubespray, a load balancer is used instead of the private ip of the controller.
I kept seeing `TLS handshake error from 10.250.250.158:63770: EOF` from two IP addresses that correlate to my ELB. Changing the health check from TCP to HTTPS stopped the errors from being generated.