[calico] don't enable ipip encapsulation by default and use vxlan in CI (#8434)

* [calico] make vxlan encapsulation the default

* don't enable ipip encapsulation by default
* set calico_network_backend by default to vxlan
* update sample inventory and documentation

* [CI] pin default calico parameters for upgrade tests to ensure proper upgrade

* [CI] improve netchecker connectivity testing

* [CI] show logs for tests

* [calico] tweak task name

* [CI] Don't run the provisioner from vagrant since we run it in testcases_run.sh

* [CI] move kube-router tests to vagrant to avoid network connectivity issues during netchecker check

* service proxy mode still fails connectivity tests so keeping it manual mode

* [kube-router] account for containerd use-case
This commit is contained in:
Cristian Calin 2022-03-18 03:05:39 +02:00 committed by GitHub
parent a86d9bd8e8
commit dd2d95ecdf
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
26 changed files with 229 additions and 82 deletions

View File

@ -100,16 +100,6 @@ packet_ubuntu16-flannel-ha:
extends: .packet_pr
when: manual
packet_ubuntu16-kube-router-sep:
stage: deploy-part2
extends: .packet_pr
when: manual
packet_ubuntu16-kube-router-svc-proxy:
stage: deploy-part2
extends: .packet_pr
when: manual
packet_debian10-cilium-svc-proxy:
stage: deploy-part2
extends: .packet_periodic
@ -165,11 +155,6 @@ packet_fedora34-docker-weave:
extends: .packet_pr
when: on_success
packet_fedora35-kube-router:
stage: deploy-part2
extends: .packet_pr
when: on_success
packet_opensuse-canal:
stage: deploy-part2
extends: .packet_periodic
@ -218,11 +203,6 @@ packet_centos7-calico-ha:
extends: .packet_pr
when: manual
packet_centos7-kube-router:
stage: deploy-part2
extends: .packet_pr
when: manual
packet_centos7-multus-calico:
stage: deploy-part2
extends: .packet_pr

View File

@ -66,3 +66,24 @@ vagrant_ubuntu20-flannel:
stage: deploy-part2
extends: .vagrant
when: on_success
vagrant_ubuntu16-kube-router-sep:
stage: deploy-part2
extends: .vagrant
when: manual
# Service proxy test fails connectivity testing
vagrant_ubuntu16-kube-router-svc-proxy:
stage: deploy-part2
extends: .vagrant
when: manual
vagrant_fedora35-kube-router:
stage: deploy-part2
extends: .vagrant
when: on_success
vagrant_centos7-kube-router:
stage: deploy-part2
extends: .vagrant
when: manual

3
Vagrantfile vendored
View File

@ -240,6 +240,7 @@ Vagrant.configure("2") do |config|
}
# Only execute the Ansible provisioner once, when all the machines are up and ready.
# And limit the action to gathering facts, the full playbook is going to be ran by testcases_run.sh
if i == $num_instances
node.vm.provision "ansible" do |ansible|
ansible.playbook = $playbook
@ -252,7 +253,7 @@ Vagrant.configure("2") do |config|
ansible.host_key_checking = false
ansible.raw_arguments = ["--forks=#{$num_instances}", "--flush-cache", "-e ansible_become_pass=vagrant"]
ansible.host_vars = host_vars
#ansible.tags = ['download']
ansible.tags = ['facts']
ansible.groups = {
"etcd" => ["#{$instance_name_prefix}-[1:#{$etcd_instances}]"],
"kube_control_plane" => ["#{$instance_name_prefix}-[1:#{$kube_master_instances}]"],

View File

@ -210,23 +210,42 @@ calico_node_readinessprobe_timeout: 10
## Config encapsulation for cross server traffic
Calico supports two types of encapsulation: [VXLAN and IP in IP](https://docs.projectcalico.org/v3.11/networking/vxlan-ipip). VXLAN is supported in some environments where IP in IP is not (for example, Azure).
Calico supports two types of encapsulation: [VXLAN and IP in IP](https://docs.projectcalico.org/v3.11/networking/vxlan-ipip). VXLAN is the more mature implementation and enabled by default, please check your environment if you need *IP in IP* encapsulation.
*IP in IP* and *VXLAN* is mutualy exclusive modes.
Configure Ip in Ip mode. Possible values is `Always`, `CrossSubnet`, `Never`.
```yml
calico_ipip_mode: 'Always'
```
Configure VXLAN mode. Possible values is `Always`, `CrossSubnet`, `Never`.
### IP in IP mode
To configure Ip in Ip mode you need to use the bird network backend.
```yml
calico_ipip_mode: 'Always' # Possible values is `Always`, `CrossSubnet`, `Never`
calico_vxlan_mode: 'Never'
calico_network_backend: 'bird'
```
If you use VXLAN mode, BGP networking is not required. You can disable BGP to reduce the moving parts in your cluster by `calico_network_backend: vxlan`
### VXLAN mode (default)
To configure VXLAN mode you can use the default settings, the example below is provided for your reference.
```yml
calico_ipip_mode: 'Never'
calico_vxlan_mode: 'Always' # Possible values is `Always`, `CrossSubnet`, `Never`.
calico_network_backend: 'vxlan'
```
In VXLAN mode BGP networking is not required.
We disable BGP to reduce the moving parts in your cluster by `calico_network_backend: vxlan`
### BGP mode
To enable BGP no-encapsulation mode:
```yml
calico_ipip_mode: 'Never'
calico_vxlan_mode: 'Never'
calico_network_backend: 'bird'
```
## Configuring interface MTU

View File

@ -61,12 +61,12 @@ gcloud compute networks subnets create kubernetes \
#### Firewall Rules
Create a firewall rule that allows internal communication across all protocols.
It is important to note that the ipip protocol has to be allowed in order for
It is important to note that the vxlan protocol has to be allowed in order for
the calico (see later) networking plugin to work.
```ShellSession
gcloud compute firewall-rules create kubernetes-the-kubespray-way-allow-internal \
--allow tcp,udp,icmp,ipip \
--allow tcp,udp,icmp,vxlan \
--network kubernetes-the-kubespray-way \
--source-ranges 10.240.0.0/24
```

View File

@ -21,7 +21,9 @@ Some variables of note include:
* *containerd_version* - Specify version of containerd to use when setting `container_manager` to `containerd`
* *docker_containerd_version* - Specify which version of containerd to use when setting `container_manager` to `docker`
* *etcd_version* - Specify version of ETCD to use
* *ipip* - Enables Calico ipip encapsulation by default
* *calico_ipip_mode* - Configures Calico ipip encapsulation - valid values are 'Never', 'Always' and 'CrossSubnet' (default 'Never')
* *calico_vxlan_mode* - Configures Calico vxlan encapsulation - valid values are 'Never', 'Always' and 'CrossSubnet' (default 'Always')
* *calico_network_backend* - Configures Calico network backend - valid values are 'none', 'bird' and 'vxlan' (default 'vxlan')
* *kube_network_plugin* - Sets k8s network plugin (default Calico)
* *kube_proxy_mode* - Changes k8s proxy mode to iptables mode
* *kube_version* - Specify a given Kubernetes version

View File

@ -75,15 +75,15 @@
# typha_max_connections_lower_limit: 300
# Set calico network backend: "bird", "vxlan" or "none"
# bird enable BGP routing, required for ipip mode.
# calico_network_backend: bird
# bird enable BGP routing, required for ipip and no encapsulation modes
# calico_network_backend: vxlan
# IP in IP and VXLAN is mutualy exclusive modes.
# set IP in IP encapsulation mode: "Always", "CrossSubnet", "Never"
# calico_ipip_mode: 'Always'
# calico_ipip_mode: 'Never'
# set VXLAN encapsulation mode: "Always", "CrossSubnet", "Never"
# calico_vxlan_mode: 'Never'
# calico_vxlan_mode: 'Always'
# set VXLAN port and VNI
# calico_vxlan_vni: 4096

View File

@ -36,6 +36,24 @@
- kube_network_plugin is defined
- not ignore_assert_errors
- name: Stop if legacy encapsulation variables are detected (ipip)
assert:
that:
- ipip is not defined
msg: "'ipip' configuration variable is deprecated, please configure your inventory with 'calico_ipip_mode' set to 'Always' or 'CrossSubnet' according to your specific needs"
when:
- kube_network_plugin == 'calico'
- not ignore_assert_errors
- name: Stop if legacy encapsulation variables are detected (ipip_mode)
assert:
that:
- ipip_mode is not defined
msg: "'ipip_mode' configuration variable is deprecated, please configure your inventory with 'calico_ipip_mode' set to 'Always' or 'CrossSubnet' according to your specific needs"
when:
- kube_network_plugin == 'calico'
- not ignore_assert_errors
- name: Stop if incompatible network plugin and cloudprovider
assert:
that:

View File

@ -6,16 +6,17 @@ nat_outgoing: true
calico_pool_name: "default-pool"
calico_ipv4pool_ipip: "Off"
# Use IP-over-IP encapsulation across hosts
ipip: true
ipip_mode: "{{ 'Always' if ipip else 'Never' }}" # change to "CrossSubnet" if you only want ipip encapsulation on traffic going across subnets
calico_ipip_mode: "{{ ipip_mode }}"
calico_vxlan_mode: 'Never'
# Change encapsulation mode, by default we enable vxlan which is the most mature and well tested mode
calico_ipip_mode: Never # valid values are 'Always', 'Never' and 'CrossSubnet'
calico_vxlan_mode: Always # valid values are 'Always', 'Never' and 'CrossSubnet'
calico_ipip_mode_ipv6: Never
calico_vxlan_mode_ipv6: Never
calico_pool_blocksize_ipv6: 116
# Calico network backend can be 'bird', 'vxlan' and 'none'
calico_network_backend: vxlan
calico_cert_dir: /etc/calico/certs
# Global as_num (/calico/bgp/v1/global/as_num)

View File

@ -11,8 +11,6 @@
that:
- "calico_network_backend in ['bird', 'vxlan', 'none']"
msg: "calico network backend is not 'bird', 'vxlan' or 'none'"
when:
- calico_network_backend is defined
- name: "Check ipip and vxlan mode defined correctly"
assert:

View File

@ -194,7 +194,7 @@
- inventory_hostname == groups['kube_control_plane'][0]
- 'calico_conf.stdout == "0"'
- name: Calico | Configure calico ipv6 network pool (version >= v3.3.0)
- name: Calico | Configure calico ipv6 network pool
command:
cmd: "{{ bin_dir }}/calicoctl.sh apply -f -"
stdin: >

View File

@ -15,12 +15,12 @@ data:
# essential.
typha_service_name: "calico-typha"
{% endif %}
{% if calico_network_backend is defined %}
cluster_type: "kubespray"
calico_backend: "{{ calico_network_backend }}"
{% else %}
{% if calico_network_backend == 'bird' %}
cluster_type: "kubespray,bgp"
calico_backend: "bird"
{% else %}
cluster_type: "kubespray"
calico_backend: "{{ calico_network_backend }}"
{% endif %}
{% if inventory_hostname in groups['k8s_cluster'] and peer_with_router|default(false) %}
as: "{{ local_as|default(global_as_num) }}"

View File

@ -176,7 +176,7 @@ spec:
- name: WAIT_FOR_DATASTORE
value: "true"
{% endif %}
{% if calico_network_backend is defined and calico_network_backend == 'vxlan' %}
{% if calico_network_backend == 'vxlan' %}
- name: FELIX_VXLANVNI
value: "{{ calico_vxlan_vni }}"
- name: FELIX_VXLANPORT
@ -319,7 +319,7 @@ spec:
command:
- /bin/calico-node
- -felix-live
{% if calico_network_backend|default("bird") == "bird" %}
{% if calico_network_backend == "bird" %}
- -bird-live
{% endif %}
periodSeconds: 10
@ -330,7 +330,7 @@ spec:
exec:
command:
- /bin/calico-node
{% if calico_network_backend|default("bird") == "bird" %}
{% if calico_network_backend == "bird" %}
- -bird-ready
{% endif %}
- -felix-ready

View File

@ -62,6 +62,14 @@ spec:
- --metrics-path={{ kube_router_metrics_path }}
- --metrics-port={{ kube_router_metrics_port }}
{% endif %}
{% if kube_router_enable_dsr %}
{% if container_manager == "docker" %}
- --runtime-endpoint=unix:///var/run/docker.sock
{% endif %}
{% if container_manager == "containerd" %}
{% endif %}
- --runtime-endpoint=unix:///run/containerd/containerd.sock
{% endif %}
{% for arg in kube_router_extra_args %}
- "{{ arg }}"
{% endfor %}
@ -86,9 +94,16 @@ spec:
privileged: true
volumeMounts:
{% if kube_router_enable_dsr %}
{% if container_manager == "docker" %}
- name: docker-socket
mountPath: /var/run/docker.sock
readOnly: true
{% endif %}
{% if container_manager == "containerd" %}
- name: containerd-socket
mountPath: /run/containerd/containerd.sock
readOnly: true
{% endif %}
{% endif %}
- name: lib-modules
mountPath: /lib/modules
@ -118,10 +133,18 @@ spec:
- operator: Exists
volumes:
{% if kube_router_enable_dsr %}
{% if container_manager == "docker" %}
- name: docker-socket
hostPath:
path: /var/run/docker.sock
type: Socket
{% endif %}
{% if container_manager == "containerd" %}
- name: containerd-socket
hostPath:
path: /run/containerd/containerd.sock
type: Socket
{% endif %}
{% endif %}
- name: lib-modules
hostPath:

View File

@ -79,4 +79,4 @@ create-vagrant:
cp /builds/kargo-ci/kubernetes-sigs-kubespray/inventory/sample/vagrant_ansible_inventory $(INVENTORY)
delete-vagrant:
vagrant destroy -f
vagrant destroy -f

View File

@ -12,3 +12,11 @@ etcd_deployment_type: docker
# Make docker happy
docker_containerd_version: latest
# Pin disabling ipip mode to ensure proper upgrade
ipip: false
calico_vxlan_mode: Always
calico_network_backend: bird
# Needed to bypass deprecation check
ignore_assert_errors: true

View File

@ -6,3 +6,11 @@ mode: default
# Docker specific settings:
container_manager: docker
etcd_deployment_type: docker
# Pin disabling ipip mode to ensure proper upgrade
ipip: false
calico_vxlan_mode: Always
calico_network_backend: bird
# Needed to bypass deprecation check
ignore_assert_errors: true

View File

@ -0,0 +1,15 @@
$num_instances = 2
$vm_memory ||= 2048
$os = "centos"
$kube_master_instances = 1
$etcd_instances = 1
# For CI we are not worried about data persistence across reboot
$libvirt_volume_cache = "unsafe"
# Checking for box update can trigger API rate limiting
# https://www.vagrantup.com/docs/vagrant-cloud/request-limits.html
$box_check_update = false
$network_plugin = "kube-router"

View File

@ -0,0 +1,15 @@
$num_instances = 2
$vm_memory ||= 2048
$os = "fedora35"
$kube_master_instances = 1
$etcd_instances = 1
# For CI we are not worried about data persistence across reboot
$libvirt_volume_cache = "unsafe"
# Checking for box update can trigger API rate limiting
# https://www.vagrantup.com/docs/vagrant-cloud/request-limits.html
$box_check_update = false
$network_plugin = "kube-router"

View File

@ -0,0 +1,15 @@
$num_instances = 2
$vm_memory ||= 2048
$os = "ubuntu1604"
$kube_master_instances = 1
$etcd_instances = 1
# For CI we are not worried about data persistence across reboot
$libvirt_volume_cache = "unsafe"
# Checking for box update can trigger API rate limiting
# https://www.vagrantup.com/docs/vagrant-cloud/request-limits.html
$box_check_update = false
$network_plugin = "kube-router"

View File

@ -0,0 +1,10 @@
$os = "ubuntu1604"
# For CI we are not worried about data persistence across reboot
$libvirt_volume_cache = "unsafe"
# Checking for box update can trigger API rate limiting
# https://www.vagrantup.com/docs/vagrant-cloud/request-limits.html
$box_check_update = false
$network_plugin = "kube-router"

View File

@ -62,7 +62,6 @@
- debug: # noqa unnamed-task
var: nca_pod.stdout_lines
failed_when: not nca_pod is success
when: inventory_hostname == groups['kube_control_plane'][0]
- name: Get netchecker agents
@ -78,16 +77,7 @@
agents.content[0] == '{' and
agents.content|from_json|length >= groups['k8s_cluster']|intersect(ansible_play_hosts)|length * 2
failed_when: false
no_log: true
- debug: # noqa unnamed-task
var: agents.content | from_json
failed_when: not agents is success and not agents.content=='{}'
run_once: true
when:
- agents.content is defined
- agents.content
- agents.content[0] == '{'
no_log: false
- name: Check netchecker status
uri:
@ -96,12 +86,12 @@
return_content: yes
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
register: result
register: connectivity_check
retries: 3
delay: "{{ agent_report_interval }}"
until: result.content|length > 0 and
result.content[0] == '{'
no_log: true
until: connectivity_check.content|length > 0 and
connectivity_check.content[0] == '{'
no_log: false
failed_when: false
when:
- agents.content != '{}'
@ -109,20 +99,19 @@
- debug: # noqa unnamed-task
var: ncs_pod
run_once: true
when: not result is success
- name: Get kube-proxy logs
command: "{{ bin_dir }}/kubectl -n kube-system logs -l k8s-app=kube-proxy"
no_log: false
when:
- inventory_hostname == groups['kube_control_plane'][0]
- not result is success
- not connectivity_check is success
- name: Get logs from other apps
command: "{{ bin_dir }}/kubectl -n kube-system logs -l k8s-app={{ item }} --all-containers"
when:
- inventory_hostname == groups['kube_control_plane'][0]
- not result is success
- not connectivity_check is success
no_log: false
with_items:
- kube-router
@ -131,27 +120,51 @@
- calico-node
- cilium
- debug: # noqa unnamed-task
var: result.content | from_json
failed_when: not result is success
- name: Parse agents list
set_fact:
agents_check_result: "{{ agents.content | from_json }}"
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
when:
- not agents.content == '{}'
- result.content
- result.content[0] == '{'
- agents is success
- agents.content is defined
- agents.content[0] == '{'
- debug: # noqa unnamed-task
var: result
failed_when: not result is success
var: agents_check_result
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
when:
- not agents.content == '{}'
- agents_check_result is defined
- name: Parse connectivity check
set_fact:
connectivity_check_result: "{{ connectivity_check.content | from_json }}"
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
when:
- connectivity_check is success
- connectivity_check.content is defined
- connectivity_check.content[0] == '{'
- debug: # noqa unnamed-task
msg: "Cannot get reports from agents, consider as PASSING"
var: connectivity_check_result
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
when:
- agents.content == '{}'
- connectivity_check_result is defined
- name: Check connectivity with all netchecker agents
assert:
that:
- agents_check_result is defined
- connectivity_check_result is defined
- agents_check_result.keys() | length > 0
- not connectivity_check_result.Absent
- not connectivity_check_result.Outdated
msg: "Connectivity check to netchecker agents failed"
delegate_to: "{{ groups['kube_control_plane'][0] }}"
run_once: true
- name: Create macvlan network conf
# We cannot use only shell: below because Ansible will render the text