This replaces kube-master with kube_control_plane because of [1]: The Kubernetes project is moving away from wording that is considered offensive. A new working group WG Naming was created to track this work, and the word "master" was declared as offensive. A proposal was formalized for replacing the word "master" with "control plane". This means it should be removed from source code, documentation, and user-facing configuration from Kubernetes and its sub-projects. NOTE: The reason why this changes it to kube_control_plane not kube-control-plane is for valid group names on ansible. [1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2.5 KiB
Large deployments of K8s
For a large scaled deployments, consider the following configuration changes:
-
Tune ansible settings for
forks
andtimeout
vars to fit large numbers of nodes being deployed. -
Override containers'
foo_image_repo
vars to point to intranet registry. -
Override the
download_run_once: true
and/ordownload_localhost: true
. See download modes for details. -
Adjust the
retry_stagger
global var as appropriate. It should provide sane load on a delegate (the first K8s master node) then retrying failed push or download operations. -
Tune parameters for DNS related applications Those are
dns_replicas
,dns_cpu_limit
,dns_cpu_requests
,dns_memory_limit
,dns_memory_requests
. Please note that limits must always be greater than or equal to requests. -
Tune CPU/memory limits and requests. Those are located in roles' defaults and named like
foo_memory_limit
,foo_memory_requests
andfoo_cpu_limit
,foo_cpu_requests
. Note that 'Mi' memory units for K8s will be submitted as 'M', if applied fordocker run
, and cpu K8s units will end up with the 'm' skipped for docker as well. This is required as docker does not understand k8s units well. -
Tune
kubelet_status_update_frequency
to increase reliability of kubelet.kube_controller_node_monitor_grace_period
,kube_controller_node_monitor_period
,kube_apiserver_pod_eviction_not_ready_timeout_seconds
&kube_apiserver_pod_eviction_unreachable_timeout_seconds
for better Kubernetes reliability. Check out Kubernetes Reliability -
Tune network prefix sizes. Those are
kube_network_node_prefix
,kube_service_addresses
andkube_pods_subnet
. -
Add calico-rr nodes if you are deploying with Calico or Canal. Nodes recover from host/network interruption much quicker with calico-rr. Note that calico-rr role must be on a host without kube_control_plane or kube-node role (but etcd role is okay).
-
Check out the Inventory section of the Getting started guide for tips on creating a large scale Ansible inventory.
-
Override the
etcd_events_cluster_setup: true
store events in a separate dedicated etcd instance.
For example, when deploying 200 nodes, you may want to run ansible with
--forks=50
, --timeout=600
and define the retry_stagger: 60
.