Red Hat has this theory that binaries in sbin are too dangerous to be on
the default path, but we need them anyway.
RH7 has /sbin and /usr/sbin as symlinks, so that is no longer important.
I'm adding it to the `PATH` instead of making the path to `modinfo`
absolute because I am worried about breaking support for other
distributions.
On Aarch64, the default cgroup driver for docker is systemd
instead of cgroupfs. Should conform kubelet to use systemd
as cgroup driver as well to keep it consistent with docker.
Without this change, below exception will be raised.
/usr/bin/docker-current: Error response from daemon: shim
error: docker-runc not installed on system.
Change-Id: Id496ec9eaac6580e4da2f3ef1a386c9abc2a5129
The number of pods on a given node is determined by the --max-pods=k
directive. When the address space is exhausted, no more pods can be
scheduled even if from the --max-pods-perspective, the node still has
capacity.
The special case that a pod is scheduled and uses the node IP in the
host network namespace is too "soft" to derive a guarantee.
Comparing kubelet_max_pods with kube_network_node_prefix when given
allows to assert that pod limits match the CIDR address space.
* Move front-proxy-client certs back to kube mount
We want the same CA for all k8s certs
* Refactor vault to use a third party module
The module adds idempotency and reduces some of the repetitive
logic in the vault role
Requires ansible-modules-hashivault on ansible node and hvac
on the vault hosts themselves
Add upgrade test scenario
Remove bootstrap-os tags from tasks
* fix upgrade issues
* improve unseal logic
* specify ca and fix etcd check
* Fix initialization check
bump machine size
* sysctl file should be in defaults so that it can be overriden
* Change sysctl_file_path to be consistent with roles/kubernetes/preinstall/defaults/main.yml
While `do` looks cleaner, forcing this extra option in ansible.cfg
seems to be more invasive. It would be better to keep the traditional
approach of `set dummy = ` instead.
Added CoreDNS to downloads
Updated with labels. Should now work without RBAC too
Fix DNS settings on hosts
Rename CoreDNS service from kube-dns to coredns
Add rotate based on http://edgeofsanity.net/rant/2017/12/20/systemd-resolved-is-broken.html
Updated docs with CoreDNS info
Added labels and fixed minor settings from official yaml file: https://github.com/kubernetes/kubernetes/blob/release-1.9/cluster/addons/dns/coredns.yaml.sed
Added a secondary deployment and secondary service ip. This is to mitigate dns timeouts and create high resitency for failures. See discussion at 'https://github.com/coreos/coreos-kubernetes/issues/641#issuecomment-281174806'
Set dns list correct. Thanks to @whereismyjetpack
Only download KubeDNS or CoreDNS if selected
Move dns cleanup to its own file and import tasks based on dns mode
Fix install of KubeDNS when dnsmask_kubedns mode is selected
Add new dns option coredns_dual for dual stack deployment. Added variable to configure replicas deployed. Updated docs for dual stack deployment. Removed rotate option in resolv.conf.
Run DNS manifests for CoreDNS and KubeDNS
Set skydns servers on dual stack deployment
Use only one template for CoreDNS dual deployment
Set correct cluster ip for the dns server
* Added cilium support
* Fix typo in debian test config
* Remove empty lines
* Changed cilium version from <latest> to <v1.0.0-rc3>
* Add missing changes for cilium
* Add cilium to CI pipeline
* Fix wrong file name
* Check kernel version for cilium
* fixed ci error
* fixed cilium-ds.j2 template
* added waiting for cilium pods to run
* Fixed missing EOF
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed too many blank lines
* Updated tolerations,annotations in cilium DS template
* Set cilium_version to iptables-1.9 to see if bug is fixed in CI
* Update cilium image tag to v1.0.0-rc4
* Update Cilium test case CI vars filenames
* Add optional prometheus flag, adjust initial readiness delay
* Update README.md with cilium info
* allow installs to not have hostname overriden with fqdn from inventory
* calico-config no longer requires local as and will default to global
* when cloudprovider is not defined, use the inventory_hostname for cni-calico
* allow reset to not restart network (buggy nodes die with this cmd)
* default kube_override_hostname to inventory_hostname instead of ansible_hostname
kube-proxy is complaining of missing modules at startup. There is a plan
to also support an LVS implementation of kube-proxy in additon to
userspace and iptables
Starting with Kubernetes v1.8.4, kubelet ignores the AWS cloud
provider string and uses the override hostname, which fails
Node admission checks.
Fixes#2094
As we have seen with other containers, sometimes container removal fails on the first attempt due to some Docker bugs. Retrying typically corrects the issue.
* Add Contiv support
Contiv is a network plugin for Kubernetes and Docker. It supports
vlan/vxlan/BGP/Cisco ACI technologies. It support firewall policies,
multiple networks and bridging pods onto physical networks.
* Update contiv version to 1.1.4
Update contiv version to 1.1.4 and added SVC_SUBNET in contiv-config.
* Load openvswitch module to workaround on CentOS7.4
* Set contiv cni version to 0.1.0
Correct contiv CNI version to 0.1.0.
* Use kube_apiserver_endpoint for K8S_API_SERVER
Use kube_apiserver_endpoint as K8S_API_SERVER to make contiv talks
to a available endpoint no matter if there's a loadbalancer or not.
* Make contiv use its own etcd
Before this commit, contiv is using a etcd proxy mode to k8s etcd,
this work fine when the etcd hosts are co-located with contiv etcd
proxy, however the k8s peering certs are only in etcd group, as a
result the etcd-proxy is not able to peering with the k8s etcd on
etcd group, plus the netplugin is always trying to find the etcd
endpoint on localhost, this will cause problem for all netplugins
not runnign on etcd group nodes.
This commit make contiv uses its own etcd, separate from k8s one.
on kube-master nodes (where net-master runs), it will run as leader
mode and on all rest nodes it will run as proxy mode.
* Use cp instead of rsync to copy cni binaries
Since rsync has been removed from hyperkube, this commit changes it
to use cp instead.
* Make contiv-etcd able to run on master nodes
* Add rbac_enabled flag for contiv pods
* Add contiv into CNI network plugin lists
* migrate contiv test to tests/files
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
* Add required rules for contiv netplugin
* Better handling json return of fwdMode
* Make contiv etcd port configurable
* Use default var instead of templating
* roles/download/defaults/main.yml: use contiv 1.1.7
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
* Defaults for apiserver_loadbalancer_domain_name
When loadbalancer_apiserver is defined, use the
apiserver_loadbalancer_domain_name with a given default value.
Fix unconsistencies for checking if apiserver_loadbalancer_domain_name
is defined AND using it with a default value provided at once.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Define defaults for LB modes in common defaults
Adjust the defaults for apiserver_loadbalancer_domain_name and
loadbalancer_apiserver_localhost to come from a single source, which is
kubespray-defaults. Removes some confusion and simplefies the code.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Fixes an issue where apiserver and friends (controller manager, scheduler) were prevented from restarting after manifests/secrets are changed. This occurred when a replaced kubelet doesn't reconcile new master manifests, which caused old master component versions to linger during deployment. In my case this was causing upgrades from k8s 1.6/1.7 -> k8s 1.8 to fail
* Improves transitions from kubelet container to host kubelet by preventing issues where kubelet container reappeared during the deployment
Some time ago I think the hardcoded `/var/lib/docker` was required, but kubelet running in a container has been aware of the Docker path since at least as far back as k8s 1.6.
Without this change, you see a large number of errors in the kubelet logs if you installed with a non-default `docker_daemon_graph`
* Refactor downloads to use download role directly
Also disable fact delegation so download delegate works acros OSes.
* clean up bools and ansible_os_family conditionals
* Rename dns_server to dnsmasq_dns_server so that it includes role prefix
as the var name is generic and conflicts when integrating with existing ansible automation.
* Enable selinux state to be configurable with new var preinstall_selinux_state
PID namespace sharing is disabled only in Kubernetes 1.7.
Explicitily enabling it by default could help reduce unexpected
results when upgrading to or downgrading from 1.7.
This follows pull request #1677, adding the cgroup-driver
autodetection also for kubeadm way of deploying.
Info about this and the possibility to override is added to the docs.
Red Hat family platforms run docker daemon with `--exec-opt
native.cgroupdriver=systemd`. When kubespray tried to start kubelet
service, it failed with:
Error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
Setting kubelet's cgroup driver to the correct value for the platform
fixes this issue. The code utilizes autodetection of docker's cgroup
driver, as different RPMs for the same distro may vary in that regard.
* kubeadm support
* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job
* change ci boolean vars to json format
* fixup
* Update create-gce.yml
* Update create-gce.yml
* Update create-gce.yml
This sets br_netfilter and net.bridge.bridge-nf-call-iptables sysctl from a single play before kube-proxy is first ran instead of from the flannel and weave network_plugin roles after kube-proxy is started
* Updates Controller Manager/Kubelet with Flannel's required configuration for CNI
* Removes old Flannel installation
* Install CNI enabled Flannel DaemonSet/ConfigMap/CNI bins and config (with portmap plugin) on host
* Uses RBAC if enabled
* Fixed an issue that could occur if br_netfilter is not a module and net.bridge.bridge-nf-call-iptables sysctl was not set
* Adding yaml linter to ci check
* Minor linting fixes from yamllint
* Changing CI to install python pkgs from requirements.txt
- adding in a secondary requirements.txt for tests
- moving yamllint to tests requirements
If Kubernetes > 1.6 register standalone master nodes w/ a
node-role.kubernetes.io/master=:NoSchedule taint to allow
for more flexible scheduling rather than just marking unschedulable.
Change kubelet deploy mode to host
Enable cri and qos per cgroup for kubelet
Update CoreOS images
Add upgrade hook for switching from kubelet deployment from docker to host.
Bump machine type for ubuntu-rkt-sep
According to code apiserver, scheduler, controller-manager, proxy don't
use resolution of objects they created. It's not harmful to change
policy to have external resolver.
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>