Added CoreDNS to downloads
Updated with labels. Should now work without RBAC too
Fix DNS settings on hosts
Rename CoreDNS service from kube-dns to coredns
Add rotate based on http://edgeofsanity.net/rant/2017/12/20/systemd-resolved-is-broken.html
Updated docs with CoreDNS info
Added labels and fixed minor settings from official yaml file: https://github.com/kubernetes/kubernetes/blob/release-1.9/cluster/addons/dns/coredns.yaml.sed
Added a secondary deployment and secondary service ip. This is to mitigate dns timeouts and create high resitency for failures. See discussion at 'https://github.com/coreos/coreos-kubernetes/issues/641#issuecomment-281174806'
Set dns list correct. Thanks to @whereismyjetpack
Only download KubeDNS or CoreDNS if selected
Move dns cleanup to its own file and import tasks based on dns mode
Fix install of KubeDNS when dnsmask_kubedns mode is selected
Add new dns option coredns_dual for dual stack deployment. Added variable to configure replicas deployed. Updated docs for dual stack deployment. Removed rotate option in resolv.conf.
Run DNS manifests for CoreDNS and KubeDNS
Set skydns servers on dual stack deployment
Use only one template for CoreDNS dual deployment
Set correct cluster ip for the dns server
* Added cilium support
* Fix typo in debian test config
* Remove empty lines
* Changed cilium version from <latest> to <v1.0.0-rc3>
* Add missing changes for cilium
* Add cilium to CI pipeline
* Fix wrong file name
* Check kernel version for cilium
* fixed ci error
* fixed cilium-ds.j2 template
* added waiting for cilium pods to run
* Fixed missing EOF
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed too many blank lines
* Updated tolerations,annotations in cilium DS template
* Set cilium_version to iptables-1.9 to see if bug is fixed in CI
* Update cilium image tag to v1.0.0-rc4
* Update Cilium test case CI vars filenames
* Add optional prometheus flag, adjust initial readiness delay
* Update README.md with cilium info
kube-proxy is complaining of missing modules at startup. There is a plan
to also support an LVS implementation of kube-proxy in additon to
userspace and iptables
* Add Contiv support
Contiv is a network plugin for Kubernetes and Docker. It supports
vlan/vxlan/BGP/Cisco ACI technologies. It support firewall policies,
multiple networks and bridging pods onto physical networks.
* Update contiv version to 1.1.4
Update contiv version to 1.1.4 and added SVC_SUBNET in contiv-config.
* Load openvswitch module to workaround on CentOS7.4
* Set contiv cni version to 0.1.0
Correct contiv CNI version to 0.1.0.
* Use kube_apiserver_endpoint for K8S_API_SERVER
Use kube_apiserver_endpoint as K8S_API_SERVER to make contiv talks
to a available endpoint no matter if there's a loadbalancer or not.
* Make contiv use its own etcd
Before this commit, contiv is using a etcd proxy mode to k8s etcd,
this work fine when the etcd hosts are co-located with contiv etcd
proxy, however the k8s peering certs are only in etcd group, as a
result the etcd-proxy is not able to peering with the k8s etcd on
etcd group, plus the netplugin is always trying to find the etcd
endpoint on localhost, this will cause problem for all netplugins
not runnign on etcd group nodes.
This commit make contiv uses its own etcd, separate from k8s one.
on kube-master nodes (where net-master runs), it will run as leader
mode and on all rest nodes it will run as proxy mode.
* Use cp instead of rsync to copy cni binaries
Since rsync has been removed from hyperkube, this commit changes it
to use cp instead.
* Make contiv-etcd able to run on master nodes
* Add rbac_enabled flag for contiv pods
* Add contiv into CNI network plugin lists
* migrate contiv test to tests/files
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
* Add required rules for contiv netplugin
* Better handling json return of fwdMode
* Make contiv etcd port configurable
* Use default var instead of templating
* roles/download/defaults/main.yml: use contiv 1.1.7
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
Some time ago I think the hardcoded `/var/lib/docker` was required, but kubelet running in a container has been aware of the Docker path since at least as far back as k8s 1.6.
Without this change, you see a large number of errors in the kubelet logs if you installed with a non-default `docker_daemon_graph`
* Rename dns_server to dnsmasq_dns_server so that it includes role prefix
as the var name is generic and conflicts when integrating with existing ansible automation.
* Enable selinux state to be configurable with new var preinstall_selinux_state
PID namespace sharing is disabled only in Kubernetes 1.7.
Explicitily enabling it by default could help reduce unexpected
results when upgrading to or downgrading from 1.7.
This follows pull request #1677, adding the cgroup-driver
autodetection also for kubeadm way of deploying.
Info about this and the possibility to override is added to the docs.
Red Hat family platforms run docker daemon with `--exec-opt
native.cgroupdriver=systemd`. When kubespray tried to start kubelet
service, it failed with:
Error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
Setting kubelet's cgroup driver to the correct value for the platform
fixes this issue. The code utilizes autodetection of docker's cgroup
driver, as different RPMs for the same distro may vary in that regard.
* kubeadm support
* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job
* change ci boolean vars to json format
* fixup
* Update create-gce.yml
* Update create-gce.yml
* Update create-gce.yml
* Updates Controller Manager/Kubelet with Flannel's required configuration for CNI
* Removes old Flannel installation
* Install CNI enabled Flannel DaemonSet/ConfigMap/CNI bins and config (with portmap plugin) on host
* Uses RBAC if enabled
* Fixed an issue that could occur if br_netfilter is not a module and net.bridge.bridge-nf-call-iptables sysctl was not set
If Kubernetes > 1.6 register standalone master nodes w/ a
node-role.kubernetes.io/master=:NoSchedule taint to allow
for more flexible scheduling rather than just marking unschedulable.
According to code apiserver, scheduler, controller-manager, proxy don't
use resolution of objects they created. It's not harmful to change
policy to have external resolver.
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
In kubernetes 1.6 ClusterFirstWithHostNet was added as an option. In
accordance to it kubelet will generate resolv.conf based on own
resolv.conf. However, this doesn't create 'options', thus the proper
solution requires some investigation.
This patch sets the same resolv.conf for kubelet as host
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
Updates based on feedback
Simplify checks for file exists
remove invalid char
Review feedback. Use regular systemd file.
Add template for docker systemd atomic
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Kubelet is responsible for creating symlinks from /var/lib/docker to /var/log
to make fluentd logging collector work.
However without using host's /var/log those links are invisible to fluentd.
This is done on rkt configuration too.
Since systemd kubelet.service has {{ ssl_ca_dirs }}, fact should be
gathered before writing kubelet.service.
Closes: #1007
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
- Exclude kubelet CPU/RAM (kube-reserved) from cgroup. It decreases a
chance of overcommitment
- Add a possibility to modify Kubelet node-status-update-frequency
- Add a posibility to configure node-monitor-grace-period,
node-monitor-period, pod-eviction-timeout for Kubernetes controller
manager
- Add Kubernetes Relaibility Documentation with recomendations for
various scenarios.
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
kubelet lost the ability to load kernel modules. This
puts that back by adding the lib/modules mount to kubelet.
The new variable kubelet_load_modules can be set to true
to enable this item. It is OFF by default.
- Docker 1.12 and further don't need nsenter hack. This patch removes
it. Also, it bumps the minimal version to 1.12.
Closes#776
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
* Drop linux capabilities for unprivileged containerized
worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
components' static manifests, etcd, calico-rr and k8s apps,
like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
networking containers though).
* Grant the kube and netplug users read access for etcd certs via
the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
for control plane.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
systemd limits due to the lack of consensus in the kernel community
requires out-of-tree kernel patches.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Also place in global vars and do not repeat the kube_*_config_dir
and kube_namespace vars for better code maintainability and UX.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Fixes#655.
This is a teporary solution for long-polling idle connections to
apiserver. It will make Nginx not cut them for the duration of expected
timeout. It will also make Nginx extremely slow in realizing that there
is some issue with connectivity to apiserver as well, so it might not be
perfect permanent solution.
* Add dns_replicas, dns_memory/cpu_limit/requests vars for
dns related apps.
* When kube_log_level=4, log dnsmasq queries as well.
* Add log level control for skydns (part of kubedns app).
* Add limits/requests vars for dnsmasq (part of kubedns app) and
dnsmasq daemon set.
* Drop string defaults for kube_log_level as it is int and
is defined in the global vars as well.
* Add docs
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
According to http://kubernetes.io/docs/user-guide/images/ :
By default, the kubelet will try to pull each image from the
specified registry. However, if the imagePullPolicy property
of the container is set to IfNotPresent or Never, then a local\
image is used (preferentially or exclusively, respectively).
Use IfNotPresent value to allow images prepared by the download
role dependencies to be effectively used by kubelet without pull
errors resulting apps to stay blocked in PullBackOff/Error state
even when there are images on the localhost exist.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>