Both kubedns and dnsmasq modes are long not maintained.
We should run dns_late steps at the end because sshd
makes DNS lookups during Ansible run and has 2s timeouts
for each failed lookup trying to connect to coredns before
it is ready.
* Add support for running a nodelocal dns cache
After encountering dns issues in a cluster I was recently working on I
noticed Kubernetes 1.13 introduced support for running a nodelocal dns
cache.
I believe this can usefull for more people.
73b548db06https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/0030-nodelocal-dns-cache.md
* Add requested changes
* Add additional requested changes + documentation
* Add requested changes after review
* Replace incorrect variable
Added CoreDNS to downloads
Updated with labels. Should now work without RBAC too
Fix DNS settings on hosts
Rename CoreDNS service from kube-dns to coredns
Add rotate based on http://edgeofsanity.net/rant/2017/12/20/systemd-resolved-is-broken.html
Updated docs with CoreDNS info
Added labels and fixed minor settings from official yaml file: https://github.com/kubernetes/kubernetes/blob/release-1.9/cluster/addons/dns/coredns.yaml.sed
Added a secondary deployment and secondary service ip. This is to mitigate dns timeouts and create high resitency for failures. See discussion at 'https://github.com/coreos/coreos-kubernetes/issues/641#issuecomment-281174806'
Set dns list correct. Thanks to @whereismyjetpack
Only download KubeDNS or CoreDNS if selected
Move dns cleanup to its own file and import tasks based on dns mode
Fix install of KubeDNS when dnsmask_kubedns mode is selected
Add new dns option coredns_dual for dual stack deployment. Added variable to configure replicas deployed. Updated docs for dual stack deployment. Removed rotate option in resolv.conf.
Run DNS manifests for CoreDNS and KubeDNS
Set skydns servers on dual stack deployment
Use only one template for CoreDNS dual deployment
Set correct cluster ip for the dns server
Do not repeat options and nameservers in the dhclient hooks.
Do not prepend nameservers for dhclient but supersede and fail back
to the upstream_dns_resolvers then default_resolver. Fixes order of
nameservers placement, which is cluster DNS ip goes always first.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* For Debian/RedHat OS families (with NetworkManager/dhclient/resolvconf
optionally enabled) prepend /etc/resolv.conf with required nameservers,
options, and supersede domain and search domains via the dhclient/resolvconf
hooks.
* Drop (z)nodnsupdate dhclient hook and re-implement it to complement the
resolvconf -u command, which is distro/cloud provider specific.
Update docs as well.
* Enable network restart to apply and persist changes and simplify handlers
to rely on network restart only. This fixes DNS resolve for hostnet K8s
pods for Red Hat OS family. Skip network restart for canal/calico plugins,
unless https://github.com/projectcalico/felix/issues/1185 fixed.
* Replace linefiles line plus with_items to block mode as it's faster.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
skip_dnsmasq_k8s var as not needed anymore.
* Preconfigure DNS stack early, which may be the case when downloading
artifacts from intranet repositories. Do not configure
K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
not existing).
* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
was set up and before K8s apps to be created.
* Move docker install task to early stage as well and unbind it from the
etcd role's specific install path. Fix external flannel dependency on
docker role handlers. Also fix the docker restart handlers' steps
ordering to match the expected sequence (the socket then the service).
* Add default resolver fact, which is
the cloud provider specific and remove hardcoded GCE resolver.
* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
domains combined with high ndots values lead to poor performance of
DNS stack and make ansible workers to fail very often with the
"Timeout (12s) waiting for privilege escalation prompt:" error.
* Update docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add a var for ndots (default 5) and put it hosts' /etc/resolv.conf.
* Poke kube dns container image to v1.7
* In order to apply changes to kubelet, notify it to
be restarted on changes made to /etc/resolv.conf. Ignore errors as the kubelet
may yet to be present up to the moment of the notification being processed.
* Remove unnecessary kubelet restart for master role as the node role ensures
it is up and running. Notify master static pods waiters for apiserver,
scheduler, controller-manager instead.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
- Update docs and a drawing to clarify DNS setup.
- Change order of nameservers placement to match
changes in https://github.com/kubespray/kargo/pull/501
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Change additional dnsmasq opts:
- Adjust caching size and TTL
- Disable resolve conf to not create loops
- Change dnsPolicy to default (similarly to kubedns's dnsmasq). The
ClusterFirst should not be used to not create loops
- Disable negative NXDOMAIN replies to be cached
- Make its very installation as optional step (enabled by default).
If you don't want more than 3 DNS servers, including 1 for K8s, disable
it.
- Add docs and a drawing to clarify DNS setup.
- Fix stdout logs for dnsmasq/kubedns app configs
- Add missed notifies to resolvconf -u handler
- Fix idempotency of resolvconf head file changes
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>