* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Docker 1.13 changes the behaviour of iptables defaults from allow
to drop. This patch disables docker's iptables management as it was
in Docker 1.12 [1]
[1] https://github.com/docker/docker/pull/28257
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
kubelet lost the ability to load kernel modules. This
puts that back by adding the lib/modules mount to kubelet.
The new variable kubelet_load_modules can be set to true
to enable this item. It is OFF by default.
* Drop linux capabilities for unprivileged containerized
worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
components' static manifests, etcd, calico-rr and k8s apps,
like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
networking containers though).
* Grant the kube and netplug users read access for etcd certs via
the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
for control plane.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
systemd limits due to the lack of consensus in the kernel community
requires out-of-tree kernel patches.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Also place in global vars and do not repeat the kube_*_config_dir
and kube_namespace vars for better code maintainability and UX.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
skip_dnsmasq_k8s var as not needed anymore.
* Preconfigure DNS stack early, which may be the case when downloading
artifacts from intranet repositories. Do not configure
K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
not existing).
* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
was set up and before K8s apps to be created.
* Move docker install task to early stage as well and unbind it from the
etcd role's specific install path. Fix external flannel dependency on
docker role handlers. Also fix the docker restart handlers' steps
ordering to match the expected sequence (the socket then the service).
* Add default resolver fact, which is
the cloud provider specific and remove hardcoded GCE resolver.
* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
domains combined with high ndots values lead to poor performance of
DNS stack and make ansible workers to fail very often with the
"Timeout (12s) waiting for privilege escalation prompt:" error.
* Update docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add an option to deploy K8s app to test e2e network connectivity
and cluster DNS resolve via Kubedns for nethost/simple pods
(defaults to false).
* Parametrize existing k8s apps templates with kube_namespace and
kube_config_dir instead of hardcode.
* For CoreOS, ensure nameservers from inventory to be put in the
first place to allow hostnet pods connectivity via short names
or FQDN and hostnet agents to pass as well, if netchecker
deployed.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
According to http://kubernetes.io/docs/user-guide/images/ :
By default, the kubelet will try to pull each image from the
specified registry. However, if the imagePullPolicy property
of the container is set to IfNotPresent or Never, then a local\
image is used (preferentially or exclusively, respectively).
Use IfNotPresent value to allow images prepared by the download
role dependencies to be effectively used by kubelet without pull
errors resulting apps to stay blocked in PullBackOff/Error state
even when there are images on the localhost exist.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
test to change the machine type
Revert "test to change the machine type"
This reverts commit 7a91f1b5405a39bee6cb91940b09a0b0f9d3aee1.
use google dns server when no upstream dns are defined
comment upstream_dns_servers
update documentation
remove deprecated kubelet flags
Revert "remove deprecated kubelet flags"
This reverts commit 21e3b893c896d0291c36a07d0414f4cb88b8d8ac.
Also adds all masters by hostname and localhost/127.0.0.1 to
apiserver SSL certificate.
Includes documentation update on how localhost loadbalancer works.
* Add a var for ndots (default 5) and put it hosts' /etc/resolv.conf.
* Poke kube dns container image to v1.7
* In order to apply changes to kubelet, notify it to
be restarted on changes made to /etc/resolv.conf. Ignore errors as the kubelet
may yet to be present up to the moment of the notification being processed.
* Remove unnecessary kubelet restart for master role as the node role ensures
it is up and running. Notify master static pods waiters for apiserver,
scheduler, controller-manager instead.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Change additional dnsmasq opts:
- Adjust caching size and TTL
- Disable resolve conf to not create loops
- Change dnsPolicy to default (similarly to kubedns's dnsmasq). The
ClusterFirst should not be used to not create loops
- Disable negative NXDOMAIN replies to be cached
- Make its very installation as optional step (enabled by default).
If you don't want more than 3 DNS servers, including 1 for K8s, disable
it.
- Add docs and a drawing to clarify DNS setup.
- Fix stdout logs for dnsmasq/kubedns app configs
- Add missed notifies to resolvconf -u handler
- Fix idempotency of resolvconf head file changes
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add HA docs for API server.
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Use facts for kube_apiserver to not repeat code and enable LB endpoints use.
* Use /healthz check for the wait-for apiserver.
* Use the single endpoint for kubelet instead of the list of apiservers
* Specify kube_apiserver_count to for HA layout
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Add loadbalancer_apiserver_localhost (default false). If enabled, override
the external LB and expect localhost:443/8080 to be new internal only frontends.
* Add kube_apiserver_multiaccess to ignore loadbalancers, and make clients
to access the apiservers as a comma-separated list of access_ip/ip/ansible ip
(a default mode). When disabled, allow clients to use the given loadbalancers.
* Define connections security mode for kube controllers, schedulers, proxies.
It is insecure be default, which is the current deployment choice.
* Rework the groups['kube-master'][0] hardcode defining the apiserver
endpoints.
* Improve grouping of vars and add facts for kube_apiserver.
* Define kube_apiserver_insecure_bind_address as a fact, add more
facts for ease of use.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Enforce a etcd-proxy role to a k8s-cluster group members. This
provides an HA layout for all of the k8s cluster internal clients.
* Proxies to be run on each node in the group as a separate etcd
instances with a readwrite proxy mode and listen the given endpoint,
which is either the access_ip:2379 or the localhost:2379.
* A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and
loadbalancers and use the etcd members IPs as a comma-separated
list. Otherwise, clients shall use the local endpoint provided by a
etcd-proxy instances on each etcd node. A Netwroking plugins always
use that access mode.
* Fix apiserver's etcd servers args to use the etcd_access_endpoint.
* Fix networking plugins flannel/calico to use the etcd_endpoint.
* Fix name env var for non masters to be set as well.
* Fix etcd_client_url was not used anywhere and other etcd_* facts
evaluation was duplicated in a few places.
* Define proxy modes only in the env file, if not a master. Del
an automatic proxy mode decisions for etcd nodes in init/unit scripts.
* Use Wants= instead of Requires= as "This is the recommended way to
hook start-up of one unit to the start-up of another unit"
* Make apiserver/calico Wants= etcd-proxy to keep it always up
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
Many use cases of k8s involve running a local
registry, chances are the person running this
will learn the hard way that they need to allow
insecure registry on the `kube_service_addresses`
network.
We should just default to settings this in
`inventory/group_vars/all.yml` to help reduce
potential friction for first time users.
This allows you to simply run `vagrant up` to get a 3 node HA cluster.
* Creates a dynamic inventory and uses the inventory/group_vars/all.yml
* commented lines in inventory.example so that ansible doesn't try to use it.
* added requirements.txt to give easy way to install ansible/ipaddr
* added gitignore files to stop attempts to save unwated files
* changed `Check if kube-system exists` to `failed_when: false` instead of
`ignore_errors`
Currently kubespray does not install kubernetes in a way that allows cinder volumes to be used. This commit provides the necessary cloud configuration file and configures kubelet and kube-apiserver to use it.
Each node can have 3 IPs.
1. ansible_default_ip4 - whatever ansible things is the first IPv4 address
usually with the default gw.
2. ip - An address to use on the local node to bind listeners and do local
communication. For example, Vagrant boxes have a first address that is the
NAT bridge and is common for all nodes. The second address/interface should
be used.
3. access_ip - An address to use for node-to-node access. This is assumed to
be used by other nodes to access the node and may not be actually assigned
on the node. For example, AWS public ip that is not assigned to node.
This updates the places addresses are used to use either ip or access_ip and walk
up the list to find an address.