Some installation are failing to authenticate with peers due to
etcd picking up/resoling the wrong node.
By setting 'etcd_peer_client_auth' to "False" you can disable peer client cert
authentication.
Signed-off-by: Sébastien Han <seb@redhat.com>
Update checksum for kubeadm
Use v1.9.0 kubeadm params
Include hash of ca.crt for kubeadm join
Update tag for testing upgrades
Add workaround for testing upgrades
Remove scale CI scenarios because of slow inventory parsing
in ansible 2.4.x.
Change region for tests to us-central1 to
improve ansible performance
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled).
Rework of #1937 with kubeadm support
Also, fixed an issue in `kubeadm-migrate-certs` where the old apiserver cert was copied as the kubeadm key
* Add Contiv support
Contiv is a network plugin for Kubernetes and Docker. It supports
vlan/vxlan/BGP/Cisco ACI technologies. It support firewall policies,
multiple networks and bridging pods onto physical networks.
* Update contiv version to 1.1.4
Update contiv version to 1.1.4 and added SVC_SUBNET in contiv-config.
* Load openvswitch module to workaround on CentOS7.4
* Set contiv cni version to 0.1.0
Correct contiv CNI version to 0.1.0.
* Use kube_apiserver_endpoint for K8S_API_SERVER
Use kube_apiserver_endpoint as K8S_API_SERVER to make contiv talks
to a available endpoint no matter if there's a loadbalancer or not.
* Make contiv use its own etcd
Before this commit, contiv is using a etcd proxy mode to k8s etcd,
this work fine when the etcd hosts are co-located with contiv etcd
proxy, however the k8s peering certs are only in etcd group, as a
result the etcd-proxy is not able to peering with the k8s etcd on
etcd group, plus the netplugin is always trying to find the etcd
endpoint on localhost, this will cause problem for all netplugins
not runnign on etcd group nodes.
This commit make contiv uses its own etcd, separate from k8s one.
on kube-master nodes (where net-master runs), it will run as leader
mode and on all rest nodes it will run as proxy mode.
* Use cp instead of rsync to copy cni binaries
Since rsync has been removed from hyperkube, this commit changes it
to use cp instead.
* Make contiv-etcd able to run on master nodes
* Add rbac_enabled flag for contiv pods
* Add contiv into CNI network plugin lists
* migrate contiv test to tests/files
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
* Add required rules for contiv netplugin
* Better handling json return of fwdMode
* Make contiv etcd port configurable
* Use default var instead of templating
* roles/download/defaults/main.yml: use contiv 1.1.7
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled). It's working, but so far I have had to:
1. Make the `uri` module "Wait for apiserver up" checks use `kube_apiserver_port` (HTTPS)
2. Add apiserver client cert/key to the "Wait for apiserver up" checks
3. Update apiserver liveness probe to use HTTPS ports
4. Set `kube_api_anonymous_auth` to true to allow liveness probe to hit apiserver's /healthz over HTTPS (livenessProbes can't use client cert/key unfortunately)
5. RBAC has to be enabled. Anonymous requests are in the `system:unauthenticated` group which is granted access to /healthz by one of RBAC's default ClusterRoleBindings. An equivalent ABAC rule could allow this as well.
Changes 1 and 2 should work for everyone, but 3, 4, and 5 require new coupling of currently independent configuration settings. So I also added a new settings check.
Options:
1. The problem goes away if you have both anonymous-auth and RBAC enabled. This is how kubeadm does it. This may be the best way to go since RBAC is already on by default but anonymous auth is not.
2. Include conditional templates to set a different liveness probe for possible combinations of `kube_apiserver_insecure_port = 0`, RBAC, and `kube_api_anonymous_auth` (won't be possible to cover every case without a guaranteed authorizer for the secure port)
3. Use basic auth headers for the liveness probe (I really don't like this, it adds a new dependency on basic auth which I'd also like to leave independently configurable, and it requires encoded passwords in the apiserver manifest)
Option 1 seems like the clear winner to me, but is there a reason we wouldn't want anonymous-auth on by default? The apiserver binary defaults anonymous-auth to true, but kubespray's default was false.
* Add possibility to insert more ip adresses in certificates
* Add newline at end of files
* Move supp ip parameters to k8s-cluster group file
* Add supplementary addresses in kubeadm master role
* Improve openssl indexes
This role only support Red Hat type distros and is not maintained
or used by many users. It should be removed because it creates
feature disparity between supported OSes and is not maintained.
* Rename dns_server to dnsmasq_dns_server so that it includes role prefix
as the var name is generic and conflicts when integrating with existing ansible automation.
* Enable selinux state to be configurable with new var preinstall_selinux_state
New files: /etc/kubernetes/admin.conf
/root/.kube/config
$GITDIR/artifacts/{kubectl,admin.conf}
Optional method to download kubectl and admin.conf if
kubeconfig_lcoalhost is set to true (default false)
* kubeadm support
* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job
* change ci boolean vars to json format
* fixup
* Update create-gce.yml
* Update create-gce.yml
* Update create-gce.yml
Change kubelet deploy mode to host
Enable cri and qos per cgroup for kubelet
Update CoreOS images
Add upgrade hook for switching from kubelet deployment from docker to host.
Bump machine type for ubuntu-rkt-sep
By default Calico CNI does not create any network access policies
or profiles if 'policy' is enabled in CNI config. And without any
policies/profiles network access to/from PODs is blocked.
K8s related policies are created by calico-policy-controller in
such case. So we need to start it as soon as possible, before any
real workloads.
This patch also fixes kube-api port in calico-policy-controller
yaml template.
Closes#1132
It is now possible to deactivate selected authentication methods
(basic auth, token auth) inside the cluster by adding
removing the required arguments to the Kube API Server and generating
the secrets accordingly.
The x509 authentification is currently not optional because disabling it
would affect the kubectl clients deployed on the master nodes.
To use OpenID Connect Authentication beside deploying an OpenID Connect
Identity Provider it is necesarry to pass additional arguments to the Kube API Server.
These required arguments were added to the kube apiserver manifest.
Operator can specify any port for kube-api (6443 default) This helps in
case where some pods such as Ingress require 443 exclusively.
Closes: 820
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Docker 1.13 changes the behaviour of iptables defaults from allow
to drop. This patch disables docker's iptables management as it was
in Docker 1.12 [1]
[1] https://github.com/docker/docker/pull/28257
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
kubelet lost the ability to load kernel modules. This
puts that back by adding the lib/modules mount to kubelet.
The new variable kubelet_load_modules can be set to true
to enable this item. It is OFF by default.
* Drop linux capabilities for unprivileged containerized
worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
components' static manifests, etcd, calico-rr and k8s apps,
like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
networking containers though).
* Grant the kube and netplug users read access for etcd certs via
the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
for control plane.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
systemd limits due to the lack of consensus in the kernel community
requires out-of-tree kernel patches.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Also place in global vars and do not repeat the kube_*_config_dir
and kube_namespace vars for better code maintainability and UX.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
skip_dnsmasq_k8s var as not needed anymore.
* Preconfigure DNS stack early, which may be the case when downloading
artifacts from intranet repositories. Do not configure
K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
not existing).
* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
was set up and before K8s apps to be created.
* Move docker install task to early stage as well and unbind it from the
etcd role's specific install path. Fix external flannel dependency on
docker role handlers. Also fix the docker restart handlers' steps
ordering to match the expected sequence (the socket then the service).
* Add default resolver fact, which is
the cloud provider specific and remove hardcoded GCE resolver.
* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
domains combined with high ndots values lead to poor performance of
DNS stack and make ansible workers to fail very often with the
"Timeout (12s) waiting for privilege escalation prompt:" error.
* Update docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add an option to deploy K8s app to test e2e network connectivity
and cluster DNS resolve via Kubedns for nethost/simple pods
(defaults to false).
* Parametrize existing k8s apps templates with kube_namespace and
kube_config_dir instead of hardcode.
* For CoreOS, ensure nameservers from inventory to be put in the
first place to allow hostnet pods connectivity via short names
or FQDN and hostnet agents to pass as well, if netchecker
deployed.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
According to http://kubernetes.io/docs/user-guide/images/ :
By default, the kubelet will try to pull each image from the
specified registry. However, if the imagePullPolicy property
of the container is set to IfNotPresent or Never, then a local\
image is used (preferentially or exclusively, respectively).
Use IfNotPresent value to allow images prepared by the download
role dependencies to be effectively used by kubelet without pull
errors resulting apps to stay blocked in PullBackOff/Error state
even when there are images on the localhost exist.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
test to change the machine type
Revert "test to change the machine type"
This reverts commit 7a91f1b5405a39bee6cb91940b09a0b0f9d3aee1.
use google dns server when no upstream dns are defined
comment upstream_dns_servers
update documentation
remove deprecated kubelet flags
Revert "remove deprecated kubelet flags"
This reverts commit 21e3b893c896d0291c36a07d0414f4cb88b8d8ac.
Also adds all masters by hostname and localhost/127.0.0.1 to
apiserver SSL certificate.
Includes documentation update on how localhost loadbalancer works.
* Add a var for ndots (default 5) and put it hosts' /etc/resolv.conf.
* Poke kube dns container image to v1.7
* In order to apply changes to kubelet, notify it to
be restarted on changes made to /etc/resolv.conf. Ignore errors as the kubelet
may yet to be present up to the moment of the notification being processed.
* Remove unnecessary kubelet restart for master role as the node role ensures
it is up and running. Notify master static pods waiters for apiserver,
scheduler, controller-manager instead.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Change additional dnsmasq opts:
- Adjust caching size and TTL
- Disable resolve conf to not create loops
- Change dnsPolicy to default (similarly to kubedns's dnsmasq). The
ClusterFirst should not be used to not create loops
- Disable negative NXDOMAIN replies to be cached
- Make its very installation as optional step (enabled by default).
If you don't want more than 3 DNS servers, including 1 for K8s, disable
it.
- Add docs and a drawing to clarify DNS setup.
- Fix stdout logs for dnsmasq/kubedns app configs
- Add missed notifies to resolvconf -u handler
- Fix idempotency of resolvconf head file changes
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Add HA docs for API server.
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Use facts for kube_apiserver to not repeat code and enable LB endpoints use.
* Use /healthz check for the wait-for apiserver.
* Use the single endpoint for kubelet instead of the list of apiservers
* Specify kube_apiserver_count to for HA layout
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>