* allows to override the bind addresses for controller-manager and scheduler
Useful for Prometheus metrics monitoring
* Add bind addr override support in kubeadm/v1beta1
Adds support for override of bind addresses for controller-manager
and scheduler in kubeadm/v1beta1
* Move location of bind address vars
* Remove double declaration of schedulerExtraArgs
The change implemented in #3908 remove line breaks for supplementary
addresses in kubeadm SANs, causing errors in the config file and
failure to bring cluster up. This commit reimplement line breaks in
between supplementary addresses.
- Creates and defaults an ansible variable for every configuration option in the `kubeproxy.config.k8s.io/v1alpha1` type spec
- Fixes vars that were orphaned by removing non-kubeadm
- Fixes previously harcoded kubeadm values
- Introduces a `main` directory for role default files per component (requires ansible 2.6.0+)
- Split out just `kube-proxy.yml` in this first effort
- Removes the kube-proxy server field patch task
We should continue to pull out other components from `main.yml` into their own defaults files as I did here for `defaults/main/kube-proxy.yml`. I hope for and will need others to join me in this refactoring across the project until each component config template has a matching role defaults file, with shared defaults in `kubespray-defaults` or `downloads`
* controlPlaneEndpoint set up through load balancer should be possible even in single master setups
Enable load balancer for single-master setups
Fixes an issue where single-master setups are not reachable using the usual admin.conf from outside the cluster.
controlPlaneEndpoint set up through load balancer should be possible even in single master setups
* add fix to other api versions
* remove obsolete check completely
* remove check, pass 2
* removes checks in client configuration
* delete 'and'
* Upgrade kubernetes to v1.13.0
* Remove all precense of scheduler.alpha.kubernetes.io/critical-pod in templates
* Fix cert dir
* Use kubespray v2.8 as baseline for gitlab
* Remove non-kubeadm deployment
* More cleanup
* More cleanup
* More cleanup
* More cleanup
* Fix gitlab
* Try stop gce first before absent to make the delete process work
* More cleanup
* Fix bug with checking if kubeadm has already run
* Fix bug with checking if kubeadm has already run
* More fixes
* Fix test
* fix
* Fix gitlab checkout untill kubespray 2.8 is on quay
* Fixed
* Add upgrade path from non-kubeadm to kubeadm. Revert ssl path
* Readd secret checking
* Do gitlab checks from v2.7.0 test upgrade path to 2.8.0
* fix typo
* Fix CI jobs to kubeadm again. Fix broken hyperkube path
* Fix gitlab
* Fix rotate tokens
* More fixes
* More fixes
* Fix tokens
* Set configure-cloud-routes=false as default if no network plugin is used
As configure-cloud-routes default value is `true`, so it need to be set to `false` when not required to avoid error messages like:
"Couldn't reconcile node routes: error listing routes: unable to find route table for AWS cluster"
on, for example, AWS installations that don't use cloud native routing.
* Update kube-controller-manager.manifest.j2
remove extra spaces
* warning on meta flush_handlers
* avoid rm
* avoid "Module remote_tmp /root/.ansible/tmp did not exist and was created with a mode of 0700, this may cause issues when running as another user. To avoid this, create the remote_tmp dir with the correct permissions manually" warning on subsequent tasks using blockinfile
* is match
* failed
* version_compare
* succeeded
* skipped
* success
* version_compare becomes version since ansible 2.5
* ansible minimal version updated in doc and spec
* last version_compare
* [jjo] add kube-router support
Fixescloudnativelabs/kube-router#147.
* add kube-router as another network_plugin choice
* support most used kube-router flags via
`kube_router_foo` vars as other plugins
* implement replacing kube-proxy (--run-service-proxy=true) via
`kube_proxy_mode: none`, verified in a _non kubeadm_enabled_
install, should also work for recent kubeadm releases via
`skipKubeProxyInstall: true` config
* [jjo] address PR#3339 review from @woopstar
* add busybox image used by kube-router to downloads
* fix busybox download groups key
* rework kubeadm_enabled + kube_router_run_service_proxy
- verify it working ok w/the kubeadm_enabled and
kube_router_run_service_proxy true or false
- introduce `kube_proxy_remove` fact, to decouple logic
from kube_proxy_mode (which affects kubeadm configmap
settings, thus no-good to ab-use it to 'none')
* improve kube-router.md re: kubeadm_enabled and kube_router_run_service_proxy
* address @woopstar latest review
* add inventory/sample/group_vars/k8s-cluster/k8s-net-kube-router.yml
* fix kube_router_run_service_proxy conditional for kube-proxy removal
* fix kube_proxy_remove fact (w/ |bool), add some needed kube-proxy tags on my and existing changes
* update kube-router tolerations for 1.12 compatibility
* add PriorityClass to kube-router DaemonSet
* Changes to assign pod priority to kube components.
* Removed the boolean flag pod_priority_assignment
* Created new priorityclass k8s-cluster-critical
* Created new priorityclass k8s-cluster-critical
* Fixed the trailing spaces
* Fixed the trailing spaces
* Added kube version check while creating Priority Class k8s-cluster-critical
* Moved k8s-cluster-critical.yml
* Moved k8s-cluster-critical.yml to kube_config_dir
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
remove empty when line
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
force kubeadm upgrade due to failure without --force flag
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
added nodeSelector to have compatibility with hybrid cluster with win nodes, also fix for download with missing container type
fixes in syntax and LF for newline in files
fix on yamllint check
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
some cleanup for innecesary lines
remove conditions for nodeselector
Upgrade Kubernetes to V1.11.2
The kubeadm configuration file version has been upgraded from v1alpha1 to v1alpha2
Add bootstrap kubeadm-config.yaml with external etcd
* Move front-proxy-client certs back to kube mount
We want the same CA for all k8s certs
* Refactor vault to use a third party module
The module adds idempotency and reduces some of the repetitive
logic in the vault role
Requires ansible-modules-hashivault on ansible node and hvac
on the vault hosts themselves
Add upgrade test scenario
Remove bootstrap-os tags from tasks
* fix upgrade issues
* improve unseal logic
* specify ca and fix etcd check
* Fix initialization check
bump machine size
The current way to setup the etc cluster is messy and buggy.
- It checks for cluster is healthy before the cluster is even created.
- The unit files are started on handlers, not in the task, so you mess with "flush handlers".
- The join_member.yml is not used.
- etcd events cluster is not configured for kubeadm
- remove duplicate runs between running the role on etcd nodes and k8s nodes
* Added option for encrypting secrets to etcd
* Fix keylength to 32
* Forgot the default
* Rename secrets.yaml to secrets_encryption.yaml
* Fix static path for secrets file to use ansible variable
* Rename secrets.yaml.j2 to secrets_encryption.yaml.j2
* Base64 encode the token
* Fixed merge error
* Changed path to credentials dir
* Update path to secrets file which is now readable inside the apiserver container. Set better file permissions
* Add encryption option to k8s-cluster.yml
Setting the following:
```
kube_kubeadm_controller_extra_args:
address: 0.0.0.0
terminated-pod-gc-threshold: "100"
```
Results in `terminated-pod-gc-threshold: 100` in the kubeadm config file. But it has to be a string to work.
to the API server configuration.
This solves the problem where if you have non-resolvable node names,
and try to scale the server by adding new nodes, kubectl commands
start to fail for newly added nodes, giving a TCP timeout error when
trying to resolve the node hostname against a public DNS.
* Set filemode to 0640
weave-net.yml file is readable by all users on the host. It however contains the weave_password to encrypt all pod communication. It should only be readable by root.
* Set mode 0640 on users_file with basic auth
* Added cilium support
* Fix typo in debian test config
* Remove empty lines
* Changed cilium version from <latest> to <v1.0.0-rc3>
* Add missing changes for cilium
* Add cilium to CI pipeline
* Fix wrong file name
* Check kernel version for cilium
* fixed ci error
* fixed cilium-ds.j2 template
* added waiting for cilium pods to run
* Fixed missing EOF
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed too many blank lines
* Updated tolerations,annotations in cilium DS template
* Set cilium_version to iptables-1.9 to see if bug is fixed in CI
* Update cilium image tag to v1.0.0-rc4
* Update Cilium test case CI vars filenames
* Add optional prometheus flag, adjust initial readiness delay
* Update README.md with cilium info
Update checksum for kubeadm
Use v1.9.0 kubeadm params
Include hash of ca.crt for kubeadm join
Update tag for testing upgrades
Add workaround for testing upgrades
Remove scale CI scenarios because of slow inventory parsing
in ansible 2.4.x.
Change region for tests to us-central1 to
improve ansible performance
As we have seen with other containers, sometimes container removal fails on the first attempt due to some Docker bugs. Retrying typically corrects the issue.
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled).
Rework of #1937 with kubeadm support
Also, fixed an issue in `kubeadm-migrate-certs` where the old apiserver cert was copied as the kubeadm key
* Allow setting --bind-address for apiserver hyperkube
This is required if you wish to configure a loadbalancer (e.g haproxy)
running on the master nodes without choosing a different port for the
vip from that used by the API - in this case you need the API to bind to
a specific interface, then haproxy can bind the same port on the VIP:
root@overcloud-controller-0 ~]# netstat -taupen | grep 6443
tcp 0 0 192.168.24.6:6443 0.0.0.0:* LISTEN 0 680613 134504/haproxy
tcp 0 0 192.168.24.16:6443 0.0.0.0:* LISTEN 0 653329 131423/hyperkube
tcp 0 0 192.168.24.16:6443 192.168.24.16:58404 ESTABLISHED 0 652991 131423/hyperkube
tcp 0 0 192.168.24.16:58404 192.168.24.16:6443 ESTABLISHED 0 652986 131423/hyperkube
This can be achieved e.g via:
kube_apiserver_bind_address: 192.168.24.16
* Address code review feedback
* Update kube-apiserver.manifest.j2
* Defaults for apiserver_loadbalancer_domain_name
When loadbalancer_apiserver is defined, use the
apiserver_loadbalancer_domain_name with a given default value.
Fix unconsistencies for checking if apiserver_loadbalancer_domain_name
is defined AND using it with a default value provided at once.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Define defaults for LB modes in common defaults
Adjust the defaults for apiserver_loadbalancer_domain_name and
loadbalancer_apiserver_localhost to come from a single source, which is
kubespray-defaults. Removes some confusion and simplefies the code.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Thought this wasn't required at first but I forgot there's no auto flush at the end of these tasks since the `kubernetes/master` role is not the end of the play.
* Fixes an issue where apiserver and friends (controller manager, scheduler) were prevented from restarting after manifests/secrets are changed. This occurred when a replaced kubelet doesn't reconcile new master manifests, which caused old master component versions to linger during deployment. In my case this was causing upgrades from k8s 1.6/1.7 -> k8s 1.8 to fail
* Improves transitions from kubelet container to host kubelet by preventing issues where kubelet container reappeared during the deployment
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled). It's working, but so far I have had to:
1. Make the `uri` module "Wait for apiserver up" checks use `kube_apiserver_port` (HTTPS)
2. Add apiserver client cert/key to the "Wait for apiserver up" checks
3. Update apiserver liveness probe to use HTTPS ports
4. Set `kube_api_anonymous_auth` to true to allow liveness probe to hit apiserver's /healthz over HTTPS (livenessProbes can't use client cert/key unfortunately)
5. RBAC has to be enabled. Anonymous requests are in the `system:unauthenticated` group which is granted access to /healthz by one of RBAC's default ClusterRoleBindings. An equivalent ABAC rule could allow this as well.
Changes 1 and 2 should work for everyone, but 3, 4, and 5 require new coupling of currently independent configuration settings. So I also added a new settings check.
Options:
1. The problem goes away if you have both anonymous-auth and RBAC enabled. This is how kubeadm does it. This may be the best way to go since RBAC is already on by default but anonymous auth is not.
2. Include conditional templates to set a different liveness probe for possible combinations of `kube_apiserver_insecure_port = 0`, RBAC, and `kube_api_anonymous_auth` (won't be possible to cover every case without a guaranteed authorizer for the secure port)
3. Use basic auth headers for the liveness probe (I really don't like this, it adds a new dependency on basic auth which I'd also like to leave independently configurable, and it requires encoded passwords in the apiserver manifest)
Option 1 seems like the clear winner to me, but is there a reason we wouldn't want anonymous-auth on by default? The apiserver binary defaults anonymous-auth to true, but kubespray's default was false.