Commit graph

383 commits

Author SHA1 Message Date
Vladimir Vasilkin
f0a04b4d65 wait 5 * 4 secs until Tiller starts 2018-03-30 00:09:36 +03:00
Vladimir Vasilkin
760ca1c3a9 adding checking for prometheus_operator_enabled 2018-03-29 23:03:43 +03:00
Vladimir Vasilkin
23b3833806 running on the first master only. 2018-03-29 22:51:46 +03:00
Kuldip Madnani
daeeae1a91 Added retries in pre-upgrade.yml and retries while applying kube-dns.yml (#2553)
* Added retries in pre-upgrade.yml and retries while applying kube-dns.yml

* Removed trailing spaces
2018-03-29 11:37:32 -05:00
Vladimir Vasilkin
19e1b11d98 prometheus operator, metrics for k8s cluster
install using Helm:
- Prometheus Operator
- metrics for k8s cluster including: grafana dashboard, alertmanager, node exporters

base project:
https://github.com/coreos/prometheus-operator

the issue:
https://github.com/kubernetes-incubator/kubespray/issues/2042

Previous PR, raw ansible without Helm:
https://github.com/kubernetes-incubator/kubespray/pull/2499
2018-03-28 21:23:30 +03:00
Andreas Krüger
03117d9572
Merge pull request #2488 from LuckySB/ingress-nginx-node-role
Dedicated node for ingress nginx controller
2018-03-28 14:07:40 +02:00
Michael Zehrer
b8d1652baf
Remove kibana_base_url
The default for kibana_base_url does not make sense an makes kibana unusable. The default path forces a 404 when you try to open kibana in the browser. Not setting kibana_base_url works just fine.
2018-03-25 16:08:07 +02:00
Wong Hoi Sing Edison
206e24448b CephFS Provisioner Addon Fixup 2018-03-22 23:03:13 +08:00
Wong Hoi Sing Edison
bb1eb9fec8 Add labels for namespace 2018-03-22 21:33:32 +08:00
Keyvan Hedayati
b0d7115e9b hswong3i/kubespray#3: Use {{ cluster_name }} for valid FQDN in REGISTRY_HOST 2018-03-22 21:33:32 +08:00
Wong Hoi Sing Edison
f8ebd08e75 Registry Addon Fixup 2018-03-22 21:33:32 +08:00
gorazio
96e46c4209
bump after CLA signing 2018-03-20 10:23:50 +03:00
gorazio
aa30fa8009
Add prometheus annotations to spec in ingress
Added annotations from metadata to spec.template.metadata. Without it, pod does not get any annotations, and Prometheus didn't see it
2018-03-20 08:47:36 +03:00
Andreas Holmsten
14ac7d797b Rotate local-volume-provisioner token
When tokens need to rotate, include local-volume-provisioner
2018-03-19 13:04:18 +01:00
Sergey Bondarev
038da7255f check if group kube-ingress is not empty
fix spelling mistaker ingress_nginx_host_network
set default value for ingress_nginx_host_network: false
2018-03-19 12:59:38 +03:00
woopstar
f1d2f84043 Only apply roles from first master node to fix regression 2018-03-18 16:15:01 +01:00
Sergey Bondarev
1481f7d64b Dedicated node for ingress nginx controller
The ability to create dedicated node for ingress nginx controller
host type network for nginx controller

and add from example https://github.com/kubernetes/ingress-nginx/blob/master/docs/examples/static-ip/nginx-ingress-controller.yaml
terminationGracePeriodSeconds: 60
2018-03-17 02:54:46 +03:00
Chad Swenson
7d33650019
Merge pull request #2462 from woopstar/coredns-patch
Add CoreDNS support
2018-03-16 18:33:36 -05:00
woopstar
e40368ae2b Add CoreDNS support with various fixes
Added CoreDNS to downloads

Updated with labels. Should now work without RBAC too

Fix DNS settings on hosts

Rename CoreDNS service from kube-dns to coredns

Add rotate based on http://edgeofsanity.net/rant/2017/12/20/systemd-resolved-is-broken.html

Updated docs with CoreDNS info

Added labels and fixed minor settings from official yaml file: https://github.com/kubernetes/kubernetes/blob/release-1.9/cluster/addons/dns/coredns.yaml.sed

Added a secondary deployment and secondary service ip. This is to mitigate dns timeouts and create high resitency for failures. See discussion at 'https://github.com/coreos/coreos-kubernetes/issues/641#issuecomment-281174806'

Set dns list correct. Thanks to @whereismyjetpack

Only download KubeDNS or CoreDNS if selected

Move dns cleanup to its own file and import tasks based on dns mode

Fix install of KubeDNS when dnsmask_kubedns mode is selected

Add new dns option coredns_dual for dual stack deployment. Added variable to configure replicas deployed. Updated docs for dual stack deployment. Removed rotate option in resolv.conf.

Run DNS manifests for CoreDNS and KubeDNS

Set skydns servers on dual stack deployment

Use only one template for CoreDNS dual deployment

Set correct cluster ip for the dns server
2018-03-16 21:51:37 +01:00
Qasim Sarfraz
8ee2091955
Merge pull request #3 from kubernetes-incubator/master
Sync Upstream
2018-03-16 17:21:54 +01:00
Oleg Vyukov
d843e3d562 Fix indent Custom ConfigMap ingress-nginx (#2447) 2018-03-15 22:18:18 +03:00
MQasimSarfraz
1bcc641dae Create vsphere clusterrole only if it doesnt exists 2018-03-14 11:29:35 +00:00
MQasimSarfraz
9a4aa4288c Fix vsphere cloud_provider RBAC permissions 2018-03-12 18:07:08 +00:00
Spencer Smith
3a714fd4ac
Merge pull request #2427 from hswong3i/local_volume_provisioner_default
FIXUP #2424: local_provisioner directory should be created only if enabled
2018-03-10 09:00:35 -05:00
chadswen
b0ab92c921 Prefix system:node CRB
Change the name of `system:node` CRB to `kubespray:system:node` to avoid
conflicts with the auto-reconciled CRB also named `system:node`

Fixes #2121
2018-03-08 23:56:46 -06:00
Wong Hoi Sing Edison
6402004018 FIXUP #2424: local_provisioner directory should be created only if enabled 2018-03-08 11:57:46 +08:00
Wong Hoi Sing Edison
3f96b2da7a Add Custom ConfigMap Support for ingress-nginx 2018-03-07 21:37:45 +08:00
Wong Hoi Sing Edison
e65904eee3 Add labels for ingress_nginx_namespace, also only setup serviceAccountName if rbac_enabled 2018-03-05 23:11:18 +08:00
Jonas Kongslund
585303ad66 Start with three dashes for consistency 2018-03-03 10:05:05 +04:00
Jonas Kongslund
a800ed094b Added support for webhook authentication/authorization on the secure kubelet endpoint 2018-03-03 10:00:09 +04:00
Wong Hoi Sing Edison
fd46442188 Integrate kubernetes/ingress-nginx 0.11.0 to Kubespray 2018-03-02 23:33:19 +08:00
Aivars Sterns
8b21034b31
Merge pull request #2344 from hswong3i/local_volume_provisioner_fixup
Upgrade Local Volume Provisioner Addon to v2.0.0
2018-03-01 13:12:44 +02:00
Dmitry Vlasov
977e7ae105 remove obsolete init image, bump dashboard version 1.8.1 -> 1.8.3 2018-02-28 12:52:59 +03:00
Wong Hoi Sing Edison
deef47c923 Upgrade Local Volume Provisioner Addon to v2.0.0 2018-02-21 13:41:25 +08:00
melkosoft
f13e76d022 Added cilium support (#2236)
* Added cilium support

* Fix typo in debian test config

* Remove empty lines

* Changed cilium version from <latest> to <v1.0.0-rc3>

* Add missing changes for cilium

* Add cilium to CI pipeline

* Fix wrong file name

* Check kernel version for cilium

* fixed ci error

* fixed cilium-ds.j2 template

* added waiting for cilium pods to run

* Fixed missing EOF

* Fixed trailing spaces

* Fixed trailing spaces

* Fixed trailing spaces

* Fixed too many blank lines

* Updated tolerations,annotations in cilium DS template

* Set cilium_version to iptables-1.9 to see if bug is fixed in CI

* Update cilium image tag to v1.0.0-rc4

* Update Cilium test case CI vars filenames

* Add optional prometheus flag, adjust initial readiness delay

* Update README.md with cilium info
2018-02-16 21:37:47 -06:00
Sebastian Söderqvist
ba2107ea8c is-default-class is case sensative so we must return a lowercase string 2018-02-15 10:51:42 +01:00
southquist
3f44a33738 allow for configurable openstack storage class 2018-02-14 11:32:56 +01:00
Wong Hoi Sing Edison
07075add3d Add optional StorageClass name with cephfs_provisioner_storage_class 2018-02-10 20:31:34 +08:00
Antoine Legrand
275b1d6897
Merge pull request #2274 from mirwan/local_volume_provisioner_configmap_in_daemonset
Local volume provisioner fixes
2018-02-09 00:59:47 +01:00
Erwan Miran
e9a676951b storageClass name template as suggested by @eyeofthefrog 2018-02-09 00:11:07 +01:00
Wong Hoi Sing Edison
b25e0f82b1 Add cephfs_provisioner Support for Kubespray 2018-02-08 22:27:54 +08:00
Maxim Krasilnikov
cae1c683aa
Merge pull request #2271 from leseb/retry-get-token
kubernetes-apps: retry get default token name
2018-02-08 16:46:32 +03:00
Erwan Miran
e1aaef7d4d Removal of surnumerary slash 2018-02-08 09:06:17 +01:00
Erwan Miran
abfb147292 MountDir in configmap and daemonset must be the same 2018-02-07 18:42:42 +01:00
Erwan Miran
44eb03f78a typo 2018-02-07 17:57:54 +01:00
Erwan Miran
857784747b local-provisioner:v1.0.1 still expects json configmap 2018-02-07 17:47:05 +01:00
Erwan Miran
7a2cb5e41c local-provisioner:v1.0.1 still uses VOLUME_CONFIG_NAME env to read ConfigMap 2018-02-07 17:01:19 +01:00
Sébastien Han
34bd47de79 kubernetes-apps: retry get default token name
In some installation, it can take up to 3sec to get the value. Retrying
for 5 sec will ensure the command won't return 1.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-02-07 12:09:51 +01:00
Erwan Miran
ca08614641 yamllint fix 2018-02-07 09:12:28 +01:00
Erwan Miran
b4e264251f JSON/YAML syntax fix 2018-02-06 17:17:10 +01:00
Antoine Legrand
37cfd289d8
Merge pull request #2248 from hswong3i/dashboard.yml.j2
Dashboard template should not suffix with .yml.j2
2018-02-06 11:25:02 +01:00
Antoine Legrand
9f3081580a
Merge pull request #2249 from hswong3i/kubedns-deploy.yml.j2
KubeDNS template should not suffix with .yml.j2
2018-02-06 11:24:19 +01:00
Antoine Legrand
a3248379db
Merge branch 'master' into local_volume_provisioner 2018-02-06 09:28:27 +01:00
Wong Hoi Sing Edison
4ad53339f6 KubeDNS template should not suffix with .yml.j2 2018-02-05 16:26:54 +08:00
Wong Hoi Sing Edison
a4d3da6a8e Dashboard template should not suffix with .yml.j2 2018-02-05 16:18:21 +08:00
Wong Hoi Sing Edison
7954ea2525 Migrate Kubernetes v1.9.1 cluster/addons/registry to Kubespray 2018-02-05 12:21:09 +08:00
Wong Hoi Sing Edison
bc2e26d7ef update apiVersion 2018-02-01 14:16:32 +08:00
Wong Hoi Sing Edison
fd80013917 lint and cleanup local_volume_provisioner 2018-02-01 14:14:18 +08:00
RongZhang
3846384d56 Bump kube-dns to 1.14.8 (#2204)
Bump kube-dns to 1.14.8
2018-01-30 19:23:37 +03:00
Matthew Mosesohn
dc6a17e092
Use include/import tasks (#2192)
import_tasks will consume far less memory, so it should be
used whenever it is compatible.
2018-01-29 14:37:48 +03:00
Brad Beam
08fe61e058
Merge pull request #2071 from riverzhang/dashboard
Update dashboard version to v1.8.1
2018-01-24 20:10:05 -06:00
Brad Beam
0c8bed21ee
Merge pull request #2019 from chadswen/disable-api-insecure-port
Support for disabling apiserver insecure port (the sequel)
2018-01-24 19:58:53 -06:00
rong.zhang
df21fc8643 Remove initContainer 2018-01-10 12:17:17 +08:00
Lukasz Piatkowski
12eb242224 fix fluentd template 2018-01-08 13:40:47 +00:00
rong.zhang
6ed2a60978 fix run dashboard error 2018-01-04 13:13:36 +08:00
Matthew Mosesohn
ad6fecefa8
Update Kubernetes to v1.9.0 (#2100)
Update checksum for kubeadm
Use v1.9.0 kubeadm params
Include hash of ca.crt for kubeadm join
Update tag for testing upgrades
Add workaround for testing upgrades
Remove scale CI scenarios because of slow inventory parsing
in ansible 2.4.x.

Change region for tests to us-central1 to
improve ansible performance
2017-12-25 08:57:45 +00:00
rong.zhang
5aef52e8c0 fix dashboard certs secret 2017-12-22 11:17:05 +08:00
Stanislav Makar
b2cb0725ac Default OpenStack Cinder Storage Class (#2083)
Add possibility to create default OpenStack Cinder Storage Class

Closes: #1609
2017-12-19 14:47:00 +00:00
rong.zhang
b974b144a8 Add RBAC to binding Dahsboard UI 2017-12-18 23:07:19 +08:00
rong.zhang
0771cd8599 Remove dashboard_tls_key and dashboard_tls_cert 2017-12-13 15:42:20 +08:00
rong.zhang
40edf8c6f5 Update dashboard version to v1.8.0
Update dependencies to be compatible with Kubernetes v1.8
2017-12-13 12:50:44 +08:00
Chad Swenson
b8788421d5 Support for disabling apiserver insecure port
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled).

Rework of #1937 with kubeadm support

Also, fixed an issue in `kubeadm-migrate-certs` where the old apiserver cert was copied as the kubeadm key
2017-12-05 09:13:45 -06:00
Brad Beam
c2347db934
Merge pull request #1953 from chadswen/dashboard-refactor
Kubernetes Dashboard v1.7.1 Refactor
2017-12-05 08:50:55 -06:00
Matthew Mosesohn
a0225507a0
Set helm deployment type to host (#2012) 2017-11-29 19:52:54 +00:00
unclejack
e5d353d0a7 contiv network support (#1914)
* Add Contiv support

Contiv is a network plugin for Kubernetes and Docker. It supports
vlan/vxlan/BGP/Cisco ACI technologies. It support firewall policies,
multiple networks and bridging pods onto physical networks.

* Update contiv version to 1.1.4

Update contiv version to 1.1.4 and added SVC_SUBNET in contiv-config.

* Load openvswitch module to workaround on CentOS7.4

* Set contiv cni version to 0.1.0

Correct contiv CNI version to 0.1.0.

* Use kube_apiserver_endpoint for K8S_API_SERVER

Use kube_apiserver_endpoint as K8S_API_SERVER to make contiv talks
to a available endpoint no matter if there's a loadbalancer or not.

* Make contiv use its own etcd

Before this commit, contiv is using a etcd proxy mode to k8s etcd,
this work fine when the etcd hosts are co-located with contiv etcd
proxy, however the k8s peering certs are only in etcd group, as a
result the etcd-proxy is not able to peering with the k8s etcd on
etcd group, plus the netplugin is always trying to find the etcd
endpoint on localhost, this will cause problem for all netplugins
not runnign on etcd group nodes.
This commit make contiv uses its own etcd, separate from k8s one.
on kube-master nodes (where net-master runs), it will run as leader
mode and on all rest nodes it will run as proxy mode.

* Use cp instead of rsync to copy cni binaries

Since rsync has been removed from hyperkube, this commit changes it
to use cp instead.

* Make contiv-etcd able to run on master nodes

* Add rbac_enabled flag for contiv pods

* Add contiv into CNI network plugin lists

* migrate contiv test to tests/files

Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>

* Add required rules for contiv netplugin

* Better handling json return of fwdMode

* Make contiv etcd port configurable

* Use default var instead of templating

* roles/download/defaults/main.yml: use contiv 1.1.7

Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
2017-11-29 14:24:16 +00:00
Christopher Randles
208ff8e350 Allow for more customization of the tiller deploy (#1946) 2017-11-28 18:33:57 +00:00
Kevin Lefevre
9368dbe0e7 update calico to 2.6.2 (#1874)
Move RS to deployment so no need to take care of the revision history
limits :
  - Delete the old RS
  - Make Calico manifest a deployment
  - move deployments to apps/v1beta2 API since Kubernetes 1.8
2017-11-28 12:01:30 +00:00
Matthew Mosesohn
67419e8d0a
Run rotate_tokens role only once (#1970) 2017-11-15 18:50:23 +00:00
Chad Swenson
a89ee8c406 Add ability to use custom cert secret instead of init container provisioned self-signed certs 2017-11-15 10:05:52 -06:00
Chad Swenson
0c6f172e75 Kubernetes Dashboard v1.7.1 Refactor
This version required changing the previous access model for dashboard completely but it's a change for the better. Docs were updated.

* New login/auth options that use apiserver auth proxying by default
* Requires RBAC in `authorization_modes`
* Only serves over https
* No longer available at https://first_master:6443/ui until apiserver is updated with the https proxy URL:
* Can access from https://first_master:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login you will be prompted for credentials
* Or you can run 'kubectl proxy' from your local machine to access dashboard in your browser from: http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
* It is recommended to access dashboard from behind a gateway that enforces an authentication token, details and other access options here: https://github.com/kubernetes/dashboard/wiki/Accessing-Dashboard---1.7.X-and-above
2017-11-15 10:05:48 -06:00
Matthew Mosesohn
f9b68a5d17
Revert "Support for disabling apiserver insecure port" (#1974) 2017-11-14 13:41:28 +00:00
Stanislav Makar
037edf1215 Fix failed task of setting up bash completion for helm (#1968)
Closes: #1967
2017-11-13 10:15:53 +00:00
Chad Swenson
0c7e1889e4 Support for disabling apiserver insecure port
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled). It's working, but so far I have had to:

1. Make the `uri` module "Wait for apiserver up" checks use `kube_apiserver_port` (HTTPS)
2. Add apiserver client cert/key to the "Wait for apiserver up" checks
3. Update apiserver liveness probe to use HTTPS ports
4. Set `kube_api_anonymous_auth` to true to allow liveness probe to hit apiserver's /healthz over HTTPS (livenessProbes can't use client cert/key unfortunately)
5. RBAC has to be enabled. Anonymous requests are in the `system:unauthenticated` group which is granted access to /healthz by one of RBAC's default ClusterRoleBindings. An equivalent ABAC rule could allow this as well.

Changes 1 and 2 should work for everyone, but 3, 4, and 5 require new coupling of currently independent configuration settings. So I also added a new settings check.

Options:

1. The problem goes away if you have both anonymous-auth and RBAC enabled. This is how kubeadm does it. This may be the best way to go since RBAC is already on by default but anonymous auth is not.
2. Include conditional templates to set a different liveness probe for possible combinations of `kube_apiserver_insecure_port = 0`, RBAC, and `kube_api_anonymous_auth` (won't be possible to cover every case without a guaranteed authorizer for the secure port)
3. Use basic auth headers for the liveness probe (I really don't like this, it adds a new dependency on basic auth which I'd also like to leave independently configurable, and it requires encoded passwords in the apiserver manifest)

Option 1 seems like the clear winner to me, but is there a reason we wouldn't want anonymous-auth on by default? The apiserver binary defaults anonymous-auth to true, but kubespray's default was false.
2017-11-06 14:01:10 -06:00
Günther Grill
0d55ed3600 Avoid that some read-only tasks cause an ansible-change (#1910) 2017-11-06 13:51:07 +00:00
Spencer Smith
a595c84f7e
Merge pull request #1928 from chadswen/flannel-rbac-fix
Flannel RBAC Fix
2017-11-03 18:12:16 -04:00
Matthew Mosesohn
66c67dbe73
Add optional helm deployment mode for host (#1920) 2017-11-03 07:09:24 +00:00
Chad Swenson
16ae2c1809 Flannel RBAC Fix
Fixes a bug that can occur if `cni-flannel-rbac.yml` was written but the playbook failed before it was applied. Uses the same approach as calico.
2017-11-02 23:20:23 -05:00
Matthew Mosesohn
520103df78 Change namespace for provisioner account 2017-11-02 10:16:08 +00:00
Matthew Mosesohn
c0e989b17c
New addon: local_volume_provisioner (#1909) 2017-11-01 14:25:35 +00:00
Spencer Smith
ef0a91da27
Merge pull request #1891 from rsmitty/proxy-fixes
Improved proxy support
2017-10-31 14:32:12 -04:00
Spencer Smith
74a9eedb93 helm template check for http/https_proxy 2017-10-30 13:11:04 -04:00
Spencer Smith
b27453d8d8 improved proxy support 2017-10-30 11:42:14 -04:00
Andrew Greenwood
c383c7e2c1
Update kubedns image to latest 2017-10-29 21:58:05 -04:00
Matthew Mosesohn
ec53b8b66a Move cluster roles and system namespace to new role
This should be done after kubeconfig is set for admin and
before network plugins are up.
2017-10-26 14:36:05 +01:00
Matthew Mosesohn
86fb669fd3 Idempotency fixes (#1838) 2017-10-25 21:19:40 +01:00
Spencer Smith
46cf6b77cf Merge pull request #1857 from pmontanari/patch-1
Use same kubedns_version: 1.14.5 in downloads  and kubernetes-apps/ansible roles
2017-10-25 10:05:43 -04:00
Matthew Mosesohn
33c4d64b62 Make ClusterRoleBinding to admit all nodes with right cert (#1861)
This is to work around #1856 which can occur when kubelet
hostname and resolvable hostname (or cloud instance name)
do not match.
2017-10-24 17:05:58 +01:00
pmontanari
8371a060a0 Update main.yml
Match kubedns_version with roles/download/defaults/main.yml:kubedns_version: 1.14.5
2017-10-22 23:48:51 +02:00
Matthew Mosesohn
fc9a65be2b Refactor downloads to use download role directly (#1824)
* Refactor downloads to use download role directly

Also disable fact delegation so download delegate works acros OSes.

* clean up bools and ansible_os_family conditionals
2017-10-19 09:17:11 +01:00
Matthew Mosesohn
4efb0b78fa Move CI vars out of gitlab and into var files (#1808) 2017-10-18 17:28:54 +01:00
Aivars Sterns
688e589e0c fix #1788 lock dashboard version to 1.6.3 version while 1.7.x is not working (#1805) 2017-10-17 11:04:55 +01:00
刘旭
6c98201aa4 remove kube-dns versions and images in kubernetes-apps/ansible/defaults/main.yaml (#1807) 2017-10-17 11:03:53 +01:00
Matthew Mosesohn
ef47a73382 Add new addon Istio (#1744)
* add istio addon

* add addons to a ci job
2017-10-13 15:42:54 +01:00
Simon Li
c14bbcdbf2 Include bin_dir when patching helm tiller with kubectl 2017-10-09 15:17:52 +01:00
Aivars Sterns
9c86da1403 Normalize tags in all places to prepare for tag fixing in future (#1739) 2017-10-05 08:43:04 +01:00
Matthew Mosesohn
a56738324a Move set_facts to kubespray-defaults defaults
These facts can be generated in defaults with a performance
boost.

Also cleaned up duplicate etcd var names.
2017-10-04 14:02:47 +01:00
Matthew Mosesohn
dae9f6d3c2 Test if tokens are expired from host instead of inside container (#1727)
* Test if tokens are expired from host instead of inside container

* Update main.yml
2017-10-02 13:14:50 +01:00
Matthew Mosesohn
3ff5f40bdb fix graceful upgrade (#1704)
Fix system namespace creation
Only rotate tokens when necessary
2017-09-27 14:49:20 +01:00
Matthew Mosesohn
327ed157ef Verify valid settings before deploy (#1705)
Also fix yaml lint issues

Fixes #1703
2017-09-27 14:47:47 +01:00
Matthew Mosesohn
bd272e0b3c Upgrade to kubeadm (#1667)
* Enable upgrade to kubeadm

* fix kubedns upgrade

* try upgrade route

* use init/upgrade strategy for kubeadm and ignore kubedns svc

* Use bin_dir for kubeadm

* delete more secrets

* fix waiting for terminating pods

* Manually enforce kube-proxy for kubeadm deploy

* remove proxy. update to kubeadm 1.8.0rc1
2017-09-26 10:38:58 +01:00
Matthew Mosesohn
b294db5aed fix apply for netchecker upgrade (#1659)
* fix apply for netchecker upgrade and graceful upgrade

* Speed up daemonset upgrades. Make check wait for ds upgrades.
2017-09-15 13:19:37 +01:00
Matthew Mosesohn
6744726089 kubeadm support (#1631)
* kubeadm support

* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job

* change ci boolean vars to json format

* fixup

* Update create-gce.yml

* Update create-gce.yml

* Update create-gce.yml
2017-09-13 19:00:51 +01:00
Matthew Mosesohn
75b13caf0b Fix kube-apiserver status checks when changing insecure bind addr (#1633) 2017-09-09 23:41:48 +03:00
Matthew Mosesohn
649388188b Fix netchecker update side effect (#1644)
* Fix netchecker update side effect

kubectl apply should only be used on resources created
with kubectl apply. To workaround this, we should apply
the old manifest before upgrading it.

* Update 030_check-network.yml
2017-09-09 23:38:38 +03:00
Matthew Mosesohn
9fa1873a65 Add kube dashboard, enabled by default (#1643)
* Add kube dashboard, enabled by default

Also add rbac role for kube user

* Update main.yml
2017-09-09 23:38:03 +03:00
Matthieu
0453ed8235 Fix an error with Canal when RBAC are disabled (#1619)
* Fix an error with Canal when RBAC are disabled

* Update using same rbac strategy used elsewhere
2017-09-06 11:32:32 +03:00
ArthurMa
c77d11f1c7 Bugfix (#1616)
lost executable path
2017-09-05 08:35:14 +03:00
Matthew Mosesohn
d279d145d5 Fix non-rbac deployment of resources as a list (#1613)
* Use kubectl apply instead of create/replace

Disable checks for existing resources to speed up execution.

* Fix non-rbac deployment of resources as a list

* Fix autoscaler tolerations field

* set all kube resources to state=latest

* Update netchecker and weave
2017-09-05 08:23:12 +03:00
Matthew Mosesohn
660282e82f Make daemonsets upgradeable (#1606)
Canal will be covered by a separate PR
2017-09-04 11:30:01 +03:00
Matthew Mosesohn
77602dbb93 Move calico to daemonset (#1605)
* Drop legacy calico logic

* add calico as a daemonset
2017-09-04 11:29:51 +03:00
Matthew Mosesohn
a3e6896a43 Add RBAC support for canal (#1604)
Refactored how rbac_enabled is set
Added RBAC to ubuntu-canal-ha CI job
Added rbac for calico policy controller
2017-09-04 11:29:40 +03:00
Matthew Mosesohn
13d08af054 Fix upgrade for canal and apiserver cert
Fixes #1573
2017-08-29 22:08:30 +01:00
Chad Swenson
a39e78d42d Initial version of Flannel using CNI (#1486)
* Updates Controller Manager/Kubelet with Flannel's required configuration for CNI
* Removes old Flannel installation
* Install CNI enabled Flannel DaemonSet/ConfigMap/CNI bins and config (with portmap plugin) on host
* Uses RBAC if enabled
* Fixed an issue that could occur if br_netfilter is not a module and net.bridge.bridge-nf-call-iptables sysctl was not set
2017-08-25 10:07:50 +03:00
Matthew Mosesohn
6bb3463e7c Enable scheduling of critical pods and network plugins on master
Added toleration to DNS, netchecker, fluentd, canal, and
calico policy.

Also small fixes to make yamllint pass.
2017-08-24 10:41:17 +01:00
Brad Beam
8b151d12b9 Adding yamllinter to ci steps (#1556)
* Adding yaml linter to ci check

* Minor linting fixes from yamllint

* Changing CI to install python pkgs from requirements.txt

- adding in a secondary requirements.txt for tests
- moving yamllint to tests requirements
2017-08-24 12:09:52 +03:00
Matthew Mosesohn
df28db0066 Fix cert and netchecker upgrade issues (#1543)
* Bump tag for upgrade CI, fix netchecker upgrade

netchecker-server was changed from pod to deployment, so
we need an upgrade hook for it.

CI now uses v2.1.1 as a basis for upgrade.

* Fix upgrades for certs from non-rbac to rbac
2017-08-18 15:46:22 +03:00
Brad Beam
af007c7189 Fixing netchecker-server type - pod => deployment (#1509) 2017-08-14 18:43:56 +03:00
Seungkyu Ahn
b22bef5cfb Apply RBAC to efk and create fluentd.conf
Making fluentd.conf as configmap to change configuration.
Change elasticsearch rc to deployment.
Having installed previous elastaicsearch as rc, first should delete that.
2017-08-11 05:31:50 +00:00
Brad Beam
1155008719 Merge pull request #1481 from magnon-bliex/fluentd-template-fix-typo
fixed typo in fluentd-ds.yml.j2
2017-08-10 08:19:59 -05:00
magnon-bliex
38eb1d548a fixed typo 2017-07-28 14:10:13 +09:00
jwfang
789910d8eb remote unused netchecker-agent-hostnet-ds.j2 2017-07-17 19:29:59 +08:00
jwfang
a8e6a0763d run netchecker-server with list pods 2017-07-17 19:29:59 +08:00
jwfang
e1386ba604 only patch system:kube-dns role for old dns 2017-07-17 19:29:59 +08:00
jwfang
83deecb9e9 Revert "no need to patch system:kube-dns"
This reverts commit c2ea8c588aa5c3879f402811d3599a7bb3ccab24.
2017-07-17 19:29:59 +08:00
jwfang
d8dcb8f6e0 no need to patch system:kube-dns 2017-07-17 19:29:59 +08:00
jwfang
0b3badf3d8 revert calico-related changes 2017-07-17 19:29:59 +08:00
jwfang
2cda982345 binding group system:nodes to clusterrole calico-role 2017-07-17 19:29:59 +08:00
jwfang
c9734b6d7b run calico-policy-controller with proper sa/role/rolebinding 2017-07-17 19:29:59 +08:00
jwfang
092bf07cbf basic rbac support 2017-07-17 19:29:59 +08:00
Spencer Smith
bba555bb08 Merge pull request #1346 from Starefossen/patch-1
Set kubedns minimum replicas to 2
2017-07-06 09:14:11 -04:00
Hans Kristian Flaatten
38f5d1b18e Set kubedns minimum replicas to 2 2017-07-04 16:58:16 +02:00
Chad Swenson
8467bce2a6 Fix inconsistent kubedns version and parameterize kubedns autoscaler image vars 2017-06-27 10:19:31 -05:00
Seungkyu Ahn
d5516a4ca9 Make kubedns up to date
Update kube-dns version to 1.14.2
https://github.com/kubernetes/kubernetes/pull/45684
2017-06-27 00:57:29 +00:00
Seungkyu Ahn
91dff61008 Fixed helm bash complete 2017-06-19 15:33:50 +09:00
Gregory Storme
266ca9318d Use the kube_apiserver_insecure_port variable instead of static 8080 2017-06-12 09:20:59 +02:00
Spencer Smith
efa2dff681 remove conditional 2017-05-12 17:16:49 -04:00
Spencer Smith
31a7b7d24e default to kubedns and set nxdomain in kubedns deployment if that's the dns_mode 2017-05-12 15:57:24 -04:00
moss2k13
791ea89b88 Updated helm installation
Added full path for helm
2017-05-08 09:27:06 +02:00
Aleksandr Didenko
883ba7aa90 Add support for different tags for netcheck containers
Replace 'netcheck_tag' with 'netcheck_version' and add additional
'netcheck_server_tag' and 'netcheck_agent_tag' config options to
provide ability to use different tags for server and agent
containers.
2017-04-27 17:15:28 +02:00
Aleksey Kasatkin
2638ab98ad add MY_NODE_NAME variable into netchecker-agent environment 2017-04-24 17:19:42 +03:00
Matthew Mosesohn
bc3068c2f9 Merge pull request #1251 from FengyunPan/fix-helm-home
Specify a dir and attach it to helm for HELM_HOME
2017-04-24 15:17:28 +03:00
FengyunPan
2bde9bea1c Specify a dir and attach it to helm for HELM_HOME 2017-04-21 10:51:27 +08:00
Spencer Smith
5c4980c6e0 Merge pull request #1231 from holser/fix_netchecker-server
Reschedule netchecker-server in case of HW failure.
2017-04-14 10:50:07 -04:00
Sergii Golovatiuk
45044c2d75 Reschedule netchecker-server in case of HW failure.
Pod opbject is not reschedulable by kubernetes. It means that if node
with netchecker-server goes down, netchecker-server won't be scheduled
somewhere. This commit changes the type of netchecker-server to
Deployment, so netchecker-server will be scheduled on other nodes in
case of failures.
2017-04-14 10:49:16 +02:00
Joe Duhamel
072b3b9d8c Update kubedns-autoscaler change target
The target was a replicationcontroller but kubedns is currently a deployment
2017-04-13 14:55:25 -04:00
Spencer Smith
9b3aa3451e Merge pull request #1218 from bradbeam/efkidempotent
Fixing resource type for kibana
2017-04-11 19:04:13 -04:00
Brad Beam
bd130315b6 Excluding bash completion for helm on CoreOS 2017-04-10 11:07:15 -05:00
Brad Beam
504711647e Fixing resource type for kibana 2017-04-10 11:01:12 -05:00
Matthew Mosesohn
75ea001bfe Merge pull request #1208 from mattymo/1.6-flannel
Update to k8s 1.6 with flannel and centos fixes
2017-04-06 13:04:02 +03:00
Matthew Mosesohn
ff2fb9196f Fix flannel for 1.6 and apply fixes to enable containerized kubelet 2017-04-06 10:06:21 +04:00
Sergii Golovatiuk
2670eefcd4 Refactoring resolv.conf
- Renaming templates for netchecker
- Add dnsPolicy: ClusterFirstWithHostNet to kube-proxy

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-04-05 09:28:01 +02:00
Sergii Golovatiuk
1cfe0beac0 Set ClusterFirstWithHostNet for Pods with hostnetwork: true
In kubernetes 1.6 ClusterFirstWithHostNet was added as an option. In
accordance to it kubelet will generate resolv.conf based on own
resolv.conf. However, this doesn't create 'options', thus the proper
solution requires some investigation.

This patch sets the same resolv.conf for kubelet as host

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-04-04 16:34:13 +02:00
Matthew Mosesohn
0f64f8db90 Merge pull request #1155 from mattymo/helm
Add helm deployment
2017-03-20 17:00:06 +03:00
Matthew Mosesohn
b69d4b0ecc Add helm deployment 2017-03-17 20:24:41 +03:00
Matthew Mosesohn
e1faeb0f6c Fix weave on RHEL deployment
Reduce retry delay checking weave
Always load br_netfilter module
2017-03-17 18:17:47 +03:00
Aleksandr Didenko
3a39904011 Move calico-policy-controller into separate role
By default Calico CNI does not create any network access policies
or profiles if 'policy' is enabled in CNI config. And without any
policies/profiles network access to/from PODs is blocked.

K8s related policies are created by calico-policy-controller in
such case. So we need to start it as soon as possible, before any
real workloads.

This patch also fixes kube-api port in calico-policy-controller
yaml template.

Closes #1132
2017-03-17 11:21:52 +01:00
Matthew Mosesohn
6453650895 Merge pull request #1093 from mattymo/scaledns
Add autoscalers for dnsmasq and kubedns
2017-03-02 16:58:56 +03:00
Matthew Mosesohn
9cb12cf250 Add autoscalers for dnsmasq and kubedns
By default kubedns and dnsmasq scale when installed.
Dnsmasq is no longer a daemonset. It is now a deployment.
Kubedns is no longer a replicationcluster. It is now a deployment.
Minimum replicas is two (to enable rolling updates).

Reduced memory erquirements for dnsmasq and kubedns
2017-03-02 13:44:22 +03:00
Sergii Golovatiuk
d31c040dc0 Change kube-api default port from 443 to 6443
Operator can specify any port for kube-api (6443 default) This helps in
case where some pods such as Ingress require 443 exclusively.

Closes: 820
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-28 15:45:35 +01:00
Brad Beam
56664b34a6 Lower default memory requests
This is to address out of memory issues on CI as well as help
fit deployments for people starting out with kargo on smaller
machines
2017-02-27 10:53:43 -06:00
Bogdan Dobrelya
712872efba Rework inventory all by real groups' vars
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-02-23 09:43:42 +01:00
Matthew Mosesohn
3cc1491833 Merge branch 'master' into pedantic-syntax-cleanup 2017-02-20 20:19:38 +03:00
Andrew Greenwood
ca9ea097df Cleanup legacy syntax, spacing, files all to yml
Migrate older inline= syntax to pure yml syntax for module args as to be consistant with most of the rest of the tasks
Cleanup some spacing in various files
Rename some files named yaml to yml for consistancy
2017-02-17 16:22:34 -05:00
Antoine Legrand
b84cc14694 Merge pull request #1029 from mattymo/graceful
Add graceful upgrade process
2017-02-17 21:24:32 +01:00
Matthew Mosesohn
617edda9ba Adjust weave daemonset for serial deployment 2017-02-16 18:24:30 +03:00
Vladimir Rutsky
7ab04b2e73 fix typo in "kibana_base_url" variable name
This typo lead to kibana_base_url being undefined and Kibana used
default base URL ("/") which is incorrect with default proxy-based
access.
2017-02-16 18:17:06 +03:00
Matthew Mosesohn
6ae70e03cb fixup upgrades for canal and weave 2017-02-10 13:27:41 +03:00
Matthew Mosesohn
29fd957352 Enable weave upgrade from previous versions
Raise readiness probe initial time to 60 (was 30)
2017-02-09 21:39:31 +03:00
Matthew Mosesohn
3c713a3f53 Fix upgrade for all daemonset type resources
Daemonsets cannot be simply upgraded through a single API call,
regardless of any kubectl documentation. The resource must be
purged and then recreated in order to make any changes.
2017-02-08 18:16:00 +03:00
Matthew Mosesohn
bfd1ea1da1 Merge pull request #971 from bradbeam/efk
Adding EFK logging stack
2017-02-08 14:28:04 +03:00
Aleksandr Didenko
54af533b31 Update playbooks to support new netchecker
Netchecker is rewritten in Go lang with some new args instead of
env variables. Also netchecker-server no longer requires kubectl
container. Updating playbooks accordingly.
2017-02-07 15:20:34 +01:00
Matthew Mosesohn
fd30131dc2 Revert "Drop linux capabilities and rework users/groups" 2017-02-06 15:58:54 +03:00
Bogdan Dobrelya
cae2982d81 Merge pull request #911 from bogdando/DROP_CAPS
Drop linux capabilities and rework users/groups
2017-02-06 12:05:51 +01:00
Brad Beam
df3e11bdb8 Adding EFK logging stack 2017-02-03 16:27:08 -06:00
Sergii Golovatiuk
f2e4ffcac2 Fix weave-net after upgrade to 1.82
- Set recommended CPU settings
- Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
- Limit DS creation to master
- Combined 2 tasks into one with better condition
2017-02-02 10:31:58 +01:00
Matthew Mosesohn
39d87a96aa Rename weave-kube to weave-net
Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
2017-01-31 18:47:27 +03:00
Matthew Mosesohn
6463a01e04 Merge pull request #880 from bradbeam/weave-kube
Weave kube
2017-01-31 13:31:09 +03:00
Brad Beam
a11b9d28bd Upgrading weave to weave-kube 2017-01-27 17:05:25 -06:00
Brad Beam
b54eb609bf Consolidating kube.py module 2017-01-27 11:28:11 -06:00
Bogdan Dobrelya
cb2e5ac776 Drop linux capabilities and rework users/groups
* Drop linux capabilities for unprivileged containerized
  worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
  components' static manifests, etcd, calico-rr and k8s apps,
  like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
  supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
  networking containers though).
* Grant the kube and netplug users read access for etcd certs via
  the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
  and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
  for control plane.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-20 08:50:42 +01:00
Greg Althaus
61dab8dc0b Should only check for api-server running on the master.
If this runs on other nodes, it will fail the playbook.
2017-01-17 15:57:34 -06:00
Aleksandr Didenko
0909368339 Set latest stable versions for Calico images
Change version for calico images to v1.0.0. Also bump versions for
CNI and policy controller.

Also removing images repo and tag duplication from netchecker role
2017-01-09 12:05:49 +01:00
Alexander Block
1d2a18b355 Introduce dns_mode and resolvconf_mode and implement docker_dns mode
Also update reset.yml to do more dns/network related cleanup.
2017-01-05 23:38:51 +01:00
Bogdan Dobrelya
d8a2941e9e Fix cert paths for flannel/calico policy apps
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-03 16:12:54 +01:00
Bogdan Dobrelya
a56d9de502 Systemd units, limits, and bin path fixes
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
  systemd limits due to the lack of consensus in the kernel community
  requires out-of-tree kernel patches.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-28 15:49:42 +01:00
Bogdan Dobrelya
bb0c3537cb Do not forward bogus domains for upstream resolvers
Also fix kube log level 4 to log dnsmasq queries.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-23 11:53:14 +01:00
Matthew Mosesohn
ad796d188d Individual etcd ssl certs
Includes hooks for triggering calico, kubelet, and kube-apiserver restarts
if etcd certs changed.
2016-12-22 13:31:11 +03:00
Bogdan Dobrelya
de8cd5cd7f Merge pull request #786 from mattymo/bug777
Add wait for kube-apiserver to kubernetes-apps
2016-12-22 11:02:50 +01:00
Bogdan Dobrelya
f10d1327d4 Revert "Do not forward private domains for upstream resolvers" 2016-12-21 15:24:17 +01:00
Matthew Mosesohn
d314174149 Add wait for kube-apiserver to kubernetes-apps
Fixes #777
2016-12-21 15:39:39 +03:00