C12s/c12s-kubespray

Author	SHA1	Message	Date
Bogdan Dobrelya	8679f10f71	Rework DNS stack to meet hostnet pods needs * For Debian/RedHat OS families (with NetworkManager/dhclient/resolvconf optionally enabled) prepend /etc/resolv.conf with required nameservers, options, and supersede domain and search domains via the dhclient/resolvconf hooks. * Drop (z)nodnsupdate dhclient hook and re-implement it to complement the resolvconf -u command, which is distro/cloud provider specific. Update docs as well. * Enable network restart to apply and persist changes and simplify handlers to rely on network restart only. This fixes DNS resolve for hostnet K8s pods for Red Hat OS family. Skip network restart for canal/calico plugins, unless https://github.com/projectcalico/felix/issues/1185 fixed. * Replace linefiles line plus with_items to block mode as it's faster. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com> Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>	2016-12-12 17:43:47 +01:00
Bogdan Dobrelya	aefe4a99d2	Preconfigure DNS stack and docker early In order to enable offline/intranet installation cases: * Move DNS/resolvconf configuration to preinstall role. Remove skip_dnsmasq_k8s var as not needed anymore. * Preconfigure DNS stack early, which may be the case when downloading artifacts from intranet repositories. Do not configure K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be not existing). * Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq was set up and before K8s apps to be created. * Move docker install task to early stage as well and unbind it from the etcd role's specific install path. Fix external flannel dependency on docker role handlers. Also fix the docker restart handlers' steps ordering to match the expected sequence (the socket then the service). * Add default resolver fact, which is the cloud provider specific and remove hardcoded GCE resolver. * Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search domains combined with high ndots values lead to poor performance of DNS stack and make ansible workers to fail very often with the "Timeout (12s) waiting for privilege escalation prompt:" error. * Update docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-09 17:30:55 +01:00
Bogdan Dobrelya	0b1ce03167	Add tags Add tags to allow more granular tasks filtering. Add generator script for MD formatted tags found. Add docs for tags how-to. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-09 12:14:28 +01:00
Matthew Mosesohn	ddf2200f2a	Merge pull request #693 from kubernetes-incubator/upgrades-doc Add document outlining upgrade process	2016-12-08 13:02:55 +03:00
Bogdan Dobrelya	1b17efee19	Merge pull request #692 from bogdando/gce_fixes Change GCE sysctls placement and docs	2016-12-07 16:17:30 +01:00
Matthew Mosesohn	cf20a61e68	Add document outlining upgrade process	2016-12-07 16:33:08 +03:00
Bogdan Dobrelya	965b27e48e	Change GCE sysctls placement and docs Override GCE sysctl in /etc/sysctl.d/99-sysctl.conf instead of the /etc/sysctl.d/11-gce-network-security.conf. It is recreated by GCE, f.e. if gcloud CLI invokes some security related changes, thus losing customizations we want to be persistent. Update cloud providers firewall requirements in calico docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-07 12:53:45 +01:00
Aleksandr Didenko	992fcd1680	Calico: fix peering with routers for new version In new `calicoctl` version nodes peering with routers is broken. We need to use predictable node names for calico-node and the same names in calico `bgpPeer` resources and CNI.	2016-12-06 17:17:39 +01:00
Bogdan Dobrelya	aafdbebd48	Reduce CI test matrix Reduce the test cases from 15 to 9, bearing in mind that: * Disable weave/coreos gate unless its deployment fixed * If debian/centos7 fails with net plugin X, ubuntu-xenial/rhel-7 will likely fail as well * Canal also covers the flannel plugin deployment, but keep at least one of the flannel plugin deployment, unless it's superseded and removed. * Keep at least one of each OS/plugin family to be tested in the separate nodes layout * Keep at least one of each OS family to be tested against each of the plugin types in default nodes layout * Rebalance GCE regions for instances, replace asia to eu/us as they are the longest running jobs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-29 18:53:43 +01:00
Antoine Legrand	6630348164	Merge pull request #657 from smelchior/master add azure support for kargo	2016-11-29 12:20:49 +01:00
Bogdan Dobrelya	d4305d1a64	Switch to standard debian/centos/rhel for CI Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-29 10:25:07 +01:00
Sebastian Melchior	254e02c69e	add basic azure support for kargo	2016-11-29 10:20:28 +01:00
Antoine Legrand	f75e2c5119	Merge pull request #529 from bogdando/netcheck Add a k8s app for advanced e2e netcheck for DNS	2016-11-28 15:26:30 +01:00
Bogdan Dobrelya	d5b21b34c2	Add advanced net check for DNS K8s app * Add an option to deploy K8s app to test e2e network connectivity and cluster DNS resolve via Kubedns for nethost/simple pods (defaults to false). * Parametrize existing k8s apps templates with kube_namespace and kube_config_dir instead of hardcode. * For CoreOS, ensure nameservers from inventory to be put in the first place to allow hostnet pods connectivity via short names or FQDN and hostnet agents to pass as well, if netchecker deployed. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-28 13:23:25 +01:00
Bogdan Dobrelya	c34c49d4d9	Tune dnsmasq/kubedns limits, replicas, logging * Add dns_replicas, dns_memory/cpu_limit/requests vars for dns related apps. * When kube_log_level=4, log dnsmasq queries as well. * Add log level control for skydns (part of kubedns app). * Add limits/requests vars for dnsmasq (part of kubedns app) and dnsmasq daemon set. * Drop string defaults for kube_log_level as it is int and is defined in the global vars as well. * Add docs Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-25 12:49:17 +01:00
Bogdan Dobrelya	417a931f78	Fix download dnsmasq image dependency on docker When download_run_once with download_localhost is used, docker is expected to be running on the delegate localhost. That may be not the case for a non localhost delegate, which is the kube-master otherwise. Then the dnsmasq role, had it been invoked early before deployment starts, would fail because of the missing docker dependency. * Fix that dependency on docker and do not pre download dnsmasq image for the dnsmasq role, if download_localhost is disabled. * Remove become: false for docker CLI invocation because that's not the common pattern to allow users access docker CLI w/o sudo. * Fix opt bin path hack for localhost delegate to ignore errors when it fails with "sudo password required" otherwise. * Describe download_run_once with download_localhost use case in docs as well. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-24 18:31:26 +01:00
Bogdan Dobrelya	539a47b0fa	Merge pull request #621 from xenolog/calico_network_backend Add ability to define network backend for Calico.	2016-11-22 14:55:47 +01:00
Sergey Vasilenko	e73c86c6aa	Add ability to define network backend for Calico. This patch introduce `calico_network_backend` global variable, which allow to describe alternative network backend. Default behavior is unchanged.	2016-11-18 16:38:18 +03:00
Alexandre Bourget	ecf40bc377	Update roadmap.md	2016-11-17 12:44:30 -05:00
Bogdan Dobrelya	2514d04c6b	Improve CI test matrix For Travis CI and GCE, add a naive generator script into a markdown table. Add GCE/Travis CI matrix docs. Add CoreOS test cases. Rework existing cases w/o loosing of coverage. Rework postinstall tests to support CoreOS as well. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-11-15 18:17:30 +01:00
Artem Panchenko	9d0a79a777	Support new version of 'calicoctl' (>=v1.0.0) Since version 'v1.0.0-beta' calicoctl is written in Go and its API differs from old Python based utility. Added support of both old and new version of the utility.	2016-11-10 17:11:29 +02:00
Matthew Mosesohn	b8ca4e4f45	Remove etcd-proxy from all nodes and use etcd multiaccess	2016-11-09 13:31:12 +03:00
Smaine Kahlouch	a8155ee35f	Merge pull request #554 from bogdando/kubeadm_adoption Update roadmap for the kubeadm LCM track	2016-10-18 13:52:55 +02:00
Smana	9d2f161430	update roadmap, kubeadm adoption	2016-10-18 13:51:36 +02:00
Bogdan Dobrelya	72a25bfd06	Update roadmap for the kubeadm LCM track Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-10-18 13:44:45 +02:00
Bogdan Dobrelya	dd82ed2a1f	Update ha docs Fix mismatch in code and docs, see https://github.com/kubespray/kargo/pull/528 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-10-17 15:42:30 +02:00
Smaine Kahlouch	9df4502909	Merge pull request #528 from kubespray/proxy-nginx Use nginx proxy on non-master nodes to proxy apiserver traffic	2016-10-05 19:19:32 +02:00
Matthew Mosesohn	73066f308d	use nginx proxy on non-master nodes to proxy apiserver traffic Also adds all masters by hostname and localhost/127.0.0.1 to apiserver SSL certificate. Includes documentation update on how localhost loadbalancer works.	2016-10-05 20:09:10 +03:00
keglevich3	52cdb911f7	changed to the correct link	2016-09-29 17:44:24 +03:00
Bogdan Dobrelya	6ab133d0a3	Allow subdomains of dns_domain and fix kubelet restarts * Add a var for ndots (default 5) and put it hosts' /etc/resolv.conf. * Poke kube dns container image to v1.7 * In order to apply changes to kubelet, notify it to be restarted on changes made to /etc/resolv.conf. Ignore errors as the kubelet may yet to be present up to the moment of the notification being processed. * Remove unnecessary kubelet restart for master role as the node role ensures it is up and running. Notify master static pods waiters for apiserver, scheduler, controller-manager instead. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-27 14:32:49 +02:00
Bogdan Dobrelya	14529c1ea3	Add more DNS docs Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-26 13:47:34 +02:00
Bogdan Dobrelya	8aeeb62719	Adjust DNS picture Reflect changes made to DNS setup Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-23 17:11:07 +02:00
Bogdan Dobrelya	2908f92524	Fix docs and dns servers placement order - Update docs and a drawing to clarify DNS setup. - Change order of nameservers placement to match changes in https://github.com/kubespray/kargo/pull/501 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-23 16:16:00 +02:00
Bogdan Dobrelya	34d0c5c676	Make dnsmasq daemon set optional Change additional dnsmasq opts: - Adjust caching size and TTL - Disable resolve conf to not create loops - Change dnsPolicy to default (similarly to kubedns's dnsmasq). The ClusterFirst should not be used to not create loops - Disable negative NXDOMAIN replies to be cached - Make its very installation as optional step (enabled by default). If you don't want more than 3 DNS servers, including 1 for K8s, disable it. - Add docs and a drawing to clarify DNS setup. - Fix stdout logs for dnsmasq/kubedns app configs - Add missed notifies to resolvconf -u handler - Fix idempotency of resolvconf head file changes Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-23 12:59:06 +02:00
Sean M. Collins	9c862a8b44	Rename large-deploymets.md to large-deployments.md Filename was a typo	2016-09-19 11:51:37 -04:00
Bogdan Dobrelya	ae8e5908ef	Add retry_stagger var for failed download/pushes. * Add the retry_stagger var to tweak push and retry time strategies. * Add large deployments related docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-15 16:43:58 +02:00
Spencer Smith	79d749b136	merge with current master, update typos in doc	2016-08-24 09:56:42 -04:00
Spencer Smith	a2fcf0be5d	updated to no longer handle gce as cloud-provider. provided aws setup doc	2016-08-24 09:48:32 -04:00
Bogdan Dobrelya	575ec168a3	Add HA/LB endpoints for kube-apiserver * Add HA docs for API server. * Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver vars and usecases. * Use facts for kube_apiserver to not repeat code and enable LB endpoints use. * Use /healthz check for the wait-for apiserver. * Use the single endpoint for kubelet instead of the list of apiservers * Specify kube_apiserver_count to for HA layout Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-07-25 17:25:45 +02:00
Bogdan Dobrelya	7639fadaba	Add ha docs Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-07-22 14:44:36 +02:00
Bogdan Dobrelya	fd83ec6526	Add etcd proxy support * Enforce a etcd-proxy role to a k8s-cluster group members. This provides an HA layout for all of the k8s cluster internal clients. * Proxies to be run on each node in the group as a separate etcd instances with a readwrite proxy mode and listen the given endpoint, which is either the access_ip:2379 or the localhost:2379. * A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and loadbalancers and use the etcd members IPs as a comma-separated list. Otherwise, clients shall use the local endpoint provided by a etcd-proxy instances on each etcd node. A Netwroking plugins always use that access mode. * Fix apiserver's etcd servers args to use the etcd_access_endpoint. * Fix networking plugins flannel/calico to use the etcd_endpoint. * Fix name env var for non masters to be set as well. * Fix etcd_client_url was not used anywhere and other etcd_* facts evaluation was duplicated in a few places. * Define proxy modes only in the env file, if not a master. Del an automatic proxy mode decisions for etcd nodes in init/unit scripts. * Use Wants= instead of Requires= as "This is the recommended way to hook start-up of one unit to the start-up of another unit" * Make apiserver/calico Wants= etcd-proxy to keep it always up Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com> Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>	2016-07-19 14:09:40 +02:00
Jean-Christophe Sirot	8c602db758	Some additional roadmap items	2016-07-08 16:32:01 +02:00
Smana	e739fd6b2b	a small change in the roadmap	2016-07-08 09:40:12 +02:00
Smaine Kahlouch	a889c5d29b	first version of the roadmap	2016-07-08 09:21:33 +02:00
Jean-Christophe Sirot	48035d30d5	Add CI test matrix	2016-07-07 10:35:59 +02:00
Smaine Kahlouch	d62294255c	add documentation	2016-07-04 14:37:30 +02:00

46 commits