Commit graph

64 commits

Author SHA1 Message Date
Bogdan Dobrelya
aefe4a99d2 Preconfigure DNS stack and docker early
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
  skip_dnsmasq_k8s var as not needed anymore.

* Preconfigure DNS stack early, which may be the case when downloading
  artifacts from intranet repositories. Do not configure
  K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
  not existing).

* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
  was set up and before K8s apps to be created.

* Move docker install task to early stage as well and unbind it from the
  etcd role's specific install path. Fix external flannel dependency on
  docker role handlers. Also fix the docker restart handlers' steps
  ordering to match the expected sequence (the socket then the service).

* Add default resolver fact, which is
  the cloud provider specific and remove hardcoded GCE resolver.

* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
  domains combined with high ndots values lead to poor performance of
  DNS stack and make ansible workers to fail very often with the
  "Timeout (12s) waiting for privilege escalation prompt:" error.

* Update docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:30:55 +01:00
Bogdan Dobrelya
0b1ce03167 Add tags
Add tags to allow more granular tasks filtering.
Add generator script for MD formatted tags found.
Add docs for tags how-to.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 12:14:28 +01:00
Spencer Smith
08fb5b6c01 Merge pull request #627 from kubernetes-incubator/issue-626
add restart flag for docker run kubelet
2016-12-06 08:47:18 -08:00
Dan Bode
aad73ea90e Ensure that etcd health checks always pass
in the etcd handler, the reload etcd action
was called after ansible waits for etcd to be
up, this means that the health checks which are
called immediately after fail (resulting in the etcd
role always failing and never finishing)

This patch changes the order to move the 'wait for etcd
up' resource after the 'reload etcd resource', ensuring that
the service is up before the health check is called.
2016-11-18 14:15:00 -08:00
Spencer Smith
106dcc3898 updated all instances of restart always to restart on-failure with a max of 5 times 2016-11-18 14:33:22 -05:00
Matthew Mosesohn
b141c41fee Fix ca certificate loading on CoreOS 2016-11-14 08:47:09 +04:00
Matthew Mosesohn
0dceb685ea Add etcd TLS support 2016-11-09 18:38:28 +03:00
Matthew Mosesohn
b8ca4e4f45 Remove etcd-proxy from all nodes and use etcd multiaccess 2016-11-09 13:31:12 +03:00
Smaine Kahlouch
f440f74e3b Merge pull request #562 from kubespray/enable_standalone_node
Enable standalone node deployment
2016-10-24 13:10:53 +02:00
Matthew Mosesohn
0d62e53939 Use only native cachable hostvars for etcd set_facts 2016-10-21 14:39:58 +03:00
Chad Swenson
5b08697679 Use absolute path for etcdctl
Small fix. The shell module won't automatically resolve the path to the etcdctl binary, so i prefixed with {{ bin_dir }}/
2016-10-20 14:56:52 -05:00
Bogdan Dobrelya
ae8e5908ef Add retry_stagger var for failed download/pushes.
* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-15 16:43:58 +02:00
Bogdan Dobrelya
da71ad9375 Download containers and save all
Move version/repo vars to download role.
Add container to download params, which overrides url/source_url,
if enabled.
Fix networking plugins download depending on kube_network_plugin.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-15 16:43:56 +02:00
Bogdan Dobrelya
97c14ec8b7 Add retries for copying binaries from containers
Closes issue: https://github.com/kubespray/kargo/issues/479

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-13 15:09:34 +02:00
Smaine Kahlouch
599e829919 Merge pull request #448 from kubespray/etcdnosync
Add --no-sync to etcdctl member list
2016-08-29 18:58:14 +02:00
Matthew Mosesohn
2af778044d Rebase etcd to v3.0.6
Fixes #450
2016-08-29 15:31:05 +03:00
Matthew Mosesohn
b54aacc62a Add --no-sync to etcdctl member list
Fixes #447
2016-08-29 12:51:43 +03:00
Bogdan Dobrelya
516b55734e Refactor roles and hosts
Shorten deployment time with:
- Remove redundand roles if duplicated by a dependency and vice versa
- When a member of k8s-cluster, always install docker as a dependency
  of the etcd role and drop the docker role from cluster.yaml.
- Drop etcd and node role dependencies from master role as they are
  covered by the node role in k8s-cluster group as well. Copy defaults
  for master from node role.
- Decouple master, node, secrets roles handlers and vars to be used w/o
  cross references.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-08-25 13:27:57 +02:00
Matthew Mosesohn
53b3601cfa Fix init scripts for etcd. Fixes #383
Fixes Ubuntu 14.04 deployment of etcd.
2016-08-15 14:09:42 +03:00
Matthew Mosesohn
16358e8aae Fix etcd restart and handler systemd tasks
Changed Wants=docker.service to docker.socket

Renamed handlers for reloading systemd to contain role in task name.
2016-07-29 16:32:35 +03:00
Matthew Mosesohn
df4c44ceef Fix etcd user for etcd-proxy service
Only affects sys V OSes (Ubuntu 14.04)

Fixes ##383
2016-07-27 11:54:47 +03:00
Matthew Mosesohn
2f1f7a492d Fix etcd standalone deployment
etcd facts are generated in kubernetes/preinstall, so etcd nodes need
to be evaluated first before the rest of the deployment.

Moved several directory facts from kubernetes/node to
kubernetes/preinstall because they are not backward dependent.
2016-07-26 18:15:06 +03:00
mattymo
623becabf8 Merge branch 'master' into etcddockerdefault 2016-07-20 19:16:47 +03:00
Matthew Mosesohn
40207937f7 Set default etcd deployment to docker
Improved docker reload command to wait for etcd to be
up before proceeding. Switched reload to run restart
because it can't reload if it is not guaranteed to be
in running state.
2016-07-20 18:26:16 +03:00
Bogdan Dobrelya
01e554fdb3 Fix set_facts visibility
Move set_facts to the preinstall scope, so every role
may see it. For example, network plugins to see the etcd_endpoint.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-07-20 11:41:09 +02:00
Bogdan Dobrelya
fd83ec6526 Add etcd proxy support
* Enforce a etcd-proxy role to a k8s-cluster group members. This
provides an HA layout for all of the k8s cluster internal clients.
* Proxies to be run on each node in the group as a separate etcd
instances with a readwrite proxy mode and listen the given endpoint,
which is either the access_ip:2379 or the localhost:2379.
* A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and
loadbalancers and use the etcd members IPs as a comma-separated
list. Otherwise, clients shall use the local endpoint provided by a
etcd-proxy instances on each etcd node. A Netwroking plugins always
use that access mode.
* Fix apiserver's etcd servers args to use the etcd_access_endpoint.
* Fix networking plugins flannel/calico to use the etcd_endpoint.
* Fix name env var for non masters to be set as well.
* Fix etcd_client_url was not used anywhere and other etcd_* facts
evaluation was duplicated in a few places.
* Define proxy modes only in the env file, if not a master. Del
an automatic proxy mode decisions for etcd nodes in init/unit scripts.
* Use Wants= instead of Requires= as "This is the recommended way to
hook start-up of one unit to the start-up of another unit"
* Make apiserver/calico Wants= etcd-proxy to keep it always up

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
2016-07-19 14:09:40 +02:00
Bogdan Dobrelya
70c37ec77b Fix systemd service unit for etcd
See https://github.com/coreos/etcd/issues/4308

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-07-15 16:22:17 +02:00
Smana
28204d18cb deployment idempotent 2016-07-14 21:33:24 +02:00
Matthew Mosesohn
eaed005045 Add optional deployment mode for Docker etcd_deployment_type
Running etcd in Docker reduces the number of individual file
downloads and services running on the host.

Note: etcd container v3.0.1 moves bindir to /usr/local/bin

Fixes: #298
2016-07-07 19:31:28 +03:00
Smana
ed1ecbd35f uprade to etcd v3.0.1 2016-07-02 14:14:32 +02:00
Smana
cbb30d5cb5 upgrade etcd version to 2.3.7 2016-06-28 12:31:57 +02:00
Evgeny L
0500f27db8 Scale-up functionality for etcd cluster
* Set ETCD_INITIAL_CLUSTER_STATE from `new` to `existing`,
because parameter `new` makes sense only on cluster assembly
stage.
* If cluster exists and current node is not a part
of the cluster, add it with command `etcdctl add member name url`.

Closes kubespray/kargo/#270
2016-05-31 18:23:46 +03:00
Paul Czarkowski
7de87d958e turn adduser/download roles into meta roles
This should make things a little more composable,
by making these roles meta roles that perform no
actions by default we allow each role to own its own
resources.
2016-05-22 17:25:52 -05:00
David Reuss
180f2d1fde Pull correct variable for etcd initial variable
This shouldn't use the `inventory_hostname` variable, as that will just yield the same variable, but rather use the `host` which we're looping over.
2016-04-29 14:37:01 +02:00
Christopher M Luciano
47982ea21c Use ansible array format instead of dot-notation.
This fixes the ansible error ```'dict object' has no attribute
'ansible_default_ipv4'"}```. Closes #215
2016-04-25 08:45:58 -04:00
Smana
fca384e24c first version of CoreOS on GCE
Please enter the commit message for your changes. Lines starting
2016-02-21 00:06:36 +01:00
Smana
b013b125bc Upgrade Calico and etcd 2016-02-15 12:41:27 +01:00
Smana
a649aa8b7e use ansible_service_mgr to detect init system 2016-02-13 11:46:53 +01:00
Smaine Kahlouch
6358cf788f etcd initd startup command fix 2016-01-30 22:31:41 +01:00
Greg Althaus
bedcca922c Add variables and defaults for multiple types of ip addresses.
Each node can have 3 IPs.
1. ansible_default_ip4 - whatever ansible things is the first IPv4 address
   usually with the default gw.
2. ip - An address to use on the local node to bind listeners and do local
   communication.  For example, Vagrant boxes have a first address that is the
   NAT bridge and is common for all nodes.  The second address/interface should
   be used.
3. access_ip - An address to use for node-to-node access.  This is assumed to
   be used by other nodes to access the node and may not be actually assigned
   on the node.  For example, AWS public ip that is not assigned to node.

This updates the places addresses are used to use either ip or access_ip and walk
up the list to find an address.
2016-01-27 16:05:39 -06:00
Antoine Legrand
b9781fa7c2 Symlink dnsmasq conf 2016-01-26 00:30:29 +01:00
Smaine Kahlouch
baaa6efc2b workaround_ha_apiserver 2016-01-25 12:07:32 +01:00
ant31
56b92812fa Fix systemd reload and calico unit 2016-01-25 10:54:07 +01:00
Smaine Kahlouch
4984b57aa2 use rsync instead of command 2016-01-23 18:26:07 +01:00
Smaine Kahlouch
283c4169ac run apiserver as a service
reorder master handlers

typo for sysvinit
2016-01-23 14:21:04 +01:00
Smaine Kahlouch
cb59559835 use command instead of synchronize 2016-01-22 16:37:07 +01:00
Antoine Legrand
078b67c50f Remove downloader host 2016-01-22 09:59:39 +01:00
Greg Althaus
28e530e005 Fix etcd synchronize to other nodes from the downloader 2016-01-21 11:21:25 -06:00
Smaine Kahlouch
9715962356 etcd directly in host
fix etcd configuration for nodes

fix wrong calico checksums

using a var name etcd_bin_dir

fix etcd handlers for sysvinit

using a var name etcd_bin_dir

sysvinit script

review etcd configuration
2016-01-21 11:36:11 +01:00
Smaine Kahlouch
8127e8f8e8 Flannel running as pod 2016-01-15 13:03:27 +01:00