Commit graph

924 commits

Author SHA1 Message Date
Sergii Golovatiuk
5122697f0b Improve Weave
- Remove weave CPU limits from .gitlab-ci.yml. Closes: #975
- Fix weave version in documentation

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-06 13:24:40 +01:00
Antoine Legrand
bd1c764a1a Merge pull request #963 from rutsky/bastion-ansible-host
handle both 'ansible_host' and 'ansible_ssh_host' in bastion configration
2017-02-04 15:42:39 -05:00
Brad Beam
df3e11bdb8 Adding EFK logging stack 2017-02-03 16:27:08 -06:00
Bogdan Dobrelya
5a7a3f6d4a Merge pull request #949 from vmtyler/master
Fixes Support for OpenStack v3 credentials
2017-02-03 12:22:00 +01:00
Vladimir Rutsky
b4327fdc99 handle both 'ansible_host' and 'ansible_ssh_host' in bastion configuration
'absible_ssh_host' is deprecated in Ansible 2.0 and at least
'contrib/inventory_builder/inventory.py' uses 'ansible_host' instead.
2017-02-02 18:34:53 +03:00
Matthew Mosesohn
10f924a617 Merge pull request #927 from holser/nsenter_fix
Remove nsenter workaround
2017-02-02 18:18:15 +03:00
Matthew Mosesohn
3dd6a01c8b Merge pull request #901 from galthaus/dns-tweak
DHCP Hook protections
2017-02-02 16:47:16 +03:00
Sergii Golovatiuk
585afef945 Remove nsenter workaround
- Docker 1.12 and further don't need nsenter hack. This patch removes
  it.  Also, it bumps the minimal version to 1.12.

Closes #776

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-02 14:38:11 +01:00
Sergii Golovatiuk
f2e4ffcac2 Fix weave-net after upgrade to 1.82
- Set recommended CPU settings
- Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
- Limit DS creation to master
- Combined 2 tasks into one with better condition
2017-02-02 10:31:58 +01:00
Matthew Mosesohn
ae66b6e648 Merge pull request #957 from mattymo/weave-net-naming
Rename weave-kube to weave-net
2017-02-02 10:18:02 +03:00
Greg Althaus
923057c1a8 This continues the DHCP hook checks. Also protect the create side
if the system doesn't have any config files at all.
2017-01-31 09:56:27 -06:00
Matthew Mosesohn
0f6e08d34f Merge pull request #951 from mattymo/k8s-certs-scale
Fix cert distribution at scale
2017-01-31 18:49:26 +03:00
Matthew Mosesohn
4889a3e2e1 Merge pull request #954 from artem-panchenko/improve_dnsmasq
Explicitly set config path for DNSMasq
2017-01-31 18:48:46 +03:00
Matthew Mosesohn
39d87a96aa Rename weave-kube to weave-net
Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
2017-01-31 18:47:27 +03:00
Matthew Mosesohn
08822ec684 Fix cert distribution at scale
Use stdin instead of bash args to pass node filenames and base64 data.
Use tempfile for master cert data
2017-01-31 16:27:45 +03:00
Matthew Mosesohn
6463a01e04 Merge pull request #880 from bradbeam/weave-kube
Weave kube
2017-01-31 13:31:09 +03:00
Artem Panchenko
1418fb394b Explicitly set config path for DNSMasq
When DNSMasq is configured to read its settings
from a folder ('-7' or '--conf-dir' option) it only
checks that the directory exists and doesn't fail if
it's empty. It could lead to a situation when DNSMasq
is running and handles requests, but not properly
configured, so some of queries can't be resolved.
2017-01-31 12:14:57 +02:00
Matthew Mosesohn
e4eda88ca9 Merge pull request #944 from tureus/skip-cloud-config-on-etcd
Bugfix: skip cloud_config on etcd
2017-01-30 20:12:36 +03:00
Brad Beam
a11b9d28bd Upgrading weave to weave-kube 2017-01-27 17:05:25 -06:00
Brad Beam
b54eb609bf Consolidating kube.py module 2017-01-27 11:28:11 -06:00
Tyler Britten
f8ffa1601d Fixed for non-null output 2017-01-27 10:47:59 -05:00
Tyler Britten
da01bc1fbb Updated OpenStack vars to check for tenant_id (v2) and project_id (v3) 2017-01-27 10:26:20 -05:00
neith00
bbc8c09753 Using the command module instead of raw
Using the command module instead of raw.
Also fixed the syntax.
2017-01-26 16:28:48 +01:00
Xavier Lange
e5fdc63bdd Bugfix: skip cloud_config on etcd 2017-01-25 14:09:21 -08:00
Aleksandr Didenko
46c177b982 Switch to ansible_hostname in calico
For consistancy with kubernetes services we should use the same
hostname for nodes, which is 'ansible_hostname'.

Also fixing missed 'kube-node' in templates, Calico is installed
on 'k8s-cluster' roles, not only 'kube-node'.
2017-01-25 11:49:58 +01:00
Matthew Mosesohn
f4b7474ade Merge pull request #926 from adidenko/fix-calico-rr-for-masters
Fix calico-rr peering with k8s masters
2017-01-24 12:38:52 +03:00
Alexander Block
9bf792ce0b Pin docker version on RedHat and CentOS to the desired version 2017-01-23 12:39:54 +01:00
Aleksandr Didenko
f05aaeb329 Fix calico-rr peering with k8s masters
Calico-rr is broken for deployments with separate k8s-master and
k8s-node roles. In order to fix it we should peer k8s-cluster
nodes with calico-rr, not just k8s-node. The same for peering
with routers.

Closes #925
2017-01-23 10:19:09 +01:00
Matthew Mosesohn
8ce32eb3e1 Merge pull request #905 from galthaus/async-runs
Add tasks to ensure that the first nodes have their directories for cert gen
2017-01-19 18:32:27 +03:00
Matthew Mosesohn
aae0314bda Merge pull request #904 from galthaus/nginx-port-config
Add nginx local balancer port configuration variable
2017-01-19 18:31:57 +03:00
Matthew Mosesohn
35d5248d41 Merge pull request #913 from galthaus/apps-master-only
Ansible apps should only check for api-server running on the master.
2017-01-19 18:30:58 +03:00
Matthew Mosesohn
0ccc2555d3 Merge pull request #917 from mattymo/rkt_resolvconf
Fix setting resolvconf when using rkt deploy mode
2017-01-19 18:30:21 +03:00
Matthew Mosesohn
b26a711e96 Merge pull request #916 from mattymo/update_ansible
Update Ansible to 2.2.1
2017-01-19 18:13:45 +03:00
Matthew Mosesohn
2218a052b2 Merge pull request #921 from mattymo/docker113
Add docker 1.13, update 1.12 to 1.12.6
2017-01-19 18:13:21 +03:00
Matthew Mosesohn
33fbcc56d6 Add docker 1.13, update 1.12 to 1.12.6
Fixes #903
2017-01-19 13:58:36 +03:00
Sergii Golovatiuk
61d05dea58 Allow to specify number of concurrent DNS queries
ndots creates overhead as every pod creates 5 concurrent connections
that are forwarded to sky dns. Under some circumstances dnsmasq may
prevent forwarding traffic with "Maximum number of concurrent DNS
queries reached" in the logs.

This patch allows to configure the number of concurrent forwarded DNS
queries "dns-forward-max" as well as "cache-size" leaving the default
values as they were before.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-01-19 11:47:37 +01:00
Matthew Mosesohn
8a821060a3 Update Ansible to 2.2.1 2017-01-19 13:46:46 +03:00
Greg Althaus
0d44599a63 Add explicit name printing in task names for deletgated task during
cert creation
2017-01-18 14:06:50 -06:00
Matthew Mosesohn
b6c3e61603 Fix setting resolvconf when using rkt deploy mode
rkt deploy mode doesn't create {{ bin_dir }}/kubelet, so
let's rely on kubelet.env file instad.
2017-01-18 19:18:47 +03:00
Matthew Mosesohn
5420fa942e Merge pull request #897 from holser/flush_handlers_before_etcd
Flush handlers before etcd restart
2017-01-18 12:27:01 +03:00
Matthew Mosesohn
1ee33d3a8d Merge pull request #910 from mattymo/escape_curly
Fix ansible 2.2.1 handling of registered vars
2017-01-18 11:13:01 +03:00
Greg Althaus
61dab8dc0b Should only check for api-server running on the master.
If this runs on other nodes, it will fail the playbook.
2017-01-17 15:57:34 -06:00
Matthew Mosesohn
b2a27ed089 Fix bash completion installation 2017-01-17 20:36:58 +03:00
Matthew Mosesohn
d8ae50800a Work around escaping curly braces for docker inspect 2017-01-17 20:35:38 +03:00
Sergii Golovatiuk
43fa72b7b7 Flush handlers before etcd restart
systemctl daemon-reload should be run before when task modifies/creates
union for etcd. Otherwise etcd won't be able to start

Closes #892

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-01-17 15:04:25 +01:00
Matthew Mosesohn
73204c868d Merge pull request #909 from mattymo/docker-upgrade
Always trigger docker restart when docker package changes
2017-01-17 11:37:42 +03:00
Matthew Mosesohn
74b78e75a1 Always trigger docker restart when docker package changes
Docker upgrade doesn't auto-restart docker, causing failures
when trying to start another container
2017-01-16 17:52:28 +03:00
Greg Althaus
6905edbeb6 Add a variable that defaults to kube_apiserver_port that defines
the which port the local nginx proxy should listen on for HA
local balancer configurations.
2017-01-14 23:38:07 -06:00
Greg Althaus
6c69da1573 This PR adds/or modifies a few tasks to allow for the playbook to
be run by limit on each node without regard for order.

The changes make sure that all of the directories needed to do
certificate management are on the master[0] or etcd[0] node regardless
of when the playbook gets run on each node.  This allows for separate
ansible playbook runs in parallel that don't have to be synchronized.
2017-01-14 23:24:34 -06:00
Greg Althaus
95bf380d07 If the inventory name of the host exceeds 63 characters,
the openssl tools will fail to create signing requests because
the CN is too long.  This is mainly a problem when FQDNs are used
in the inventory file.

THis will truncate the hostname for the CN field only at the
first dot.  This should handle the issue for most cases.
2017-01-13 10:02:23 -06:00
Matthew Mosesohn
80703010bd Use only one certificate for all apiservers
https://github.com/kubernetes/kubernetes/issues/25063
2017-01-13 14:03:20 +03:00
Bogdan Dobrelya
e88c10670e Merge pull request #891 from galthaus/selinux-order
preinstall fails on AWS CentOS7 image
2017-01-13 11:51:18 +01:00
Alexander Block
1054f37765 Don't try to delete kargo specific config from dhclient when file does not exist
Also remove the check for != "RedHat" when removing the dhclient hook,
as this had also to be done on other distros. Instead, check if the
dhclienthookfile is defined.
2017-01-13 10:56:10 +01:00
Greg Althaus
f77257cf79 When running on CentOS7 image in AWS with selinux on, the order of
the tasks fail because selinux prevents ip-forwarding setting.

Moving the tasks around addresses two issues.  Makes sure that
the correct python tools are in place before adjusting of selinux
and makes sure that ipforwarding is toggled after selinux adjustments.
2017-01-12 10:12:21 -06:00
Bogdan Dobrelya
f004cc07df Merge pull request #830 from mattymo/k8sperhost
Generate individual certificates for k8s hosts
2017-01-12 12:42:14 +01:00
Alexander Block
a7bf7867d7 Add tasks to undo changes to hosts /etc/resolv.conf and dhclient configs 2017-01-11 16:56:16 +01:00
Matthew Mosesohn
3f274115b0 Generate individual certificates for k8s hosts 2017-01-11 12:58:07 +03:00
Matthew Mosesohn
3b0918981e Merge pull request #878 from bradbeam/rkt-cni
Adding /opt/cni /etc/cni to rkt run kubelet
2017-01-11 12:22:04 +03:00
Bogdan Dobrelya
d8cef34d6c Merge pull request #872 from mattymo/bug868
Bind nginx localhost proxy to localhost
2017-01-10 17:09:25 +01:00
Brad Beam
db8173da28 Adding /opt/cni /etc/cni to rkt run kubelet 2017-01-10 08:48:58 -06:00
Bogdan Dobrelya
bcdfb3cfb0 Merge pull request #793 from kubernetes-incubator/fix_dhclientconf_path
Fix wrong path of dhclient on CentOS+Azure
2017-01-10 13:23:55 +01:00
Bogdan Dobrelya
79aeb10431 Merge pull request #858 from bradbeam/calicoctl-canal
Misc updates for canal
2017-01-10 12:24:59 +01:00
Matthew Mosesohn
38338e848d Merge pull request #860 from adidenko/fix-calico-rr-certs
Fix etcd cert generation for calico-rr role
2017-01-09 18:34:02 +03:00
Bogdan Dobrelya
10dbd0afbd Merge pull request #871 from mattymo/fix_system_search_domains
Fix docker dns host scenario with no search domains
2017-01-09 15:52:12 +01:00
Matthew Mosesohn
e22f938ae5 Bind nginx localhost proxy to localhost
This proxy should only be listening for local connections, not 0.0.0.0.

Fixes #868
2017-01-09 17:19:54 +03:00
Matthew Mosesohn
1dce56e2f8 Fix docker dns host scenario with no search domains
Fixes scenario where docker-dns.conf tries to create an empty
search entry
2017-01-09 16:36:44 +03:00
Aleksandr Didenko
d9539e0f27 Fix etcd cert generation for calico-rr role
"etcd_node_cert_data" variable is undefinded for "calico-rr" role.
This patch adds "calico-rr" nodes to task where "etcd_node_cert_data"
variable is registered.
2017-01-09 12:06:25 +01:00
Aleksandr Didenko
0909368339 Set latest stable versions for Calico images
Change version for calico images to v1.0.0. Also bump versions for
CNI and policy controller.

Also removing images repo and tag duplication from netchecker role
2017-01-09 12:05:49 +01:00
Bogdan Dobrelya
091b634ea1 Merge pull request #799 from kubernetes-incubator/docker_dns
Implement "dockerd --dns-xxx" based dns mode
2017-01-09 11:38:02 +01:00
Alexander Block
a8b5b856d1 Only use default resolver in dnsmasq when we are using host_resolvconf mode 2017-01-06 10:21:07 +01:00
Alexander Block
1d2a18b355 Introduce dns_mode and resolvconf_mode and implement docker_dns mode
Also update reset.yml to do more dns/network related cleanup.
2017-01-05 23:38:51 +01:00
Spencer Smith
4a59340182 remove assertion for family not being CoreOS 2017-01-05 13:36:25 -05:00
Brad Beam
cf042b2a4c Create network policy directory for canal 2017-01-05 10:54:27 -06:00
Brad Beam
65c86377fc Adding calicoctl to canal deployment 2017-01-05 10:54:27 -06:00
Bogdan Dobrelya
5af2c42bde Better fix for different CoreOS os family facts
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-05 16:32:08 +01:00
Bogdan Dobrelya
f7447837c5 Rename CoreOS fact
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-05 14:02:29 +01:00
Bogdan Dobrelya
6546869c42 Merge branch 'master' into rkt 2017-01-05 10:34:18 +01:00
Brad Beam
4b6f29d5e1 Adding kubelet in rkt 2017-01-03 14:49:48 -06:00
Brad Beam
8dc19374cc Allowing etcd to run via rkt 2017-01-03 10:10:38 -06:00
Brad Beam
a8f2af0503 Adding initial rkt support 2017-01-03 10:08:43 -06:00
Bogdan Dobrelya
d8a2941e9e Fix cert paths for flannel/calico policy apps
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-03 16:12:54 +01:00
Alexander Block
ab7df10a7d Upgrade docker version and do some cleanups for unsupported distros/docker versions 2017-01-02 18:05:50 +01:00
Bogdan Dobrelya
93663e987c Merge pull request #847 from bogdando/bug_769
Fix etc hosts for cluster nodes
2017-01-02 17:47:23 +01:00
Bogdan Dobrelya
97f96a6376 Fix etc hosts for cluster nodes
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-02 13:20:51 +01:00
Bogdan Dobrelya
58062be2a3 Drop non systemd OS types support
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-02 12:14:03 +01:00
Matthew Mosesohn
1f9f885379 Fix etcd cert generation to support large deployments
Due to bash max args limits, we should pass all node filenames and
base64-encoded tar data through stdin/stdout instead.

Fixes #832
2016-12-30 12:55:26 +03:00
Bogdan Dobrelya
a56d9de502 Systemd units, limits, and bin path fixes
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
  systemd limits due to the lack of consensus in the kernel community
  requires out-of-tree kernel patches.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-28 15:49:42 +01:00
Matthew Mosesohn
f0c0390646 Fix creation and sync of etcd certs
Admin certs only go to etcd nodes
Only generate cert-data for nodes that need sync
2016-12-28 14:21:17 +04:00
Matthew Mosesohn
e7a1949d85 Merge pull request #818 from mattymo/calico-rr-certs
Fix calico-rr to use etcd certs instead of kube certs
2016-12-28 08:47:16 +03:00
Matthew Mosesohn
6d9cd2d720 Fix calico-rr to use etcd certs instead of kube certs 2016-12-27 17:04:50 +03:00
Bogdan Dobrelya
79996b557b Rework ignore_errors to report no reds
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2016-12-27 13:00:50 +01:00
Bogdan Dobrelya
bb0c3537cb Do not forward bogus domains for upstream resolvers
Also fix kube log level 4 to log dnsmasq queries.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-23 11:53:14 +01:00
Matthew Mosesohn
385f7f6e75 Update etcd.j2 2016-12-22 22:29:24 +03:00
Matthew Mosesohn
9f1e3db906 Adjust etcd server certificates
ETCD doesn't need cert/key options set. It only requires peer
cert options.
2016-12-22 23:05:17 +04:00
Spencer Smith
b63d900625 Workaround etcdctl not yet being installed (#797)
workaround case for etcdctl not yet being installed, only allow for return code of 0 (no error)
2016-12-22 12:41:38 -05:00
Matthew Mosesohn
a4bce333a3 Merge pull request #760 from genti-t/issue-748-flannel-options
Fix Flannel network on CoreOS
2016-12-22 19:02:31 +03:00
Genti Topija
7c2785e083 Fix Flannel network on CoreOS
Resolves: #748
2016-12-22 16:50:04 +01:00
Matthew Mosesohn
ad796d188d Individual etcd ssl certs
Includes hooks for triggering calico, kubelet, and kube-apiserver restarts
if etcd certs changed.
2016-12-22 13:31:11 +03:00
Bogdan Dobrelya
de8cd5cd7f Merge pull request #786 from mattymo/bug777
Add wait for kube-apiserver to kubernetes-apps
2016-12-22 11:02:50 +01:00
Alexander Block
8e4e3998dd Fix wrong path of dhclient on CentOS+Azure
This was alredy fixed in #755 but had to be reverted. This PR should be
more intelligent about deciding which path to use.
2016-12-21 21:51:07 +01:00
Spencer Smith
8d9f207836 create systemd drop-in path if not existent 2016-12-21 13:06:12 -05:00
Bogdan Dobrelya
f10d1327d4 Revert "Do not forward private domains for upstream resolvers" 2016-12-21 15:24:17 +01:00
Matthew Mosesohn
d314174149 Add wait for kube-apiserver to kubernetes-apps
Fixes #777
2016-12-21 15:39:39 +03:00
Bogdan Dobrelya
b8bc8eee41 Add download_always_pull check and sha256 for docker images
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-20 17:02:09 +01:00
Bogdan Dobrelya
11380769cd Merge pull request #722 from bogdando/dnsmasq_armors
Do not forward private domains for upstream resolvers
2016-12-20 14:25:17 +01:00
Bogdan Dobrelya
843d439898 Merge pull request #775 from kubernetes-incubator/register_master
Register master node as unschedulable
2016-12-20 14:17:55 +01:00
Bogdan Dobrelya
c1e4cef75b Merge pull request #774 from kubernetes-incubator/ant31-patch-2
check if calico_peer_rr is defined
2016-12-19 18:19:03 +01:00
Matthew Mosesohn
348fc5b109 Fix etcd to-SSL upgrade and task register vars 2016-12-19 15:05:49 +03:00
Bogdan Dobrelya
101864c050 Do not forward private domains for upstream resolvers
Also fix kube log level 4 to log dnsmasq queries.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
2016-12-19 11:01:41 +01:00
Alexander Block
fe150d4e4d Register master node as unschedulable
Also refactor generation of kubelet args to not repeat args.
2016-12-19 10:47:43 +01:00
Antoine Legrand
048ac264a3 Update main.yml 2016-12-17 20:22:39 +01:00
Antoine Legrand
768fe05eea Merge pull request #704 from vwfs/bastion_hosts
Add support for bastion hosts
2016-12-17 12:08:49 +01:00
Antoine Legrand
1c48a001df Merge pull request #763 from bogdando/resolver_fallback
Fallback to default resolver if no nameservers
2016-12-17 12:03:41 +01:00
Antoine Legrand
a7276901a3 Merge pull request #766 from kubernetes-incubator/docker12point5
Update docker to 1.12.5
2016-12-17 11:55:06 +01:00
Bogdan Dobrelya
1782d19e1f Fallback to default resolver if no nameservers
Current design expects users to define at least one
nameserver in the nameservers var to backup host OS DNS config
when the K8s cluster DNS service IP is not available and hosts
still have to resolve external or intranet FQDNs.

Fix undefined nameservers to fallback to the default_resolver.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-16 14:51:34 +01:00
Bogdan Dobrelya
e2476fbd0b Revert "Fix wrong path for dhclient.conf on RedHat/CentOS" 2016-12-16 14:49:26 +01:00
Matthew Mosesohn
07cd81ef58 Update docker to 1.12.5
Note the new ubuntu/debian version string change:
https://github.com/docker/docker/issues/29355
2016-12-16 16:30:46 +03:00
Bogdan Dobrelya
92f542938c Merge pull request #745 from kubernetes-incubator/fix_weave_start
Fix weave restart after docker daemon restart
2016-12-16 14:06:48 +01:00
Matthew Mosesohn
495d0b659a Fix weave restart after docker daemon restart 2016-12-16 14:15:22 +03:00
Antoine Legrand
a2f8f17270 Merge pull request #757 from kubernetes-incubator/issue754
Add dns_domain for each host to /etc/hosts
2016-12-15 21:42:59 +01:00
Bogdan Dobrelya
0e2329b59e Merge pull request #755 from kubernetes-incubator/fix_dhclientconf_path
Fix wrong path for dhclient.conf on RedHat/CentOS
2016-12-15 19:08:31 +01:00
Bogdan Dobrelya
70143d87bf Merge pull request #746 from kubernetes-incubator/etcd_ssl_upgrade_fix
Fix etcd member list when upgrading ETCD from an old version
2016-12-15 12:31:34 +01:00
Matthew Mosesohn
68ad4ff4d9 Add dns_domain for each host to /etc/hosts
Fixes #754
2016-12-15 13:34:59 +04:00
Bogdan Dobrelya
725f9ea3bd Merge pull request #749 from kubernetes-incubator/azure_ip_forward
Set net.ipv4.ip_forward=1 on all systems, not only on GCE
2016-12-15 10:19:43 +01:00
Alexander Block
a9684648ab Fix wrong path for dhclient.conf on RedHat/CentOS
/etc/dhclient.conf is ignored on RedHat/CentOS
Correct location is /etc/dhcp/dhclient.conf
2016-12-15 10:11:16 +01:00
Matthew Mosesohn
9cc73bdf08 Fix etcd member list when upgrading ETCD from an old version 2016-12-15 12:00:45 +04:00
Bogdan Dobrelya
114ab5e4e6 Merge pull request #721 from adidenko/calico-add-rr
Add calico/routereflector support
2016-12-14 17:22:00 +01:00
Smaine Kahlouch
29874baf8a Merge pull request #708 from vwfs/cloud_network
Add support for cloud-provider based networking
2016-12-14 16:23:20 +01:00
Alexander Block
81317505eb Set net.ipv4.ip_forward=1 on all systems, not only on GCE 2016-12-14 15:08:13 +01:00
Aleksandr Didenko
d57c27ffcf Add calico/routereflector support
Add BGP route reflectors support in order to optimize BGP topology
for deployments with Calico network plugin.

Also bump version of calico/ctl for some bug fixes.
2016-12-14 13:44:10 +01:00
Alexander Block
d50eb60827 Add --reconcile-cidr flag to kubelet to support cloud network plugin in 1.4 2016-12-13 17:30:10 +01:00
Alexander Block
dbd9aaf1ea Add check for azure_route_table_name and add it to all.yml 2016-12-13 17:30:10 +01:00
Alexander Block
d20d5e648f Add pseudo network plugin called "cloud" to use cloud provider for network
Allow to let the cloud provider configure proper routing for nodes.
2016-12-13 17:30:10 +01:00
Alexander Block
06584ee3aa Add support for bastion hosts 2016-12-13 17:29:47 +01:00
Antoine Legrand
26e3142c95 Merge branch 'master' into standalone_kubelet 2016-12-13 17:26:21 +01:00
Alexander Block
665ce82d71 Move kube_version to group_vars/all to allow easier changing of version
Also allows to perform version dependent logic in Ansible roles.
2016-12-13 17:21:00 +01:00
Alexander Block
444b1dafdc Pass --anonymous-auth to apiserver
Fixes #732
2016-12-13 17:06:53 +01:00
Bogdan Dobrelya
d6174b22e9 Merge pull request #731 from bogdando/fix_resolvconf
Fix resolvconf
2016-12-13 16:48:37 +01:00
Bogdan Dobrelya
c75f394707 Address standalone kubelet config case
Also place in global vars and do not repeat the kube_*_config_dir
and kube_namespace vars for better code maintainability and UX.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-13 16:35:53 +01:00
Bogdan Dobrelya
0515814e0c Fix resolvconf
Do not repeat options and nameservers in the dhclient hooks.
Do not prepend nameservers for dhclient but supersede and fail back
to the upstream_dns_resolvers then default_resolver. Fixes order of
nameservers placement, which is cluster DNS ip goes always first.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-13 15:48:53 +01:00
Alexander Block
1cfaf927c9 Fix reverse umount in reset role
The Jinja2 filter 'reverse' returned an iterator instead of a list,
resulting in the umount task to fail.

Intead of using the reverse filter, we use 'tac' to reverse the output
of the previous task.
2016-12-13 14:21:24 +01:00
Bogdan Dobrelya
45135ad3e4 Merge pull request #705 from vwfs/centos7-azure
Better support for CentOS 7 on Azure
2016-12-13 10:36:58 +01:00
Bogdan Dobrelya
4e721bfd9d Merge pull request #667 from bogdando/fix_dns
Rework DNS stack to meet hostnet pods needs
2016-12-12 21:38:13 +01:00
Bogdan Dobrelya
f52ed9f91e Update main.yml 2016-12-12 21:37:16 +01:00
Bogdan Dobrelya
3117858dcd Rework DNS stack to meet hostnet pods needs
* For Debian/RedHat OS families (with NetworkManager/dhclient/resolvconf
  optionally enabled) prepend /etc/resolv.conf with required nameservers,
  options, and supersede domain and search domains via the dhclient/resolvconf
  hooks.

* Drop (z)nodnsupdate dhclient hook and re-implement it to complement the
  resolvconf -u command, which is distro/cloud provider specific.
  Update docs as well.

* Enable network restart to apply and persist changes and simplify handlers
  to rely on network restart only. This fixes DNS resolve for hostnet K8s
  pods for Red Hat OS family. Skip network restart for canal/calico plugins,
  unless https://github.com/projectcalico/felix/issues/1185 fixed.

* Replace linefiles line plus with_items to block mode as it's faster.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
2016-12-12 17:43:47 +01:00
Alexander Block
5176e5c968 Make growpart only run on Azure 2016-12-12 14:14:22 +01:00
Bogdan Dobrelya
774f4dbbf7 Merge branch 'master' into tags_download 2016-12-12 11:44:00 +01:00
Matthew Mosesohn
b1e852a785 Merge pull request #707 from vwfs/reset_playbook
Add playbook and role to reset the cluster
2016-12-12 12:43:00 +03:00
Alexander Block
9fd14cb6ea Add growpart role to allow growing the root partition on CentOS
At least the OS images from Azure do not grow the root FS automatically.
2016-12-12 09:55:28 +01:00
Alexander Block
4e34803b1e Disable fastestmirror on CentOS
It actually slows down things dramatically when used in combination
with Ansible.
2016-12-12 09:54:39 +01:00
Alexander Block
7abcf6e0b9 Remove requiretty from sudoers to actually make pipelining work
Some systems (e.g. CentOS on Azure) have requiretty in sudoers which makes
pipelining fail.
2016-12-12 09:54:39 +01:00
Matthew Mosesohn
e5ad0836bc Merge pull request #713 from kubernetes-incubator/bump_kubedns
Bump kubedns version to 1.9
2016-12-10 11:08:42 +03:00
Bogdan Dobrelya
2c50f20429 Merge pull request #696 from bogdando/intranet_dns
Preconfigure dns stack early
2016-12-09 21:46:03 +01:00
Bogdan Dobrelya
a15d626771 Preconfigure DNS stack and docker early
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
  skip_dnsmasq_k8s var as not needed anymore.

* Preconfigure DNS stack early, which may be the case when downloading
  artifacts from intranet repositories. Do not configure
  K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
  not existing).

* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
  was set up and before K8s apps to be created.

* Move docker install task to early stage as well and unbind it from the
  etcd role's specific install path. Fix external flannel dependency on
  docker role handlers. Also fix the docker restart handlers' steps
  ordering to match the expected sequence (the socket then the service).

* Add default resolver fact, which is
  the cloud provider specific and remove hardcoded GCE resolver.

* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
  domains combined with high ndots values lead to poor performance of
  DNS stack and make ansible workers to fail very often with the
  "Timeout (12s) waiting for privilege escalation prompt:" error.

* Update docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:30:55 +01:00
Bogdan Dobrelya
fd9b26675e More granular control for download/upload images/binaries
Add upload tag allow users to exclude distributing images across nodes
when running with the download tag set.
Add related tags and update docs as well.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:04:55 +01:00
Alexander Block
eb33f085b6 Changes according to code review 2016-12-09 16:33:10 +01:00
Matthew Mosesohn
459bee6d2c Bump kubedns version to 1.9
Version 1.9 has reduced verbosity for federation dns queries
which flood container logs.
2016-12-09 17:57:54 +03:00
Alexander Block
8a5ba6b20c Use proper style (spacing) for docker_storage_options 2016-12-09 13:56:56 +01:00
Alexander Block
c3ec3ff902 Allow to specify docker storage driver 2016-12-09 13:56:56 +01:00
Bogdan Dobrelya
7897c34ba3 Merge pull request #700 from bogdando/tags
Add tags
2016-12-09 13:23:56 +01:00
Bogdan Dobrelya
8cc84e132a Add tags
Add tags to allow more granular tasks filtering.
Add generator script for MD formatted tags found.
Add docs for tags how-to.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 12:14:28 +01:00
Alexander Block
00ad151186 Add playbook and role to reset the cluster
This deletes everything related to the cluster and allows to start from
scratch.
2016-12-09 11:15:36 +01:00
Aleksandr Didenko
ee8d6ab4fc Convert docker_versioned_pkg dict keys to string
This will allow to use '-e docker_version=1.12' in ansible playbook
execution. It's also backward-compatible and will work with floating
docker_version format in custom yaml files.

Closes #702
2016-12-09 09:17:36 +01:00
Matthew Mosesohn
a80745b5bd Merge pull request #668 from bodepd/etcd_access_address
Use etcd host ip instead of hostname to build etcd_access_addresses
2016-12-09 07:54:12 +03:00
Bogdan Dobrelya
710d5ae48e Merge pull request #691 from adidenko/calico-old-cni-fix
Fix possible problems with legacy calicoctl
2016-12-08 12:00:08 +01:00
Dan Bode
eec2ed5809 Allow etcd_access_addresses to be more flexible
The variale etcd_access_addresses is used to determine
how to address communication from other roles to
the etcd cluster.

It was set to the address that ansible uses to
connect to instance ({{ item }})s and not the
the variable:
  ip_access
which had already been created and could already
be overridden through the access_ip variable.

This change allows ansible to connect to a machine using
a different address than the one used to access etcd.
2016-12-07 10:33:15 -08:00
Matthew Mosesohn
bfc9bcb8c7 Force hardlink for calico/canal certs
Fixes: #669
2016-12-07 19:03:22 +03:00
Bogdan Dobrelya
8eb26c21be Merge pull request #692 from bogdando/gce_fixes
Change GCE sysctls placement and docs
2016-12-07 16:17:30 +01:00
Bogdan Dobrelya
f0f2b81276 Change GCE sysctls placement and docs
Override GCE sysctl in /etc/sysctl.d/99-sysctl.conf instead of
the /etc/sysctl.d/11-gce-network-security.conf. It is recreated
by GCE, f.e. if gcloud CLI invokes some security related changes,
thus losing customizations we want to be persistent.

Update cloud providers firewall requirements in calico docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-07 12:53:45 +01:00
Aleksandr Didenko
c9290182be Fix possible problems with legacy calicoctl
When running legacy calicoctl we do not specify calico hostname in
calico-node container thus we should not specify it in CNI config.

Also move 'legacy_calicoctl' set_fact task to the top.
2016-12-07 12:26:44 +01:00
fen4o
246c8209c1 add cluster-signing to kube-controller-manager
kube-controller-manager's cluster signing cert and key points by default to not
existing `/etc/kubernetes/ca/ca.pem` and `/etc/kubernetes/ca/ca.key` [docs][1]

[1]: http://kubernetes.io/docs/admin/kube-controller-manager/#options
2016-12-07 11:20:18 +02:00
Bogdan Dobrelya
36fe2cb5ea Merge pull request #584 from chadswen/docker-options-refactor
Docker Options Refactor
2016-12-07 07:57:53 +01:00
Bogdan Dobrelya
9d6cc3a8d5 Merge pull request #684 from adidenko/fix-calico-peering
Calico: fix peering with routers for new version
2016-12-06 22:42:02 +01:00
Spencer Smith
8870178a2d Merge pull request #627 from kubernetes-incubator/issue-626
add restart flag for docker run kubelet
2016-12-06 08:47:18 -08:00
Aleksandr Didenko
b0079ccd77 Calico: fix peering with routers for new version
In new `calicoctl` version nodes peering with routers is broken.
We need to use predictable node names for calico-node and the
same names in calico `bgpPeer` resources and CNI.
2016-12-06 17:17:39 +01:00
Bogdan Dobrelya
2c1db56213 Merge pull request #678 from adidenko/update-calico-unit
Update calico-node systemd unit
2016-12-06 13:51:37 +01:00
Aleksandr Didenko
f1d7af11ee Update calico-node systemd unit
New calicoctl does not support --detach=false option, so we should
use a recommended way to run calico-node service:
http://docs.projectcalico.org/v2.0/usage/configuration/as-service

Closes #674, #675
2016-12-06 11:34:12 +01:00
Bogdan Dobrelya
59a097b255 Merge pull request #679 from kubernetes-incubator/kube-proxy-dbus
Add dbus socket dir to kube-proxy
2016-12-06 11:08:16 +01:00
Matthew Mosesohn
7a3a473ccf Fix ipv4 forwarding on GCE
ipv4 forwarding gets broken when restarting networking, which
breaks all networking for all pods.
2016-12-06 11:57:57 +03:00
Matthew Mosesohn
2cdf752481 Add dbus socket dir to kube-proxy 2016-12-05 19:25:27 +03:00
Chad Swenson
8b5b27bb51 Docker Options Refactor 2016-12-02 15:07:51 -06:00
Bogdan Dobrelya
7328e0e1ac Merge pull request #672 from kubernetes-incubator/fail_all_on_error
Fail all nodes on error
2016-12-02 17:08:10 +01:00
Bogdan Dobrelya
c13d0db0cc Merge pull request #656 from YorikSar/nginx-proxy-timeout
Set proxy_timeout to 10m in nginx.conf
2016-12-02 12:48:18 +01:00
ant31
dba2026002 Fail all nodes on error 2016-12-02 12:37:22 +01:00
Sebastian Melchior
bb55f68f95 add basic azure support for kargo 2016-11-29 10:20:28 +01:00
Yuriy Taraday
658543c949 Set proxy_timeout to 10m in nginx.conf
Fixes #655.

This is a teporary solution for long-polling idle connections to
apiserver. It will make Nginx not cut them for the duration of expected
timeout. It will also make Nginx extremely slow in realizing that there
is some issue with connectivity to apiserver as well, so it might not be
perfect permanent solution.
2016-11-28 20:27:47 +03:00
Antoine Legrand
5b382668f5 Merge pull request #529 from bogdando/netcheck
Add a k8s app for advanced e2e netcheck for DNS
2016-11-28 15:26:30 +01:00
Bogdan Dobrelya
b7692fad09 Add advanced net check for DNS K8s app
* Add an option to deploy K8s app to test e2e network connectivity
  and cluster DNS resolve via Kubedns for nethost/simple pods
  (defaults to false).
* Parametrize existing k8s apps templates with kube_namespace and
  kube_config_dir instead of hardcode.
* For CoreOS, ensure nameservers from inventory to be put in the
  first place to allow hostnet pods connectivity via short names
  or FQDN and hostnet agents to pass as well, if netchecker
  deployed.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-28 13:23:25 +01:00
Bogdan Dobrelya
fbdda81515 Merge pull request #652 from kubernetes-incubator/debug_mode
Tune dnsmasq/kubedns limits, replicas, logging
2016-11-25 16:57:15 +01:00
Bogdan Dobrelya
2d18e19263 Tune dnsmasq/kubedns limits, replicas, logging
* Add dns_replicas, dns_memory/cpu_limit/requests vars for
dns related apps.
* When kube_log_level=4, log dnsmasq queries as well.
* Add log level control for skydns (part of kubedns app).
* Add limits/requests vars for dnsmasq (part of kubedns app) and
  dnsmasq daemon set.
* Drop string defaults for kube_log_level as it is int and
  is defined in the global vars as well.
* Add docs

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-25 12:49:17 +01:00
Aleksandr Didenko
ff7d489f2d Update calico/ctl image tag
We no longer need to use v0.22.0 for calicoctl since Kargo has
support for new calicoctl CLI format.

Also fixing condition logic for calico pool task.
2016-11-25 11:23:27 +01:00
Bogdan Dobrelya
6d29a5981c Merge pull request #651 from bogdando/fix_docker_install
Fix download dnsmasq image dependency on docker
2016-11-24 18:44:12 +01:00
Bogdan Dobrelya
10b75d1d51 Merge pull request #648 from artem-panchenko/fix_calicoctl_node_run
Fix Calico jinja template (systemd)
2016-11-24 18:33:34 +01:00
Bogdan Dobrelya
aa447585c4 Fix download dnsmasq image dependency on docker
When download_run_once with download_localhost is used, docker is
expected to be running on the delegate localhost. That may be not
the case for a non localhost delegate, which is the kube-master
otherwise. Then the dnsmasq role, had it been invoked early before
deployment starts, would fail because of the missing docker dependency.

* Fix that dependency on docker and do not pre download dnsmasq image
  for the dnsmasq role, if download_localhost is disabled.
* Remove become: false for docker CLI invocation because that's not
  the common pattern to allow users access docker CLI w/o sudo.
* Fix opt bin path hack for localhost delegate to ignore errors when
  it fails with "sudo password required" otherwise.
* Describe download_run_once with download_localhost use case in docs
  as well.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-24 18:31:26 +01:00
Bogdan Dobrelya
d208896c46 Ensure /etc/resolv.conf content for CoreOS
Use cloud-init config to replace /etc/resolv.conf with the
content for kubelet to properly configure hostnet pods.

Do not use systemd-resolved yet, see
https://coreos.com/os/docs/latest/configuring-dns.html
"Only nss-aware applications can take advantage of the
systemd-resolved cache. Notably, this means that statically
linked Go programs and programs running within Docker/rkt
will use /etc/resolv.conf only, and will not use the
systemd-resolve cache."

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-23 16:51:49 +01:00
Artem Panchenko
2c4b11f321 Fix Calico jinja template (systemd) 2016-11-23 11:43:53 +02:00
Bogdan Dobrelya
d890d2f277 Fix nginx container download for download_run_once mode
W/o this patch, the "Download containers" task may be skipped
when running on the delegate node due to wrong "when" confition.
Then it fails to upload nginx image to the nodes as well.

Fix download nginx dependency so it always can be pushed to
nodes when download_run_once is enabled.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-23 10:37:08 +01:00
Bogdan Dobrelya
793f3990a0 Merge pull request #642 from kubernetes-incubator/k8s_imgpull
Allow pre-downloaded images to be used effectively
2016-11-22 18:09:38 +01:00
Aleksandr Didenko
db03f17486 Set defaults for ansible_ssh_user
When setting permission for containers download/upload dir we're
using `ansible_ssh_user`. But if playbook is executed without
user being explicitly set `ansible_ssh_user` may be undefined.
In such situations dir ownership will default to `ansible_user_id`

Closes: #644
2016-11-22 18:00:56 +01:00
Bogdan Dobrelya
dff78f616e Allow pre-downloaded images to be used effectively
According to http://kubernetes.io/docs/user-guide/images/ :
By default, the kubelet will try to pull each image from the
specified registry. However, if the imagePullPolicy property
of the container is set to IfNotPresent or Never, then a local\
image is used (preferentially or exclusively, respectively).

Use IfNotPresent value to allow images prepared by the download
role dependencies to be effectively used by kubelet without pull
errors resulting apps to stay blocked in PullBackOff/Error state
even when there are images on the localhost exist.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-22 16:16:04 +01:00