Commit graph

389 commits

Author SHA1 Message Date
Matthew Mosesohn ec54b36e05
add retries for calico/canal etcd commands (#2007) 2017-11-28 16:39:55 +00:00
Simon Li 7be2521a31 Add flannel hairping mode 2017-11-21 10:43:50 +00:00
Spencer Smith bc1a4e12ad fix broken variable in ansible 2.4.1.0 and ensure tasks for calico-rr (#1982) 2017-11-16 18:44:15 +00:00
Aivars Sterns 5e558c361b update weave-net to 2.0.5 version (#1877) 2017-11-13 16:11:47 +00:00
Hyunsun Moon 37125866ca Make calico_node_ignorelooserpf have an effect (#1945) 2017-11-13 09:35:13 +00:00
Chad Swenson 16ae2c1809 Flannel RBAC Fix
Fixes a bug that can occur if `cni-flannel-rbac.yml` was written but the playbook failed before it was applied. Uses the same approach as calico.
2017-11-02 23:20:23 -05:00
Matthew Mosesohn 86fb669fd3 Idempotency fixes (#1838) 2017-10-25 21:19:40 +01:00
Matthew Mosesohn fc9a65be2b Refactor downloads to use download role directly (#1824)
* Refactor downloads to use download role directly

Also disable fact delegation so download delegate works acros OSes.

* clean up bools and ansible_os_family conditionals
2017-10-19 09:17:11 +01:00
Matthew Mosesohn d4b10eb9f5 Fix path for calico get node names (#1816) 2017-10-17 10:54:48 +01:00
Kevin Lefevre 6ec45b10f1 Update network-plugins to use portmap plugin (#1763)
Portmap allow to use hostPort with CNI plugins. Should fix #1675
2017-10-16 07:11:38 +01:00
Matthew Mosesohn 10dd049912 Revert "Security fixes for etcd (#1778)" (#1786)
This reverts commit 4209f1cbfd.
2017-10-12 14:02:51 +01:00
Matthew Mosesohn 4209f1cbfd Security fixes for etcd (#1778)
* Security fixes for etcd

* Use certs when querying etcd
2017-10-12 13:32:54 +01:00
Brad Beam b81c0d869c Adding calico/node env vars for prometheus configuration 2017-10-05 08:46:01 -05:00
Matthew Mosesohn f14f04c5ea Upgrade to kubernetes v1.8.0 (#1730)
* Upgrade to kubernetes v1.8.0

hyperkube no longer contains rsync, so now use cp

* Enable node authorization mode

* change kube-proxy cert group name
2017-10-05 10:51:21 +01:00
Aivars Sterns 9c86da1403 Normalize tags in all places to prepare for tag fixing in future (#1739) 2017-10-05 08:43:04 +01:00
Matthew Mosesohn a56738324a Move set_facts to kubespray-defaults defaults
These facts can be generated in defaults with a performance
boost.

Also cleaned up duplicate etcd var names.
2017-10-04 14:02:47 +01:00
Matthew Mosesohn d94e3a81eb Use api lookup for kubelet hostname when using cloudprovider (#1686)
The value cannot be determined properly via local facts, so
checking k8s api is the most reliable way to look up what hostname
is used when using a cloudprovider.
2017-09-24 09:22:15 +01:00
Matthew Mosesohn b294db5aed fix apply for netchecker upgrade (#1659)
* fix apply for netchecker upgrade and graceful upgrade

* Speed up daemonset upgrades. Make check wait for ds upgrades.
2017-09-15 13:19:37 +01:00
Matthew Mosesohn 6744726089 kubeadm support (#1631)
* kubeadm support

* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job

* change ci boolean vars to json format

* fixup

* Update create-gce.yml

* Update create-gce.yml

* Update create-gce.yml
2017-09-13 19:00:51 +01:00
Brad Beam 69fac8ea58 Merge pull request #1634 from bradbeam/calico_cni
fix for calico cni plugin node name
2017-09-11 22:18:06 -05:00
Matthew Mosesohn 5d99fa0940 Purge old upgrade hooks and unused tasks (#1641) 2017-09-09 23:41:20 +03:00
Brad Beam eeffbbb43c Updating calicocni.hostname to calicocni.nodename 2017-09-08 12:47:40 +00:00
Brad Beam aaa0105f75 Flexing calicocni.hostname based on cloud provider 2017-09-08 12:47:40 +00:00
Chad Swenson e26aec96b0 Consolidate kube-proxy module and sysctl loading (#1586)
This sets br_netfilter and net.bridge.bridge-nf-call-iptables sysctl from a single play before kube-proxy is first ran instead of from the flannel and weave network_plugin roles after kube-proxy is started
2017-09-06 15:11:51 +03:00
Matthew Mosesohn 660282e82f Make daemonsets upgradeable (#1606)
Canal will be covered by a separate PR
2017-09-04 11:30:01 +03:00
Matthew Mosesohn 77602dbb93 Move calico to daemonset (#1605)
* Drop legacy calico logic

* add calico as a daemonset
2017-09-04 11:29:51 +03:00
Matthew Mosesohn a3e6896a43 Add RBAC support for canal (#1604)
Refactored how rbac_enabled is set
Added RBAC to ubuntu-canal-ha CI job
Added rbac for calico policy controller
2017-09-04 11:29:40 +03:00
Matthew Mosesohn 13d08af054 Fix upgrade for canal and apiserver cert
Fixes #1573
2017-08-29 22:08:30 +01:00
Chad Swenson a39e78d42d Initial version of Flannel using CNI (#1486)
* Updates Controller Manager/Kubelet with Flannel's required configuration for CNI
* Removes old Flannel installation
* Install CNI enabled Flannel DaemonSet/ConfigMap/CNI bins and config (with portmap plugin) on host
* Uses RBAC if enabled
* Fixed an issue that could occur if br_netfilter is not a module and net.bridge.bridge-nf-call-iptables sysctl was not set
2017-08-25 10:07:50 +03:00
Brad Beam 71dca67ca2 Merge pull request #1508 from tmjd/update-calico-2-4-0
Update Calico to 2.4.1 release.
2017-08-24 14:57:29 -05:00
Yuki KIRII a98b866a66 Verify if br_netfilter module exists (#1492) 2017-08-24 17:47:32 +03:00
Matthew Mosesohn 6bb3463e7c Enable scheduling of critical pods and network plugins on master
Added toleration to DNS, netchecker, fluentd, canal, and
calico policy.

Also small fixes to make yamllint pass.
2017-08-24 10:41:17 +01:00
Brad Beam 8b151d12b9 Adding yamllinter to ci steps (#1556)
* Adding yaml linter to ci check

* Minor linting fixes from yamllint

* Changing CI to install python pkgs from requirements.txt

- adding in a secondary requirements.txt for tests
- moving yamllint to tests requirements
2017-08-24 12:09:52 +03:00
Erik Stidham 9f9f70aade Update Calico to 2.4.1 release.
- Switched Calico images to be pulled from quay.io
- Updated Canal too
2017-08-21 09:33:12 -05:00
Vijay Katam c92506e2e7 Add calico variable that enables ignoring Kernel's RPF Setting (#1493) 2017-08-20 14:01:09 +03:00
timtoum 3e457e4edf Enable weave seed mode for kubespray (#1414)
* Enable weave seed mode for kubespray

* fix task Weave seed | Set peers if existing peers

* fix mac address variabilisation

* fix default values

* fix include seed condition

* change weave var to default values

* fix Set peers if existing peers
2017-07-26 19:09:34 +03:00
AtzeDeVries e160018826 Fixed conflicts, ipip:true as defualt and added ipip_mode 2017-07-08 14:36:44 +02:00
Kevin Jing Qiu a742d10c54 Allow calico ipPool to be created with mode "cross-subnet" 2017-07-04 19:05:16 -04:00
AtzeDeVries 61b74f9a5b updated to direct control over ipip 2017-06-23 09:16:05 +02:00
AtzeDeVries 7332679678 Give more control over IPIP, but with same default behaviour 2017-06-20 14:50:08 +02:00
Brad Beam 780308c194 Merge pull request #1174 from jlothian/atomic-docker-restart
Fix docker restart in atomic
2017-06-07 12:05:32 -05:00
Justin Hunthrop af55e179c7 adding --skip-exists flag for peer_with_router 2017-05-25 14:29:18 -05:00
Josh Lothian 7ae5785447 Removed the other unused handler
With live-restore: true, we don't need a special docker restart
2017-05-19 09:50:10 -05:00
Josh Lothian ef8d3f684f Remove unused handler
Previous patch removed the step that sets live-restore
back to false, so don't try to notify that handler any more
2017-05-19 09:45:46 -05:00
Josh Lothian 6f67367b57 Leave 'live-restore' false
Leave live-restore false to updates always pick
up new network configuration
2017-05-17 14:31:49 -05:00
Josh Lothian 9ee0600a7f Update handler names and explanation 2017-05-17 14:31:49 -05:00
Josh Lothian 30cc7c847e Reconfigure docker restart behavior on atomic
Before restarting docker, instruct it to kill running
containers when it restarts.

Needs a second docker restart after we restore the original
behavior, otherwise the next time docker is restarted by
an operator, it will unexpectedly bring down all running
containers.
2017-05-17 14:31:49 -05:00
Josh Lothian a5bb24b886 Fix docker restart in atomic
In atomic, containers are left running when docker is restarted.
When docker is restarted after the flannel config is put in place,
the docker0 interface isn't re-IPed because docker sees the running
containers and won't update the previous config.

This patch kills all the running containers after docker is stopped.
We can't simply `docker stop` the running containers, as they respawn
before we've got a chance to stop the docker daemon, so we need to
use runc to do this after dockerd is stopped.
2017-05-17 14:31:49 -05:00
Sergii Golovatiuk 674b71b535 Ansible 2.3 support
- Fix when clauses in various places
- Update requirements.txt
- Fix README.md

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-04-26 15:22:10 +02:00
Matthew Mosesohn e9a294fd9c Significantly reduce memory requirements
Canal runs more pods and upgrades need a bit of extra
room to load new pods in and get the old ones out.
2017-03-27 13:28:37 +03:00
Matthew Mosesohn e1faeb0f6c Fix weave on RHEL deployment
Reduce retry delay checking weave
Always load br_netfilter module
2017-03-17 18:17:47 +03:00
Aleksandr Didenko 3a39904011 Move calico-policy-controller into separate role
By default Calico CNI does not create any network access policies
or profiles if 'policy' is enabled in CNI config. And without any
policies/profiles network access to/from PODs is blocked.

K8s related policies are created by calico-policy-controller in
such case. So we need to start it as soon as possible, before any
real workloads.

This patch also fixes kube-api port in calico-policy-controller
yaml template.

Closes #1132
2017-03-17 11:21:52 +01:00
Matthew Mosesohn a422ad0d50 More idempotency fixes
Fixed sync_tokens fact
Fixed sync_certs for k8s tokens fact
Disabled register docker images changability
Fixed CNI dir permission
Fix idempotency for etcd pre upgrade checks
2017-03-15 19:06:39 +03:00
Artem Panchenko fa05d15093 Allow connections from pods to local endpoints
By default Calico blocks traffic from endpoints
to the host itself by using an iptables DROP
action. It could lead to a situation when service
has one alive endpoint, but pods which run on
the same node can not access it. Changed the action
to RETURN.
2017-03-01 09:21:02 +02:00
Brad Beam 56664b34a6 Lower default memory requests
This is to address out of memory issues on CI as well as help
fit deployments for people starting out with kargo on smaller
machines
2017-02-27 10:53:43 -06:00
Bogdan Dobrelya 712872efba Rework inventory all by real groups' vars
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-02-23 09:43:42 +01:00
Ivan Shvedunov 0006e5ab45 Fix shell special vars 2017-02-21 22:22:40 +03:00
Andrew Greenwood ca9ea097df Cleanup legacy syntax, spacing, files all to yml
Migrate older inline= syntax to pure yml syntax for module args as to be consistant with most of the rest of the tasks
Cleanup some spacing in various files
Rename some files named yaml to yml for consistancy
2017-02-17 16:22:34 -05:00
Sergii Golovatiuk e91e58aec9 Fix fact tags
Ansible playbook fails when tags are limited to "facts,etcd" or to
"facts". This patch allows to run ansible-playbook to gather facts only
that don't require calico/flannel/weave components to be verified. This
allows to run ansible with 'facts,bootstrap-os' or just 'facts' to
gether facts that don't require specific components.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-17 12:32:33 +01:00
Vladimir Rutsky 09847567ae set "check_mode: no" for read-only "shell" steps that registers result
"shell" step doesn't support check mode, which currently leads to failures,
when Ansible is being run in check mode (because Ansible doesn't run command,
assuming that command might have effect, and no "rc" or "output" is registered).

Setting "check_mode: no" allows to run those "shell" commands in check mode
(which is safe, because those shell commands doesn't have side effects).
2017-02-13 18:53:41 +03:00
Matthew Mosesohn 6ae70e03cb fixup upgrades for canal and weave 2017-02-10 13:27:41 +03:00
Matthew Mosesohn 29fd957352 Enable weave upgrade from previous versions
Raise readiness probe initial time to 60 (was 30)
2017-02-09 21:39:31 +03:00
Matthew Mosesohn 0a7c6eb9dc Merge pull request #998 from mattymo/fix_upgrade_daemonsets
Fix upgrade for all daemonset type resources
2017-02-09 20:02:21 +03:00
Sergii Golovatiuk 4124d84c00 Lower weave RAM settings.
- Since Weave 1.8.x was rewritten in Golang we may decrease RAM settings
  to continue using g1-small for CI
2017-02-08 18:50:36 +01:00
Matthew Mosesohn 3c713a3f53 Fix upgrade for all daemonset type resources
Daemonsets cannot be simply upgraded through a single API call,
regardless of any kubectl documentation. The resource must be
purged and then recreated in order to make any changes.
2017-02-08 18:16:00 +03:00
Matthew Mosesohn fd30131dc2 Revert "Drop linux capabilities and rework users/groups" 2017-02-06 15:58:54 +03:00
Bogdan Dobrelya cae2982d81 Merge pull request #911 from bogdando/DROP_CAPS
Drop linux capabilities and rework users/groups
2017-02-06 12:05:51 +01:00
Sergii Golovatiuk f2e4ffcac2 Fix weave-net after upgrade to 1.82
- Set recommended CPU settings
- Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
- Limit DS creation to master
- Combined 2 tasks into one with better condition
2017-02-02 10:31:58 +01:00
Matthew Mosesohn 39d87a96aa Rename weave-kube to weave-net
Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
2017-01-31 18:47:27 +03:00
Brad Beam a11b9d28bd Upgrading weave to weave-kube 2017-01-27 17:05:25 -06:00
Aleksandr Didenko 46c177b982 Switch to ansible_hostname in calico
For consistancy with kubernetes services we should use the same
hostname for nodes, which is 'ansible_hostname'.

Also fixing missed 'kube-node' in templates, Calico is installed
on 'k8s-cluster' roles, not only 'kube-node'.
2017-01-25 11:49:58 +01:00
Aleksandr Didenko f05aaeb329 Fix calico-rr peering with k8s masters
Calico-rr is broken for deployments with separate k8s-master and
k8s-node roles. In order to fix it we should peer k8s-cluster
nodes with calico-rr, not just k8s-node. The same for peering
with routers.

Closes #925
2017-01-23 10:19:09 +01:00
Bogdan Dobrelya cb2e5ac776 Drop linux capabilities and rework users/groups
* Drop linux capabilities for unprivileged containerized
  worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
  components' static manifests, etcd, calico-rr and k8s apps,
  like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
  supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
  networking containers though).
* Grant the kube and netplug users read access for etcd certs via
  the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
  and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
  for control plane.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-20 08:50:42 +01:00
Bogdan Dobrelya 79aeb10431 Merge pull request #858 from bradbeam/calicoctl-canal
Misc updates for canal
2017-01-10 12:24:59 +01:00
Aleksandr Didenko d9539e0f27 Fix etcd cert generation for calico-rr role
"etcd_node_cert_data" variable is undefinded for "calico-rr" role.
This patch adds "calico-rr" nodes to task where "etcd_node_cert_data"
variable is registered.
2017-01-09 12:06:25 +01:00
Brad Beam cf042b2a4c Create network policy directory for canal 2017-01-05 10:54:27 -06:00
Brad Beam 65c86377fc Adding calicoctl to canal deployment 2017-01-05 10:54:27 -06:00
Bogdan Dobrelya 5af2c42bde Better fix for different CoreOS os family facts
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-05 16:32:08 +01:00
Bogdan Dobrelya f7447837c5 Rename CoreOS fact
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-05 14:02:29 +01:00
Bogdan Dobrelya d8a2941e9e Fix cert paths for flannel/calico policy apps
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-03 16:12:54 +01:00
Bogdan Dobrelya 58062be2a3 Drop non systemd OS types support
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-02 12:14:03 +01:00
Bogdan Dobrelya a56d9de502 Systemd units, limits, and bin path fixes
* Add restart for weave service unit
* Reuse docker_bin_dir everythere
* Limit systemd managed docker containers by CPU/RAM. Do not configure native
  systemd limits due to the lack of consensus in the kernel community
  requires out-of-tree kernel patches.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-28 15:49:42 +01:00
Matthew Mosesohn e7a1949d85 Merge pull request #818 from mattymo/calico-rr-certs
Fix calico-rr to use etcd certs instead of kube certs
2016-12-28 08:47:16 +03:00
Matthew Mosesohn 6d9cd2d720 Fix calico-rr to use etcd certs instead of kube certs 2016-12-27 17:04:50 +03:00
Bogdan Dobrelya 79996b557b Rework ignore_errors to report no reds
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2016-12-27 13:00:50 +01:00
Matthew Mosesohn a4bce333a3 Merge pull request #760 from genti-t/issue-748-flannel-options
Fix Flannel network on CoreOS
2016-12-22 19:02:31 +03:00
Genti Topija 7c2785e083 Fix Flannel network on CoreOS
Resolves: #748
2016-12-22 16:50:04 +01:00
Matthew Mosesohn ad796d188d Individual etcd ssl certs
Includes hooks for triggering calico, kubelet, and kube-apiserver restarts
if etcd certs changed.
2016-12-22 13:31:11 +03:00
Spencer Smith 8d9f207836 create systemd drop-in path if not existent 2016-12-21 13:06:12 -05:00
Bogdan Dobrelya 92f542938c Merge pull request #745 from kubernetes-incubator/fix_weave_start
Fix weave restart after docker daemon restart
2016-12-16 14:06:48 +01:00
Matthew Mosesohn 495d0b659a Fix weave restart after docker daemon restart 2016-12-16 14:15:22 +03:00
Bogdan Dobrelya 114ab5e4e6 Merge pull request #721 from adidenko/calico-add-rr
Add calico/routereflector support
2016-12-14 17:22:00 +01:00
Smaine Kahlouch 29874baf8a Merge pull request #708 from vwfs/cloud_network
Add support for cloud-provider based networking
2016-12-14 16:23:20 +01:00
Aleksandr Didenko d57c27ffcf Add calico/routereflector support
Add BGP route reflectors support in order to optimize BGP topology
for deployments with Calico network plugin.

Also bump version of calico/ctl for some bug fixes.
2016-12-14 13:44:10 +01:00
Alexander Block d20d5e648f Add pseudo network plugin called "cloud" to use cloud provider for network
Allow to let the cloud provider configure proper routing for nodes.
2016-12-13 17:30:10 +01:00
Bogdan Dobrelya c75f394707 Address standalone kubelet config case
Also place in global vars and do not repeat the kube_*_config_dir
and kube_namespace vars for better code maintainability and UX.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-13 16:35:53 +01:00
Bogdan Dobrelya a15d626771 Preconfigure DNS stack and docker early
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
  skip_dnsmasq_k8s var as not needed anymore.

* Preconfigure DNS stack early, which may be the case when downloading
  artifacts from intranet repositories. Do not configure
  K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
  not existing).

* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
  was set up and before K8s apps to be created.

* Move docker install task to early stage as well and unbind it from the
  etcd role's specific install path. Fix external flannel dependency on
  docker role handlers. Also fix the docker restart handlers' steps
  ordering to match the expected sequence (the socket then the service).

* Add default resolver fact, which is
  the cloud provider specific and remove hardcoded GCE resolver.

* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
  domains combined with high ndots values lead to poor performance of
  DNS stack and make ansible workers to fail very often with the
  "Timeout (12s) waiting for privilege escalation prompt:" error.

* Update docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:30:55 +01:00
Bogdan Dobrelya 8cc84e132a Add tags
Add tags to allow more granular tasks filtering.
Add generator script for MD formatted tags found.
Add docs for tags how-to.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 12:14:28 +01:00
Bogdan Dobrelya 710d5ae48e Merge pull request #691 from adidenko/calico-old-cni-fix
Fix possible problems with legacy calicoctl
2016-12-08 12:00:08 +01:00
Matthew Mosesohn bfc9bcb8c7 Force hardlink for calico/canal certs
Fixes: #669
2016-12-07 19:03:22 +03:00
Aleksandr Didenko c9290182be Fix possible problems with legacy calicoctl
When running legacy calicoctl we do not specify calico hostname in
calico-node container thus we should not specify it in CNI config.

Also move 'legacy_calicoctl' set_fact task to the top.
2016-12-07 12:26:44 +01:00
Bogdan Dobrelya 36fe2cb5ea Merge pull request #584 from chadswen/docker-options-refactor
Docker Options Refactor
2016-12-07 07:57:53 +01:00
Aleksandr Didenko b0079ccd77 Calico: fix peering with routers for new version
In new `calicoctl` version nodes peering with routers is broken.
We need to use predictable node names for calico-node and the
same names in calico `bgpPeer` resources and CNI.
2016-12-06 17:17:39 +01:00
Aleksandr Didenko f1d7af11ee Update calico-node systemd unit
New calicoctl does not support --detach=false option, so we should
use a recommended way to run calico-node service:
http://docs.projectcalico.org/v2.0/usage/configuration/as-service

Closes #674, #675
2016-12-06 11:34:12 +01:00
Chad Swenson 8b5b27bb51 Docker Options Refactor 2016-12-02 15:07:51 -06:00
Aleksandr Didenko ff7d489f2d Update calico/ctl image tag
We no longer need to use v0.22.0 for calicoctl since Kargo has
support for new calicoctl CLI format.

Also fixing condition logic for calico pool task.
2016-11-25 11:23:27 +01:00
Artem Panchenko 2c4b11f321 Fix Calico jinja template (systemd) 2016-11-23 11:43:53 +02:00
Bogdan Dobrelya dff78f616e Allow pre-downloaded images to be used effectively
According to http://kubernetes.io/docs/user-guide/images/ :
By default, the kubelet will try to pull each image from the
specified registry. However, if the imagePullPolicy property
of the container is set to IfNotPresent or Never, then a local\
image is used (preferentially or exclusively, respectively).

Use IfNotPresent value to allow images prepared by the download
role dependencies to be effectively used by kubelet without pull
errors resulting apps to stay blocked in PullBackOff/Error state
even when there are images on the localhost exist.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-22 16:16:04 +01:00
Bogdan Dobrelya dc58159d16 Merge pull request #621 from xenolog/calico_network_backend
Add ability to define network backend for Calico.
2016-11-22 14:55:47 +01:00
Bogdan Dobrelya 66f27ed1f3 Download images as dependencies of roles
Pre download all required container images as roles' deps.
Drop unused flannel-server-helper images pre download.
Improve pods creation post-install test pre downloaded busybox.
Improve logs collection script with kubectl describe, fix sudo/etcd/weave
commands.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-22 11:13:57 +01:00
Sergey Vasilenko f6d69d0a00 Add ability to define network backend for Calico.
This patch introduce `calico_network_backend` global variable,
which allow to describe alternative network backend.
Default behavior is unchanged.
2016-11-18 16:38:18 +03:00
Aleksandr Didenko e3470b28c5 Move CNI config and add MTU support for calico-cni
- Move CNI configuration creation for Calico to appropriate
network_plugin role from kubernetes/node.
- Add support for MTU configuration in Calico.
2016-11-15 18:05:11 +01:00
Bogdan Dobrelya e587e82f7f Merge pull request #600 from adidenko/calico-cni-container-support
Replace calico-cni binaries with calico/cni container
2016-11-15 15:40:13 +01:00
Bogdan Dobrelya 876c4df1b6 Fix mountflags and kubelet config
Add missing --require-kubeconfig to the if..else stanza.
Make sure certs dirs mounted in RO.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-15 11:22:23 +01:00
Aleksandr Didenko caa81f3ac2 Fix etcd ssl for canal
- Move CNI configuration from `kubernetes/node` role to
`network_plugin/canal`
- Create SSL dir for Canal and symlink etcd SSL files
- Add needed options to `canal-config` configmap
- Run flannel and calico-node containers with proper configuration
2016-11-14 14:49:17 +01:00
Aleksandr Didenko 965a1234d3 Replace calico-cni binaries with calico/cni container
Calico CNI binaries are also released/shipped in calico/cni
container. This patch replaces download of calico CNI binaries with
calico/cni container.
2016-11-14 12:19:58 +01:00
Artem Panchenko c58bd33af7 Support new version of 'calicoctl' (>=v1.0.0)
Since version 'v1.0.0-beta' calicoctl is written
in Go and its API differs from old Python based
utility. Added support of both old and new version
of the utility.
2016-11-10 17:11:29 +02:00
Matthew Mosesohn fe16fecd8f Fix canal's calico networking config for ETCD TLS
Also fixes kube-apiserver upgrade that was erroneously
deleted in a previous commit.
2016-11-10 12:49:47 +03:00
Matthew Mosesohn 9ea9604b3f Merge pull request #591 from kubernetes-incubator/etcdtls
Add etcd tls support
2016-11-10 12:32:13 +03:00
Matthew Mosesohn a32cd85eb7 Add etcd TLS support 2016-11-09 18:38:28 +03:00
Matthew Mosesohn 95b460ae94 Remove etcd-proxy from all nodes and use etcd multiaccess 2016-11-09 13:31:12 +03:00
Aleksandr Didenko 60a217766f Add ConfigMap for basic configuration options
Container settings moved from deamonset yaml to a separate
configmap.
2016-11-08 12:57:34 +01:00
Aleksandr Didenko 309240cd6f Adding support for canal network plugin
This patch provides support for Canal network plugin installation
as a self-hosted app, see the following link for details:

https://github.com/tigera/canal/tree/master/k8s-install
2016-11-08 11:04:01 +01:00
Smana 056f4b6c00 upgrade to kubernetes version 1.4.0
test to change the machine type

Revert "test to change the machine type"

This reverts commit 7a91f1b5405a39bee6cb91940b09a0b0f9d3aee1.

use google dns server when no upstream dns are defined

comment upstream_dns_servers

update documentation

remove deprecated kubelet flags

Revert "remove deprecated kubelet flags"

This reverts commit 21e3b893c896d0291c36a07d0414f4cb88b8d8ac.
2016-10-10 22:44:47 +02:00
Smaine Kahlouch 9ca374a88d Merge pull request #491 from kubespray/calicopools
Allow calico to configure pool if tree exists, but no pools defined
2016-10-05 17:12:26 +02:00
Aleksandr Didenko 2b6866484e Allow to use custom "canalized" calico cni
- Allow to overwrite calico cni binaries copied from hyperkube
  by the custom ones.
- Fix calico-ipam deployment (it had wrong source in rsync)
- Make copy from hyperkube idempotent (use rsync instead of cp)
- Remove some orphaned comments
2016-09-28 18:09:20 +02:00
Anthony Haussmann 550bda951e Change method to set use_hyperkube_cni var bool
The precedent method returb a string "True\n" or "False\n", it seems to be an Ansible bug.
New method return a boolean
2016-09-27 16:41:09 +02:00
Matthew Mosesohn a3fe1e78df Copy hyperkube CNI plugins when using weave 2016-09-26 12:02:19 +03:00
Matthew Mosesohn a93639650f Allow calico to configure pool if tree exists, but no pools defined 2016-09-19 15:27:47 +03:00
Bogdan Dobrelya 5ed3916f82 Fix use_hyperkube_cni logic
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-16 13:07:04 +02:00
Bogdan Dobrelya 390764c2b4 Add retry_stagger var for failed download/pushes.
* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-15 16:43:58 +02:00
Bogdan Dobrelya 422428908a Download containers and save all
Move version/repo vars to download role.
Add container to download params, which overrides url/source_url,
if enabled.
Fix networking plugins download depending on kube_network_plugin.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-15 16:43:56 +02:00
Matthew Mosesohn b69d5f6e6e Fix logic handling for use_hyperkube_cni 2016-09-15 16:09:40 +03:00
Smaine Kahlouch 125cb0aa64 Merge pull request #481 from bogdando/issue/479
Add retries for copying binaries from containers and packages
2016-09-14 10:04:32 +02:00
Bogdan Dobrelya 6fdcaa1a63 Add retries for copying binaries from containers
Closes issue: https://github.com/kubespray/kargo/issues/479

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-13 15:09:34 +02:00
Anthony Haussmann d47a2d03b4 Delete default variable use_hyperkube_cni
The variable is now set via a task depending of the version of kube
2016-09-13 14:59:50 +02:00
Anthony Haussmann 739cf59953 Determine hyperkube cni to use
Starting from version 1.3.4 of hyperkube, calico is "canalized" which requires flannel and hostonly cni plugins.So we let hyperkube ship necessary cni
2016-09-13 14:58:29 +02:00
Smaine Kahlouch 69f09e0f18 Merge pull request #461 from kubespray/issue-369
Issue 369
2016-08-31 15:09:33 +02:00
Matthew Mosesohn 26a0406669 Disable calicoctl from creating a default pool
Sometimes invoking calicoctl to create a pool also
creates a default pool, which causes errors in deploy.
2016-08-31 12:54:05 +03:00
Spencer Smith a746d63177 ensure docker.service.d exists 2016-08-30 09:34:34 -07:00
Spencer Smith 0fc5e70c18 incorrect file name 2016-08-30 09:26:14 -07:00
Spencer Smith b74c2f89f0 lay down a systemd dropin instead of the /run/flannel_docker_opts.env symlink 2016-08-30 09:17:41 -07:00
Matthew Mosesohn b92404fd0a Enable customization of calico-node docker image
New vars: calico_node_image_repo and claico_node_image_tag
Defaults: calico/node and {{ calico_version }}, respectively
2016-08-27 16:25:39 +04:00
Bogdan Dobrelya 8168689caa Refactor roles and hosts
Shorten deployment time with:
- Remove redundand roles if duplicated by a dependency and vice versa
- When a member of k8s-cluster, always install docker as a dependency
  of the etcd role and drop the docker role from cluster.yaml.
- Drop etcd and node role dependencies from master role as they are
  covered by the node role in k8s-cluster group as well. Copy defaults
  for master from node role.
- Decouple master, node, secrets roles handlers and vars to be used w/o
  cross references.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-08-25 13:27:57 +02:00
Matthew Mosesohn f073ee91ea Copy hyperkube cni plugins optionally for calico deployment
Hyperkube from CoreOS now ships with all binaries required for
calico and flannel (but not weave). It simplifies deployment for
some network plugin scenarios to not download CNI images.

TODO: Optionally disable downloading calico to /opt/cni/bin
2016-08-10 15:35:53 +03:00
Bogdan Dobrelya d2c57142d3 Fix calico-node service unit
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-08-08 12:06:32 +02:00
Matthew Mosesohn e8a1c7a53f Move docker systemd unit creation to docker role
Creating the unit using default settings early on
and then changing it during network_plugin section
leads to too many docker restarts and duplicated code.

Reversed Wants= dependence on docker.service so it does not
restart docker when reloading systemd

Consolidated all docker restart handlers.
2016-08-02 17:56:24 +03:00
Bogdan Dobrelya 2af71f31b4 Rework systemd service units
* Add for docker system units:
    ExecReload=/bin/kill -s HUP $MAINPID
    Delegate=yes
    KillMode=process.
* Add missed DOCKER_OPTIONS for calico/weave docker systemd unit.
* Change Requires= to a less strict and non-faily Wants=, add missing
  Wants= for After=.
* Align wants/after in a wat if Wants=foo, After= has foo as well.
* Make wants/after docker.service to ask for the docker.socket as well.
* Move "docker rm -f" commands from ExecStartPre= to ExecStopPost=.
  hooks to ensure non-destructive start attempts issued by Wants=.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-08-02 10:55:42 +02:00
Matthew Mosesohn c7fef6cb76 Fix weave deployment task names 2016-07-30 23:12:41 +04:00
Antoine Legrand 6a7308d5c7 Merge pull request #372 from adidenko/calico-ipip-support
Support --ipip option for calico pool
2016-07-29 08:05:00 -07:00
Antoine Legrand 4419662fa0 Merge pull request #330 from jonbec/master
Add settable flannel image tag & image repo
2016-07-29 08:02:18 -07:00
Matthew Mosesohn 5668e5f767 Fix etcd restart and handler systemd tasks
Changed Wants=docker.service to docker.socket

Renamed handlers for reloading systemd to contain role in task name.
2016-07-29 16:32:35 +03:00
Aleksandr Didenko c52c5f5056 Add run_once to define calico pool task name 2016-07-27 15:55:41 +02:00
Bogdan Dobrelya 731d32afda Add HA/LB endpoints for kube-apiserver
* Add HA docs for API server.
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Use facts for kube_apiserver to not repeat code and enable LB endpoints use.
* Use /healthz check for the wait-for apiserver.
* Use the single endpoint for kubelet instead of the list of apiservers
* Specify kube_apiserver_count to for HA layout

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-07-25 17:25:45 +02:00
Matthew Mosesohn d0a1e15ef3 Deploy kubelet and kube-apiserver as containers
kubelet via docker
kube-apiserver as a static pod

Fixed etcd service start to be more tolerant of slow start.

Workaround for kube_version to stay in download role, but not
download an files by creating a new "nothing" download entry.
2016-07-22 16:42:34 +03:00
Aleksandr Didenko f94eb0b997 Support --ipip option for calico pool
Adds new boolean configuration variable for calico network plugin
`ipip`. When it's enabled calico pool is created with '--ipip'
option (IP-over-IP encapsulation across hosts).

Also refactor pool creation tasks to simplify logic and make tasks
more readable.
2016-07-21 13:05:40 +02:00
Matthew Mosesohn 7a86b6c73e Set default etcd deployment to docker
Improved docker reload command to wait for etcd to be
up before proceeding. Switched reload to run restart
because it can't reload if it is not guaranteed to be
in running state.
2016-07-20 18:26:16 +03:00
Bogdan Dobrelya 32cd6e99b2 Add etcd proxy support
* Enforce a etcd-proxy role to a k8s-cluster group members. This
provides an HA layout for all of the k8s cluster internal clients.
* Proxies to be run on each node in the group as a separate etcd
instances with a readwrite proxy mode and listen the given endpoint,
which is either the access_ip:2379 or the localhost:2379.
* A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and
loadbalancers and use the etcd members IPs as a comma-separated
list. Otherwise, clients shall use the local endpoint provided by a
etcd-proxy instances on each etcd node. A Netwroking plugins always
use that access mode.
* Fix apiserver's etcd servers args to use the etcd_access_endpoint.
* Fix networking plugins flannel/calico to use the etcd_endpoint.
* Fix name env var for non masters to be set as well.
* Fix etcd_client_url was not used anywhere and other etcd_* facts
evaluation was duplicated in a few places.
* Define proxy modes only in the env file, if not a master. Del
an automatic proxy mode decisions for etcd nodes in init/unit scripts.
* Use Wants= instead of Requires= as "This is the recommended way to
hook start-up of one unit to the start-up of another unit"
* Make apiserver/calico Wants= etcd-proxy to keep it always up

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
2016-07-19 14:09:40 +02:00
Jonathan Beckman d4dfdf68a6 Add settable flannel image tag & image repo
New settings with defaults:
flannel_server_helper_image_repo: "gcr.io/google_containers/"
flannel_server_helper_image_tag: "0.1"
flannel_image_repo: "quay.io/coreos/flannel"
flannel_image_tag: "0.5.5"
2016-07-11 13:18:20 +08:00
Smaine Kahlouch 83da5d7657 Merge pull request #335 from mattymo/calicoctl
Change calicoctl deployment to use container
2016-07-07 21:47:40 +02:00
Alexandre Bourget 3b7eaf66b6 flanneld: don't redirect logs to an unreadable location, let docker/k8s see
and aggregate them.
2016-07-06 16:25:11 -04:00
Matthew Mosesohn baf80b7d7e Change calicoctl deployment to use container
Improves upgradability of calicoctl by leveraging docker tags.
2016-07-05 13:49:03 +03:00
Chris Bell 9e59c74c24 Maintain backwards compatibility with EL6 2016-06-22 09:51:49 -04:00
Chris Bell d94253ff6a Modify calico docker.service 2016-06-22 09:44:31 -04:00
Matthew Mosesohn 153b82a803 Add docker_options to calico networking 2016-06-14 19:33:44 +03:00
Smana 4a7d8c6fea clean conditions into docker templates 2016-06-02 21:01:41 +02:00
Spencer Smith 87757d4fcf provides initial docker options support 2016-05-25 12:56:45 -04:00
Paul Czarkowski 7de87d958e turn adduser/download roles into meta roles
This should make things a little more composable,
by making these roles meta roles that perform no
actions by default we allow each role to own its own
resources.
2016-05-22 17:25:52 -05:00
Smana ae5ff890d4 fix flannel deployment, remove docker bridge before restarting 2016-05-13 18:10:00 +02:00
Paul Czarkowski bd064e8094 fix flannel's cross vm networking for vagrant
* set flannel backend type to `host-gw`
* set flannel interface to be eth1 ip
2016-05-08 23:42:42 -05:00
Smaine Kahlouch 68fafd030d choose between gce and aws cloud providers 2016-03-23 17:27:06 +01:00
Smana ede3aad2ab flannel backend type option 2016-03-04 14:55:04 +01:00
Smana 152c409022 calico: enabling nat outgoing by default 2016-02-21 17:11:49 +01:00
Smana fca384e24c first version of CoreOS on GCE
Please enter the commit message for your changes. Lines starting
2016-02-21 00:06:36 +01:00
Smana a649aa8b7e use ansible_service_mgr to detect init system 2016-02-13 11:46:53 +01:00
Smana 91fca69aa0 generate secrets on deployment machine
test travis with sudo=true instead of required
2016-02-13 06:51:54 +01:00
Smana 793d665db4 specify weave version 2016-02-10 18:19:03 +01:00
Smana ab007e4ab8 weave network plugin 2016-02-09 17:55:12 +01:00
Smaine Kahlouch 4f92417a5d split network plugins into distinct roles 2016-02-09 11:42:00 +01:00
Smaine Kahlouch 779299de15 calico uses --ip option 2016-02-01 15:53:23 +01:00
Smaine Kahlouch 3bb6066558 add option '--nat-outgoing' for calico on clouds 2016-02-01 10:47:34 +01:00
Greg Althaus 6163fe166e Update docker for CentOS issues in AWS and general
variables.

1. AWS has issues with ext4 (use xfs instead for CentOS only)
2. Make sure all the centos config files are include in the systemd config
3. Make sure that network options are set in the correct file by os family

This allows downstream items like opencontrail and others change variables
in expected locations.
2016-01-30 21:46:32 -06:00
Antoine Legrand b33713da4a Change calico condition --ipip 2016-01-29 14:07:21 +01:00
Antoine Legrand 83c1bd516d Update calico.yml 2016-01-29 12:23:29 +01:00
Antoine Legrand 5d24cabc83 Merge pull request #116 from ansibl8s/calico_on_cloud
Add --ipip to calico if on cloud_proivder
2016-01-28 20:28:15 +01:00
Antoine Legrand 7127e6de54 Add --ipip to calico if on cloud_proivder 2016-01-28 20:13:50 +01:00
Greg Althaus bedcca922c Add variables and defaults for multiple types of ip addresses.
Each node can have 3 IPs.
1. ansible_default_ip4 - whatever ansible things is the first IPv4 address
   usually with the default gw.
2. ip - An address to use on the local node to bind listeners and do local
   communication.  For example, Vagrant boxes have a first address that is the
   NAT bridge and is common for all nodes.  The second address/interface should
   be used.
3. access_ip - An address to use for node-to-node access.  This is assumed to
   be used by other nodes to access the node and may not be actually assigned
   on the node.  For example, AWS public ip that is not assigned to node.

This updates the places addresses are used to use either ip or access_ip and walk
up the list to find an address.
2016-01-27 16:05:39 -06:00
Smaine Kahlouch a323335d36 use 'kube_pods_subnet' var for flannel conf 2016-01-27 22:00:12 +01:00
ant31 fd6ac61afc Use local etcd/etcdproxy for calico 2016-01-26 17:28:30 +01:00
Antoine Legrand b9781fa7c2 Symlink dnsmasq conf 2016-01-26 00:30:29 +01:00
Smaine Kahlouch 90ffb8489a fix some handlers 2016-01-25 22:49:24 +01:00
ant31 56b92812fa Fix systemd reload and calico unit 2016-01-25 10:54:07 +01:00
Greg Althaus bcd6ecb7fb Add flannel vars to enable vagrant and amazon environments 2016-01-24 16:18:35 +01:00
Smaine Kahlouch 4984b57aa2 use rsync instead of command 2016-01-23 18:26:07 +01:00
Smaine Kahlouch cb59559835 use command instead of synchronize 2016-01-22 16:37:07 +01:00
Antoine Legrand 078b67c50f Remove downloader host 2016-01-22 09:59:39 +01:00
Antoine Legrand 859f6322a0 Merge branch 'master' into add_set_remote_user 2016-01-19 21:08:52 +01:00
Antoine Legrand f68d8f3757 Add seT_remote_user in synchronize 2016-01-19 14:20:05 +01:00
Smaine Kahlouch 7cab7e5fef restarting kubelet is sometimes required after docker restart 2016-01-19 13:47:07 +01:00
Smaine Kahlouch 8415634016 use google hyperkube image 2016-01-16 22:55:49 +01:00