Commit graph

56 commits

Author SHA1 Message Date
Bogdan Dobrelya
4fea28fe7b Manual steps for Gitlab CI pipeline
* Reduce default testcase to 2 nodes, add HA case.
* Adjust gen_matrix script for Travis/Gitlab CIs.
* Enable netchecker deploy foro gitlab CI.
* Sync other things from travis matrix and reorder them as build steps
  for pull requests, master branch, auto/manual.
* Do auto-step1 from part1 and manual step2,3 for branches/PRs.
* Do manual steps from part2, special for master merges.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-15 17:23:18 +01:00
Bogdan Dobrelya
f635d224ed Merge pull request #721 from adidenko/calico-add-rr
Add calico/routereflector support
2016-12-14 17:22:00 +01:00
Bogdan Dobrelya
a80cb5647e Rebalance CI GCE zones for better CPU per region usage
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-14 16:23:57 +01:00
Aleksandr Didenko
d5a9b34d9e Add calico/routereflector support
Add BGP route reflectors support in order to optimize BGP topology
for deployments with Calico network plugin.

Also bump version of calico/ctl for some bug fixes.
2016-12-14 13:44:10 +01:00
Antoine Legrand
1c34637b01 Merge pull request #730 from vwfs/azurerm
Add Azure Resource Group templates and scripts to contrib
2016-12-13 17:07:41 +01:00
Alexander Block
8b9c6164a3 Add documentation link for contrib/azurerm 2016-12-13 16:30:52 +01:00
Bogdan Dobrelya
bab6ec8477 Fix resolvconf
Do not repeat options and nameservers in the dhclient hooks.
Do not prepend nameservers for dhclient but supersede and fail back
to the upstream_dns_resolvers then default_resolver. Fixes order of
nameservers placement, which is cluster DNS ip goes always first.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-13 15:48:53 +01:00
Bogdan Dobrelya
659b482b62 Merge pull request #667 from bogdando/fix_dns
Rework DNS stack to meet hostnet pods needs
2016-12-12 21:38:13 +01:00
Bogdan Dobrelya
8679f10f71 Rework DNS stack to meet hostnet pods needs
* For Debian/RedHat OS families (with NetworkManager/dhclient/resolvconf
  optionally enabled) prepend /etc/resolv.conf with required nameservers,
  options, and supersede domain and search domains via the dhclient/resolvconf
  hooks.

* Drop (z)nodnsupdate dhclient hook and re-implement it to complement the
  resolvconf -u command, which is distro/cloud provider specific.
  Update docs as well.

* Enable network restart to apply and persist changes and simplify handlers
  to rely on network restart only. This fixes DNS resolve for hostnet K8s
  pods for Red Hat OS family. Skip network restart for canal/calico plugins,
  unless https://github.com/projectcalico/felix/issues/1185 fixed.

* Replace linefiles line plus with_items to block mode as it's faster.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>
2016-12-12 17:43:47 +01:00
Bogdan Dobrelya
5dd0caf1b5 Merge branch 'master' into tags_download 2016-12-12 11:44:00 +01:00
Bogdan Dobrelya
aefe4a99d2 Preconfigure DNS stack and docker early
In order to enable offline/intranet installation cases:
* Move DNS/resolvconf configuration to preinstall role. Remove
  skip_dnsmasq_k8s var as not needed anymore.

* Preconfigure DNS stack early, which may be the case when downloading
  artifacts from intranet repositories. Do not configure
  K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be
  not existing).

* Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq
  was set up and before K8s apps to be created.

* Move docker install task to early stage as well and unbind it from the
  etcd role's specific install path. Fix external flannel dependency on
  docker role handlers. Also fix the docker restart handlers' steps
  ordering to match the expected sequence (the socket then the service).

* Add default resolver fact, which is
  the cloud provider specific and remove hardcoded GCE resolver.

* Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search
  domains combined with high ndots values lead to poor performance of
  DNS stack and make ansible workers to fail very often with the
  "Timeout (12s) waiting for privilege escalation prompt:" error.

* Update docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:30:55 +01:00
Bogdan Dobrelya
10383c88ee More granular control for download/upload images/binaries
Add upload tag allow users to exclude distributing images across nodes
when running with the download tag set.
Add related tags and update docs as well.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 17:04:55 +01:00
Bogdan Dobrelya
0b1ce03167 Add tags
Add tags to allow more granular tasks filtering.
Add generator script for MD formatted tags found.
Add docs for tags how-to.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-09 12:14:28 +01:00
Matthew Mosesohn
ddf2200f2a Merge pull request #693 from kubernetes-incubator/upgrades-doc
Add document outlining upgrade process
2016-12-08 13:02:55 +03:00
Bogdan Dobrelya
1b17efee19 Merge pull request #692 from bogdando/gce_fixes
Change GCE sysctls placement and docs
2016-12-07 16:17:30 +01:00
Matthew Mosesohn
cf20a61e68 Add document outlining upgrade process 2016-12-07 16:33:08 +03:00
Bogdan Dobrelya
965b27e48e Change GCE sysctls placement and docs
Override GCE sysctl in /etc/sysctl.d/99-sysctl.conf instead of
the /etc/sysctl.d/11-gce-network-security.conf. It is recreated
by GCE, f.e. if gcloud CLI invokes some security related changes,
thus losing customizations we want to be persistent.

Update cloud providers firewall requirements in calico docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-12-07 12:53:45 +01:00
Aleksandr Didenko
992fcd1680 Calico: fix peering with routers for new version
In new `calicoctl` version nodes peering with routers is broken.
We need to use predictable node names for calico-node and the
same names in calico `bgpPeer` resources and CNI.
2016-12-06 17:17:39 +01:00
Bogdan Dobrelya
aafdbebd48 Reduce CI test matrix
Reduce the test cases from 15 to 9, bearing in mind that:
* Disable weave/coreos gate unless its deployment fixed
* If debian/centos7 fails with net plugin X, ubuntu-xenial/rhel-7 will
  likely fail as well
* Canal also covers the flannel plugin deployment, but keep at least one
  of the flannel plugin deployment, unless it's superseded and removed.
* Keep at least one of each OS/plugin family to be tested in the separate
  nodes layout
* Keep at least one of each OS family to be tested against each of the
  plugin types in default nodes layout
* Rebalance GCE regions for instances, replace asia to eu/us as they
  are the longest running jobs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-29 18:53:43 +01:00
Antoine Legrand
6630348164 Merge pull request #657 from smelchior/master
add  azure support for kargo
2016-11-29 12:20:49 +01:00
Bogdan Dobrelya
d4305d1a64 Switch to standard debian/centos/rhel for CI
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-29 10:25:07 +01:00
Sebastian Melchior
254e02c69e add basic azure support for kargo 2016-11-29 10:20:28 +01:00
Antoine Legrand
f75e2c5119 Merge pull request #529 from bogdando/netcheck
Add a k8s app for advanced e2e netcheck for DNS
2016-11-28 15:26:30 +01:00
Bogdan Dobrelya
d5b21b34c2 Add advanced net check for DNS K8s app
* Add an option to deploy K8s app to test e2e network connectivity
  and cluster DNS resolve via Kubedns for nethost/simple pods
  (defaults to false).
* Parametrize existing k8s apps templates with kube_namespace and
  kube_config_dir instead of hardcode.
* For CoreOS, ensure nameservers from inventory to be put in the
  first place to allow hostnet pods connectivity via short names
  or FQDN and hostnet agents to pass as well, if netchecker
  deployed.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-28 13:23:25 +01:00
Bogdan Dobrelya
c34c49d4d9 Tune dnsmasq/kubedns limits, replicas, logging
* Add dns_replicas, dns_memory/cpu_limit/requests vars for
dns related apps.
* When kube_log_level=4, log dnsmasq queries as well.
* Add log level control for skydns (part of kubedns app).
* Add limits/requests vars for dnsmasq (part of kubedns app) and
  dnsmasq daemon set.
* Drop string defaults for kube_log_level as it is int and
  is defined in the global vars as well.
* Add docs

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-25 12:49:17 +01:00
Bogdan Dobrelya
417a931f78 Fix download dnsmasq image dependency on docker
When download_run_once with download_localhost is used, docker is
expected to be running on the delegate localhost. That may be not
the case for a non localhost delegate, which is the kube-master
otherwise. Then the dnsmasq role, had it been invoked early before
deployment starts, would fail because of the missing docker dependency.

* Fix that dependency on docker and do not pre download dnsmasq image
  for the dnsmasq role, if download_localhost is disabled.
* Remove become: false for docker CLI invocation because that's not
  the common pattern to allow users access docker CLI w/o sudo.
* Fix opt bin path hack for localhost delegate to ignore errors when
  it fails with "sudo password required" otherwise.
* Describe download_run_once with download_localhost use case in docs
  as well.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-24 18:31:26 +01:00
Bogdan Dobrelya
539a47b0fa Merge pull request #621 from xenolog/calico_network_backend
Add ability to define network backend for Calico.
2016-11-22 14:55:47 +01:00
Sergey Vasilenko
e73c86c6aa Add ability to define network backend for Calico.
This patch introduce `calico_network_backend` global variable,
which allow to describe alternative network backend.
Default behavior is unchanged.
2016-11-18 16:38:18 +03:00
Alexandre Bourget
ecf40bc377 Update roadmap.md 2016-11-17 12:44:30 -05:00
Bogdan Dobrelya
2514d04c6b Improve CI test matrix
For Travis CI and GCE, add a naive generator script into a markdown table.
Add GCE/Travis CI matrix docs.
Add CoreOS test cases.
Rework existing cases w/o loosing of coverage.
Rework postinstall tests to support CoreOS as well.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-11-15 18:17:30 +01:00
Artem Panchenko
9d0a79a777 Support new version of 'calicoctl' (>=v1.0.0)
Since version 'v1.0.0-beta' calicoctl is written
in Go and its API differs from old Python based
utility. Added support of both old and new version
of the utility.
2016-11-10 17:11:29 +02:00
Matthew Mosesohn
b8ca4e4f45 Remove etcd-proxy from all nodes and use etcd multiaccess 2016-11-09 13:31:12 +03:00
Smaine Kahlouch
a8155ee35f Merge pull request #554 from bogdando/kubeadm_adoption
Update roadmap for the kubeadm LCM track
2016-10-18 13:52:55 +02:00
Smana
9d2f161430 update roadmap, kubeadm adoption 2016-10-18 13:51:36 +02:00
Bogdan Dobrelya
72a25bfd06 Update roadmap for the kubeadm LCM track
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-10-18 13:44:45 +02:00
Bogdan Dobrelya
dd82ed2a1f Update ha docs
Fix mismatch in code and docs, see
https://github.com/kubespray/kargo/pull/528

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-10-17 15:42:30 +02:00
Smaine Kahlouch
9df4502909 Merge pull request #528 from kubespray/proxy-nginx
Use nginx proxy on non-master nodes to proxy apiserver traffic
2016-10-05 19:19:32 +02:00
Matthew Mosesohn
73066f308d use nginx proxy on non-master nodes to proxy apiserver traffic
Also adds all masters by hostname and localhost/127.0.0.1 to
apiserver SSL certificate.

Includes documentation update on how localhost loadbalancer works.
2016-10-05 20:09:10 +03:00
keglevich3
52cdb911f7 changed to the correct link 2016-09-29 17:44:24 +03:00
Bogdan Dobrelya
6ab133d0a3 Allow subdomains of dns_domain and fix kubelet restarts
* Add a var for ndots (default 5) and put it hosts' /etc/resolv.conf.
* Poke kube dns container image to v1.7
* In order to apply changes to kubelet, notify it to
be restarted on changes made to /etc/resolv.conf. Ignore errors as the kubelet
may yet to be present up to the moment of the notification being processed.
* Remove unnecessary kubelet restart for master role as the node role ensures
it is up and running. Notify master static pods waiters for apiserver,
scheduler, controller-manager instead.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-27 14:32:49 +02:00
Bogdan Dobrelya
14529c1ea3 Add more DNS docs
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-26 13:47:34 +02:00
Bogdan Dobrelya
8aeeb62719 Adjust DNS picture
Reflect changes made to DNS setup

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-23 17:11:07 +02:00
Bogdan Dobrelya
2908f92524 Fix docs and dns servers placement order
- Update docs and a drawing to clarify DNS setup.
- Change order of nameservers placement to match
  changes in https://github.com/kubespray/kargo/pull/501

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-23 16:16:00 +02:00
Bogdan Dobrelya
34d0c5c676 Make dnsmasq daemon set optional
Change additional dnsmasq opts:
- Adjust caching size and TTL
- Disable resolve conf to not create loops
- Change dnsPolicy to default (similarly to kubedns's dnsmasq). The
  ClusterFirst should not be used to not create loops
- Disable negative NXDOMAIN replies to be cached
- Make its very installation as optional step (enabled by default).
  If you don't want more than 3 DNS servers, including 1 for K8s, disable
  it.
- Add docs and a drawing to clarify DNS setup.
- Fix stdout logs for dnsmasq/kubedns app configs
- Add missed notifies to resolvconf -u handler
- Fix idempotency of resolvconf head file changes

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-23 12:59:06 +02:00
Sean M. Collins
9c862a8b44 Rename large-deploymets.md to large-deployments.md
Filename was a typo
2016-09-19 11:51:37 -04:00
Bogdan Dobrelya
ae8e5908ef Add retry_stagger var for failed download/pushes.
* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-09-15 16:43:58 +02:00
Spencer Smith
79d749b136 merge with current master, update typos in doc 2016-08-24 09:56:42 -04:00
Spencer Smith
a2fcf0be5d updated to no longer handle gce as cloud-provider. provided aws setup doc 2016-08-24 09:48:32 -04:00
Bogdan Dobrelya
575ec168a3 Add HA/LB endpoints for kube-apiserver
* Add HA docs for API server.
* Add auto-evaluated internal endpoints and clarify the loadbalancer_apiserver
vars and usecases.
* Use facts for kube_apiserver to not repeat code and enable LB endpoints use.
* Use /healthz check for the wait-for apiserver.
* Use the single endpoint for kubelet instead of the list of apiservers
* Specify kube_apiserver_count to for HA layout

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-07-25 17:25:45 +02:00
Bogdan Dobrelya
7639fadaba Add ha docs
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-07-22 14:44:36 +02:00