Skip to content

20180504 talk 5

20180504_talk_5

Baremetal k8s

Intro

Wikimedia
  • 2 Primary DCs
  • Caching DCs
  • Ganeti cluster with VMs
Reason for k8s
  • 2014: 2 Main product
  • 2018: Micro Services

Also
+ Elasticity
+ Node failure Mgmt
+ Containers
+ Empower Devs

BUT new challenges
* More moving parts
* New paradigms
* Containers

Why BareMetal ?

No Public cloud:
* User’s privacy
* Already having the infra
* Costs

No private cloud:
* We run k8s on OpenStack for other projects
* Reducing movng parts (OpenStack is a massive moving part)
* No practivcal advantages for production services

Cluster setup

1 - Setup

We build debs packages
* k8s 1.7
* calico
* Etcd 2.2.1

Configure with Puppet
* TLS included
* but NOT the k8s resources

API servers HA
* 2 per cluster

2 - Infra resources
  • 2 Prod clusters + 1 Staging
  • Sperated ETCD clusters (3 servers, DC Local, on Ganeti VMs)
  • Kube-proxy in iptables mode
  • Docker as runtime BUT interested in rkt
  • We host ourown registry (No docker hub)
  • We rebuild our image from scratch
  • Backend on Swift
3 - Sec
  • RBAC enabled since day1
  • 1 namespace / app
  • Token-auth
  • Standard adminission controllers
  • Firewalling enabled on all nodes via ferm
  • This works better than we thought

Integration

Network

Topologie
* Racks + TORs
* 2 swicthes / Racks Row
* 2 Routers

Others
* Machines are distributed per rack row for redondency
* For backward compatibility, avoid overlay network
* Calico BGP speakers on every node and the 2 routers
* No BGP Full mesh
* Pods have ipv6 thanks to Calico

Network Policies
* We use

Ingress
* Evaludated but not used yet
* USe in house python daemon running on LB (PyBal: https://github.com/wikimedia/PyBal)
- NodePort with ExternalIPs
- PyBal manages LVS DR entries on the lb

Metrics

Infra with Prometheus
* Polling all APIservers
* Polling all Kubelets
* Polling kubelets cAdvisor
* Discovery mechanisms RULES !

Applications with Prometheus too !

Alerting

App dpeloyments

Helm
* tiller on each app namespace with specific RBAC
* A “deploy” user is allowed to speak to tiller
* Wrapper shell script around Helm (shamefull) for kind of “federation”

Deploy
* On both DCs
* 2 Helms releases: Canary and Production
* Only Production declares services
* Deploy Canary, do tests and check metrics (Automated), then go Production