Skip to content

Commit 98160a6

Browse files
Add initial documentation for CPLB
Signed-off-by: Juan-Luis de Sousa-Valadas Castaño <[email protected]>
1 parent f1a762a commit 98160a6

File tree

4 files changed

+336
-0
lines changed

4 files changed

+336
-0
lines changed

docs/configuration.md

+23
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,29 @@ node-local load balancing.
300300
| `apiServerBindPort` | Port number on which to bind the Envoy load balancer for the Kubernetes API server to on a worker's loopback interface. Default: `7443`. |
301301
| `konnectivityServerBindPort` | Port number on which to bind the Envoy load balancer for the konnectivity server to on a worker's loopback interface. Default: `7132`. |
302302
303+
##### `spec.network.controlPlaneLoadBalancing`
304+
305+
Configuration options related to k0s's [control plane load balancing] feature
306+
307+
| Element | Description |
308+
| --------------- | ---------------------------------------------------------------------------------------------------------- |
309+
| `vrrpInstances` | Configuration options related to the VRRP. This is an array which allows to configure multiple virtual IPs |
310+
311+
[control plane load balancing]: cplb.md
312+
313+
##### `spec.network.controlPlaneLoadBalancing.VRRPInstances`
314+
315+
Configuration options required for using VRRP to configure VIPs in control plane load balancing.
316+
317+
| Element | Description |
318+
| ----------------- | ----------------------------------------------------------------------------------------------------------------- |
319+
| `name` | The name of the VRRP instance. If omitted it generates a predictive name shared across all nodes. |
320+
| `virtualIPs` | A list of the CIDRs handled by the VRRP instance. |
321+
| `interface` | The interface used by each VRRPInstance. If undefined k0s will try to auto detect it based on the default gateway |
322+
| `virtualRouterId` | Virtual router ID for the instance. Default: `51` |
323+
| `advertInterval` | Advertisement interval in seconds. Default: `1`. |
324+
| `authPass` | The password used for accessing vrrpd. This field is mandatory and must be under 8 characters long |
325+
303326
### `spec.controllerManager`
304327

305328
| Element | Description |

docs/cplb.md

+310
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
# Control plane load balancing
2+
3+
For clusters that don't have an [externally managed load balancer](high-availability.md#load-balancer) for the k0s
4+
control plane, there is another option to get a highly available control plane called control plane load balancing (CPLB).
5+
6+
CPLB allows automatic assigned of predefined IP addresses using VRRP across masters.
7+
8+
## Technical functionality
9+
10+
The k0s control plane load balancer provides k0s with virtual IPs on each
11+
controller node. This allows the control plane to be highly available as
12+
long as the network infrastructure allows multicast and GARP.
13+
14+
[Keepalived](https://www.keepalived.org/) is the only load balancer that is
15+
supported so far and currently there are no plans to support other alternatives.
16+
17+
## Enabling in a cluster
18+
19+
In order to use control plane load balancing, the cluster needs to comply with the
20+
following:
21+
22+
* K0s isn't running as a [single node](k0s-single-node.md), i.e. it isn't
23+
started using the `--single` flag.
24+
* The cluster should have multiple controller nodes. Technically CPLB also works
25+
with a single controller node, but is only useful in conjunction with a highly
26+
available control plane.
27+
* Unique virtualRouterID and authPass for each cluster in the same broadcast domain.
28+
These do not provide any sort of security against ill-intentioned attacks, they are
29+
safety features to prevent accidental mistakes.
30+
31+
Add the following to the cluster configuration (`k0s.yaml`):
32+
33+
```yaml
34+
spec:
35+
api:
36+
externalAddress: <External address> # This isn't a requirement, but it's a common use case.
37+
network:
38+
controlPlaneLoadBalancing:
39+
vrrpInstances:
40+
- virtualIPs: ["<External address IP>/<external address IP netmask"]
41+
authPass: <password>
42+
```
43+
44+
Or alternatively, if using [`k0sctl`](k0sctl-install.md), add the following to
45+
the k0sctl configuration (`k0sctl.yaml`):
46+
47+
```yaml
48+
spec:
49+
k0s:
50+
config:
51+
spec:
52+
api:
53+
externalAddress: <External address> # This isn't a requirement, but it's a common use case.
54+
network:
55+
controlPlaneLoadBalancing:
56+
vrrpInstances:
57+
- virtualIPs: ["<External address IP>/<external address IP netmask>"]
58+
authPass: <password>
59+
```
60+
61+
Because this is a feature intended to configure the apiserver, CPLB noes not
62+
support dynamic configuration and in order to make changes you need to restart
63+
the k0s controllers to make changes.
64+
65+
## Full example using `k0sctl`
66+
67+
The following example shows a full `k0sctl` configuration file featuring three
68+
controllers and three workers with control plane load balancing enabled:
69+
70+
```yaml
71+
apiVersion: k0sctl.k0sproject.io/v1beta1
72+
kind: Cluster
73+
metadata:
74+
name: k0s-cluster
75+
spec:
76+
hosts:
77+
- role: controller
78+
ssh:
79+
address: controller-0.k0s.lab
80+
user: root
81+
keyPath: ~/.ssh/id_rsa
82+
k0sBinaryPath: /opt/k0s
83+
uploadBinary: true
84+
- role: controller
85+
ssh:
86+
address: controller-1.k0s.lab
87+
user: root
88+
keyPath: ~/.ssh/id_rsa
89+
k0sBinaryPath: /opt/k0s
90+
uploadBinary: true
91+
- role: controller
92+
ssh:
93+
address: controller-2.k0s.lab
94+
user: root
95+
keyPath: ~/.ssh/id_rsa
96+
k0sBinaryPath: /opt/k0s
97+
uploadBinary: true
98+
- role: worker
99+
ssh:
100+
address: worker-0.k0s.lab
101+
user: root
102+
keyPath: ~/.ssh/id_rsa
103+
k0sBinaryPath: /opt/k0s
104+
uploadBinary: true
105+
- role: worker
106+
ssh:
107+
address: worker-1.k0s.lab
108+
user: root
109+
keyPath: ~/.ssh/id_rsa
110+
k0sBinaryPath: /opt/k0s
111+
uploadBinary: true
112+
- role: worker
113+
ssh:
114+
address: worker-2.k0s.lab
115+
user: root
116+
keyPath: ~/.ssh/id_rsa
117+
k0sBinaryPath: /opt/k0s
118+
uploadBinary: true
119+
k0s:
120+
version: v{{{ extra.k8s_version }}}+k0s.0
121+
config:
122+
spec:
123+
api:
124+
externalAddress: 192.168.122.200
125+
network:
126+
controlPlaneLoadBalancing:
127+
vrrpInstances:
128+
- virtualIPs: ["192.168.122.200/24"]
129+
authPass: Example
130+
```
131+
132+
Save the above configuration into a file called `k0sctl.yaml` and apply it in
133+
order to bootstrap the cluster:
134+
135+
```console
136+
$ k0sctl apply
137+
⠀⣿⣿⡇⠀⠀⢀⣴⣾⣿⠟⠁⢸⣿⣿⣿⣿⣿⣿⣿⡿⠛⠁⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀█████████ █████████ ███
138+
⠀⣿⣿⡇⣠⣶⣿⡿⠋⠀⠀⠀⢸⣿⡇⠀⠀⠀⣠⠀⠀⢀⣠⡆⢸⣿⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀███ ███ ███
139+
⠀⣿⣿⣿⣿⣟⠋⠀⠀⠀⠀⠀⢸⣿⡇⠀⢰⣾⣿⠀⠀⣿⣿⡇⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀███ ███ ███
140+
⠀⣿⣿⡏⠻⣿⣷⣤⡀⠀⠀⠀⠸⠛⠁⠀⠸⠋⠁⠀⠀⣿⣿⡇⠈⠉⠉⠉⠉⠉⠉⠉⠉⢹⣿⣿⠀███ ███ ███
141+
⠀⣿⣿⡇⠀⠀⠙⢿⣿⣦⣀⠀⠀⠀⣠⣶⣶⣶⣶⣶⣶⣿⣿⡇⢰⣶⣶⣶⣶⣶⣶⣶⣶⣾⣿⣿⠀█████████ ███ ██████████
142+
k0sctl Copyright 2023, k0sctl authors.
143+
Anonymized telemetry of usage will be sent to the authors.
144+
By continuing to use k0sctl you agree to these terms:
145+
https://k0sproject.io/licenses/eula
146+
level=info msg="==> Running phase: Connect to hosts"
147+
level=info msg="[ssh] worker-2.k0s.lab:22: connected"
148+
level=info msg="[ssh] controller-2.k0s.lab:22: connected"
149+
level=info msg="[ssh] worker-1.k0s.lab:22: connected"
150+
level=info msg="[ssh] worker-0.k0s.lab:22: connected"
151+
level=info msg="[ssh] controller-0.k0s.lab:22: connected"
152+
level=info msg="[ssh] controller-1.k0s.lab:22: connected"
153+
level=info msg="==> Running phase: Detect host operating systems"
154+
level=info msg="[ssh] worker-2.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
155+
level=info msg="[ssh] controller-2.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
156+
level=info msg="[ssh] controller-0.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
157+
level=info msg="[ssh] controller-1.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
158+
level=info msg="[ssh] worker-0.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
159+
level=info msg="[ssh] worker-1.k0s.lab:22: is running Fedora Linux 38 (Cloud Edition)"
160+
level=info msg="==> Running phase: Acquire exclusive host lock"
161+
level=info msg="==> Running phase: Prepare hosts"
162+
level=info msg="==> Running phase: Gather host facts"
163+
level=info msg="[ssh] worker-2.k0s.lab:22: using worker-2.k0s.lab as hostname"
164+
level=info msg="[ssh] controller-0.k0s.lab:22: using controller-0.k0s.lab as hostname"
165+
level=info msg="[ssh] controller-2.k0s.lab:22: using controller-2.k0s.lab as hostname"
166+
level=info msg="[ssh] controller-1.k0s.lab:22: using controller-1.k0s.lab as hostname"
167+
level=info msg="[ssh] worker-1.k0s.lab:22: using worker-1.k0s.lab as hostname"
168+
level=info msg="[ssh] worker-0.k0s.lab:22: using worker-0.k0s.lab as hostname"
169+
level=info msg="[ssh] worker-2.k0s.lab:22: discovered eth0 as private interface"
170+
level=info msg="[ssh] controller-0.k0s.lab:22: discovered eth0 as private interface"
171+
level=info msg="[ssh] controller-2.k0s.lab:22: discovered eth0 as private interface"
172+
level=info msg="[ssh] controller-1.k0s.lab:22: discovered eth0 as private interface"
173+
level=info msg="[ssh] worker-1.k0s.lab:22: discovered eth0 as private interface"
174+
level=info msg="[ssh] worker-0.k0s.lab:22: discovered eth0 as private interface"
175+
level=info msg="[ssh] worker-2.k0s.lab:22: discovered 192.168.122.210 as private address"
176+
level=info msg="[ssh] controller-0.k0s.lab:22: discovered 192.168.122.37 as private address"
177+
level=info msg="[ssh] controller-2.k0s.lab:22: discovered 192.168.122.87 as private address"
178+
level=info msg="[ssh] controller-1.k0s.lab:22: discovered 192.168.122.185 as private address"
179+
level=info msg="[ssh] worker-1.k0s.lab:22: discovered 192.168.122.81 as private address"
180+
level=info msg="[ssh] worker-0.k0s.lab:22: discovered 192.168.122.219 as private address"
181+
level=info msg="==> Running phase: Validate hosts"
182+
level=info msg="==> Running phase: Validate facts"
183+
level=info msg="==> Running phase: Download k0s binaries to local host"
184+
level=info msg="==> Running phase: Upload k0s binaries to hosts"
185+
level=info msg="[ssh] controller-0.k0s.lab:22: uploading k0s binary from /opt/k0s"
186+
level=info msg="[ssh] controller-2.k0s.lab:22: uploading k0s binary from /opt/k0s"
187+
level=info msg="[ssh] worker-0.k0s.lab:22: uploading k0s binary from /opt/k0s"
188+
level=info msg="[ssh] controller-1.k0s.lab:22: uploading k0s binary from /opt/k0s"
189+
level=info msg="[ssh] worker-1.k0s.lab:22: uploading k0s binary from /opt/k0s"
190+
level=info msg="[ssh] worker-2.k0s.lab:22: uploading k0s binary from /opt/k0s"
191+
level=info msg="==> Running phase: Install k0s binaries on hosts"
192+
level=info msg="[ssh] controller-0.k0s.lab:22: validating configuration"
193+
level=info msg="[ssh] controller-1.k0s.lab:22: validating configuration"
194+
level=info msg="[ssh] controller-2.k0s.lab:22: validating configuration"
195+
level=info msg="==> Running phase: Configure k0s"
196+
level=info msg="[ssh] controller-0.k0s.lab:22: installing new configuration"
197+
level=info msg="[ssh] controller-2.k0s.lab:22: installing new configuration"
198+
level=info msg="[ssh] controller-1.k0s.lab:22: installing new configuration"
199+
level=info msg="==> Running phase: Initialize the k0s cluster"
200+
level=info msg="[ssh] controller-0.k0s.lab:22: installing k0s controller"
201+
level=info msg="[ssh] controller-0.k0s.lab:22: waiting for the k0s service to start"
202+
level=info msg="[ssh] controller-0.k0s.lab:22: waiting for kubernetes api to respond"
203+
level=info msg="==> Running phase: Install controllers"
204+
level=info msg="[ssh] controller-2.k0s.lab:22: validating api connection to https://192.168.122.200:6443"
205+
level=info msg="[ssh] controller-1.k0s.lab:22: validating api connection to https://192.168.122.200:6443"
206+
level=info msg="[ssh] controller-0.k0s.lab:22: generating token"
207+
level=info msg="[ssh] controller-1.k0s.lab:22: writing join token"
208+
level=info msg="[ssh] controller-1.k0s.lab:22: installing k0s controller"
209+
level=info msg="[ssh] controller-1.k0s.lab:22: starting service"
210+
level=info msg="[ssh] controller-1.k0s.lab:22: waiting for the k0s service to start"
211+
level=info msg="[ssh] controller-1.k0s.lab:22: waiting for kubernetes api to respond"
212+
level=info msg="[ssh] controller-0.k0s.lab:22: generating token"
213+
level=info msg="[ssh] controller-2.k0s.lab:22: writing join token"
214+
level=info msg="[ssh] controller-2.k0s.lab:22: installing k0s controller"
215+
level=info msg="[ssh] controller-2.k0s.lab:22: starting service"
216+
level=info msg="[ssh] controller-2.k0s.lab:22: waiting for the k0s service to start"
217+
level=info msg="[ssh] controller-2.k0s.lab:22: waiting for kubernetes api to respond"
218+
level=info msg="==> Running phase: Install workers"
219+
level=info msg="[ssh] worker-2.k0s.lab:22: validating api connection to https://192.168.122.200:6443"
220+
level=info msg="[ssh] worker-1.k0s.lab:22: validating api connection to https://192.168.122.200:6443"
221+
level=info msg="[ssh] worker-0.k0s.lab:22: validating api connection to https://192.168.122.200:6443"
222+
level=info msg="[ssh] controller-0.k0s.lab:22: generating a join token for worker 1"
223+
level=info msg="[ssh] controller-0.k0s.lab:22: generating a join token for worker 2"
224+
level=info msg="[ssh] controller-0.k0s.lab:22: generating a join token for worker 3"
225+
level=info msg="[ssh] worker-2.k0s.lab:22: writing join token"
226+
level=info msg="[ssh] worker-0.k0s.lab:22: writing join token"
227+
level=info msg="[ssh] worker-1.k0s.lab:22: writing join token"
228+
level=info msg="[ssh] worker-2.k0s.lab:22: installing k0s worker"
229+
level=info msg="[ssh] worker-1.k0s.lab:22: installing k0s worker"
230+
level=info msg="[ssh] worker-0.k0s.lab:22: installing k0s worker"
231+
level=info msg="[ssh] worker-2.k0s.lab:22: starting service"
232+
level=info msg="[ssh] worker-1.k0s.lab:22: starting service"
233+
level=info msg="[ssh] worker-0.k0s.lab:22: starting service"
234+
level=info msg="[ssh] worker-2.k0s.lab:22: waiting for node to become ready"
235+
level=info msg="[ssh] worker-0.k0s.lab:22: waiting for node to become ready"
236+
level=info msg="[ssh] worker-1.k0s.lab:22: waiting for node to become ready"
237+
level=info msg="==> Running phase: Release exclusive host lock"
238+
level=info msg="==> Running phase: Disconnect from hosts"
239+
level=info msg="==> Finished in 2m20s"
240+
level=info msg="k0s cluster version v{{{ extra.k8s_version }}}+k0s.0 is now installed"
241+
level=info msg="Tip: To access the cluster you can now fetch the admin kubeconfig using:"
242+
level=info msg=" k0sctl kubeconfig"
243+
```
244+
245+
The cluster with the two nodes should be available by now. Setup the kubeconfig
246+
file in order to interact with it:
247+
248+
```shell
249+
k0sctl kubeconfig > k0s-kubeconfig
250+
export KUBECONFIG=$(pwd)/k0s-kubeconfig
251+
```
252+
253+
All three worker nodes are ready:
254+
255+
```console
256+
$ kubectl get nodes
257+
NAME STATUS ROLES AGE VERSION
258+
worker-0.k0s.lab Ready <none> 8m51s v1.29.2+k0s
259+
worker-1.k0s.lab Ready <none> 8m51s v1.29.2+k0s
260+
worker-2.k0s.lab Ready <none> 8m51s v1.29.2+k0s
261+
```
262+
263+
Each controller node has a dummy interface with the VIP and /32 netmask,
264+
but only one has it in the real nic:
265+
266+
```console
267+
$ for i in controller-{0..2} ; do echo $i ; ssh $i -- ip -4 --oneline addr show | grep -e eth0 -e dummyvip0; done
268+
controller-0
269+
2: eth0 inet 192.168.122.37/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0\ valid_lft 2381sec preferred_lft 2381sec
270+
2: eth0 inet 192.168.122.200/24 scope global secondary eth0\ valid_lft forever preferred_lft forever
271+
3: dummyvip0 inet 192.168.122.200/32 scope global dummyvip0\ valid_lft forever preferred_lft forever
272+
controller-1
273+
2: eth0 inet 192.168.122.185/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0\ valid_lft 2390sec preferred_lft 2390sec
274+
3: dummyvip0 inet 192.168.122.200/32 scope global dummyvip0\ valid_lft forever preferred_lft forever
275+
controller-2
276+
2: eth0 inet 192.168.122.87/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0\ valid_lft 2399sec preferred_lft 2399sec
277+
3: dummyvip0 inet 192.168.122.200/32 scope global dummyvip0\ valid_lft forever preferred_lft forever
278+
```
279+
280+
The cluster is using control plane load balancing and is able to tolerate the
281+
outage of one controller node. Shutdown the first controller to simulate a
282+
failure condition:
283+
284+
```console
285+
$ ssh controller-0 'sudo poweroff'
286+
Connection to 192.168.122.37 closed by remote host.
287+
```
288+
289+
Control plane load balancing provides high availability, the VIP will have moved to a different node:
290+
291+
```console
292+
$ for i in controller-{0..2} ; do echo $i ; ssh $i -- ip -4 --oneline addr show | grep -e eth0 -e dummyvip0; done
293+
controller-1
294+
2: eth0 inet 192.168.122.185/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0\ valid_lft 2173sec preferred_lft 2173sec
295+
2: eth0 inet 192.168.122.200/24 scope global secondary eth0\ valid_lft forever preferred_lft forever
296+
3: dummyvip0 inet 192.168.122.200/32 scope global dummyvip0\ valid_lft forever preferred_lft forever
297+
controller-2
298+
2: eth0 inet 192.168.122.87/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0\ valid_lft 2182sec preferred_lft 2182sec
299+
3: dummyvip0 inet 192.168.122.200/32 scope global dummyvip0\ valid_lft forever preferred_lft forever
300+
````
301+
302+
And the cluster will be working normally:
303+
304+
```console
305+
$ kubectl get nodes
306+
NAME STATUS ROLES AGE VERSION
307+
worker-0.k0s.lab Ready <none> 8m51s v1.29.2+k0s
308+
worker-1.k0s.lab Ready <none> 8m51s v1.29.2+k0s
309+
worker-2.k0s.lab Ready <none> 8m51s v1.29.2+k0s
310+
```

docs/networking.md

+2
Original file line numberDiff line numberDiff line change
@@ -54,10 +54,12 @@ One goal of k0s is to allow for the deployment of an isolated control plane, whi
5454
| TCP | 10250 | kubelet | controller, worker => host `*` | Authenticated kubelet API for the controller node `kube-apiserver` (and `heapster`/`metrics-server` addons) using TLS client certs
5555
| TCP | 9443 | k0s-api | controller <-> controller | k0s controller join API, TLS with token auth
5656
| TCP | 8132 | konnectivity | worker <-> controller | Konnectivity is used as "reverse" tunnel between kube-apiserver and worker kubelets
57+
| TCP | 112 | keepalived | controller <-> controller | Only required for control plane load balancing vrrpInstances for ip address 224.0.0.18. 224.0.0.18 is a multicast IP address defined in [RFC 3768].
5758

5859
You also need enable all traffic to and from the [podCIDR and serviceCIDR] subnets on nodes with a worker role.
5960

6061
[podCIDR and serviceCIDR]: configuration.md#specnetwork
62+
[RFC 3768]: https://datatracker.ietf.org/doc/html/rfc3768#section-5.2.2
6163

6264
## iptables
6365

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ nav:
4545
- IPv4/IPv6 Dual-Stack: dual-stack.md
4646
- Control Plane High Availability: high-availability.md
4747
- Node-local load balancing: nllb.md
48+
- Control plane load balancing: cplb.md
4849
- Shell Completion: shell-completion.md
4950
- User Management: user-management.md
5051
- Configuration of Environment Variables: environment-variables.md

0 commit comments

Comments
 (0)