Skip to content

Commit 58bebaa

Browse files
committed
docs: faq network
1 parent 006c343 commit 58bebaa

File tree

3 files changed

+181
-30
lines changed

3 files changed

+181
-30
lines changed

docs/advanced/peering/inter-cluster-network.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,10 @@ liqo-tenant-cl02 gw-cl02 Client Connected 76s
103103

104104
In the second cluster (acting as gateway server) you can find the following resources:
105105

106+
```{admonition} Note
107+
If the status return **Error** check the FAQ section [Debug gateway-to-gateway communication issues](../../faq/faq.md#debug-gateway-to-gateway-communication-issues) to get an hint on how to solve the issue.
108+
```
109+
106110
```bash
107111
kubectl get gatewayservers.networking.liqo.io -A
108112
```

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@
6464
'https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-api#service-types',
6565
'https://ieeexplore.ieee.org',
6666
'https://dl.acm.org', # often 403
67+
'https://scholar.google.com'
6768
]
6869

6970

docs/faq/faq.md

Lines changed: 176 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,8 @@
22

33
This section contains the answers to the most frequently asked questions by the community (Slack, GitHub, etc.).
44

5-
## Table of contents
6-
7-
* [General](FAQGeneralSection)
8-
* [Cluster limits](FAQClusterLimits)
9-
* [Why DaemonSets pods (e.g., Kube-Proxy, CNI pods) scheduled on Virtual Nodes are in OffloadingBackOff?](FAQDaemonsetBackOff)
10-
* [Installation](FAQInstallationSection)
11-
* [Upgrade the Liqo version installed on a cluster](FAQUpgradeLiqo)
12-
* [How to install Liqo on DigitalOcean](FAQInstallLiqoDO)
13-
* [Peering](FAQPeeringSection)
14-
* [How to force unpeer a cluster?](FAQForceUnpeer)
15-
* [Is it possible to peer clusters using an ingress?](FAQPeerOverIngress)
16-
17-
(FAQGeneralSection)=
18-
195
## General
206

21-
(FAQClusterLimits)=
22-
237
### Cluster limits
248

259
The official Kubernetes documentation presents some [general best practices and considerations for large clusters](https://kubernetes.io/docs/setup/best-practices/cluster-large/), defining some cluster limits.
@@ -30,8 +14,6 @@ For instance, the limitation of 110 pods per node is not enforced on Liqo virtua
3014
The same consideration applies to the maximum number of nodes (5000) since all the remote nodes are hidden by a single virtual node.
3115
You can find additional information [here](https://github.com/liqotech/liqo/issues/1863).
3216

33-
(FAQDaemonsetBackOff)=
34-
3517
### Why DaemonSets pods (e.g., Kube-Proxy, CNI pods) scheduled on virtual nodes are in OffloadingBackOff?
3618

3719
The virtual nodes generated by Liqo have a [taint](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) that prevents pods from being scheduled on a virtual node (so on the remote cluster) unless the pod is a created in an [offloaded namespace](../usage/namespace-offloading.md).
@@ -54,32 +36,22 @@ nodeAffinity:
5436
5537
This ensures that a pod is **not** created on any nodes with the `liqo.io/type` label.
5638

57-
(FAQInstallationSection)=
58-
5939
## Installation
6040

61-
(FAQUpgradeLiqo)=
62-
6341
### Upgrade the Liqo version installed on a cluster
6442

6543
Unfortunately, this feature is not currently fully supported.
6644
At the moment, upgrading through `liqoctl install` or `helm update` will update manifests and Docker images (excluding the *virtual-kubelet* one as it is created dynamically by the *controller-manager*), but it will not update any CRD-related changes (see this [issue](https://github.com/liqotech/liqo/issues/1831) for further details).
6745
The easiest way is to unpeer all existing clusters and then uninstall and reinstall Liqo on all clusters (make sure to have the same Liqo version on all peered clusters).
6846

69-
(FAQInstallLiqoDO)=
70-
7147
### How to install Liqo on DigitalOcean
7248

7349
The installation of Liqo on a Digital Ocean's cluster does not work out of the box.
7450
The problem is related to the `liqo-gateway` service and DigitalOcean load balancer health check (which does not support a health check based on UDP).
7551
This [issue](https://github.com/liqotech/liqo/issues/1668) presents a step-by-step solution to overcome this problem.
7652

77-
(FAQPeeringSection)=
78-
7953
## Peering
8054

81-
(FAQForceUnpeer)=
82-
8355
### How to force unpeer a cluster?
8456

8557
It is highly recommended to first unpeer all existing foreignclusters before upgrading/uninstalling Liqo.
@@ -98,8 +70,6 @@ This is a not recommended solution, use this only as a last resort if no other v
9870
Future upgrades will make it easier to unpeer a cluster or uninstall Liqo.
9971
```
10072

101-
(FAQPeerOverIngress)=
102-
10373
### Is it possible to peer clusters using an ingress?
10474

10575
It is possible to use an ingress to expose the `liqo-auth` service instead of a NodePort/LoadBalancer using Helm values.
@@ -108,3 +78,179 @@ Make sure to set `auth.ingress.enable` to `true` and configure the rest of the v
10878
```{admonition} Note
10979
The `liqo-gateway` service can't be exposed through a common ingress (proxies like nginx which works with HTTP only) because it uses UDP.
11080
```
81+
82+
## Network
83+
84+
### Debug gateway-to-gateway communication issues
85+
86+
Follow these steps only if you are receiving an **error** in the **connection** resources.
87+
Run the following command to check the status of the connections:
88+
89+
```bash
90+
kubectl get connection -A
91+
```
92+
93+
#### Check the UDP service
94+
95+
Liqo exposes the **gateway server** using a UDP service.
96+
97+
In the majority of the cases, the issue is related to the missing support for UDP services on a cloud provider or in your on-premise environment.
98+
99+
You can manually test if your UDP **LoadBalancers** or **NodePort** services are working correctly by creating a dummy UDP echo server:
100+
101+
```yaml
102+
# echo-server.yaml
103+
apiVersion: apps/v1
104+
kind: Deployment
105+
metadata:
106+
name: echo-server
107+
spec:
108+
replicas: 1
109+
selector:
110+
matchLabels:
111+
app: echo-server
112+
template:
113+
metadata:
114+
labels:
115+
app: echo-server
116+
spec:
117+
containers:
118+
- name: echo-server
119+
image: ghcr.io/liqotech/udpecho
120+
ports:
121+
- containerPort: 5000
122+
protocol: UDP
123+
---
124+
apiVersion: v1
125+
kind: Service
126+
metadata:
127+
name: echo-server-lb
128+
spec:
129+
selector:
130+
app: echo-server
131+
type: LoadBalancer
132+
ports:
133+
- protocol: UDP
134+
port: 5000
135+
targetPort: 5000
136+
---
137+
apiVersion: v1
138+
kind: Service
139+
metadata:
140+
name: echo-server-np
141+
spec:
142+
selector:
143+
app: echo-server
144+
type: NodePort
145+
ports:
146+
- protocol: UDP
147+
port: 5000
148+
targetPort: 5000
149+
```
150+
151+
Save this file and apply the manifests to create the echo server and expose it:
152+
153+
```bash
154+
kubectl apply -f echo-server.yaml
155+
```
156+
157+
Now you can test the UDP service exposed by the echo server using the following command:
158+
159+
```bash
160+
nc -u <IP> <PORT>
161+
```
162+
163+
In case you want to test a **LoadBalancer** service, replace `<IP>` and `<PORT>` with the values of the `echo-server-lb` service. Otherwise, if you are testing the **NodePort** connectivity, replace `<IP>` with the IP of one of your nodes and `<PORT>` with the NodePort value of the `echo-server-np` service.
164+
165+
After you have run the command, you can type a message and press `Enter`. If you see the message echoed back in upper case, the UDP service is working correctly.
166+
167+
### Debug pod-to-pod communication issues
168+
169+
These steps are intended to be used to get information about network issues between **two clusters**, to share with maintainers when asking for help.
170+
171+
Before starting check the **connection** resources on your clusters using ```kubectl get connection -A```.
172+
If you get an error in their status, refer to the [Debug gateway-to-gateway communication issues](./faq.md#debug-gateway-to-gateway-communication-issues) section.
173+
174+
```{warning}
175+
It's strongly recommended to use 2 clusters with different **pod CIDRs** for debugging.
176+
```
177+
178+
#### Deploy debug pods
179+
180+
Create 2 namespaces, one in each cluster, and deploy a debug pod in each namespace.
181+
You don't need to offload them.
182+
183+
```bash
184+
# Run these commands on both clusters
185+
kubectl create ns liqo-debug
186+
kubectl create deployment nginx --image=nginx -n liqo-debug
187+
```
188+
189+
#### Enter in the debug pod
190+
191+
Run an interactive shell in the debug pod to test the connectivity between the 2 clusters.
192+
193+
```bash
194+
# Run these commands on both clusters
195+
kubectl exec -it deployments/nginx -n liqo-debug -- /bin/bash
196+
```
197+
198+
Now install the required tools to test the connectivity.
199+
200+
```bash
201+
apt update
202+
apt install iputils-ping -y
203+
```
204+
205+
#### Get the remote pod IP
206+
207+
We need to obtain the IPs we need to ping to test the connectivity.
208+
209+
If you are using 2 different pod CIDRs, you can use the original pod IPs.
210+
211+
```bash
212+
kubectl get pods -n liqo-debug -o wide
213+
```
214+
215+
If you are using the same pod CIDR, you need to **remap** the IPs of the pods.
216+
217+
If you have 2 clusters called `cluster A` and `cluster B`, to remap the pod IP on `cluster B` you need to:
218+
get the **configuration** resource on `cluster A` related to `cluster B`:
219+
220+
```bash
221+
kubectl get configuration -A
222+
```
223+
224+
Now take the **REMAPPED POD CIDR** value, keep the **network** part of the CIDR and replace the **host** part with the one of the pod you want to reach on `cluster B`.
225+
226+
If you want a more detailed explanation, you can find an example of remapping [here](../advanced/external-ip-remapping.md).
227+
228+
#### Sniff the traffic inside the gateway
229+
230+
In your tenant namespace, you can find a pod called `gw-<CLUSTER_ID>`. This pod routes the traffic between the clusters.
231+
232+
In order to check if the traffic is correctly routed, you can sniff the traffic inside the gateway pod.
233+
234+
Let's start opening a shell in a gateway pod.
235+
236+
```bash
237+
kubectl exec -it gw-<CLUSTER_ID> -n liqo-gateway -- /bin/bash
238+
```
239+
240+
Now you can use `tcpdump` to sniff the traffic.
241+
242+
```bash
243+
tcpdump -tnl -i any icmp
244+
```
245+
246+
#### Test the connectivity
247+
248+
Now you can test the connectivity between the 2 pods.
249+
250+
Run the following command in the shell of one of the two debug pods targeting the IP of the other debug pod.
251+
252+
```bash
253+
ping -c1 <REMOTE_POD_IP>
254+
```
255+
256+
Now check the packets in the gateway pod, and share the output with the maintainers.

0 commit comments

Comments
 (0)