Skip to content

Commit f8810cb

Browse files
authored
Extend troubleshooting doc (#2072)
Problem: As a user, I want to know how to collect info to diagnose and get support when failures occur. Solution: Extend the troubleshooting doc to contain info about collecting status, events, and logs.
1 parent 38b5498 commit f8810cb

File tree

1 file changed

+74
-0
lines changed

1 file changed

+74
-0
lines changed

site/content/how-to/monitoring/troubleshooting.md

+74
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,80 @@ docs: "DOCS-1419"
99

1010
This topic describes possible issues users might encounter when using NGINX Gateway Fabric. When possible, suggested workarounds are provided.
1111

12+
### General troubleshooting
13+
14+
When investigating a problem or requesting help, there are important data points that can be collected to help understand what issues may exist.
15+
16+
##### Resource status
17+
18+
To check the status of a resource, use `kubectl describe`. This example checks the status of the `coffee` HTTPRoute, which has an error:
19+
20+
```shell
21+
kubectl describe httproutes.gateway.networking.k8s.io coffee [-n namespace]
22+
```
23+
24+
```text
25+
...
26+
Status:
27+
Parents:
28+
Conditions:
29+
Last Transition Time: 2024-05-31T17:20:51Z
30+
Message: The route is accepted
31+
Observed Generation: 4
32+
Reason: Accepted
33+
Status: True
34+
Type: Accepted
35+
Last Transition Time: 2024-05-31T17:20:51Z
36+
Message: spec.rules[0].backendRefs[0].name: Not found: "bad-backend"
37+
Observed Generation: 4
38+
Reason: BackendNotFound
39+
Status: False
40+
Type: ResolvedRefs
41+
Controller Name: gateway.nginx.org/nginx-gateway-controller
42+
Parent Ref:
43+
Group: gateway.networking.k8s.io
44+
Kind: Gateway
45+
Name: gateway
46+
Namespace: default
47+
Section Name: http
48+
```
49+
50+
If a resource has errors relating to its configuration or relationship to other resources, they can likely be read in the status. The `ObservedGeneration` in the status should match the `ObservedGeneration` of the resource. Otherwise, this could mean that the resource hasn't been processed yet or that the status failed to update.
51+
52+
##### Events
53+
54+
Events created by NGINX Gateway Fabric or other Kubernetes components could indicate system or configuration issues. To see events:
55+
56+
```shell
57+
kubectl get events [-n namespace]
58+
```
59+
60+
For example, a warning event when the NginxGateway configuration CRD is deleted:
61+
62+
```text
63+
kubectl -n nginx-gateway get event
64+
LAST SEEN TYPE REASON OBJECT MESSAGE
65+
5s Warning ResourceDeleted nginxgateway/ngf-config NginxGateway configuration was deleted; using defaults
66+
```
67+
68+
##### Logs
69+
70+
Logs from the NGINX Gateway Fabric control plane and data plane can contain information that isn't available to status or events. These can include errors in processing or passing traffic.
71+
72+
To see logs for the control plane container:
73+
74+
```shell
75+
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx-gateway
76+
```
77+
78+
To see logs for the data plane container:
79+
80+
```shell
81+
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx
82+
```
83+
84+
You can see logs for a crashed or killed container by adding the `-p` flag to the above commands.
85+
1286
### NGINX fails to reload
1387

1488
#### Description

0 commit comments

Comments
 (0)