You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Installing kube-ovn environments using helm charts causes issues in the context of subnet/pod deletion and pod IP allocation.
Installing and then uninstalling a kube-ovn environment using helm charts results in orphaned subnet resources (might be related to #4898). Subnets can only be deleted manually by removing the corresponding finalizers.
Additionally, dynamic IP allocation by deploying pods in a predefined namespace causes the PostStartHook to fail, as my PostStartHook modifies the routing table. The pod acquires an IP address of the ovn-default namespace, and not the one defined in the subnet-namespace it was deployed in, causing the route table modification to fail. Interestingly, if I assign the pod a static ip address within it's namespace IP pool, it works correctly. Furthermore, if I manually deploy the NetworkAttachmentDefinitions and subnets first using kubectl (instead of a combined helm chart), everything works correctly (static as well as dynamic IP allocation). If I then manually remove all pods first and then manually remove the subnets, no orphaned subnets remain.
As far as I know, helm collects all kinds defined in a helm chart and deploys them in a predefined, static sequence. This leads me to the suspicion that both issues described here are both symptoms of the same underlying problem: the CRDs might not be deployed in the expected sequence when using helm.
# r1pool
apiVersion: apps/v1
kind: Deployment
metadata:
name: r1pool
namespace: r1pool
spec:
replicas: 1
selector:
matchLabels:
app: ippool
template:
metadata:
labels:
app: ippool
# If explicit IPs are assigned, everything works as expected using helm
# annotations:
# ovn.kubernetes.io/ip_pool: 10.1.1.100
spec:
containers:
- name: r1pool
image: docker.io/library/nginx:alpine
imagePullPolicy: IfNotPresent
securityContext:
privileged: true # Required for networking
capabilities:
add: ["NET_RAW", "NET_ADMIN"]
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "ip route del default && ip route add default via 10.1.1.254 dev eth0"]
---
Current Behavior
Given the following two manifests, everything works fine if they are deployed manually using kubectl. If hosts.yaml is removed first and subnet.yaml subsequently, everything works as expected.
If both are combined into a helm chart and the helm chart is installed, the pod receives an IP from the ovn-default subnet, not the subnet specified by the actual namespace r1pool. This causes the PostStartHook to fail. If an explicit IP is specified, the correct IP is acquired and the PostStartHook succeeds.
Uninstalling the helm chart causes the subnet deletion to fail, subnets have to be removed manually.
Expected Behavior
Pods acquire correct IP from the specified subnet without the need to specifiy an explicit IP address.
The text was updated successfully, but these errors were encountered:
Hey @tz1112! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together!
I found a similar closed issue that might be relevant to your problem: ip资源未被回收,子网ip占用残留. This issue was resolved by a pull request that addressed the deletion of IP resources based on the pod name [1].
To resolve subnet removal and IPAM allocation issues with Kube-OVN, particularly regarding orphaned subnet resources and incorrect IP allocation in the PostStartHook, you can follow these steps:
Ensure Subnet Validity: Validate the CIDR when adding or updating a subnet to prevent invalid configurations that can cause orphaned resources.
Handle IP Conflicts: Check for conflicts with existing IPs when assigning static IPs to prevent incorrect IP allocation.
Release Addresses Properly: Ensure IP addresses are released correctly when pods are deleted or updated to prevent orphaned resources.
Update Excluded IPs: When updating a subnet's excluded IPs, ensure released addresses are not reused if they fall within the new exclusion range.
Check for No Available IPs: Handle scenarios where no available IPs are left due to exclusions or full allocation, and return appropriate errors.
Test Subnet Operations: Use test cases to ensure subnet operations such as adding, updating, and removing IP pools are functioning correctly.
Use IPAM Methods Correctly: Leverage IPAM methods like AddOrUpdateSubnet, DeleteSubnet, and ReleaseAddressByPod to manage subnets and IP allocations effectively.
Manage IP Pools: Use methods like AddOrUpdateIPPool and RemoveIPPool to manage IP pools within a subnet, ensuring IPs are correctly allocated and released [2][3][4][5].
Additionally, the issue with the PostStartHook failing due to incorrect IP allocation might be related to the sequence of resource deployment when using Helm charts. Helm may not deploy CRDs in the expected sequence, leading to these issues. Manually deploying NetworkAttachmentDefinitions and subnets first, as you've noted, seems to resolve the problem, indicating a potential sequence issue with Helm.
Your guess is correct. The result will be affected by resources order. When a Pod is created, if the subnet is not ready, it will allocate an IP address first from the default subnet. I think this cannot be resolved on the kube-ovn side, as the order seen by the kube-ovn-controller is not deterministic. The controller does not know if there will be a subsequent subnet or namespace change when the pod is created. Therefore, it's better to ensure the order by using Helm hooks.
It seems like a bug related to the deletion, but I can't reproduce it in my environment. Could you please attach the kube-ovn-controller log to help debug the issue?
Kube-OVN Version
v1.13.2
Kubernetes Version
v1.32.2
Operation-system/Kernel Version
6.11.0-17-generic
Description
Installing kube-ovn environments using helm charts causes issues in the context of subnet/pod deletion and pod IP allocation.
Installing and then uninstalling a kube-ovn environment using helm charts results in orphaned subnet resources (might be related to #4898). Subnets can only be deleted manually by removing the corresponding finalizers.
Additionally, dynamic IP allocation by deploying pods in a predefined namespace causes the PostStartHook to fail, as my PostStartHook modifies the routing table. The pod acquires an IP address of the ovn-default namespace, and not the one defined in the subnet-namespace it was deployed in, causing the route table modification to fail. Interestingly, if I assign the pod a static ip address within it's namespace IP pool, it works correctly. Furthermore, if I manually deploy the NetworkAttachmentDefinitions and subnets first using kubectl (instead of a combined helm chart), everything works correctly (static as well as dynamic IP allocation). If I then manually remove all pods first and then manually remove the subnets, no orphaned subnets remain.
As far as I know, helm collects all kinds defined in a helm chart and deploys them in a predefined, static sequence. This leads me to the suspicion that both issues described here are both symptoms of the same underlying problem: the CRDs might not be deployed in the expected sequence when using helm.
Steps To Reproduce
subnet.yaml
host.yaml
Current Behavior
Given the following two manifests, everything works fine if they are deployed manually using kubectl. If hosts.yaml is removed first and subnet.yaml subsequently, everything works as expected.
If both are combined into a helm chart and the helm chart is installed, the pod receives an IP from the ovn-default subnet, not the subnet specified by the actual namespace r1pool. This causes the PostStartHook to fail. If an explicit IP is specified, the correct IP is acquired and the PostStartHook succeeds.
Uninstalling the helm chart causes the subnet deletion to fail, subnets have to be removed manually.
Expected Behavior
Pods acquire correct IP from the specified subnet without the need to specifiy an explicit IP address.
The text was updated successfully, but these errors were encountered: