Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cni initialisation fails #5078

Open
ibrokethecloud opened this issue Mar 13, 2025 · 3 comments
Open

[BUG] cni initialisation fails #5078

ibrokethecloud opened this issue Mar 13, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@ibrokethecloud
Copy link

Kube-OVN Version

v1.13.3

Kubernetes Version

v1.31.4

Operation-system/Kernel Version

Harvester (based on SLE Micro 5.5)

Description

CNI initialisation fails with following error

I0313 04:47:48.478768    5084 gateway_linux.go:441] the first nat prerouting rule is ["-m" "comment" "--comment" "kube-ovn prerouting rules" "-j" "OVN-PREROUTING"]
W0313 04:47:48.478848    5084 gateway_linux.go:454] delete the nat prerouting rule: {nat PREROUTING 3 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
I0313 04:47:48.478869    5084 gateway_linux.go:1074] delete iptables rule by pos 3: {nat PREROUTING 3 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
W0313 04:47:48.481882    5084 gateway_linux.go:454] delete the nat prerouting rule: {nat PREROUTING 4 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
I0313 04:47:48.481912    5084 gateway_linux.go:1074] delete iptables rule by pos 4: {nat PREROUTING 4 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
W0313 04:47:48.485737    5084 gateway_linux.go:454] delete the nat prerouting rule: {nat PREROUTING 5 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
I0313 04:47:48.485768    5084 gateway_linux.go:1074] delete iptables rule by pos 5: {nat PREROUTING 5 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}
E0313 04:47:48.488290    5084 gateway_linux.go:1076] failed to delete iptables PREROUTING rule "-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING": running [/usr/local/sbin/iptables -t nat -D PREROUTING 5 --wait]: exit status 1: iptables: Index of deletion too big.
E0313 04:47:48.488367    5084 gateway_linux.go:456] failed to delete rule {nat PREROUTING 5 [-m comment --comment kube-ovn prerouting rules -j OVN-PREROUTING]}: running [/usr/local/sbin/iptables -t nat -D PREROUTING 5 --wait]: exit status 1: iptables: Index of deletion too big.

I believe this is being caused by the changes in gateway_linux.go https://github.com/kubeovn/kube-ovn/blob/master/pkg/daemon/gateway_linux.go#L455

When deleting the rule we are iterating over rules but deletion of a rule shifts the remaining rule position up, and eventually we end up trying to delete a rule at a non existent position

Steps To Reproduce

  1. Install rke2 v1.31.4
  2. Install kubeovn v1.13.3
  3. CNI initialisation fails

Current Behavior

CNI fails to initialise

Expected Behavior

CNI should initialise correctly

@ibrokethecloud ibrokethecloud added the bug Something isn't working label Mar 13, 2025
Copy link

dosubot bot commented Mar 13, 2025

Hey there, @ibrokethecloud! I'm here to help you with your Kube-OVN issue. Let's work together to solve this bug and get everything up and running smoothly!

The issue you're encountering is related to the deletion of iptables rules in the gateway_linux.go file, where deleting a rule causes the subsequent rules to shift up, leading to attempts to delete a rule at a non-existent position. This results in the error "Index of deletion too big."

To address this, you can modify the logic to handle the shifting of rule indices during deletion. One common approach is to iterate over the rules in reverse order. By doing so, the shifting of indices does not affect the rules that are yet to be deleted, as they are processed after the current rule.

Here's a conceptual example of how you might adjust the logic:

// Assuming `rules` is a slice of rules to be deleted
for i := len(rules) - 1; i >= 0; i-- {
    rule := rules[i]
    if err := deleteIptablesRule(ipt, rule); err != nil {
        klog.Error(err)
        return err
    }
}

This way, you start deleting from the last rule and move towards the first, ensuring that the index shift does not affect the deletion process. You might want to check the specific implementation in gateway_linux.go and adjust it accordingly.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@zhangzujian
Copy link
Member

I cannot reproduce it. Could you please provide more details?

@ibrokethecloud
Copy link
Author

This can be reproduced on a rke2 cluster running on an ubuntu vm

NAME           STATUS   ROLES                       AGE     VERSION          INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
kubeovn-test   Ready    control-plane,etcd,master   9m12s   v1.31.4+rke2r1   10.115.54.125   <none>        Ubuntu 24.04.1 LTS   6.8.0-45-generic   containerd://1.7.23-k3s2

To setup rke2

curl -fSL https://get.rke2.io | INSTALL_RKE2_VERSION=v1.31.4+rke2r1 RKE2_TOKEN=token sh -
systemctl enable rke2-server
systemctl start rke2-server
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

The only change in the values.yaml of the helm chart is as follows:

MASTER_NODES_LABEL: "node-role.kubernetes.io/etcd=true"
networking:
  TUNNEL_TYPE: vxlan
ipv4:
  POD_CIDR: "10.42.0.0/16"
  POD_GATEWAY: "10.42.0.1"
  SVC_CIDR: "10.43.0.0/16"
  JOIN_CIDR: "100.64.0.0/16"
  PINGER_EXTERNAL_ADDRESS: "1.1.1.1"
  PINGER_EXTERNAL_DOMAIN: "kube-ovn.io."

post install i can see the indexation error in cni pods.

I did a custom build where i changed the deletion logic to start iteration from last element of the fetched rules and I have not run into any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants