Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in logs duplicate key value violates unique constraint "argo_workflows_pkey" #14344

Open
3 of 4 tasks
hanneskaeufler opened this issue Apr 1, 2025 · 5 comments · May be fixed by #14357
Open
3 of 4 tasks

Error in logs duplicate key value violates unique constraint "argo_workflows_pkey" #14344

hanneskaeufler opened this issue Apr 1, 2025 · 5 comments · May be fixed by #14357
Assignees
Labels

Comments

@hanneskaeufler
Copy link

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

No DB errors

Version(s)

v3.6.4

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.

Unclear

Logs from the workflow controller

workflow-controller-5d9976df6d-2gmnh 2025/04/01 09:31:56     Session ID:     00001                                                                                    │
│ workflow-controller-5d9976df6d-2gmnh     Query:          INSERT INTO "argo_workflows" ("clustername", "namespace", "nodes", "uid", "version") VALUES ($1, $2, $3, $4, │
│ workflow-controller-5d9976df6d-2gmnh     Arguments:      []interface {}{"default", "XXX", "{\"XXXX │
│ workflow-controller-5d9976df6d-2gmnh \\",\\n  \\\"XXX\\\": \\\"XXX\\\",\\n  \\\"XXX\\\": \\\"XXX\\\",\\n  \\\"XXX\\\": \\\" │
│  workflow-controller-5d9976df6d-2gmnh     Stack:                                                                                                                       │
│  workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).handleMethods@/usr/local/go/src/fmt/print.go:673                                                               │
│  workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).printArg@/usr/local/go/src/fmt/print.go:756                                                                    │
│  workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).doPrint@/usr/local/go/src/fmt/print.go:1208                                                                    │
│  workflow-controller-5d9976df6d-2gmnh         fmt.Append@/usr/local/go/src/fmt/print.go:289                                                                            │
│  workflow-controller-5d9976df6d-2gmnh         log.(*Logger).Print.func1@/usr/local/go/src/log/log.go:261                                                               │
│  workflow-controller-5d9976df6d-2gmnh         log.(*Logger).output@/usr/local/go/src/log/log.go:238                                                                    │
│  workflow-controller-5d9976df6d-2gmnh         log.(*Logger).Print@/usr/local/go/src/log/log.go:260                                                                     │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/persist/sqldb.(*nodeOffloadRepo).Save@/go/src/github.com/argoproj/argo-workflows/p │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/hydrator.hydrator.Dehydrate.func1@/go/src/github.com/argoproj/argo-workfl │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/util/wait.Backoff.func1@/go/src/github.com/argoproj/argo-workflows/util/wait/backo │
│  workflow-controller-5d9976df6d-2gmnh         k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection@/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/ │
│  workflow-controller-5d9976df6d-2gmnh         k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff@/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:46 │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/util/wait.Backoff@/go/src/github.com/argoproj/argo-workflows/util/wait/backoff.go: │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/hydrator.hydrator.Dehydrate@/go/src/github.com/argoproj/argo-workflows/wo │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).persistUpdates@/go/src/github.com/argoproj/a │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate.func1@/go/src/github.com/argoproj/ar │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate@/go/src/github.com/argoproj/argo-wor │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).processNextItem@/go/src/github.com/argop │
│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).runWorker@/go/src/github.com/argoproj/ar │
│  workflow-controller-5d9976df6d-2gmnh     Error:          ERROR: duplicate key value violates unique constraint "argo_workflows_pkey" (SQLSTATE 23505)                 │
│  workflow-controller-5d9976df6d-2gmnh     Time taken:     0.00538s                                                                                                     │
│  workflow-controller-5d9976df6d-2gmnh     Context:        context.Background                                                                                           │
│  workflow-controller-5d9976df6d-2gmnh

Logs from in your workflow's wait container

Not relevant?
@MasonM
Copy link
Member

MasonM commented Apr 2, 2025

Thanks for the report. Some of the lines in those logs appear to be truncated, which makes it hard to tell what's happening here. Can you provide the untruncated logs? Specifically, I'd like to see the rest of this line:

│  workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/persist/sqldb.(*nodeOffloadRepo).Save@/go/src/github.com/argoproj/argo-workflows/p │

@MasonM MasonM added the problem/more information needed Not enough information has been provide to diagnose this issue. label Apr 2, 2025
@hanneskaeufler
Copy link
Author

Oh, right, good point. Here is a full log (the inline json is removed to just "").

argo-workflows-workflow-controller-5d9976df6d-2gmnh 2025/04/01 09:31:54     Session ID:     00001
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Query:          INSERT INTO "argo_workflows" ("clustername", "namespace", "nodes", "uid", "version") VALUES ($1, $2, $3, $4, $5) RETURNING "clustername", "uid", "version"
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Arguments:      []interface {}{"default", "pipeline", "":"ERROR: duplicate key value violates unique constraint \"argo_workflows_pkey\" (SQLSTATE 23505)","level":"info","msg":"Ignoring duplicate key error","time":"2025-04-01T09:31:54.620Z","uid":"45c010a6-671e-4d10-9b32-ab172d3e8a59","version":"fnv:4087120198"}
", "45c010a6-671e-4d10-9b32-ab172d3e8a59", "fnv:4087120198"}
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Stack:          
argo-workflows-workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).handleMethods@/usr/local/go/src/fmt/print.go:673
argo-workflows-workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).printArg@/usr/local/go/src/fmt/print.go:756
argo-workflows-workflow-controller-5d9976df6d-2gmnh         fmt.(*pp).doPrint@/usr/local/go/src/fmt/print.go:1208
argo-workflows-workflow-controller-5d9976df6d-2gmnh         fmt.Append@/usr/local/go/src/fmt/print.go:289
argo-workflows-workflow-controller-5d9976df6d-2gmnh         log.(*Logger).Print.func1@/usr/local/go/src/log/log.go:261
argo-workflows-workflow-controller-5d9976df6d-2gmnh         log.(*Logger).output@/usr/local/go/src/log/log.go:238
argo-workflows-workflow-controller-5d9976df6d-2gmnh         log.(*Logger).Print@/usr/local/go/src/log/log.go:260
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/persist/sqldb.(*nodeOffloadRepo).Save@/go/src/github.com/argoproj/argo-workflows/persist/sqldb/offload_node_status_repo.go:89
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/hydrator.hydrator.Dehydrate.func1@/go/src/github.com/argoproj/argo-workflows/workflow/hydrator/hydrator.go:114
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/util/wait.Backoff.func1@/go/src/github.com/argoproj/argo-workflows/util/wait/backoff.go:15
argo-workflows-workflow-controller-5d9976df6d-2gmnh         k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection@/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:145
argo-workflows-workflow-controller-5d9976df6d-2gmnh         k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff@/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:461
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/util/wait.Backoff@/go/src/github.com/argoproj/argo-workflows/util/wait/backoff.go:13
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/hydrator.hydrator.Dehydrate@/go/src/github.com/argoproj/argo-workflows/workflow/hydrator/hydrator.go:112
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).persistUpdates@/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:742
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate.func1@/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:191
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).operate@/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:440
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).processNextItem@/go/src/github.com/argoproj/argo-workflows/workflow/controller/controller.go:898
argo-workflows-workflow-controller-5d9976df6d-2gmnh         github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).runWorker@/go/src/github.com/argoproj/argo-workflows/workflow/controller/controller.go:812
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Error:          ERROR: duplicate key value violates unique constraint "argo_workflows_pkey" (SQLSTATE 23505)
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Time taken:     0.00552s
argo-workflows-workflow-controller-5d9976df6d-2gmnh     Context:        context.Background

@hanneskaeufler
Copy link
Author

These typically occur very close together as well

Image

@MasonM MasonM removed the problem/more information needed Not enough information has been provide to diagnose this issue. label Apr 3, 2025
@MasonM MasonM self-assigned this Apr 3, 2025
MasonM added a commit to MasonM/argo-workflows that referenced this issue Apr 4, 2025
When offloading a workflow, it's possible for multiple workers to concurrently
call `Save()` with the same workflow, which leads to `ERROR: duplicate key value
violates unique constraint "argo_workflows_pkey"`` messages in the logs, along
with a stack trace. These messages are cluttering the logs and confusing users,
but duplicate key errors are harmless. When it detects a duplicate key error,
`Save()` will return the `version` hash, and `version` is part of the primary
key of the `argo_workflows` table, which means it's guaranteed to be identical
to the previously-inserted row.

This decreases the log level of that message to `DEBUG` so that it doesn't
clutter the logs. I thought about removing it entirely, but I figured it's worth
keeping just in case.

I wasn't able to reproduce the error locally, but I verified workflow offloading
works locally using `make PROFILE=postgres UI=true ALWAYS_OFFLOAD_NODE_STATUS=true`

Signed-off-by: Mason Malone <[email protected]>
@MasonM
Copy link
Member

MasonM commented Apr 4, 2025

Thanks for the details @hanneskaeufler! I think these messages are harmless: the code already detects and handles duplicate key errors properly. But the messages are misleading (since they appear to be a real problem), and the fact they're cluttering the logs is definitely a problem. I entered a PR to reduce the log level to "debug" so they won't show up by default: #14357

@hanneskaeufler
Copy link
Author

Thanks for the details @hanneskaeufler! I think these messages are harmless: the code already detects and handles duplicate key errors properly. But the messages are misleading (since they appear to be a real problem), and the fact they're cluttering the logs is definitely a problem. I entered a PR to reduce the log level to "debug" so they won't show up by default: #14357

Thanks! Downgrading the log works for me if we know this is harmless 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants