-
-
Notifications
You must be signed in to change notification settings - Fork 782
unrecoverable error - A concurrent update was performed on this collection and corrupted its state #8299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The task buffer is owned by a resolver task which itself including the context is a pooled resource. It sounds that the context is passed on beyond the resolver and used outside of the task. This would in any case not be a safe thing to do. |
How do you get to these lines from your stack trace? looks more like |
The oddly formatted stack trace is because it comes from the Application Insights Exceptions table associated with the Azure Container App. Do let me know if you require anything another info. Thanks for looking into this! |
I think this issue is a DAB issue to be honest. The code in question is guaranteed to run in a single thread and is owned by the ResolverTask which itself is a pooled instance. Within the stack trace we are not in the complete phase but in the execute phase so the referenced code in this issue is not relevant. Also following this thing down to the cosmos engine the shared metastore is Dictionary which is not thread-safe. I will ping the DAB team. |
Jap ... looks like the metastore could be it ... "method": "System.Collections.Generic.Dictionary`2.FindValue", this is in your stack even. |
Product
Hot Chocolate
Version
12.22.6.0
Link to minimal reproduction
graphql-platform/src/HotChocolate/Core/src/Execution/Processing/Tasks/ResolverTask.Execute.cs
Lines 33 to 34 in e338bf7
Steps to reproduce
ref issue I raised with DAB that appears to point the finger at Hot Chocolate Azure/data-api-builder#2694
Background
When using the /graphql in DAB, once this error has occurred just once, it persistently errors for the Entities (source tables) until DAB application is restarted.
Research
Apologies, I'm not sufficiently skilled to repro the issue, but I'm doing by hardest to show some effort was put into documenting the issue and possible root cause
I've walking the code trying to understand where a concurrency issue might occur and this area appears to have potential for causing the Exception. I won't embarrass myself but putting in some AI suggestions for a fix as I don't understand the optimisations at work here, but Claude explains to me that
When you use
CollectionsMarshal.AsSpan(_taskBuffer)
, you're getting direct, low-level access to the memory of the collection. This is a performance optimization, but it comes with a major caveat: it bypasses the normal thread-safety mechanisms of collection classes.The exception message "Operations that change non-concurrent collections must have exclusive access" indicates that:
_ taskBuffer is a non-concurrent collection (like a standard List or array)
The collection is being modified from multiple threads simultaneously
Here are the links to the relevant code
graphql-platform/src/HotChocolate/Core/src/Execution/Processing/Tasks/ResolverTask.cs
Line 10 in e338bf7
graphql-platform/src/HotChocolate/Core/src/Execution/Processing/Tasks/ResolverTask.Execute.cs
Lines 33 to 34 in e338bf7
This is the line referenced in the stack trace
graphql-platform/src/HotChocolate/Core/src/Execution/Processing/Tasks/ResolverTask.Execute.cs
Line 57 in e338bf7
What is expected?
What is actually happening?
For those Entities that errored, they cannot be queried successfully until the application is restarted,
Note that non errored Entities can still be successfully returned in the `"data":¬ field of the json (not included here)
Relevant log output
Additional context
No response
The text was updated successfully, but these errors were encountered: