You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are running the official image mcr.microsoft.com/azure-databases/data-api-builder:1.4.27
This issue is not new to this release, it's long standing across multiple releases of DAB and .Net 6 and .Net 8
We are querying a single CosmosDB database , DAB configuration Entities map to around 20 Cosmos containers.
All the queries are point reads by primary key
The CosmosDB is not being altered during the period the error occurs
Error message
"Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct."
Symptoms
The request to /graphql returns HTTP 200, but only some Entities are resolved in the data property, with errors field populated
Once the error has occurred once, those Entities that failed with consistently fail until the container is restarted/redeployed
Curiously, entities in the error state includes some which have not been queried.
"errors": [
{
"message": "Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.",
"locations": [
{
"line": 46,
"column": 3
}
],
"path": [
"copernicusSlope_by_pk"
]
},
{
"message": "Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.",
"locations": [
{
"line": 102,
"column": 3
}
],
"path": [
"hadUKgroundfrost_by_pk"
]
},
...
Repeatability
We cannot reproduce at will, hundreds of thousand of requests are successfully handled over a period of a week or two
It appears to coincide with peak concurrent requests, in the region of 200-300 requests per minute
Desired behaviour
The odd error here and there is acceptable, but getting stuck in a persistent errored state, whilst still returning HTTP 200 response codes is difficult to manage operationally
Possibly give some consideration to a /health api that can report on this persistent error state
I analyzed a bit and it seems that this could be cause by the metastore that is accessed within the cosmos provider ... the stack trace point to a concurrency issue when accessing the dictionary.
The GetIdAndPartitionKey access the metastore with dictionary
What happened?
Background
mcr.microsoft.com/azure-databases/data-api-builder:1.4.27
Error message
"Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct."
Symptoms
data
property, witherrors
field populatedRepeatability
Desired behaviour
Version
1.4.27.0
What database are you using?
CosmosDB NoSQL
What hosting model are you using?
Container Apps
Which API approach are you accessing DAB through?
GraphQL
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: