Open
Description
Steps to reproduce the issue:
It's not easy to reproduce it as it occasionally happens.
But what we observed is as follows.
- Queue endpoints are suddenly removed at once for some reason.
- The DataMangementService tried to put the queue endpoint again but there was already data written.
- So it just removed the queue as it thought there is another queue running.
- But there was no queue running and the queue endpoint was the same scheduler.
- So there was no queue but queue endpoint data was in ETCD.
- Since the endpoint in ETCD points to the scheduler without a queue, activations were sent to the scheduler repeatedly.
Additional information you deem important:
- Need to test when etcd data is abnormally removed and recovered.
Metadata
Metadata
Assignees
Labels
No labels