-
Notifications
You must be signed in to change notification settings - Fork 114
DiskPool stuck Creating #1823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @Mefinst, could you please try the latest version of openebs (4.3.0)? |
How could I obtain 4.3.0? |
Sorry, it's actually The |
Those logs from version |
Could you please share a support bundle for the clean install please? |
Hey, this is still mayastor v2.7.1, are you sure you're on openebs 4.2? |
Yep. That was 4.1.1. Here is new one from 4.2.0.
|
hmm indeed something is quite wrong. |
Deleting io-engine is not helpful. Though I collected support info after I deleted io-engine. |
Could you please scale down agent-core deployment. |
|
Is there anything else to try? |
hmm I think there might be some problem with multiple slow pools at the same time. |
Same result. Also I tried that before filing an issue? Also I posted a link to openebs/openebs#3820 another issue where timeouts were a problem for large/slow disks. There were some kind of solution increasing timeouts in some cli way. But may or may not that help I don't even know where should I pass those.
|
Recreated agent core deployment using config parameters from previous message. Both pools became online. Could you provide some insights on potential side effects of such configuration? |
Downside is that some operations may take a very long time to fail or may get stuck for a long time. |
Yep. By Do "some operations" include volume attachment/detachment during pod creation/restarts and io operations in the volume? |
Not IO operations, but like control-plane operations, like volume create and publish. We still need to track why the latest build didn't work with default parameters, but haven't had time to delve into this. |
@Mefinst could you test the scenario once again in the 4.2 version, by creating a pool on a similar block device? Share the logs if you are able to see the same issue. |
Delete DiskPools. mayastor-2025-04-08--11-15-22-UTC.tar.gz It's for the same 2 drives. Have only those now to test on. |
Describe the bug
DiskPool resources hang in Creating state due to IO-Engine unable to send Ok response after pool is in fact created.
IO-engine pod logs
From those I make a conclusion that IO-engine successfully creates or destroys a pool when requested to do so.
create_pool
method times out.As method times out, IO-engine fails to send success response.
Operator sends commands one after another to destroy and import pool, so it destroys previously created pool, then tries to import it, then tries to create new one, but hits timeout again.
I believe DiskPool operator should not send DestroyPoolRequest commands during creation process.
DiskPoll operator logs
Not so informative. Those contain the same messages which posted to
kubectl descrie diskpool
.To Reproduce
Steps to reproduce the behavior:
Expected behavior
DiskPool created.
Screenshots
If applicable, add screenshots to help explain your problem.
** OS info (please complete the following information):**
I use Openebs chart of versions 4.1.1, 4.1.2, 4.2.0 with default values.
I can not disclose infrastructure details due to NDA.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: