Skip to content

race btw blkid and destroy_vbd_frontend can cause hang #41

Open
@zultron

Description

@zultron

On EL6:

When building a PV VM with pygrub, create_vbd_frontend attaches the VM's boot block device to dom0 for pygrub to operate on. This triggers udev to start blkid.

If blkid does not finish before pygrub, destroy_vbd_frontend will fail to close the device, since blkid is holding it open.

After this, bad stuff. The task will hang, the vdi will remain attached to the dom0, the blkid process can't be killed, and a reboot is required, but the reboot process hangs when stopping the 'blk-availability' service, so the host must be power cycled.

The following links suggest running something like 'udevadm settle', which will wait for the udev event queue to empty, and then exit:

https://www.redhat.com/archives/libguestfs/2012-February/msg00023.html

https://rwmj.wordpress.com/2012/01/19/udev-unexpectedness/#content

For a cheap hack, I added this to the end of the pygrub script, and the problem seems to have disappeared. Of course pygrub isn't the right place for this, but I'm not sure what is. The above links suggest it's possible to run 'udevadm settle' too early before the event is placed in the udev queue, so perhaps it should be in destroy_vbd_frontend.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions