Skip to content

Add interrupt traits and a KVM based implementation #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

jiangliu
Copy link
Member

This PR add consts/structs/traits to manage interrupt sources for backend devices.
It also provides a KVM hypervisor based implementation for x86 legacy interrupts, PCI MSI/MSI-x interrupts.

@jiangliu
Copy link
Member Author

Please refer to #8, #6 for related discussions.
Needs help from ARM experts to make it work on ARM/ARM64 platforms.

@sameo
Copy link

sameo commented Oct 28, 2019

@jiangliu Does that PR replace #8 ?

@jiangliu
Copy link
Member Author

@jiangliu Does that PR replace #8 ?

Yes, #11 and #9 replaces #8 .

Switch to rust 2018 edition and turn on deny(missing_docs).

Signed-off-by: Liu Jiang <[email protected]>
Introduce traits InterruptManager and InterruptSourceGroup to manage
interrupt sources for virtual devices.

Signed-off-by: Liu Jiang <[email protected]>
Signed-off-by: Bin Zha <[email protected]>
@jiangliu jiangliu force-pushed the irq_v2 branch 2 times, most recently from ef4fa92 to 355e1a8 Compare November 7, 2019 09:20
Comment on lines +49 to +50
//! * the VMM creates a device manager, passing on an reference to the interrupt manager
//! * the device manager passes on an reference to the interrupt manager to all registered devices
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we deciding whether device manager is responsible for passing the reference of interrupt manager to each registered devices, we can update the comments or the PR #12.

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but nice work :)

Implement infrastructure to manage interrupt sources based on Linux KVM
kernel module.

Signed-off-by: Liu Jiang <[email protected]>
Signed-off-by: Bin Zha <[email protected]>
Copy link
Member

@andreeaflorescu andreeaflorescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few general concerns here:

  1. vm-device has too many dependencies
  2. it looks very x86_64 and linux specific. @petrutlucian94 can you offer some insight into interrupts on Windows? Is the trait generic enough to work for your usecase as well?
  3. At the design level it looks like the current trait doesn't follow the open-closed design principle. Specifically, if you need to add a new interrupt type, you need to change the following things:
  • define a new InterruptSourceType with the name of your interrupt
  • define a new InterruptSourceConfig

I haven't have the time to dive deep into how interrupts work, but before merging this I would like to take some more time and discuss to see if we can get to a simpler Interrupt interface that can be useful for Firecracker as well. I find this of particular importance since we want the Virtio implementation to depend on the Interrupt trait.

Comment on lines +11 to +12
kvm-bindings = { version = ">=0.1.1, <1.0", optional = true }
kvm-ioctls = { git = "https://github.com/rust-vmm/kvm-ioctls.git", branch = "master", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not particularly happy with adding dependencies on kvm related wrappers here. Even though you specify them as optional, their meta-data would still be pulled even if you don't use them. Furthermore, it makes the vm-device a very coupled crate. If there is a way to decouple this, I would prefer to not have these dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the granularity for rust-vmm crates. We may have separate interface definition and implementation crates.
My suggestion is that:

  1. integrate interface and implementation within the same crate if there's a majority implementation existing. Vm-memory and interrupt falls into this case.
  2. separate interface definition and implementations if there are multiple possible implementations. VmDevice(DeviceIo) falls into this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this PR in two? One PR that adds the interface so people can experiment with it and one PR for actual implementation?

#[allow(clippy::trivially_copy_pass_by_ref)]
pub trait InterruptSourceGroup: Send + Sync {
/// Get type of interrupt sources managed by the group.
fn interrupt_type(&self) -> InterruptSourceType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that having the interrupt_type accessible from outside defeats the purpose of having a generic trait for working with interrupts. Unfortunately this means that in the crates depending on InterruptSourceGroup instead of counting only on the trait interface, you will use the type to alter the parameters of trait functions. This already happens with the current proposal of Virtio: https://github.com/rust-vmm/vm-virtio/blob/5b37629a8edae71004d11f17da5b0b0856531bb3/src/device.rs#L59

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have some other technical ways to solve this issue, such as change

fn trigger(&self, index: InterruptIndex, flags: u32)

as

fn trigger(&self, index: InterruptIndex, subindex: u32)

The Intel interrupt remapping tech has used this style terms.
But the device backend drivers need to know about interrupt working mode, it's not easy to hide all interrupt details from device backend drivers.

ty: InterruptSourceType,
base: InterruptIndex,
count: InterruptIndex,
) -> Result<Arc<Box<dyn InterruptSourceGroup>>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the caller of create_group make sure to add the InterruptSourceGroup in an Arc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InterruptManager implementation needs to hold a reference to Arc<Box> for host keeping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I still don't understand why create_group can't return InterruptSourceGroup and the caller of create_group can wrap the InterruptSourceGroup in an Arc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating a new interrupt group, the new interrupt group object will be inserted into a hash map for house-keeping. So we need to maintain the ownership by either:

  1. use Arc::clone()
  2. use reference.
    And it's a little complex to handle lifetime parameter when using reference. So an <Box> object is returned.
    fn create_group(
        &self,
        type_: InterruptSourceType,
        base: InterruptIndex,
        count: InterruptIndex,
    ) -> Result<Arc<Box<dyn InterruptSourceGroup>>>;

struct KvmIrqManagerObj {
    vmfd: Arc<VmFd>,
    routes: Arc<KvmIrqRouting>,
    groups: HashMap<InterruptIndex, Arc<Box<dyn InterruptSourceGroup>>>,
    max_msi_irqs: InterruptIndex,
}

/// interrupt source as a distinct InterruptSource.
#[allow(clippy::len_without_is_empty)]
#[allow(clippy::trivially_copy_pass_by_ref)]
pub trait InterruptSourceGroup: Send + Sync {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super sure how this would work with aarch64 and I would need more time to test it out. Did you run any experiments on aarch64 as well? From the naming it looks coupled to the way x86_64 works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. I have had a quick glance at GICv3 spec, which look similar to x86 interrupt architecture.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have refined the implementation to better support arm/arm64, could you please help to take a look? @andreeaflorescu

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to the changes? GitHub is not very good at showing diffs between versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I have moved legacy irq related code from src/interrupt/kvm/mod.rs into src/interrupt/kvm/legacy_irq.rs with better support of ARM/ARM64. For ARM/ARM64, the MSI part should be the same, and hopefully we only need to implement initialize_legacy().

    }

    #[cfg(any(target_arch = "aarch", target_arch = "aarch64"))]
    pub(super) fn initialize_legacy(
        _routes: &mut HashMap<u64, kvm_irq_routing_entry>,
    ) -> Result<()> {
        //TODO
        Ok(())
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are reaching the real interesting parts now:)
Essentially the PR defines two sets of features:

  1. feature bit to control supported interrupt source types:
    [features]
    legacy_irq = []
    msi_irq = []
    pci_msi_irq = ["msi_irq"]
    And there's one more coming feature bit of this class:
    generic_msi_irq = ["msi_irq"]
    So we could support three types of interrupt sources: legacy, pci msi and generic msi.
  2. feature bit to control the implementations of InterruptManager/InterruptSourceGroup.
    kvm_irq = ["kvm-ioctls", "kvm-bindings"]
    The 'kvm_irq' enables a default implementation of InterruptManager/InterruptSourceGroup based KVM kernel module. If the vmm has special requirements, it may provide its own implementation by disabling the kvm_irq feature.

For the vm-virtio, there are three possible combinations:

  1. use legacy irq only. The crosvm/firecracker project uses this mode.
  2. use msi irq only. The cloud hypervisor works in this mode.
  3. use both legacy irq and msi irq. Our dragonball vmm works in this mode, and we plan to switch to msi irq only mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a complex problem, I would really like to insist on splitting this PR. It looks like we have a few loose ends that we need to decide on. Let's first decide on the interface and have a separate PR only with that. Can you please add a PR only with the interface definition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have different opinion here.
A PR with both interface definition and concrete implementation may give more information about the design, and it's easier for community to do experiments with the design.
If some design flaw has been discovered during experiments, we could easily change the interface design and implementation, or even revert the PR.
The vm-device is still in early stage, it would be better to share design and implementation sooner, otherwise we could not speed up the progress.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not against prototyping. This is something that we discussed about a few times already.

The proposal that everyone seemed to be on board with was to have a PR that defines the interfaces and nothing more. Functionality was to be added in subsequent PRs.

The initial PR can include external links to prototypes with the interface being used in practice. Having such a large PR introduces a few problems:

  1. Delays in reviewing and merging the code.
  2. Key design points of the interface can be missed because we are not insisting on the interface.
  3. Large PRs such as this one tend to reach around 100 comments. After a few iterations the comments get outdated, you cannot possibly go through all of them to see if they were fixed.

///
/// If the interrupt has an associated `interrupt_status` register, all bits set in `flag`
/// will be atomically ORed into the `interrupt_status` register.
fn trigger(&self, index: InterruptIndex, flags: u32) -> Result<()>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit weird that depending on the InterruptSourceType either index or flags is used. It looks to me that trigger cannot be abstract it for all interrupt types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a tradeoff. For pin-based IRQ, it almost have an associated status register. So the extra parameter flag is used to share the common code to manipulate the status register. If we skip the parameter flags, each driver will need to implement the code to manipulate status register repeatedly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code to manipulate the status register is usually one line. So I think that the benefit of having the flags parameter removed is higher than the trouble of handling this outside.

Copy link
Member Author

@jiangliu jiangliu Nov 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to hide interrupt delivery details from device drivers.
Take virtio-net as an example, the virtio-net driver only cares to trigger an interrupt to notify the guest, and the virtio transport layer decides the way to deliver the interrupt. It doesn't make sense for the virtio-net driver to maintain and manipulate the interrupt status bit flags.
Or it's not only about lines of code, but also information encapsulation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with encapsulating the information, but I don't think this should be encapsulated at the vm-device level because not all devices need flags. Virtio devices need flags and interrupt_status, so instead of forcing this interface for interrupts on all devices, we can define an interrupt wrapper in vm-virtio. Looking at the implementation of legacy devices for example, it seems like flags is never going to be used. It also looks like the only thing that legacy interrupts and MSI have in common is trigger. I've been playing a bit with separating this interface in multiple specialized interfaces. I hope I'll have a prototype by the end of this week.

@jiangliu
Copy link
Member Author

I have a few general concerns here:

  1. vm-device has too many dependencies
    It's a valid state. Currently we have dependency tree as:
    vm-device v0.1.0 (/ws/src/vmm/rust-vmm/vm-device.git)
    ├── kvm-bindings v0.2.0
    ├── kvm-ioctls v0.3.0 (https://github.com/rust-vmm/kvm-ioctls.git#681745a7)
    │ ├── kvm-bindings v0.2.0 ()
    │ ├── libc v0.2.62
    │ └── vmm-sys-util v0.2.0
    │ └── libc v0.2.62 (
    )
    ├── libc v0.2.62 ()
    ├── vm-memory v0.1.0 (https://github.com/rust-vmm/vm-memory#8669369d)
    │ ├── cast v0.2.2
    │ └── libc v0.2.62 (
    )
    └── vmm-sys-util v0.2.0 (*)
    Among those, the vm-memory could be removed by using u64 instead of GuestMemoryAddress.
    kvm-bindings/kvm-ioctls are gated by features.
  1. it looks very x86_64 and linux specific. @petrutlucian94 can you offer some insight into interrupts on Windows? Is the trait generic enough to work for your usecase as well?
    Any plan to support HyperV in addition to KVM?
  1. At the design level it looks like the current trait doesn't follow the open-closed design principle. Specifically, if you need to add a new interrupt type, you need to change the following things:
  • define a new InterruptSourceType with the name of your interrupt
  • define a new InterruptSourceConfig
    The interrupt is a relative stable subsystem, it doesn't evolve as quick as the VirtIo subsystem.
    So I choose to keep things straightforward.

I haven't have the time to dive deep into how interrupts work, but before merging this I would like to take some more time and discuss to see if we can get to a simpler Interrupt interface that can be useful for Firecracker as well. I find this of particular importance since we want the Virtio implementation to depend on the Interrupt trait.

Copy link

@michael2012z michael2012z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments mainly from the point of view of ARM.

Implement InterruptSourceGroup trait to manage x86 legacy interruts.
On x86 platforms, pin-based device interrupts connecting to the master
PIC, the slave PIC and IOAPICs are named as legacy interrupts. For
legacy interrupts, the interrupt routing logic are manged by the
PICs/IOAPICs and the interrupt group logic only takes responsibility
to enable/disable the interrupts.

Signed-off-by: Liu Jiang <[email protected]>
Signed-off-by: Bin Zha <[email protected]>
@jiangliu jiangliu force-pushed the irq_v2 branch 2 times, most recently from 613c326 to 4c57c28 Compare November 21, 2019 02:30
juliusxlh and others added 4 commits November 21, 2019 10:46
With some kvm version, setting irq_routing for non-existing legaccy
IRQs may cause system crash. So limit the number to available legacy
interrupts.

Signed-off-by: 守情 <[email protected]>
Introduce generic mechanism to support message signalled interrupts
based on KVM hypervisor.

Signed-off-by: Liu Jiang <[email protected]>
Signed-off-by: Bin Zha <[email protected]>
Implement interrupt source driver to manage PCI MSI/MSI-x interrupts.

Signed-off-by: Liu Jiang <[email protected]>
Signed-off-by: Bin Zha <[email protected]>
@jiangliu jiangliu force-pushed the irq_v2 branch 2 times, most recently from f915a2e to f3bd7b1 Compare November 21, 2019 03:30
Copy link

@liujing2 liujing2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern for me is, we can see that every device that uses MSI would have a irq_routing: Arc<KvmIrqRouting> which shows the whole routing information of system.
Is it available to keep such info in KvmIrqManager and when a device wants to add new routing entries,
call function to do that?

@jiangliu
Copy link
Member Author

One concern for me is, we can see that every device that uses MSI would have a irq_routing: Arc<KvmIrqRouting> which shows the whole routing information of system.
Is it available to keep such info in KvmIrqManager and when a device wants to add new routing entries,
call function to do that?

It's an implementation detail, both PciMsiIrq and KvmIrqRouting are hidden from device backend drivers.

@jiangliu
Copy link
Member Author

As we have discussed, close this one in perfer of PR #21

@jiangliu jiangliu closed this Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants