Skip to content

amdgpu fixes for rpi5 #6947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: rpi-6.12.y
Choose a base branch
from

Conversation

pepijndevos
Copy link

@pepijndevos pepijndevos commented Jul 8, 2025

This cleans up the commit history of the amazing work by @Coreforge to make AMD GPUs work with the Raspberry Pi 5, following instructions by @6by9: geerlingguy/raspberry-pi-pcie-devices#222 (comment) and an explanation of the different parts of the patch: geerlingguy/raspberry-pi-pcie-devices#222 (comment)

So I've made

  • one commit with all the memset changes that to my understanding could potentially be upstreamed to mainline linux since they are just more correct.
  • one commit with a miscellaneous ttm_uncached change that may need an ifdef for arm only
  • one commit with all the volatile changes which are not that invasive but that coreforge suggested might need to be ifdefed as well for mainline acceptance

What is not included at the moment is the whole alignment machinery which to my understanding is more hacky and could be harder to get merged or might require significant changes. I'm not sure how essential that change is, but if desired I could include it as a separate commit as well. Or maybe the Ampere version of that trap could be used. fwiw, it seems llama.cpp works equally well without that patch applied from limited testing.

Just to be clear, I don't claim any authorship or even understanding of these changes, and am just trying to grease the wheels of getting these changes upstreamed as far as they will go, making it easier to use GPUs on Raspberry Pi, which I have a big interest in: https://sanctuary-systems.com/sentinel-core/

@Coreforge
Copy link

The alignment trap isn't needed if all userspace programs respect the alignment requirements. I guess llama.cpp might do that, so it works without it. Xorg I found did need it, even the arm64 build (or a userspace workaround like the memcpy library). The Ampere version should work as well, although I haven't tried it. The Ampere version also covers kernelspace, while mine only covers userspace, so more cards might work with it without extra changes, but at the cost of some performance (whether that would be noticeable or not, I don't know).

@popcornmix
Copy link
Collaborator

These would be best submitted upstream where the devs who actually understand this driver can comment if the patches are correct (or could be achieved in better ways). See submitting patches.

No objection to leaving this PR here for information for other interested users, but we are unlikely to merge it as a downstream only patch. If any patches are accepted upstream we're generally happy to cherry-pick them to get them into trees sooner.

@pepijndevos
Copy link
Author

pepijndevos commented Jul 11, 2025

The impression I got from the @6by9 comment linked above is that you could maybe guide us on what might be needed to upstream these changes.

If you could have a first pass at tidying it up on rpi-6.12.y, and create a PR against raspberrypi/linux rpi-6.12.y, then we can give pointers on what is needed.

I'd be happywilling to spearhead that effort, but could use some guidance from people more familiar with kernel development, because it sounds like a more legislative process than submitting a PR ;)

@popcornmix
Copy link
Collaborator

Yeah, @6by9 is probably a better guide - he's succeeded in upstreaming a number of patches recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants