Skip to content

Update to cuTENSOR 2.0 #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

github-actions[bot]
Copy link
Contributor

This pull request changes the compat entry for the cuTENSOR package from 1 to 1, 2.
This keeps the compat entries for earlier versions.

Note: I have not tested your package with this new compat entry.
It is your responsibility to make sure that your package tests pass before you merge this pull request.

@github-actions github-actions bot force-pushed the compathelper/new_version/2024-01-19-01-22-28-486-00885519492 branch from 6506fa2 to b95e83a Compare January 19, 2024 01:22
@ejmeitz
Copy link

ejmeitz commented Jan 31, 2024

@Jutho could you please push this through. Thanks

@Jutho
Copy link
Owner

Jutho commented Jan 31, 2024

I think @lkdvos checked and noticed that we cannot simply do this without having to also update our implementation/package extension, due to breaking changes.

@lkdvos
Copy link
Collaborator

lkdvos commented Feb 1, 2024

Sadly it's quite a bit of work since cuTENSOR has changed their interface. It's definitely somewhere on my to do list, but for now I think cuTENSOR v1 works just fine?

@ejmeitz
Copy link

ejmeitz commented Feb 1, 2024

I believe cuTensor restricts me to GPUArrays 9 which has a memory double free issue when using multiple threads. I was hoping to update to 10 but I believe this compat is restricting me.

If its a big change dont worry my code still runs all be it with a bunch of errors printing out.

@ejmeitz
Copy link

ejmeitz commented Feb 9, 2024

Apparently this issue can cause crashes. To be clear this only happens when using TensorOperations (and more specifically GPUArrays.jl) inside of multiple separate threads. In my case I have one thread per GPU.

image

@lkdvos
Copy link
Collaborator

lkdvos commented Feb 10, 2024

I started some work on moving to the new interface. I think it should be working for plain CuArrays, but I am still deciding on how to implement views/stridedviews, so for now that will have to wait.
If you try it out, do let me know if there are any obvious errors?

@lkdvos
Copy link
Collaborator

lkdvos commented Feb 10, 2024

I also just noticed that cuTENSOR 2 requires julia 1.8, which I am not too happy about. I think this means we either need to keep two different versions of TensorOperations, for 1.6-1.7 with cuTENSOR 1 and for 1.8+ with cuTENSOR 2, or I would have to come up with a way of keeping the old code if julia is below 1.8. Maybe we can consider also restricting to julia 1.8, but I didn't see the need to do that here just yet

@ejmeitz
Copy link

ejmeitz commented Feb 11, 2024

I'll test it out, thanks for making some changes!

For the record here is the issue on GPUArrays: JuliaGPU/GPUArrays.jl#503

@lkdvos lkdvos changed the title CompatHelper: bump compat for cuTENSOR to 2, (keep existing compat) [WIP] cuTENSOR v2 Feb 11, 2024
@lkdvos lkdvos force-pushed the compathelper/new_version/2024-01-19-01-22-28-486-00885519492 branch from 3c734ae to 16e865e Compare February 11, 2024 09:15
@lkdvos lkdvos mentioned this pull request Mar 29, 2024
@lkdvos
Copy link
Collaborator

lkdvos commented Apr 29, 2024

Awaiting the result of JuliaGPU/CUDA.jl#2356 to simplify the implementation further.

@lkdvos
Copy link
Collaborator

lkdvos commented May 5, 2024

I think this is ready to go in principle. All it requires is the tagged version of the changes in cuTENSOR, so let's wait for that and then get this merged. I think we can still do a minor release upgrade for this, but we should probably get the v5 thing started asap.

@lkdvos lkdvos changed the title [WIP] cuTENSOR v2 Update to cuTENSOR 2.0 May 16, 2024
@lkdvos lkdvos requested a review from Jutho May 16, 2024 06:17
@lkdvos
Copy link
Collaborator

lkdvos commented May 16, 2024

The CUDA updates have been tagged, so this is only waiting for cuTENSOR to tag its updates now.

Some comments:

  • This necessarily changes the minimum julia version to 1.8
  • I have changed the TensorOperations version to 4.2.0, making this a minor release, as I think technically we are not making any breaking changes, as these are all protected behind compat entries. In principle I could also try to just finish the v5 updates as soon as possible, but I would be in favour of getting this thing merged asap.

@lkdvos lkdvos force-pushed the compathelper/new_version/2024-01-19-01-22-28-486-00885519492 branch from decdde4 to 0bfc23b Compare May 28, 2024 08:23
@lkdvos
Copy link
Collaborator

lkdvos commented May 28, 2024

The cuTENSOR updates got tagged, this is now all good to go. @Jutho , shall I merge this?

@Jutho
Copy link
Owner

Jutho commented May 28, 2024

Is it ok if I try to review tonight or tomorrow morning (it's mostly a means for me to see what was changed, so that I can keep track)? If I didn't succeed by tomorrow lunch time; feel free to merge.

@lkdvos lkdvos force-pushed the compathelper/new_version/2024-01-19-01-22-28-486-00885519492 branch from ac76ca8 to 714835a Compare May 29, 2024 13:47
@lkdvos lkdvos merged commit f047345 into master May 29, 2024
15 checks passed
@lkdvos lkdvos deleted the compathelper/new_version/2024-01-19-01-22-28-486-00885519492 branch May 29, 2024 14:34
@ejmeitz
Copy link

ejmeitz commented Jun 18, 2024

Is is safe to use this?

@Jutho
Copy link
Owner

Jutho commented Jun 18, 2024

It should be; we hope to release a v5 of TensorOperations soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants