-
Notifications
You must be signed in to change notification settings - Fork 61
cuTENSOR not working with automatic differentiation #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi hxjz232! I had a look a this, and it looks like this is indeed a mistake from my end, I am assuming some default argument being filled in somewhere in the rrules, but this does not work as soon as there is a backend specified that is not the default one (this is how On a separate note, I noticed that this uses VectorInterface for some of the implementations, which by default falls back to a broadcasting operation, which is not necessarily what you want to do for CuArrays. I'll write a fix for that, and update you here once I finish it. In any case, thanks for letting me now that this is broken, I hope to have it fixed asap, as this is definitely something that is wrong on our side of things. |
Jutho/VectorInterface.jl#14 should also get rid of the warning message for scalar indexing with CuArrays. Feel free to re-open an issue if things still are not working the way you expect! |
Hi Ikdvos, just FYI, the given code won't pass and gives (in fact the same as before) Error Message
Version Info
But if you switch to |
The changes in VectorInterface were not yet tagged, but this should be resolved once this is merged: JuliaRegistries/General#105225 |
Yes it solves the issue! Thanks for the effort! :) |
I met difficulties implementing my code for tensor calculations on a GPU, and it basically amounts to the issue of backpropagating through tensor operations. Here is a simplified code.
The given code can run nicely if the target function had
@tensor
. Should I modify my code or wait for later updates? Or maybe having cuTENSOR working with back-propagation is in principle not possible to implement?The text was updated successfully, but these errors were encountered: