Change training step to a scalar tensor so it works with CUDA graphs #842
jasooney23
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was experimenting with the custom aggregator in the Turbulent Channel example and wanted to enable CUDA graphs for faster execution. However, currently
step
gets passed as a genericint
fromTrainer._cuda_graph_training_step
, which means that when the CUDA graph gets captured, the step it was captured at is the step the graph will always execute using.i.e., if my aggregator's
forward
takesstep
as an argument and the CUDA graph is captured atstep = 20
, then the aggregator will continue to execute withstep = 20
.My simple fix is just to pass
step
as a Tensor, but i'm not sure if i should submit the change myself or just let someone bundle it as part of a bigger revision? (sorry, it's my first time participating in open source stuff!)Thanks 😎
Beta Was this translation helpful? Give feedback.
All reactions