-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-978] Higher Order Gradient Support broadcast_to, broadcast_power, power, elemwise_mul and elemwise_sub and add unit test function check check_nth_order_binary #17754
base: master
Are you sure you want to change the base?
[MXNET-978] Higher Order Gradient Support broadcast_to, broadcast_power, power, elemwise_mul and elemwise_sub and add unit test function check check_nth_order_binary #17754
Conversation
…r and add check check_nth_order_binary
@mxnet-bot run ci [windows-gpu, windows-cpu] |
Jenkins CI successfully triggered : [windows-gpu, windows-cpu] |
@mxnet-label-bot add [Autograd, Backend, pr-awaiting-review] |
@apeforest @kshitij12345 could you please please review the pr. thx. |
@mxnet-bot run ci [windows-gpu, windows-cpu] |
Jenkins CI successfully triggered : [windows-gpu, windows-cpu] |
@mxnet-bot run ci [windows-cpu] |
Jenkins CI successfully triggered : [windows-cpu] |
@anirudh2290 Could someone review the pr? thx. |
@tobecontinued Thanks for the PR. Apologies for the delayed response. |
@kshitij12345 any update? thx. |
@tobecontinued Apologies. Will surely do that before upcoming Monday. Thank You. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work.
# of the function, and in the gradient function, they manually defined derivative_(i-1) | ||
# and use it to computed derivative_(1) * head_grad_(1). | ||
# It maybe a wrong approach, because the gradient of the first gradient should compute the grad of | ||
# derivative_(1) * head_grad_(1) instead of gradient of derivative_(1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is taken care of. (Though I'll think some more to wrap my head more around it).
Like in the example below, grad_grad_mid
considers the fact that head_grad
was used in computing the first gradient,
https://github.com/apache/incubator-mxnet/blob/5fd90249cafd14910633d25d3bc370a522736ac7/src/operator/tensor/elemwise_unary_op_trig.cc#L94-L121
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rechecked the code. you are right, The check_nth_order_unary
and high order grad of unary ops are correct. because
We just used differrnt ways to handle head_gread in check_nth_order_unary
and check_nth_order_binary
Thx!.
// It will get wrong dependency for the second backward in CoreOpExecutor::GetBackwardDependency | ||
// due to "return igrad_entries[0].node;" // and failed to call CoreOpExecutor::bwd_inputs() | ||
// and CoreOpExecutor::bwd_outpuss() due to "CHECK_EQ(backward_.size(), 1U)";. | ||
// { "elemwise_mul", "_backward_mul" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know much about this, but I don't think removing an existing test would be a good idea.
@apeforest @larroy @sxjscience Would be better people to take the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more explanation, becaue the failure of the test cases are caused by defects of CoreOpExecutor, I just want a pr to achieve only one task. Maybe we need a other pr or jira to fix the issuee of CoreOpExecutor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That makes sense.
However, I feel the mentioned people are in much better position to make the call.
…to higheer-order-gradient-power-elemwise_mul_elemwise_sub
Description
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes