-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
[docs] add performance tip concerning overly-fused broadcast loops #49228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I will have to change the examples if #49182 is merged, as it will change the behavior here. |
This covers two orthogonal topics. One is that broadcast fusion may call a function more times than necessary. I wonder if the right place is in performance tips, like here, as that page has many examples. Perhaps with a sentence added here? A simpler example without slices might be something like this:
The second topic is aliasing & mutating broadcasts. Some |
Good point. The same might be true of the aliasing bit, in that only a brief reference is needed here with a follow-up link to a full page on alias handling & views, except (as far as I can find) I don't think this full page exists yet. I could attempt to write this as well, although I'm not sure I understand it well enough. |
ok, I left only the |
doc/src/manual/functions.md
Outdated
!!! warning | ||
It is always important to understand the mechanisms behind the magic of unique language features like fused broadcast loops. There are some occasional pitfalls which are important to avoid. For an example on when *not* to fuse broadcasted operations, see the note in [Performance Tips](@ref man-performance-unfuse). Users should also remember that due to the in-place mutation behavior of the `.=` operator, extra care should be taken to ensure correct and intended semantics when an object appears in both arguments of the operator like `x .= foo.(x)` or similar expressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the basic point here is about how many times the function are called. I don't think it should immediately be highlighted as a problem, and I think we can explain what goes right here, so that you get the right mental model. Maybe (without the warning):
!!! warning | |
It is always important to understand the mechanisms behind the magic of unique language features like fused broadcast loops. There are some occasional pitfalls which are important to avoid. For an example on when *not* to fuse broadcasted operations, see the note in [Performance Tips](@ref man-performance-unfuse). Users should also remember that due to the in-place mutation behavior of the `.=` operator, extra care should be taken to ensure correct and intended semantics when an object appears in both arguments of the operator like `x .= foo.(x)` or similar expressions. | |
All functions in the fused broadcast are always called for every element of the result. Thus `X .+ σ .* randn.()` adds a different random number to every element of the array `X`. But also, something like `Y .= X ./ sqrt.(s)` will call `sqrt` all `length(Y)` times, even though it has only `length(s)` distinct values. It may be faster to do less computation (at the cost of allocating some storage) by avoiding fusion here -- see more in [performance tips](@ref man-performance-unfuse). |
The bit about in-place broadcasting seems to be hinting at aliasing, but IMO that's a different paragraph, and it needs to tell you facts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am certainly aiming for a neutral explanation and don't mean to highlight problems that aren't there. I do want to check --- is your comment directed at the wording in the !!! warning
under functions.md
or directed at the blurb in Performance Tips? If the latter, I would protest slightly, as the wording (emph. added)
Fewer dots: Unfuse certain intermediate broadcasts
However, it is important to remember that the fused operation will be computed at every iteration of the broadcast. This means that in some situations, particularly in the presence of composed or multidimensional broadcasts, an expression with dot calls may be computing a function more times than intended.
Is (at least to my eyes) already using a pretty neutral tone with plenty of hedging language. I think any negative framing here is already going to be pretty mitigated by its location on the page, being just below the antithetical tip More dots: Fuse vectorized operations.
Of course there are many places where the described behavior is desired (like your rand.()
example), but I don't think it's necessary to call these out explicitly.
If the comment was directed at the warning though, I'm open to any changes there. I think something, at least, should be communicated about the aliasing potential, but I wasn't sure what or where was the right place for a note like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant on the bit in the docs introducing broadcasting!
(Edited to use the suggestion feature)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. I'll try another framing
@mcabbott I tried rewording again. I do think that some kind of However, point taken that the fact itself that a function recomputes for each iteration of a broadcast is just part of the design and should not be presented as a warning. I moved that part above the warning to be more neutral, and I left in the warning only the recognition that aliasing can happen and protection is not guaranteed. This matches what the docs already say at the definition of
I also added a positive example of when the mutating broadcast might be useful, if that helps. |
Others will have to comment on what to say about aliasing guarantees. There are cases where it notices and makes a preventative copy, but I don't want to accidentally make policy. It's possible that such cases also deserve mention in performance tips, julia> let x = rand(2^10)
y = similar(x)
@time x[3:end] .= x[1:end-2] # obvious copy
@time x[3:end] .= @view x[1:end-2] # special as aliasing is detected
@time y[3:end] .= @view x[1:end-2] # obvious no-copy
end;
0.000003 seconds (1 allocation: 8.125 KiB)
0.000003 seconds (1 allocation: 8.125 KiB)
0.000001 seconds
I think this is undefined -- there is no guarantee of what order elements will be executed in. (Perhaps to allow for them to be multi-threaded in future.) So if I understand the example here, it's not safe. Maybe this execution order is really a third property, separate from aliasing and how-many-executions. It may deserve a warning. I think neither I know this is more work but I wonder if you should consider splitting this PR into 2 or 3. "How many" is pretty straightforward. "What order" should probably be in docstrings too, there might be more discussion. And "when is aliasing fixed" is likely to take the longest. |
Oh no 😂, so much for my example. Ok, I will split up the PR. I'll keep this one as the "performance tip" |
Aside, I just saw this nice blog post about all such things: https://bkamins.github.io/julialang/2023/03/31/broadcast.html |
merge possibly? I think the other components in the docs for "what order do broadcasts evaluate" and "when is aliasing fixed" would be useful, but are orthogonal and shouldn't block this performance tip which is relatively self-contained. |
gentle bump. happy to modify, or if this language is not wanted in the docs I will close the PR |
bump |
Very nice, thanks @adienes! |
…49228) cautionary / explanatory documentation following the wake of https://discourse.julialang.org/t/unexpected-broadcasting-behavior-involving-eachrow/96781/88
cautionary / explanatory documentation following the wake of https://discourse.julialang.org/t/unexpected-broadcasting-behavior-involving-eachrow/96781/88