Skip to content

[docs] add performance tip concerning overly-fused broadcast loops #49228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 15, 2023

Conversation

adienes
Copy link
Member

@adienes adienes commented Apr 3, 2023

cautionary / explanatory documentation following the wake of https://discourse.julialang.org/t/unexpected-broadcasting-behavior-involving-eachrow/96781/88

@adienes
Copy link
Member Author

adienes commented Apr 3, 2023

I will have to change the examples if #49182 is merged, as it will change the behavior here.

@mcabbott
Copy link
Contributor

mcabbott commented Apr 3, 2023

This covers two orthogonal topics.

One is that broadcast fusion may call a function more times than necessary. I wonder if the right place is in performance tips, like here, as that page has many examples. Perhaps with a sentence added here?

A simpler example without slices might be something like this:

julia> let x = rand(100, 100)
           y = sum(abs2, x; dims=2)
           @btime $x ./ sqrt.($y)     # calls sqrt 100^2 times
           @btime $x ./ map(sqrt, $y) # 100 calls, but allocates length 100
       end;
  min 18.042 μs, mean 24.067 μs (2 allocations, 78.17 KiB)
  min 2.863 μs, mean 8.403 μs (3 allocations, 79.05 KiB)

The second topic is aliasing & mutating broadcasts. Some view are already un-aliased. Is this discussed anywhere in the manual?

@adienes
Copy link
Member Author

adienes commented Apr 3, 2023

I wonder if the right place is in performance tip [...] with a sentence added here?

Good point. The same might be true of the aliasing bit, in that only a brief reference is needed here with a follow-up link to a full page on alias handling & views, except (as far as I can find) I don't think this full page exists yet. I could attempt to write this as well, although I'm not sure I understand it well enough.

@adienes
Copy link
Member Author

adienes commented Apr 3, 2023

ok, I left only the !!! warning in the original location, and took your example for Performance Tips section. Should still have mention of the alias behavior, but not sure where to put it so I just left it as an empty reference right now

Comment on lines 1162 to 1163
!!! warning
It is always important to understand the mechanisms behind the magic of unique language features like fused broadcast loops. There are some occasional pitfalls which are important to avoid. For an example on when *not* to fuse broadcasted operations, see the note in [Performance Tips](@ref man-performance-unfuse). Users should also remember that due to the in-place mutation behavior of the `.=` operator, extra care should be taken to ensure correct and intended semantics when an object appears in both arguments of the operator like `x .= foo.(x)` or similar expressions.
Copy link
Contributor

@mcabbott mcabbott Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the basic point here is about how many times the function are called. I don't think it should immediately be highlighted as a problem, and I think we can explain what goes right here, so that you get the right mental model. Maybe (without the warning):

Suggested change
!!! warning
It is always important to understand the mechanisms behind the magic of unique language features like fused broadcast loops. There are some occasional pitfalls which are important to avoid. For an example on when *not* to fuse broadcasted operations, see the note in [Performance Tips](@ref man-performance-unfuse). Users should also remember that due to the in-place mutation behavior of the `.=` operator, extra care should be taken to ensure correct and intended semantics when an object appears in both arguments of the operator like `x .= foo.(x)` or similar expressions.
All functions in the fused broadcast are always called for every element of the result. Thus `X .+ σ .* randn.()` adds a different random number to every element of the array `X`. But also, something like `Y .= X ./ sqrt.(s)` will call `sqrt` all `length(Y)` times, even though it has only `length(s)` distinct values. It may be faster to do less computation (at the cost of allocating some storage) by avoiding fusion here -- see more in [performance tips](@ref man-performance-unfuse).

The bit about in-place broadcasting seems to be hinting at aliasing, but IMO that's a different paragraph, and it needs to tell you facts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am certainly aiming for a neutral explanation and don't mean to highlight problems that aren't there. I do want to check --- is your comment directed at the wording in the !!! warning under functions.md or directed at the blurb in Performance Tips? If the latter, I would protest slightly, as the wording (emph. added)

Fewer dots: Unfuse certain intermediate broadcasts

However, it is important to remember that the fused operation will be computed at every iteration of the broadcast. This means that in some situations, particularly in the presence of composed or multidimensional broadcasts, an expression with dot calls may be computing a function more times than intended.

Is (at least to my eyes) already using a pretty neutral tone with plenty of hedging language. I think any negative framing here is already going to be pretty mitigated by its location on the page, being just below the antithetical tip More dots: Fuse vectorized operations. Of course there are many places where the described behavior is desired (like your rand.() example), but I don't think it's necessary to call these out explicitly.

If the comment was directed at the warning though, I'm open to any changes there. I think something, at least, should be communicated about the aliasing potential, but I wasn't sure what or where was the right place for a note like that.

Copy link
Contributor

@mcabbott mcabbott Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant on the bit in the docs introducing broadcasting!

(Edited to use the suggestion feature)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. I'll try another framing

@adienes
Copy link
Member Author

adienes commented Apr 4, 2023

@mcabbott I tried rewording again. I do think that some kind of !!! warning is warranted---my reasoning is that all the information being communicated here is already present in the docs, so if it were only a matter of getting the facts down then this wouldn't even be an issue. The point of this PR is to draw attention to the possibility that this behavior might not be what (some) would expect, no matter what reasons they might have to hold those expectations.

However, point taken that the fact itself that a function recomputes for each iteration of a broadcast is just part of the design and should not be presented as a warning. I moved that part above the warning to be more neutral, and I left in the warning only the recognition that aliasing can happen and protection is not guaranteed. This matches what the docs already say at the definition of mightalias and I think has similar scope & severity to other warnings throughout the docs, for example the one under Random

!!! warning
Because the precise way in which random numbers are generated is considered an implementation detail, bug fixes and speed improvements may change the stream of numbers that are generated after a version change. Relying on a specific seed or generated stream of numbers during unit testing is thus discouraged - consider testing properties of the methods in question instead.

I also added a positive example of when the mutating broadcast might be useful, if that helps.

@mcabbott
Copy link
Contributor

mcabbott commented Apr 4, 2023

Others will have to comment on what to say about aliasing guarantees. There are cases where it notices and makes a preventative copy, but I don't want to accidentally make policy. It's possible that such cases also deserve mention in performance tips,

julia> let x = rand(2^10)
         y = similar(x)
         @time x[3:end] .= x[1:end-2]  # obvious copy
         @time x[3:end] .= @view x[1:end-2]  # special as aliasing is detected
         @time y[3:end] .= @view x[1:end-2]  # obvious no-copy
       end;
  0.000003 seconds (1 allocation: 8.125 KiB)
  0.000003 seconds (1 allocation: 8.125 KiB)
  0.000001 seconds

users should be very careful to understand the order of iterations

I think this is undefined -- there is no guarantee of what order elements will be executed in. (Perhaps to allow for them to be multi-threaded in future.) So if I understand the example here, it's not safe. Maybe this execution order is really a third property, separate from aliasing and how-many-executions. It may deserve a warning.

I think neither broadcast nor map currently say this in their docstrings. But I do think it's the policy that neither guarantees order.

I know this is more work but I wonder if you should consider splitting this PR into 2 or 3. "How many" is pretty straightforward. "What order" should probably be in docstrings too, there might be more discussion. And "when is aliasing fixed" is likely to take the longest.

@adienes
Copy link
Member Author

adienes commented Apr 4, 2023

there is no guarantee of what order elements will be executed in

Oh no 😂, so much for my example. Ok, I will split up the PR. I'll keep this one as the "performance tip"

@adienes adienes changed the title add hazard signs to loop fusion docs [docs] add performance tip concerning overly-fused broadcast loops Apr 4, 2023
@N5N3 N5N3 added docs This change adds or pertains to documentation broadcast Applying a function over a collection labels Apr 4, 2023
@mcabbott
Copy link
Contributor

mcabbott commented Apr 4, 2023

Aside, I just saw this nice blog post about all such things: https://bkamins.github.io/julialang/2023/03/31/broadcast.html

@adienes
Copy link
Member Author

adienes commented May 17, 2023

merge possibly? I think the other components in the docs for "what order do broadcasts evaluate" and "when is aliasing fixed" would be useful, but are orthogonal and shouldn't block this performance tip which is relatively self-contained.

@adienes
Copy link
Member Author

adienes commented Aug 1, 2023

gentle bump. happy to modify, or if this language is not wanted in the docs I will close the PR

@ViralBShah
Copy link
Member

@vtjnash @gbaraldi Can we merge?

@adienes
Copy link
Member Author

adienes commented Sep 14, 2023

bump

@timholy
Copy link
Member

timholy commented Sep 15, 2023

Very nice, thanks @adienes!

@timholy timholy merged commit b5feb45 into JuliaLang:master Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection docs This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants