Skip to content

Various optimizations #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 4, 2023
Merged

Various optimizations #68

merged 5 commits into from
May 4, 2023

Conversation

andreasnoack
Copy link
Member

@andreasnoack andreasnoack commented Apr 19, 2023

This avoids a lot of dynamic dispatch by removing abstract type in the fields by making KDInternalNode and KDTree parametric on the node types and by avoiding AbstractArray as fields. At least for now, it should be fine to require Matrix and Vector either directly of via conversion. Some timings.

Current:

julia> @btime loess($(df.Speed), $(df.Dist));
  87.375 μs (2441 allocations: 847.36 KiB)

julia> @btime predict($ft, 5:25);
  27.834 μs (811 allocations: 31.22 KiB)

New:

julia> @btime loess($(df.Speed), $(df.Dist));
  77.167 μs (1886 allocations: 813.06 KiB)

julia> @btime predict($ft, 5:25);
  1.986 μs (55 allocations: 4.06 KiB)

It might still be possible to improve the speed a bit. E.g. some of the nested loops are in the wrong order but profiling suggests that most of the time in loess is spent within qr and for predict it's spent on allocating the arrays for storing the results so there shouldn't be that much performance left on the table.

Closes #47 and #51. Supersedes #53

Update: to avoid the extra type parameters, I've changed the representation of the tree to use Union{Nothing,KDNode} (where KDNode is the new name for KDInternalNode.) The performance is the same as before.

@andreasnoack andreasnoack requested a review from devmotion April 19, 2023 21:05
@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2023

Codecov Report

Patch coverage: 84.09% and project coverage change: -3.27 ⚠️

Comparison is base (f02c460) 93.17% compared to head (2569c9b) 89.90%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #68      +/-   ##
==========================================
- Coverage   93.17%   89.90%   -3.27%     
==========================================
  Files           2        2              
  Lines         205      208       +3     
==========================================
- Hits          191      187       -4     
- Misses         14       21       +7     
Impacted Files Coverage Δ
src/Loess.jl 83.80% <76.66%> (-4.31%) ⬇️
src/kd.jl 96.11% <100.00%> (-1.97%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@andreasnoack
Copy link
Member Author

I've pushed some changes based on your comments.

Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

to avoid the extra type parameters, I've changed the representation of the tree to use Union{Nothing,KDNode} (where KDNode is the new name for KDInternalNode.) The performance is the same as before.

That is, you see the same performance improvements as in the initial version of the PR? Good if we can avoid deeply nested types.

@andreasnoack
Copy link
Member Author

That is, you see the same performance improvements as in the initial version of the PR?

Indeed. It was relative to the previous version of this PR which is much faster than current master.

@andreasnoack andreasnoack merged commit add0103 into master May 4, 2023
@andreasnoack andreasnoack deleted the an/optimize branch May 4, 2023 12:21
This was referenced May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Example runs slowly
3 participants