-
Notifications
You must be signed in to change notification settings - Fork 42
Speed up checking of iterator compatibility #2077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
db63e48
to
9040552
Compare
cce85e4
to
b19ce61
Compare
b19ce61
to
59c31dd
Compare
CodSpeed Performance ReportMerging #2077 will not alter performanceComparing Summary
Benchmarks breakdown
|
573f603
to
f251c36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit torn here. It's great to see the speed-up of course, and factoring out the control block logic makes a lot of sense too. However, there's a drawback of not relying on lock()
anymore to get access to the data: the code is no longer thread-safe.
That's one of these things that we currently don't need: the way we use Spicy doesn't require thread-safety, and we have some other places that aren't thread-safe either. But iterators are a pretty fundamental piece to the runtime, and if we ever wanted to become thread-safe, we'd have to revert some of this (and it would probably take a while to even just notice the issue).
Let me ask: the PR does multiple optimizations. Do you know how much performance we'd lose if we switched back to access to the controlled data through shared_ptr
+lock, but left the other improvements (expired
/owner_before
) in?
cd764f1
to
7adcd13
Compare
At least for the That said, I believe thread-safety is a red hering since even before access to the data behind the iterator was not threadsafe (e.g., I'd still opt for accessing the data via the controlled raw ptr since 2x is a significant performance impact for loop-heavy code. With the new data structure we will have a single place to work on should we ever try to make the runtime library threadsafe. |
Ok, I agree. Actually the "thread-safety" I was referring to isn't full safety, but the case where separate threads can work on independent data without causing trouble. And thinking about that again now, I believe that covers the case here as well. I had some case in mind earlier but that was indeed a red herring. |
7adcd13
to
a9ca939
Compare
993ca78
to
523624a
Compare
We were previously using a control block which held a weak_ptr to the protected data. This was pretty inefficient for a number of reasons: - access to the controlled data always required a `weak_ptr::lock` which created a temporary shared_ptr copy and immediately destroyed it after access - to check whether the control block was expired we used `lock` instead of `expired` which introduced the same overhead - to check compatibility of iterators we compared shared_ptrs to the control data which again required full locks instead of using `owner_before` This patch introduces a new control block data structure and uses it across all classes which previously held ad hoc implementations (`Bytes`, `Map`, `Set`, `Vector`). The main improvement is that we now separate tracking liveliness and the data, and use a better implementation for control block equality checks. With the new implementation I see throughput improvements across the board for anything needing to iterate, e.g., I see bytes iteration being up to 30x faster in trivial setups; the code from the original issue is now 10x faster. Closes #1663.
This patch repurposes the existing `hilti-rt-fiber-benchmark` to be a more general benchmark suite of HILTI runtime behaviors.
This diagnostic returns false positives in code we do not own and in ways which are hard to work around with pragmas, see e.g., https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111273.
523624a
to
1aa11f1
Compare
We were previously using a control block which held a weak_ptr to the
protected data. This was pretty inefficient for a number of reasons:
weak_ptr::lock
whichcreated a temporary shared_ptr copy and immediately destroyed it after
access
lock
insteadof
expired
which introduced the same overheadcontrol data which again required full locks instead of using
owner_before
This patch introduces a new control block data structure and uses it
across all classes which previously held ad hoc implementations
(
Bytes
,Map
,Set
,Vector
). The main improvement is that we nowseparate tracking liveliness and the data, and use a better
implementation for control block equality checks. With the new
implementation I see throughput improvements across the board for
anything needing to iterate, e.g., I see bytes iteration being up to 30x
faster in trivial setups; the code from the original issue is now 10x
faster.
Closes #1663.