-
Notifications
You must be signed in to change notification settings - Fork 92
perf(weave): address data loading perf issue on eval compare #4192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
46f2202
to
5bc5915
Compare
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=e0ff91d73a74f2bddded908c73577a5c11c57ed8 |
c581e64
to
66241f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with it and its really nice, I did notice one small bug which is that if you click the next example button rapidly, we get into a state where the query fires but the loading state if permanently stuck. I think this has to do with the mutation counter but i'm not sure. A very edge case, but because we don't have a way of going to a specific example (which I think we should eventually add btw), I could imagine a user spamming the next example button to get to the one they know is interesting. We might be able to get away with just loading 5 ahead or something 🤷🏻
...se3/pages/CompareEvaluationsPage/sections/ExampleCompareSection/exampleCompareSectionUtil.ts
Outdated
Show resolved
Hide resolved
...se3/pages/CompareEvaluationsPage/sections/ExampleCompareSection/exampleCompareSectionUtil.ts
Show resolved
Hide resolved
...se3/pages/CompareEvaluationsPage/sections/ExampleCompareSection/exampleCompareSectionUtil.ts
Show resolved
Hide resolved
98ba681
to
27b30c2
Compare
Great job on discovering this bug, it is a solid bug. This is because in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, love the code removal. I think the cache handling is neat. I trust that the mutation counter is common practice, and after watching your video it makes sense; @tssweeney might be curious about it!
27b30c2
to
c7f0529
Compare
In regard to the |
c7f0529
to
d29f6de
Compare
@@ -499,30 +499,6 @@ const fetchEvaluationComparisonResults = async ( | |||
}); | |||
}); | |||
|
|||
// 3.5 Populate the inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this, you are no long populating inputs
in
export type EvaluationComparisonResults = {
// Inputs are the intersection of all inputs used in the evaluations.
// Note, we are able to "merge" the same input digest even if it is
// used in different evaluations.
inputs: {
[rowDigest: string]: DatasetRow;
};
as such, I would either:
a) populate this field using other forms (ex. the predict_and_score input) or
b) remove this field from the type definition and force resolution of any typing issues caused from this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think (b) corresponds with this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. It was an oversight, it should be removed.
inputDigest: predictAndScoreRes.rowDigest, | ||
inputRef: predictAndScoreRes.exampleRef, | ||
output: flattenObjectPreservingWeaveTypes({output}), | ||
scores: Object.fromEntries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in @andrewtruong's upcoming PR, we can't count that the data comes from a dataset (gasp!) it might be good to at least leave a comment here as this would be a possible location to record the raw predict_and_score inputs as the presumed data row.
@@ -225,6 +227,16 @@ export const ExampleCompareSection: React.FC<{ | |||
return filteredRows[targetIndex]; | |||
}, [filteredRows, targetIndex]); | |||
|
|||
const {targetRowValue, loading: loadingInputValue} = useExampleCompareData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my reading of the implementation of useExampleCompareData
is that filteredRows is only used to get the index using filteredRows[targetIndex]
. It might be cleaner to just pass filteredRows[targetIndex]
to useExampleCompareData
, simplifying the params and internal logic of useExampleCompareData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filteredRows should be needed to invalidate the cache in case the filter condition changes.
// Including `cacheVersion` in the dependency array ensures the memo recalculates | ||
// when it changes, even though it's not directly used in the calculation. | ||
// eslint-disable-next-line react-hooks/exhaustive-deps | ||
}, [cacheVersion, filteredRows, targetIndex]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not the biggest fan of this pattern (in particular, explicitly depending on cacheVersion
that is not used in the hook). But I would agree it is hard to get around this one. Approving, but do think there might be a cleaner way.
d29f6de
to
0d75513
Compare
Description
This addresses a part of the evaluation comparison performance issue. No the customer scenario will no longer crash.
UI wise, there is no noticeable behavior change.
Here I attach a video(internal) to explain the change.
Testing
This PR is manually tested against the customer scenario and locally.