Skip to content

perf(weave): address data loading perf issue on eval compare #4192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chance-wnb
Copy link
Contributor

@chance-wnb chance-wnb commented Apr 19, 2025

Description

This addresses a part of the evaluation comparison performance issue. No the customer scenario will no longer crash.

UI wise, there is no noticeable behavior change.

Here I attach a video(internal) to explain the change.

Testing

This PR is manually tested against the customer scenario and locally.

@chance-wnb chance-wnb requested review from a team as code owners April 19, 2025 00:27
@chance-wnb chance-wnb requested a review from gtarpenning April 19, 2025 00:27
@circle-job-mirror
Copy link

circle-job-mirror bot commented Apr 19, 2025

@chance-wnb chance-wnb force-pushed the chance/eval_perf branch 4 times, most recently from c581e64 to 66241f8 Compare April 19, 2025 18:36
Copy link
Member

@gtarpenning gtarpenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with it and its really nice, I did notice one small bug which is that if you click the next example button rapidly, we get into a state where the query fires but the loading state if permanently stuck. I think this has to do with the mutation counter but i'm not sure. A very edge case, but because we don't have a way of going to a specific example (which I think we should eventually add btw), I could imagine a user spamming the next example button to get to the one they know is interesting. We might be able to get away with just loading 5 ahead or something 🤷🏻
small-eval-perf-example-bug

@chance-wnb chance-wnb force-pushed the chance/eval_perf branch 2 times, most recently from 98ba681 to 27b30c2 Compare April 21, 2025 20:14
@chance-wnb
Copy link
Contributor Author

chance-wnb commented Apr 21, 2025

I played around with it and its really nice, I did notice one small bug which is that if you click the next example button rapidly, we get into a state where the query fires but the loading state if permanently stuck. I think this has to do with the mutation counter but i'm not sure. A very edge case, but because we don't have a way of going to a specific example (which I think we should eventually add btw), I could imagine a user spamming the next example button to get to the one they know is interesting. We might be able to get away with just loading 5 ahead or something 🤷🏻

Great job on discovering this bug, it is a solid bug. This is because in the loadRowDataIntoCache() I was passing cachedRowData.current which is the at-the-time value of the cache, which does not reflect the latest value during race conditions. My fix is to change the parameter to pass the refs (the boxed object), and the issue went away.

@chance-wnb chance-wnb requested a review from gtarpenning April 21, 2025 20:26
Copy link
Member

@gtarpenning gtarpenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, love the code removal. I think the cache handling is neat. I trust that the mutation counter is common practice, and after watching your video it makes sense; @tssweeney might be curious about it!

@chance-wnb
Copy link
Contributor Author

lgtm, love the code removal. I think the cache handling is neat. I trust that the mutation counter is common practice, and after watching your video it makes sense; @tssweeney might be curious about it!

In regard to the mutation counter thingy, I now feel that semantically giving it a better name seems to justify its usage better. now I renamed it to cacheVersion.

@@ -499,30 +499,6 @@ const fetchEvaluationComparisonResults = async (
});
});

// 3.5 Populate the inputs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this, you are no long populating inputs in

export type EvaluationComparisonResults = {
  // Inputs are the intersection of all inputs used in the evaluations.
  // Note, we are able to "merge" the same input digest even if it is
  // used in different evaluations.
  inputs: {
    [rowDigest: string]: DatasetRow;
  };

as such, I would either:
a) populate this field using other forms (ex. the predict_and_score input) or
b) remove this field from the type definition and force resolution of any typing issues caused from this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think (b) corresponds with this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. It was an oversight, it should be removed.

inputDigest: predictAndScoreRes.rowDigest,
inputRef: predictAndScoreRes.exampleRef,
output: flattenObjectPreservingWeaveTypes({output}),
scores: Object.fromEntries(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in @andrewtruong's upcoming PR, we can't count that the data comes from a dataset (gasp!) it might be good to at least leave a comment here as this would be a possible location to record the raw predict_and_score inputs as the presumed data row.

@@ -225,6 +227,16 @@ export const ExampleCompareSection: React.FC<{
return filteredRows[targetIndex];
}, [filteredRows, targetIndex]);

const {targetRowValue, loading: loadingInputValue} = useExampleCompareData(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my reading of the implementation of useExampleCompareData is that filteredRows is only used to get the index using filteredRows[targetIndex]. It might be cleaner to just pass filteredRows[targetIndex] to useExampleCompareData, simplifying the params and internal logic of useExampleCompareData

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filteredRows should be needed to invalidate the cache in case the filter condition changes.

// Including `cacheVersion` in the dependency array ensures the memo recalculates
// when it changes, even though it's not directly used in the calculation.
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [cacheVersion, filteredRows, targetIndex]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not the biggest fan of this pattern (in particular, explicitly depending on cacheVersion that is not used in the hook). But I would agree it is hard to get around this one. Approving, but do think there might be a cleaner way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants