Skip to content

Unoptimized memoization behaviour when passing multiple arguments #739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kahoot-karl opened this issue Feb 13, 2025 · 12 comments
Open

Unoptimized memoization behaviour when passing multiple arguments #739

kahoot-karl opened this issue Feb 13, 2025 · 12 comments

Comments

@kahoot-karl
Copy link

Hey!
I am encountering undesirable memoization behaviour which doesn’t match expectations from the documentation, in particular the behaviour of WeakMapMemoize.

I read the issue at #635 carefully and this seems like a different problem/scenario.

We were happy to use weakMapMemoize so we don't need to create private instances of a selector when using them with props, like so:

useSelector(state => selectSomeData(state, id))

However, assume that selectSomeData was defined like this:

const selectSomeData = createSelector([
   heavySelectorThatDependsOnDeepChainOfSelectors,
   (state, id) => id
], (heavyResult, id) => /* combiner code */

Then the first selector and its corresponding input selectors get invalidated every time it encounters a new value of id, although heavySelectorThatDependsOnDeepChainOfSelectors only accepts one argument so the result will obviously be the same.

Quoting the documentation from https://reselect.js.org/api/weakmapmemoize/

Use Cases: This memoizer is likely best used for cases where you need to call the same selector instance with many different arguments, such as a single selector instance that is used in a list item component and called with item IDs…

This might be obvious to some, or our use case is maybe too niche, but I would have loved to know this before since we’re using a wide range of values of the second arguments that will break the memoization.

We have applied a local patch to our own code like so:

export const createCustomSelector = (inputs, combiner) => {
  return createSelector(
    inputs.map((input) => {
      const isUnary =
        input.dependencies?.every((dependency) => dependency.length <= 1) ||
        (!('dependencies' in input) && input.length <= 1);
      return isUnary ? (state) => input(state) : (state, props) => input(state, props);
    }),
    combiner
  );
};

Mass-replacing that selector in our application leads to great speed-ups.

Could you consider introducing a similar optimization into the core library? Or perhaps just clarify the docs.

Thank you

@markerikson
Copy link
Contributor

Then the first selector and its corresponding input selectors get invalidated every time it encounters a new value of id.

I don't think I follow why this would be an issue.

What matters is what the "input selectors" extract, and if the heavySelector is only extracting one argument, it should be ignoring the id argument.

Can you give an actual repro of this problem?

@kahoot-karl
Copy link
Author

kahoot-karl commented Feb 13, 2025

repro of the problem: https://codesandbox.io/p/devbox/serverless-sky-y9dhwn?workspaceId=ws_7Uz69aHwQChHg4Ja3sh53K

Just to add that this problem is both a CPU and a memory problem. CPU because the input selectors get invalidated, and memory because the unary selectors get a new memoization entry for each unique extra argument.

if the heavySelector is only extracting one argument, it should be ignoring the id argument.

Yes, but normally it should not be executed at all if passed the same argument twice.

@maxired
Copy link

maxired commented Feb 13, 2025

Hey @markerikson , I have been working with @kahoot-karl on that issue.

The cache key in weakMapMemoize is using all arguments, even if the selector end up using a single argument.
This mean there would not be a single cache, but as many caches as there is values for the second argument.

In the sentence you quoted, I am not sure if invalidated is the correct term.
What we meant is that the full selector tree will needs to be re-validated as many time as the number of different value we use as second argument, anytime the first argument (store state) gets udpated.

@markerikson
Copy link
Contributor

I do see one issue with the example itself.

Yes, all the input functions for computationSelector will get called with all the arguments, because that selector itself is getting called with all the arguments. In this specific repro, you've got:

const superHeavyComputation = jest.fn((input) => {
    return { a: "A", b: "B" };
  });

which is a problem because now the input function is always returning a new reference.

In general, though: yes, weakMapMemoize does look at all the arguments to do its memoization.

If that's a problem in your case, then I think the best approach is to override the argsMemoize option when you create the selectors, and not use weakMapMemoize on the args here.

@maxired
Copy link

maxired commented Feb 13, 2025

@markerikson I tried to come up with a repro close from reselect best practices

https://codesandbox.io/p/sandbox/gracious-jackson-x4q877

If that's a problem in your case,

yes it is a problem because we are using the same component with a different id, something like 100 times, and we update the store multiple times per second.

then I think the best approach is to override the argsMemoize

Thanks for the suggestion. That is something we are evaluating.
We were hopping for a built in solution for this, or at least a clarification in the doc that this pattern was not currenlty optimized for high number of instances.

@kahoot-karl
Copy link
Author

Overriding argsMemoize for this patch is very tricky because the memoization function cannot introspect dependencies recursively. We thought it might be attractive to support from a library that specializes in memoization pov, since this is an optimisation that could be introduced on the library-side with no drawbacks to the end user.

@markerikson
Copy link
Contributor

markerikson commented Feb 13, 2025

You mentioned that the patch resulted in "a great speedup". If you record a CPU profile, where is the extra time being spent? Is it just in comparisons? Is that the varying inputs are actually leading to complete selector recalculations, and therefore unnecessary component re-renders?

I still feel like I'm missing something about this problem description. Even if the args check runs, I would assume that the actual selectors don't run, because the input function for the heavySelector is returning the same result.

What we meant is that the full selector tree will needs to be re-validated as many time as the number of different value we use as second argument, anytime the first argument (store state) gets udpated.

That's what I'm saying. Yes, the args checks will need to rerun, because for every cache entry there are new references that were used to generate that level of caching. (And since the state is the first arg, it forms the root of the caching tree). But the input selectors ought to be extracting the same values, and so the combiner function shouldn't be running because those values haven't changed.

@kahoot-karl
Copy link
Author

@markerikson thank you for your patience with this issue so far. It make sense to try to find out why this would lead to such bad performance.

It seems like all the input selectors follow the proper rules: I added some console.log in our application and verified that no combiner functions are ran when they shouldn't.

I am speculating that the weakmapmemoization slows down things by allocating a lot of memory storing the same thing over and over, but I will try to get back to you shortly to profile and experiment a bit more.

@EskiMojo14
Copy link
Contributor

personally i would be against reselect including some sort of one-size-fits-all solution - fn.length is not reliable because of rest and defaulted parameters. (for example, createSelector(...).length is always 0)

in userland, you could perhaps have some sort of forceArity solution:

const forceArity = <Args extends any[], Return>(fn: (...args: Args) => Return, arity: Args["length"]) => (...args: Args) => fn(...args.slice(0, arity))

const selectSomeData = createSelector([
   forceArity(heavySelectorThatDependsOnDeepChainOfSelectors, 1),
   (state, id) => id
], (heavyResult, id) => /* combiner code */

but obviously that would need to be applied on a case by case basis.

@kahoot-karl
Copy link
Author

kahoot-karl commented Feb 14, 2025

Ok, I think I understand the problem now a bit better.

  • We are filling a React view with 100 items, each with their own id.
  • Every component instance has their own id, and has a useSelector((state) => selectSomeData(state, id)) call.
  • For every component added, the store is obviously updated, which means that the code inside useSelector needs to re-evaluate.
  • selectSomeData(state, id) is called with a new state, where the full tree of input selectors need to be evaluated, for each id.

So even though going through the memoized input selectors is "cheap", what is usually linear complexity, is now quadratic complexity, because for every new entry "N + 1" requires N extra passes through the full input selector tree.

With the optimization of not passing the extra argument unnecessarily, this work is only done once per entry insertion and not N times (where N is the number of entries).

We have verified that we are following best practices in the sense that each new entry only renders one React component, and that the state updates do not cause selector combiners to invalidate (as mentioned above).

We've drilled down this problem with a benchmark running in node.js and react-testing-library:

 for (let i = 0; i < 100; i++) {
   const depBefore = heavySelector.dependencyRecomputations();
   addEntry(i);
   const depAfter = heavySelector.dependencyRecomputations();
   console.log('new dep computations', depAfter - depBefore);
}

With the normal version of createSelector, this will print a higher and higher number of each iteration, whereas in our patched version preventing extra argument being passed, this prints a constant increment for each entry.

This starts becoming noticeable in the benchmark already around 50 entries and was evident at 100, where the benchmark time is about 1.5 seconds compared to 400 ms with our custom patch (after the fix, the problematic selector did not add any significant time to the benchmark, time was spent in other places).

I hope I'm making myself clear, let me know if I might be missing something, or if you need something else double-checked.

@misha-erm
Copy link

Faced the same problem. We have some selectors that are used as input for almost every other selectors so cache invalidation became a real issue for us.

Gladly such "hot" selectors in our app depend mostly on state so I created special createStateOnlySelector which drops off every argument except state

const stateOnlySelectorOptions = {
  argsMemoize(func, options) {
    const memoized = lruMemoize(func, options);

    return Object.assign(function withStateOnly(state: RootState) {
      return memoized(state);
    }, memoized);
  },
  memoize: weakMapMemoize,
} satisfies CreateSelectorOptions<typeof weakMapMemoize, typeof lruMemoize>;

export const createStateOnlySelector = createSelectorCreator(stateOnlySelectorOptions).withTypes<RootState>();

For other cases I can always just do:

const sel = createSelector([
  (state, prop1) => heavySel1(state, prop1),
  (state, prop1, prop2) => heavySel2(state, prop1, prop2)
])

but yeah, keeping track of it is not very pleasant task

@kahoot-karl
Copy link
Author

@misha-erm Here is our workaround that doesn't require you to keep track of it manually.

type CreateSelectorFunction<StateType = any> = <InputSelectors extends SelectorArray<StateType>, Result>(
  inputSelectors: [...InputSelectors],
  combiner: Combiner<InputSelectors, Result>
) => Selector<GetStateFromSelectors<InputSelectors>, Result, GetParamsFromSelectors<InputSelectors>> &
  NonNullable<unknown>;

export const createSelector: CreateSelectorFunction = (inputs, combiner) => {
  return reselectCreateSelector(
    inputs.map((input) => {
      if (input.length > 2) {
        throw new Error('More than one free argument not supported (selector can have at most 2 arguments)');
      }

      const isUnary =
        (input as OutputSelector).dependencies?.every((dependency) => dependency.length <= 1) ||
        (!('dependencies' in input) && input.length <= 1);
      return isUnary ? (state) => input(state) : (state, props) => input(state, props);
    }) as typeof inputs,
    combiner
  );
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants