feat: add memoization when converting tree to graph #74

admons · 2020-12-03T20:19:37Z

Ready for review
Follows CONTRIBUTING rules
Reviewed by Snyk internal team

What does this PR do?

Like #73 (which affects graphToDepTree), It adds memoization in the depTreeToGraph

For a very large tree (that was generated using the memoization from #73), it reduced the time from 20000ms to less than 200ms

FauxFaux · 2020-12-04T07:24:03Z

Like with the other PR, this appears to just leak memory forever, unless we're planning to use it perfectly in all cases. Isn't there some kind of lru-cache for this?

robcresswell · 2020-12-04T10:24:45Z

https://github.com/dominictarr/bench-lru if you do go that route

darscan · 2020-12-04T10:28:50Z

@FauxFaux I don't think either of the PRs leak memory. In both cases GC will clean up the maps themselves at some point after the operation is complete. Also, the values/keys in the maps are already in memory and shared by the output instances. What are you seeing that I'm not?

Also, I don'r see how an lru would help.. there is no global state in either of the PRs

FauxFaux · 2020-12-08T15:13:07Z

I WAS WRONG IGNORE ME.

darscan · 2020-12-08T15:17:46Z

For posterity: the concern was around how default argument initialisers are evaluated in JS (versus Python, for example, where this would indeed be problematic)

snyksec · 2020-12-08T17:51:42Z

🎉 This PR is included in version 1.22.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Shesekino · 2020-12-08T17:53:40Z

For posterity: the concern was around how default argument initialisers are evaluated in JS (versus Python, for example, where this would indeed be problematic)

because in Python arguments with a mutable default value get attached to the function as a "private member" and never get deleted? so the concern was memoizationMap never getting cleaned because nothing actively cleans it?

calvinli · 2021-01-25T19:54:24Z

Hey all,

Sorry to revive an old-ish thread but I tried this PR and it did not improve the performance of buildGraph(...) on a very large JS dependency tree. I found that it was because

if (memoizationMap.has(depTree)) {

never evaluated to true because Map.has uses Object.is semantics (same object, instead of identical object). I was only able to get a speedup (which was substantial) by changing the condition to something like

    const depTreeObjHash = objectHash(depTree, { excludeKeys: (key => (key === 'dependencies')) }) + objectHash(depTree.dependencies || {}, { excludeValues: true })
    if (memoizationMap.has(depTreeObjHash)) {
        return memoizationMap.get(depTreeObjHash);
    }

so I'm wondering whether the behavior in this PR was intentional.

darscan · 2021-02-03T20:28:17Z

Hi @calvinli . Apologies for the delay.

never evaluated to true because Map.has uses Object.is semantics

Yep, unfortunately that was intentional - this was just a small optimisation following on from #73 so it only helps in that particular case.

Some tools (like npm) can produce trees where sub trees that appear equivalent from the outside are not, so we have to use full object hashing to pick them up and preserve them properly when converting them to graphs. We can't just look at the direct deps of each dep as your example does (by using excludeValues). For example:

[email protected]
- [email protected]
  - [email protected]

[email protected]
- [email protected]
  - [email protected]

Your change would consider both instances of [email protected] as equivalent, but they are not.

Having to compare sub trees like this makes the conversion very expensive indeed. The goal is to build dep-graphs directly (instead of trees) so that we can avoid this issue entirely.

admons requested a review from darscan December 3, 2020 20:19

admons requested a review from a team as a code owner December 3, 2020 20:19

feat: add memoization when converting tree to graph

d842c25

admons force-pushed the feat/add-memoization-to-depTreeToGraph branch from 36a2848 to d842c25 Compare December 3, 2020 20:21

darscan assigned admons Dec 4, 2020

darscan approved these changes Dec 8, 2020

View reviewed changes

admons merged commit bb9169f into master Dec 8, 2020

admons deleted the feat/add-memoization-to-depTreeToGraph branch December 8, 2020 17:47

snyksec added the released label Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add memoization when converting tree to graph #74

feat: add memoization when converting tree to graph #74

admons commented Dec 3, 2020

FauxFaux commented Dec 4, 2020

robcresswell commented Dec 4, 2020

darscan commented Dec 4, 2020 •

edited

Loading

FauxFaux commented Dec 8, 2020

darscan commented Dec 8, 2020

snyksec commented Dec 8, 2020

Shesekino commented Dec 8, 2020

calvinli commented Jan 25, 2021

darscan commented Feb 3, 2021

feat: add memoization when converting tree to graph #74

feat: add memoization when converting tree to graph #74

Conversation

admons commented Dec 3, 2020

What does this PR do?

FauxFaux commented Dec 4, 2020

robcresswell commented Dec 4, 2020

darscan commented Dec 4, 2020 • edited Loading

FauxFaux commented Dec 8, 2020

darscan commented Dec 8, 2020

snyksec commented Dec 8, 2020

Shesekino commented Dec 8, 2020

calvinli commented Jan 25, 2021

darscan commented Feb 3, 2021

darscan commented Dec 4, 2020 •

edited

Loading