Skip to content

Commit 2f69ea7

Browse files
committed
uv-resolver: fix graph traversal
This fixes a bug found by Konsti[1] where not all extras would propagate correctly. This could lead to improper simplification steps. The fix here is to revisit nodes in the graph unless no changes have been made to the set of enabled extras. We also add a regression test whose snapshot changes without this fix. I tried writing a test case by hand but couldn't figure it out. The original pytorch MRE in #9289 also has a different lock file with this fix, but we really shouldn't be adding more pytorch tests given how beefy they are. So I found this case using airflow, which is also beefy, but hopefully good enough. [1]: #9370 (comment)
1 parent 2f4a35c commit 2f69ea7

File tree

2 files changed

+2097
-2
lines changed

2 files changed

+2097
-2
lines changed

crates/uv-resolver/src/graph_ops.rs

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,14 @@ pub(crate) fn simplify_conflict_markers(
115115
included: bool,
116116
}
117117

118+
// Do nothing if there are no declared conflicts. Without any declared
119+
// conflicts, we know we have no conflict markers and thus nothing to
120+
// simplify by determining which extras are activated at different points
121+
// in the dependency graph.
122+
if conflicts.is_empty() {
123+
return;
124+
}
125+
118126
// The set of activated extras and groups for each node. The ROOT nodes
119127
// don't have any extras/groups activated.
120128
let mut activated: FxHashMap<NodeIndex, Vec<FxHashSet<ConflictItem>>> = FxHashMap::default();
@@ -153,10 +161,30 @@ pub(crate) fn simplify_conflict_markers(
153161
}
154162
let sets = activated.get(&parent_index).cloned().unwrap_or_default();
155163
for child_edge in graph.edges_directed(parent_index, Direction::Outgoing) {
164+
let mut change = false;
156165
for set in sets.clone() {
157-
activated.entry(child_edge.target()).or_default().push(set);
166+
let existing = activated.entry(child_edge.target()).or_default();
167+
// This is doing a linear scan for testing membership, which
168+
// is non-ideal. But it's not actually clear that there's a
169+
// strictly better alternative without a real workload being
170+
// slow because of this. Namely, we are checking whether the
171+
// _set_ being inserted is equivalent to an existing set. So
172+
// instead of, say, `Vec<FxHashSet<ConflictItem>>`, we could
173+
// have `BTreeSet<BTreeSet<ConflictItem>>`. But this in turn
174+
// makes mutating the elements in each set (done above) more
175+
// difficult and likely require more allocations.
176+
//
177+
// So if this does result in a perf slowdown on some real
178+
// work-load, I think the first step would be to re-examine
179+
// whether we're doing more work than we need to be doing. If
180+
// we aren't, then we might want a more purpose-built data
181+
// structure for this.
182+
if !existing.contains(&set) {
183+
existing.push(set);
184+
change = true;
185+
}
158186
}
159-
if seen.insert(child_edge.target()) {
187+
if seen.insert(child_edge.target()) || change {
160188
queue.push(child_edge.target());
161189
}
162190
}

0 commit comments

Comments
 (0)