You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement signaled dep tracking for partial reevaluation, use for genquery
This CL teaches Skyframe to keep track of which previously requested
unfinished deps have completed, for parent SkyFunctions which have opted
into partial reevaluation. Those SkyFunction implementations can now take
advantage of that information to avoid wasted work.
This CL teaches the new genquery scope traversal SkyFunction to take
advantage of this new functionality.
Some terminology used in this CL ("PartialReevaluationMailbox") takes
inspiration from "actor models" of concurrency, such as Erlang and Akka.
Those frameworks associate a "mailbox" with each "actor" (e.g. concurrent
processor of work), to buffer incoming messages. In Skyframe, keys/nodes
correspond to these actors, and deps signaling their parents correspond
to message passing.
These mailboxes live alongside SkyKeyComputeState. SkyFunctions hoping to
take advantage of this signaling mechanism must store state somewhere,
and if they store it in the recommended way through the environment, they
must be robust in the case that state is discarded due to memory
pressure. Coupling mailboxes to that state makes implementation errors
less likely, because SkyFunctions won't need to implement "have dep
signals, have no compute state" recovery, because that state can't be
represented. Rather, their "initial evaluation" policy should apply, and
is almost certainly the right thing for them to do.
SkyFunctionEnvironment's implementation is inefficient for nodes opting
into partial reevaluation, especially when they take advantage of dep
signaling, because SkyFunctionEnvironment "batch prefetches" previously
requested deps before each reevaluation. Given regular (non-partial
reevaluation) SkyFunction evaluations, this is fine, because those
evaluations tend to reread previously requested deps on every restart.
However, SkyFunctions opting into partial reevaluation, especially those
that maintain SkyKeyComputeState, and doubly especially when they take
advantage of dep signaling, are highly likely to *not* need to reread
every previously requested dep. Prefetching them is wasteful. In an
adversarial sequence of SkyFunction reevaluations, this could result in
O(|deps|^2) work: if the SkyFunction yields after each new dep request,
and all previously requested deps are reread for each reevaluation.
This CL takes the first step towards fixing this inefficiency. Now, the
SkyFunctionEnvironment instantiated for partial reevaluations doesn't
prefetch previously requested deps from the graph. Instead, it reads them
only as the SkyFunction rerequests them. Unfortunately, the
SkyFunctionEnvironment still needs to be aware of whether a requested dep
is new, so even the specialized environment does O(|deps|) work,
constructing a Set of the node's previously requested deps on each
restart.
A subsequent CL will complete this effort.
Note that no attempt is made to reconcile the environment's
getMaxTransitiveSourceVersionSoFar functionality with partial
reevaluation. The only SkyFunction which depends on this method is
ActionSketchFunction, which has not opted into partial reevaluation. If
it did, then this could be made to work by doing a full fetch of
previously requested deps first, being aware of the performance penalty
of doing so.
However, maxTransitiveSourceVersion will be correct when a node gets
committed, because before any commit, AbstractParallelEvaluator fetches
previously requested deps -- because it needs to solve this version
problem, and because it needs to ensure that all previously requested
deps are still done, to be robust to rewinding.
PiperOrigin-RevId: 501015898
Change-Id: I43717aada069a0374a19e55574b101067e4b57db
0 commit comments