[DESIGN] rule streaming #5251

ghost · 2021-11-30T18:06:56Z

This PR adds a design doc for the rule streaming feature I'm thinking about. /cc @snowleopard @rgrinberg @bobot. I still need to cover stale artifact deletion and special directories such as .ppx, _odoc, ... but the main idea is there.

Signed-off-by: Jeremie Dimino <[email protected]>

rgrinberg · 2021-11-30T19:06:17Z

doc/dev/rule-streaming.md

+The idea is that when we produce rules, we will producing rules under
+a current active "mask" that tells us where we are allowed to generate
+files or directories.  Trying to produce a rule with targets not
+matched by this mask will be a runtime error.


Don't we already have schemes for a similar use case? Would it be possible to unify the two?

Absolutely. Scheme is more the implementation of such a thing, i.e. it is what we would use inside Load_rules. It is not the API that dune_rules will use. So Scheme would move to dune_engine and we would stop using it in dune_rules.

bobot · 2021-12-01T09:14:16Z

doc/dev/rule-streaming.md

+Interpreting a `library` stanza requires knowing the set of `.ml`
+files in the current directrory. Knowing this requires interpreting
+`copy_files` in the current directory. So the interpretation of
+`library` stanzas will need to go under a `narrow` as well.


Thank you a lot for the write up; mask are very similar (on purpose I guess) with the summary of the scoped_dir we discussed. Summary would given by the user in the scoped_dir stanza and would allow dune to tell when to go inside the scoped_dir, so I propose to add a futur extension to the mask.

Suggested change

`library` stanzas will need to go under a `narrow` as well.

`library` stanzas will need to go under a `narrow` as well.

### Extended mask (V2)

Stanzas produces rules that produce targets, but they also produce public binaries, libraries and even extend aliases, and list of files to install in a package. The the computation loading generated dune files, could be suspended on the generated public binary names, library names public or not, aliases and package names using an extended mask.

So in the same way that the target database is populated lazily and in the right order using the mask and suspension, the public binaries database, libraries database, aliases database and package database would be populated lazily.

val narrow_library : Library_mask.t -> unit Memo.Build.t -> unit Memo.Build.t

val library_mask : ?sublib:bool -> Library_name.t -> Library_mask.t

That makes sense to me. However, do you think this should live in dune_engine? Currently dune_engine doesn't know anything about libraries or binaries.

What would be nice for a V2 is for the Mask.t to be extensible, and to have helpers to build lazy databases. Those would leave in the dune_engine, and the specific masks in dune_rules.

I like this idea. But do you mind putting it in a separate file? I feel like the current doc is complicated enough and I still haven't covered some topics such as stale artifacts deletion.

ghost · 2021-12-02T03:20:35Z

The production of rules definitely needs to be better defined and more strict. While doing some preliminary cleanup, I found:

some dead code: Remove some useless code in odoc #5261
production of the same rules over and over: Stop generation the same rules over and over #5262

doc/dev/rule-streaming.md

snowleopard · 2021-12-02T11:46:15Z

doc/dev/rule-streaming.md

+
+### `Dune_rules.Gen_rules`
+
+The entry of `Dune_rules.Gen_rules` is the `gen_rules` function. It's


Suggested change

The entry of `Dune_rules.Gen_rules` is the `gen_rules` function. It's

The entry point of `Dune_rules.Gen_rules` is the `gen_rules` function. It's

doc/dev/rule-streaming.md

snowleopard · 2021-12-02T11:51:16Z

doc/dev/rule-streaming.md

+filters out the result. But for things to behave well, the unwritten
+following invariant must hold: `gen_rules ~dir:d` is allowed to
+generate rules for directory `d'` iff `gen_rules ~dir:d'` emits a call
+to `Load_rules.load_dir ~dir:d`.


At this point of reading I thought that the current behaviour and invariants are pretty crazy in terms of complexity :)

Agreed! I've been tightening the the rule production API since last night and found a lot of odd stuff:

in many places we just keep generating the same rules again and again

in some places we produce rules that will never be looked at

I've added checks and I'm almost done fixing the various places where we do these things.

Yep, just looked at the PRs -- nice clean up!

And just put the last one up: #5270

snowleopard · 2021-12-02T11:54:50Z

doc/dev/rule-streaming.md

+rules when `gen_rules` is called with direcotry
+`_build/default/src/.foo.objs/byte`, however that would spread out the
+logic for interpreting `library` stanzas. It is much simpler to
+produce all the build rules corresponding to a `library` stanza in one


It is much simpler to produce all the build rules corresponding to a library stanza in one go.

Reading about all the complexity that this leads to, I'm not convinced that it's the optimal design. Maybe splitting up rules for lilbrary is in fact going to be simpler.

I tried splitting the library rules this way a few times with poor results. This problem also prevents us from adding features like targets for inferred mli's. In general, it really is an annoying limitation that forces to "invert" the logic of how we want to write our rules.

doc/dev/rule-streaming.md

snowleopard · 2021-12-02T12:08:24Z

@jeremiedimino This is a very helpful document! I now understand much better what's going on and why.

I'm not convinced that the proposed design is the right way to go in the long term but it does seem to solve our short-term problem. I feel like the proposed world is still too complex to survive for long without trouble, so I'd like us to find a simpler, more long-lived solution. But if we can't, then let's go with this idea.

Co-authored-by: Andrey Mokhov <[email protected]>

ghost · 2021-12-02T14:35:13Z

No problem, this write up was useful for me as well. I've done a series of clean up and after #5270 we will already be in a much better place. I'll update the document to reflect the new current design.

ghost · 2021-12-02T21:39:22Z

I'm not convinced that the proposed design is the right way to go in the long term but it does seem to solve our short-term problem. I feel like the proposed world is still too complex to survive for long without trouble, so I'd like us to find a simpler, more long-lived solution. But if we can't, then let's go with this idea.

I agree. We should keep things simple. There is no need to bake in multi-dir rule production in the system. Instead, we can do it via separate utilities.

Here is much simpler proposal:

gen_rules ~dir is only allowed to produce rules in dir

we provide the following API (I have an implem of this in a branch):

 val Rules.Produce.value : dir:Path.Build.t -> 'a Univ_map.Key.t -> 'a -> unit Memo.Build.t
 val Load_rules.load_value : dir:Path.Build.t -> 'a Univ_map.Key.t -> 'a Memo.Build.t

Then if we want to produce rules for a subtree somewhere, say for the obj dir of a library, we can generate all the rules at once in the directory where the library is defined, store the set of rules for the obj dir in a produced value, and load this value in the directories of the sub-tree. This way we are just composing simple features that are easy to reason about and there are no surprises.

ghost · 2021-12-02T23:07:23Z

Although, it's still not that different from what I propposed before. Maybe we simply haven't pushed far enough the "pull-based" model. Thinking about it again, when we generate the obj dir for a library we have several sub-directories that each correspond to a different stage (bytecode, native, dep graph). So we could think of matching on the sub-directory as matching on the stage. i.e. it's like implementing the logic for building a library as:

let build_lib stage =
   match stage with
  | Byte -> ...
  | Native -> ...
  | Dep_graph -> ...

rather than:

let build_lib =
  build_byte ();
  build_native ();
  build_dep_graph ()

bobot · 2021-12-08T10:24:00Z

To push even further, the different stages seems like something that can be part of the mask.

let build_lib = narrow (stage_mask Byte) build_byte ++ narrow (stage_mask Native) build_native ++ ...

ghost · 2021-12-08T11:23:56Z

If we go in the pull-based direction, then we would forget about masks and narrowing.

To be more precise, let's formalise things a bit. Let's consider the internal computation DAG of Dune, together with an eval function to evaluate a node of this DAG.

The design I proposed in this PR is what I would call "push-based". In this model, the DAG is represented as follow:

type node =
   | Node of (value * node list) Memo.Lazy.t

then nodes are referred to via a path, and eval walks this path to find the right node:

let rec eval (Node n) path =
  let* v, deps = Memo.Lazy.force n in
  match path with
  | [] -> Memo.Build.return v
  | x :: path -> eval (List.nth deps x) path

Then the rule provider explicitly construct this DAG. I call it "push-based" because it feels like going forward:

read the toplevel dune file
produce some rules for the toplevel directory
produce a list of nodes for recursing in sub-directories

In what I would call "pull-based", the rule provider doesn't explicitly construct the DAG and instead provide the "eval" function directly, which is memoised. eval calls itself recursively. I call it "pull-based" because it feels like we are going backward and pulling values as we need them. In practice, eval would start by analysing the path and dispatching to the right provider.

gen_rules as we have it now at the tip of main is pull-based. And the design proposed in this PR is a mix between pull-based and push-based: overall, it's pull-based but inside each directory we have some limited push-based. In my last comment, what I wanted to say is: "maybe we shouldn't try to mix, and instead stick to pull-based for everything".

FTR, Memo is optimised for pull-based.

rgrinberg · 2023-02-28T18:35:58Z

Merged the proposal document into the repo

rgrinberg · 2023-08-07T11:04:57Z

I had a serious look at implementing this and. I don't have a PR ready yet, but I can report on my experience so far.

The proposal has a couple of disadvantages that haven't yet been discussed:

Some error handling is delayed. For example, suppose we are generating under two masks:

mask (prefix "a") (fun () -> (* create a rule that generates b *)) >>>
mask (prefix "b") (fun () -> (* create a rule that generates b *))

The rules under mask prefix "a" are invalid because they are outside the mask. But if we're only loading a subset of the rules (e.g. under mask prefix "b") , we will never determine this.

Memoization becomes less effective. Consider our currently memoized rule loading function:

val load : dir:Path.t -> Loaded.t Memo.t

It would have to become:

val load : dir:Path.t -> Mask.t -> Loaded.t Memo.t

This means that we'll either have to memoize for every loaded mask or drop this memoization step completely. Memoizing this function with arbitrary masks might become quite expensive as there's no limit as to how many masks would be used, and there's no good way to share computation between overlapping masks.

The counter arguments to the issue above are both that you only pay these costs if you're using this feature. When you don't use this feature, you are effectively generating everything under the mask *, in which case we can have full error handling, and keep memoization just for this mask. For 2., we should also mention that load doesn't do much computationally intensive work apart from running the user callback (merging the rules into a map, doing some validation), and the user callback will remain completely memoized.

I'll also mention some other notable things regarding the implementation.

The type of masks must be such that we can have the following operations done quickly:

val is_empty : mask -> bool
val inter : mask -> mask -> mask

This is because when we're loading under a mask, we need to keep narrowing it as we descend down the user provided callback. And once we've hit an empty mask, we aren't interested in loading rules further. The following constraints rule out a natural implementation for the mask such as Path.t Predicate_lang.t.

In the original design doc, the mask is for both file and directory targets. I think maintaining the masks separate would prove to be better. In addition, we should also consider a separate mask for aliases and "generated sub directories". This will all give us more flexibility without much additional complexity.

Add design doc for rule streaming feature

3118fd7

Signed-off-by: Jeremie Dimino <[email protected]>

rgrinberg reviewed Nov 30, 2021

View reviewed changes

bobot reviewed Dec 1, 2021

View reviewed changes