Description
Compiling files in parallel is an opportunity for a productivity boost in large BSV designs. But is it best to achieve this inside the compiler (e.g. using Haskell's Control.Concurrent
library) or outside (e.g. using a bluetcl
script to extract depdendencies and generate a parallel Makefile)?
Doing it inside the compiler would certainly be simpler for users, but how hard would it be to get right?
I've had a quick scan of the code, and my first impression is that maybe it is not difficult. I propose a high-level strategy below. Probably though, I've missed important corner cases or complications. So I'd like to solict feedback (if anyone has time) before looking much further.
The strategy has three main parts.
-
Identifying dependencies. Looks like this is done by the
transClose
function, which returns a list of(package, [import])
pairs, i.e. the import graph. -
Topological sort. Currently, the import graph is topologically sorted and reversed to give a flat list of files to compile sequentially, in order. I propose to keep the existing topological sort just to check for cycles, but to have
chkDeps
return a richer structure: the original import graph. We will still perform a topological sort, but to maximise parallelism, this will be done dynamically during the parallel build process (because we don't know in advance how long it will take to build each individual file). -
Parallel build. Currently, the files are compiled sequentially by a
foldM comp
incompile_with_deps
. I propose to replace this call tofoldM
with a new parallel build function, which operates as follows.-
First, find all leaves of the import graph, and add them to a work queue.
-
Second, create a pool of worker threads to consume from the work queue and call the
comp
function. -
Third, when a worker finishes compiling a file, remove all the incoming edges to that file from the import graph. If this exposes any new leaves, add them to the work queue. (This is the dynamic reverse topological sort.)
-
Repeat the third step until the work queue is empty and all workers are idle.
-
This is all probably obvious, but it looks like it could work. Am I missing anything important or perhaps even a show-stopper? If not, we may have time to prototype the idea, and report back our findings.