-
Notifications
You must be signed in to change notification settings - Fork 154
Parallel compilation of large BSV projects #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We have just gotten an external parallel build working using bluetcl and make. The working bits in the end were very simple. First, we have a bluetcl script that generates a dependency graph of make rules thusly: foreach i
[depend make $topfile] {
set tgt [lindex $i 0]
set deps [join [lindex $i 1]]
puts [append tgt ": " $deps]
} Then the following rule can build any %bo in the design, along with any verilog modules that have been delineated using synthesis boundaries. %.bo:
bsc -verilog $(BSC_COMPILATION_FLAGS) -p $(BSC_PATH) $< Given that this seems to be rather trivial, one would imagine that implementing this behaviour by default in the compiler would not be too difficult. Having said that it is trivial, it should be noted that this took 2 of us most of a week to arrive at, and there are a large number of people using a much less efficient build (which was developed by the Bluespec company, no less!) so it would be much appreciated if this behaviour was integrated and enabled with a flag. |
@jonwoodruff where can I checkout your external parallel build environment? I would like to use it for some of my projects. |
@neelgala You can find the build we're using in: genDependencies.tcl is the one that asks the bluespec compiler to return a dependency graph structure, and then generates makefile rules. This one takes flags that should be passed to the Bluespec build in environment variables. There is a rule in Include_verilator.mk that invokes this script to generate the makefile include with these dependencies. The rules from Include_verilator.mk are below: .depends.mk: $(REPO)/builds/Resources/genDependencies.tcl build_dir
$(info generating bsv dependency graph)
BSC_PATH=$(BSC_PATH) \
BSC_DEFINES="$(BSC_DEFINES)" \
BSC_BUILDDIR=build_dir \
BSC_TOPFILE=$(REPO)/src_Testbench/Top/Top_HW_Side.bsv \
OUTPUTFILE=$@ \
$<
include .depends.mk
%.bo:
$(info building $@)
bsc -verilog -elab -bdir build_dir -vdir Verilog_RTL $(BSC_COMPILATION_FLAGS) -p $(BSC_PATH) $< Sorry for the delay! We've been testing things before we merged it into our main branch. |
Hi all, It sounds like this is no longer a pressing issue for us, at the moment. The Bluetcl-generated makefile is doing the job, so I'll close the issue. Of course, it still might be useful to have a parallel build feature inside the compiler itself, and I'm still interested hear anyone's feedback on that idea. So please feel free to reopen. |
Compiling files in parallel is an opportunity for a productivity boost in large BSV designs. But is it best to achieve this inside the compiler (e.g. using Haskell's
Control.Concurrent
library) or outside (e.g. using abluetcl
script to extract depdendencies and generate a parallel Makefile)?Doing it inside the compiler would certainly be simpler for users, but how hard would it be to get right?
I've had a quick scan of the code, and my first impression is that maybe it is not difficult. I propose a high-level strategy below. Probably though, I've missed important corner cases or complications. So I'd like to solict feedback (if anyone has time) before looking much further.
The strategy has three main parts.
Identifying dependencies. Looks like this is done by the
transClose
function, which returns a list of(package, [import])
pairs, i.e. the import graph.Topological sort. Currently, the import graph is topologically sorted and reversed to give a flat list of files to compile sequentially, in order. I propose to keep the existing topological sort just to check for cycles, but to have
chkDeps
return a richer structure: the original import graph. We will still perform a topological sort, but to maximise parallelism, this will be done dynamically during the parallel build process (because we don't know in advance how long it will take to build each individual file).Parallel build. Currently, the files are compiled sequentially by a
foldM comp
incompile_with_deps
. I propose to replace this call tofoldM
with a new parallel build function, which operates as follows.First, find all leaves of the import graph, and add them to a work queue.
Second, create a pool of worker threads to consume from the work queue and call the
comp
function.Third, when a worker finishes compiling a file, remove all the incoming edges to that file from the import graph. If this exposes any new leaves, add them to the work queue. (This is the dynamic reverse topological sort.)
Repeat the third step until the work queue is empty and all workers are idle.
This is all probably obvious, but it looks like it could work. Am I missing anything important or perhaps even a show-stopper? If not, we may have time to prototype the idea, and report back our findings.
The text was updated successfully, but these errors were encountered: