-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Extend embuilder deferred building mode to ports #23924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This version requires threading the 'deferred' flag through each port definition.
tools/ports/__init__.py
Outdated
@@ -36,6 +36,7 @@ | |||
|
|||
logger = logging.getLogger('ports') | |||
|
|||
build_deferred = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global variable is ugly but it allows this approach to work without having to edit all the ports definition files. The alternative would be to thread the deferred argument through the get()
of each port and its calls of ports.build_port
.
tools/ports/sdl2_gfx.py
Outdated
@@ -16,8 +16,6 @@ def needed(settings): | |||
|
|||
|
|||
def get(ports, settings, shared): | |||
sdl_build = os.path.join(ports.get_build_dir(), 'sdl2') | |||
assert os.path.exists(sdl_build), 'You must use SDL2 to use SDL2_gfx' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work when the actual library build is deferred, but the dependency code in ports/init.py should ensure that the dependency is met.
This does seem to work now, and cut about 4 minutes off of the build-linux stage on the critical path. seems like it might be worthwhile to consider? |
.circleci/config.yml
Outdated
@@ -499,7 +499,7 @@ jobs: | |||
- build-libs | |||
- run: | |||
name: Clean build directory | |||
command: rm -rf ~/cache/build | |||
command: rm -rf ~/cache/build ~/cache/ports-builds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this directory cache/build/ports
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way on that. After #23932 they are all just together under /build in their own uniquely-named directory, which seems fine to me.
tools/cache.py
Outdated
@@ -129,6 +129,10 @@ def get_lib_dir(absolute): | |||
return path | |||
|
|||
|
|||
def get_lib_subdir(): | |||
return get_lib_dir(False).relative_to(Path(get_sysroot(False), 'lib')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do this the other way around? i.e put the subdir calculation logic here in this function and then have get_lib_dir
call get_lib_subdir
?
def get_lib_dir(absolute):
ensure_setup()
libroot = Path(get_sysroot(absolute=absolute), 'lib')
return Path(libroot, get_lib_subdir()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed in the current version.
This allows building them simultaneously with Ninja. Split out from emscripten-core#23924
This allows building them simultaneously with Ninja. Split out from #23924
tools/ports/__init__.py
Outdated
@@ -578,9 +584,13 @@ def add_cflags(args, settings): # noqa: U100 | |||
|
|||
# Now get (i.e. build) the ports in dependency order. This is important because the | |||
# headers from one ports might be needed before we can build the next. | |||
global build_deferred | |||
assert not build_deferred, "add_cflags shouldn't be called from embuilder" | |||
build_deferred = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When are the deferred ports actually built?
If we defer the building of the port does when only compiling does that mean that the port headers will be installed N times if we do emcc -c -sUSE_SDL2
for each source file, and then once again at link time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deferred libs and ports are only built when using embuilder.py. It explicitly invokes ninja (via system_libs.build_deferred()) after all the calls to port.build.
emcc never builds anything deferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this code does nothing outside of embuilder?
Is it still needed? I find the assertion a little strange. If add_cflags is never called from embuilder then why are we needing to set build_deferred
here (which means nothing outside of embuilder).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, so here's the issue. The headers for any dependencies need to be in place in order to build object files. So (as the comment on top of the function suggests), this can normally cause those dependencies to be built to get the headers in place. When building with embuilder we want all the object files to be deferred and built by the single invocation of ninja at the end (i.e. we don't want the invocations of emcc that build the object files to cause additional object files and archives to be built). But emcc is in a different process from embuilder, so it doesn't have access to the build_deferred
global in the embuilder process. So what was happening was that some of the libraries were getting built twice (once as a result of add_cflags
inside an invocation of emcc, and once from the top-level ninja file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this use of build_deferred
is for emcc
usage and not for embuilder usage?
So emcc always installs headers in build_deffered
mode now?
Doesn't that mean that if I compile 1000 source files I will get the headers for any ports use installed 1000 separate times? Since none of the compile steps actually build the library (due to being deffered) each compile will run as if the library is completely missing and go ahead and install the headers, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally it doesn't seem like cache lock contention is too much of an issue (after the port is unpacked, which only happens ones) but I guess the create
function is called under the lock, so the header check happens with the lock. Another env variable would certainly do the trick. It could be scoped to skipping ports as you suggested, or it could be used more generally to mean that we are running an embuilder library build. It coudl just replace the build_deferred
global variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its too gross for me I think.
At the very least we should somehow ensure this behaviour change is only for embuilder. As it stands this would effect all USE_NINJA=1
users. How about a separate USE_DEFFERED_BUILD=1
or something like that that embuilder can set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was initially not thrilled about another env var, but the fact that it can replace all the uses of this global is nice. I'll go with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this a little more. Using the env var in place of the global var here does make it a bit nicer. But I realized that we don't actually need that in order to keep the headers from being installed redundantly. we just need to omit the ports.get
for the dependency ports under embuilder. It's nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this works; WDYT?
@@ -280,6 +281,9 @@ def main(): | |||
if auto_tasks: | |||
print('Building targets: %s' % ' '.join(tasks)) | |||
|
|||
if USE_NINJA: | |||
os.environ['EMBUILDER_PORT_BUILD_DEFERRED'] = '1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So its not currently possible use ninja in non-deferred mode? I guess thats OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ninja still works in non-deferred mode. If you do EMCC_USE_NINJA=1 test/runner.py core0
it will build the libraries on-demand in non-deferred mode with ninja. Deferred mode only makes sense when using embuilder though.
Any more thoughts or questions on this one? |
The remaining failures here are the flaky SDL mouse and audio tests we've been seeing. Let's either land this or decide to kill it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % nits
with get_port_variant(port_name) as port_name: | ||
ports.build_port(port_name, settings) | ||
with get_port_variant(port_name) as port_name_base: | ||
ports.build_port(port_name_base, settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is maybe more like "full_name" ?
if not deferred: | ||
assert cachename.exists() | ||
if not os.getenv('EMBUILDER_PORT_BUILD_DEFERRED'): | ||
assert cachename.is_file() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather have this check done in tools/ports/__init__.py
if possible and continue to pass
deferred
here as an argument.
The rational being that EMBUILDER_PORT_BUILD_DEFERRED
should logically only effect ports and I don't really want post-specific stuff here in the general purpose caching code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that does make sense; the only problem is that cache.get()
is called (via cache.get_lib()
) from every port build script. So we'd have to update all of them to pass through the deferred
argument manually. Or else come up with some other weird solution (like making a fake/empty archive file that will pass the assertion but then get overwritten when we run ninja or something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough LGTM. Maybe add a comment as to why this needs to be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that (the only other idea I had that didn't involve updating all the port files was creating an empty archive file that would get overwritten when ninja was run. That seemed worse to me than just having the file not be there, but if you'd prefer that let me know).
This allows embuilder build ALL to mix the ports builds together with the
system_libs builds, for even faster build-linux speed.
Headers are still installed eagerly during ports.get() (to allow dependencies to
build their object files) but does not run ninja to build the ports until
the explicit step at the end.