Skip to content

Extend embuilder deferred building mode to ports #23924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 24, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion embuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ def clear_port(port_name):

def build_port(port_name):
with get_port_variant(port_name) as port_name:
ports.build_port(port_name, settings)
ports.build_port(port_name, settings, deferred=True)


def get_system_tasks():
Expand Down
2 changes: 0 additions & 2 deletions tools/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,8 +177,6 @@ def get(shortname, creator, what=None, force=False, quiet=False, deferred=False)
logger.info(message)
utils.safe_ensure_dirs(cachename.parent)
creator(str(cachename))
if not deferred:
assert cachename.exists()
if not quiet:
logger.info(' - ok')

Expand Down
13 changes: 10 additions & 3 deletions tools/ports/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@

logger = logging.getLogger('ports')

build_deferred = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global variable is ugly but it allows this approach to work without having to edit all the ports definition files. The alternative would be to thread the deferred argument through the get() of each port and its calls of ports.build_port.


def init_port(name, port):
ports.append(port)
Expand Down Expand Up @@ -199,7 +200,8 @@ def build_port(src_dir, output_path, port_name, includes=[], flags=[], cxxflags=
ninja_file = os.path.join(build_dir, 'build.ninja')
system_libs.ensure_sysroot()
system_libs.create_ninja_file(srcs, ninja_file, output_path, cflags=cflags)
system_libs.run_ninja(build_dir)
if not build_deferred:
system_libs.run_ninja(build_dir)
else:
commands = []
objects = []
Expand Down Expand Up @@ -525,7 +527,9 @@ def get_needed_ports(settings):
return needed


def build_port(port_name, settings):
def build_port(port_name, settings, deferred=False):
global build_deferred
build_deferred = deferred
port = ports_by_name[port_name]
port_set = OrderedSet([port])
resolve_dependencies(port_set, settings)
Expand Down Expand Up @@ -573,10 +577,13 @@ def add_cflags(args, settings): # noqa: U100

# Now get (i.e. build) the ports in dependency order. This is important because the
# headers from one ports might be needed before we can build the next.
global build_deferred
assert not build_deferred, "add_cflags shouldn't be called from embuilder"
build_deferred = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are the deferred ports actually built?

If we defer the building of the port does when only compiling does that mean that the port headers will be installed N times if we do emcc -c -sUSE_SDL2 for each source file, and then once again at link time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred libs and ports are only built when using embuilder.py. It explicitly invokes ninja (via system_libs.build_deferred()) after all the calls to port.build.
emcc never builds anything deferred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this code does nothing outside of embuilder?

Is it still needed? I find the assertion a little strange. If add_cflags is never called from embuilder then why are we needing to set build_deferred here (which means nothing outside of embuilder).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, so here's the issue. The headers for any dependencies need to be in place in order to build object files. So (as the comment on top of the function suggests), this can normally cause those dependencies to be built to get the headers in place. When building with embuilder we want all the object files to be deferred and built by the single invocation of ninja at the end (i.e. we don't want the invocations of emcc that build the object files to cause additional object files and archives to be built). But emcc is in a different process from embuilder, so it doesn't have access to the build_deferred global in the embuilder process. So what was happening was that some of the libraries were getting built twice (once as a result of add_cflags inside an invocation of emcc, and once from the top-level ninja file).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this use of build_deferred is for emcc usage and not for embuilder usage?

So emcc always installs headers in build_deffered mode now?

Doesn't that mean that if I compile 1000 source files I will get the headers for any ports use installed 1000 separate times? Since none of the compile steps actually build the library (due to being deffered) each compile will run as if the library is completely missing and go ahead and install the headers, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally it doesn't seem like cache lock contention is too much of an issue (after the port is unpacked, which only happens ones) but I guess the create function is called under the lock, so the header check happens with the lock. Another env variable would certainly do the trick. It could be scoped to skipping ports as you suggested, or it could be used more generally to mean that we are running an embuilder library build. It coudl just replace the build_deferred global variable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its too gross for me I think.

At the very least we should somehow ensure this behaviour change is only for embuilder. As it stands this would effect all USE_NINJA=1 users. How about a separate USE_DEFFERED_BUILD=1 or something like that that embuilder can set?

Copy link
Member Author

@dschuff dschuff Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was initially not thrilled about another env var, but the fact that it can replace all the uses of this global is nice. I'll go with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a little more. Using the env var in place of the global var here does make it a bit nicer. But I realized that we don't actually need that in order to keep the headers from being installed redundantly. we just need to omit the ports.get for the dependency ports under embuilder. It's nicer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works; WDYT?

for port in dependency_order(needed):
port.get(Ports, settings, shared)
args += port.process_args(Ports)

build_deferred = False

def show_ports():
sorted_ports = sorted(ports, key=lambda p: p.name)
Expand Down
Loading