Description
RAR-as-a-service implementation spec
See documentation/specs/archive/rar-as-service.md for a brief background of goals and overall architecture. Although that doc was written for a previous prototype of this project, it mostly still applies outside of implementation details.
This is a detailed implementation overview of RAR-as-a-service. intended to give a big picture view of currently completed work, design choices, ect., and lay out the critical path / blockers for PRs.
Out for review:
Completed, but blocked on PR:
- 2: Serialization (blocked on 1)
- 3: Task Execution (blocked on 2)
- 4: Caching (blocked on 2)
Needs design:
- 5: Environment snapshots (needs feedback) (blocked on 2)
6: Miscellaneous Notes
Additional related perf work:
(These aren't blccking but still important for perf, will be breaking into a separate issue):
Pull 11640: Support string interning / deduplication within packets
Pull 11638: Fix TaskParameterTaskItem serialization perf
Pull 11635: Optimize TaskItem cloning between Engine and Tasks
1: Node lifecycle
Pull 11383: Implement out-of-proc RAR node lifecycle
Control Flags
BuildParameters.EnableRarNode
enables both launching the out-of-proc RAR node, and configuring all RAR tasks in the build to forward execution to the node. This flag will also be set via the command line switch (naming open for discussion):
msbuild.exe -enableRarNode
Note: To avoid publicly exposing the feature until complete, -enableRarNode
is currently replaced by the environment variable MSBuildRarNode
.
A new node mode is added to configure MSBuild to start as the RAR node, displaying as just another msbuild.exe
process:
msbuild.exe /nodemode:3
Tasks unfortunately don't have access to BuildParameters
, so we'd still need some method to signal RAR to connect to the out-of-proc node, tied to the build and not the environment.
There's already precedence for using EngineServices
for this purpose (see IsTaskInputLoggingEnabled
, only used by RAR), so it seems like the obvious choice to plumb this through to the task:
// Framework/EngineServices.cs
abstract class EngineServices
{
public const int Version2 = 2;
virtual bool IsOutOfProcRarNodeEnabled => /* true when EnableRarNode is set + we successfully launched or found the node */;
}
Finally, any project can explicitly opt out of running RAR out-of-proc, even if set via the BuildParameters
flag:
<!-- Microsoft.Common.CurrentVersion.targets -->
<PropertyGroup>
<!-- Defaults to true, but actual execution is still dependent on the BuildParameters flag -->
<ResolveAssemblyReferencesOutOfProcess Condition="'$(ResolveAssemblyReferencesOutOfProcess)' == ''">true</ResolveAssemblyReferencesOutOfProcess>
</PropertyGroup>
Node launch / existence check
With BuildParameters.EnableRarNode
set, MSBuild checks for an existing named pipe instance matching the name MSBuildRarNode-{handshake}
. To avoid blocking the build, we avoid going through the sequence of creating a pipe client and performing a handshake, or relying on interprocess synchronization methods such as shared mutexes.
Instead, we can just perform a quick file probe to see if the handle exists, and continue on with the build, assuming the server will handle any potential race conditions. This looks slightly different depending on the platform:
- Windows - All named pipes are prefixed with the pipe root
\\.\pipe\
, and are partially treated as file system objects compatible with read/write APIs. However, attempting to directly probe the path will crash the pipe server due to undefined behavior. Instead, we need to enumerate\\.\pipe\
as a directory and match on the name. - MacOS / Linux - Sockets are represented as files and don't have an OS-defined location. We always place them under
/tmp
, so we can just directly callFile.Exists()
.
If this fails to find a match, we kick off a new MSBuild process in /nodemode:3
. NodeLauncher
automatically handles locating the appropriate MSBuild executable / host + assembly for .NET Framework / .NET Core.
Node (RAR Service)
When /nodemode:3
is read from command line arguments, MSBuild will set itself up as the RAR node. The RAR node uses two sets of pipe servers, connected via separate handles:
MSBuildRarNode-{handshake.ComputeHash()}
- Single-instance pipe server- Receives messages from MSBuild execution
- Manages the setup/teardown of the node and its workers
- Initializes any shared caching and global settings
- Starts up the multi-instance pipe workers to await RAR requests
MSBuildRarNodeEndpoint-{handshake.ComputeHash()}
- Multi-instance pipe server- Receives commands from the RAR task and executes requests
- Connection lives
- Does not manage its own lifetime, apart from observing cancellation
- Maintain an active connection with a single MSBuild node for the duration of a build
This structure is important for enforcing that only a single RAR node process can exist. Ideally we'd only need the multi-instance server, and let maxNumberOfServerInstances
(parameter for NamedPipeServerStream
) automatically handle this for us. But because multi-instance pipes can live across multiple processes on Windows, this could create a race condition where pipe server instances are split across nodes.
By setting up MSBuildRarNode-{*}
before the worker pipes, the single instance acts as implicit mutex. In the event of a race, the other node will fail to create the server and shut down.
# RAR node launches as first server instance.
❯ <path-to-msbuild>\MSBuild.exe /nodemode:3
MSBuild version <version> for .NET Framework
# Second RAR node launches, errors out.
# Should not get here in normal conditions.
❯ <path-to-msbuild>\MSBuild.exe /nologo /nodemode:3
MSBuild version <version> for .NET Framework
MSBUILD : error MSB1025: An internal failure occurred while running MSBuild.
System.InvalidOperationException: RAR node is already running.
Note: These node implementations currently live separate from the rest of MSBuild's IPC Node infrastructure, over in Microsoft.Build.Tasks
, due to RAR type dependencies for both the client and server. As such, some logic is duplicated (e.g. pipe setup, handshakes), although there's work-in-progress to share code where possible. E.g. Pull 11546: Consolidate common IPC / named pipe code
Handshake
Our Handshake
implementation needs to be independent of the build session, so the standard implementation used for out-of-proc nodes would not work.
For now, we can reuse the existing ServerNodeHandshake
, designed for the MSBuildServer
feature.
// Shared/CommunicationsUtilities.cs
// Handshake layout. Salt == hash(MSBuild tools directory)
$"{options} {salt} {fileVersionMajor} {fileVersionMinor} {fileVersionBuild} {fileVersionPrivate}".ToString(CultureInfo.InvariantCulture);
One current gap is that the salt
is a just hash of the MSBuild tools directory, which would prevent a .NET Framework RAR task from communicating with a .NET Core RAR node. (see perf section for more details).
Node Client (RAR Task)
If EngineServices.IsOutOfProcRarNodeEnabled
is set and property /p:ResolveAssemblyReferencesOutOfProcess
is not overridden, the RAR task will create the client and forward execution to the server. If the connection fails for any reason (aka not a Task failure), or exceeds the short timeout period, it will always fall back to executing in-proc.
For performance purposes, each MSBuild node maintains it's own global OutOfProcRarClient
which is registered via IBuildEngine4
. This is created when RAR runs for the first time on a given node, and reused on all subsequent runs.
// First run...
rarClient = new OutOfProcRarClient();
buildEngine.RegisterTaskObject(OutOfProcRarClientKey, rarClient, RegisteredTaskObjectLifetime.Build, allowEarlyCollection: false);
// Skip connection overhead + buffer allocations!
OutOfProcRarClient rarClient = (OutOfProcRarClient)buildEngine.GetRegisteredTaskObject(OutOfProcRarClientKey, RegisteredTaskObjectLifetime.Build);
This allows us to skip the handshake check for the duration of the build, and reuse buffers across every RAR task. We intentionally allow the client to be disposed between builds (e.g. when /nodereuse
is enabled) so we don't lock each pipe instance to a specific node for its lifetime.
Example 1: A multi-proc build. Each MSBuild node holds the connection until the overall build completes.
Example 2: Multiple single-proc MSBuild instances (as orchestrated by a higher-level build engine).
2: Serialization
Task parameters
RAR input / output parameters are dynamically discovered and accessed via reflection.
For context, the first idea here was to simply redeclare all of RAR's parameters and set up minimal shims for ITaskItem
inputs/outputs. While this is probably the most performant thing to do, it added a ton of boilerplate and looked like an easy point for regressions. For instance, what happens when RAR adds another input or output?
internal class RarNodeExecuteRequest : RarSerializableMessageBase
{
private bool _autoUnify;
private RarTaskItemInput[] _assemblies = [];
private string[] _candidateAssemblyFiles = [];
// ...
// 40+ lines of fields...
// ...
internal bool AutoUnify { get => _autoUnify; set => _autoUnify = value; }
internal RarTaskItemInput[] Assemblies { get => _assemblies; set => _assemblies = value; }
internal string[] CandidateAssemblyFiles { get => _candidateAssemblyFiles; set => _candidateAssemblyFiles = value; }
// ...
// Another 40+ lines of properties...
// ...
public override void Translate(ITranslator translator)
{
translator.Translate(ref _autoUnify);
translator.TranslateArray(ref _assemblies);
translator.Translate(ref _candidateAssemblyFiles);
// ...
// Not shown: Another 40+ lines, The response side, TaskItem wrappers, many more conversion utils...
// ...
}
}
...so I dug around MSBuild to see what could be reused. Luckily, TaskParameter
(currently used by the out-of-proc task host) already does the work of wrapping arbitrary types into an ITranslatable
object, and even provides a ITaskItem
shim via TaskParameterTaskItem
.
// Shared/TaskParameter.cs
internal class TaskParameter
{
private TaskParameterType _parameterType;
private TypeCode _parameterTypeCode;
private object _wrappedParameter;
// ...
private class TaskParameterTaskItem
{
// Handles cloning from arbitrary ITaskItem implementations
}
}
Note: This currently has some (fixable) perf gaps related to TaskItem cloning, but otherwise has minimal reflection overhead since it relies on optimized functions like GetType(). See: Pull 11638: Fix TaskParameterTaskItem serialization perf
This means we only need a scaled down version of the work TaskExecutionHost
does to dynamically discover RAR's input and output parameters. This involves reflection, but as long as the expensive Type.GetProperties()
is globally cached, this doesn't appear to cause a performance issue.
Implementation snippet:
class RarTaskParameters
{
private static readonly Lazy<PropertyInfo[]> s_outputProperties = new(() =>
[.. typeof(ResolveAssemblyReference).GetProperties(BindingFlags.Public | BindingFlags.Instance | BindingFlags.DeclaredOnly)
.Where(property => property.GetCustomAttribute<OutputAttribute>() != null && !property.Name.Equals(CopyLocalPropertyName, StringComparison.Ordinal))]);
private static readonly Lazy<PropertyInfo[]> s_inputProperties = new(() =>
[.. typeof(ResolveAssemblyReference).GetProperties(BindingFlags.Public | BindingFlags.Instance | BindingFlags.DeclaredOnly)
.Where(property => property.GetGetMethod() != null && property.GetSetMethod() != null)]);
// ...
{
// ...
// PropertyInfo.SetValue(ResolveAssemblyReference, TaskParameter)
property.SetValue(rarTask, parameter.WrappedParameter);
}
}
Note: Supposedly PropertyInfo.SetValue
can also be a perf slow point, but so far I haven't picked it up in profiling traces. If necessary, Delegate.CreateDelegate
is a commonly used alternative to cache the setters.
This eliminates the need for almost all custom serialization logic, outside of the main request / response wrappers and a couple special cases...
Parameter-specific handling
CopyLocalFiles
So one "special" RAR output is the CopyLocalFiles
property. This is an ITaskItem[]
constructed by reference to other outputs produced by RAR.
Although the reference equality isn't important (TaskExecutionHost
will eventually duplicate these into separate ProjectItemInstance.TaskItem
instances when collecting outputs), serializing these duplicate outputs back to the client can add significantly to the payload size.
Luckily, CopyLocalFiles
is simply produced by looking for the key ItemMetadataNames.CopyLocal
:
// Tasks/AssemblyDependency/ReferenceTable.cs
bool copyLocal = MetadataConversionUtilities.TryConvertItemMetadataToBool(
i,
ItemMetadataNames.copyLocal,
out bool found);
...so as an optimization, we can just rely on the metadata key and reconstruct the output on the client.
Path normalization
Any relative paths also need to be resolved to the full path of the project it is relative to - otherwise RAR will blow up at runtime.
Currently this only applies to AppConfigFile
and StateFile
, and this is easily done ahead of time from the client.
3: Full Execution
By this point, all the plumbing exists for the task to execute end-to-end, so this PR is mostly just connecting the last dots.
Replaying log event messages
One challenge here is that log events need to be replayed on the client as if they occurred in-proc. So instead of passing a real IBuildEngine
implementation, the node passes a dummy which queues messages to send back to the client.
However, log events can explode the size of the final payload, and of course aren't well suited for string deduplication.
To solve this, the dummy build engine queues events into a Channel
to be processed asynchronously by the node. Once the capacity exceeds a set threshold, the current batch of messages is sent back to the client.
By buffering in chunks, log events add minimal overhead to the final payload, as the client will have already processed most messages by the time RAR completes.
Implementation snippets:
internal class RarNodeBuildEngine : IBuildEngine10
{
private readonly Channel<RarBuildEventArgs> _channel = new(/* unbounded, single reader/writer */);
public void LogMessageEvent(BuildMessageEventArgs e)
{
// Wraps event to support ITranslatable.
_channel.Writer.TryWrite(new RarBuildEventArgs(e));
}
internal ChannelReader<RarBuildEventArgs> BuildEventQueue => _channel.Reader;
}
class OutOFProcRarNodeEndpoint
{
// Synchronous buffer.
private readonly Queue<RarBuildEventArgs> _buildEventQueue;
// ...
// Runs in a task concurrent to the RAR task until the task completes.
{
RarBuildEventArgs buildEventArgs = await buildEngine.BuildEventQueue.ReadAsync(cancellationToken);
_buildEventQueue.Enqueue(buildEventArgs);
if (_buildEventQueue.Count == MaxBuildEventsBeforeFlush)
{
await _pipeServer.WritePacketAsync(new RarNodeBuildEvents(_buildEventQueue), cancellationToken);
_buildEventQueue.Clear();
}
}
}
4: Caching
File metadata caching
For a refresher, RAR currently has two levels of file metadata caching:
- Process-wide cache - shared across all executions.
- Task-local cache - serialized to disk.
class SystemState
{
// Process-wide
static readonly ConcurrentDictionary<string, FileState> s_processWideFileStateCache;
// Task-local
Dictionary<string, FileState> instanceLocalOutgoingFileStateCache;
// ...more local caches exist, but aren't accessed as frequently.
}
However, as MSBuild's parallelism increases, the less effective this cache becomes:
- Scheduling is non-deterministic, so there's a high change of a project building on a node which has not deserialized this project's cache file yet.
- On rebuilds, every node then needs to incrementally invalidate its cache, and write the updates back to disk.
- Any newly spun up processes always start with an empty cache.
We improve this scenario for free just by running RAR out-of-proc:
- Every RAR worker is able to share the same global cache instance.
- The cache lives across MSBuild invocations, even when running single-proc builds or with node reuse disabled.
- Important for higher level build engines like QuickBuild which can't use those features due to sandbox tracing + build caching.
However, we take this even futher now that we have a persistent node...
Task-level incremental caching
Given a set of task inputs, file system state, and environment, RAR will access the same set of files and produce the same outputs.
Normally, RAR does not support the concept of an incremental task. This is because its outputs are a function of both the inputs and the file system state, which makes it impossible to know which files will be accessed without first running the task.
However, if running RAR in a persistent node, we have the ability to track which files RAR accessed at execution time.
So when a RAR task completes, we store the request -> result pair into a cache which lives for the duration of the RAR node...
class RarExecutionCache
{
// TODO: Should probably key this by some project identifier as a simple way to evict old entries.
private ConcurrentDictionary<byte[], RarCachedExecution> _executionCache = new(/* custom byte[] comparer */);
}
...deferring deserialization until we've performed the lookup.
But first, we need to know the full set of files accessed + have a cache invalidation mechanism...
File access tracking
Tracking changes purely via file watchers is not performant when monitoring a large number of files and directories, in addition to the potential chance of dropping accesses when the CPU is loaded.
RAR has a fairly long list of potential IO calls:
// Tasks/AssemblyDependency/ResolveAssemblyReference.cs
Execute(
p => FileUtilities.FileExistsNoThrow(p),
p => FileUtilities.DirectoryExistsNoThrow(p),
(p, searchPattern) => [.. FileSystems.Default.EnumerateDirectories(p, searchPattern)],
AssemblyNameExtension.GetAssemblyNameEx,
AssemblyInformation.GetAssemblyMetadata,
#if FEATURE_WIN32_REGISTRY
(baseKey, subkey) => RegistryHelper.GetSubKeyNames(baseKey, subkey),
(baseKey, subkey) => RegistryHelper.GetDefaultValue(baseKey, subkey),
#endif
NativeMethodsShared.GetLastWriteFileUtcTime,
AssemblyInformation.GetRuntimeVersion,
#if FEATURE_WIN32_REGISTRY
(hive, view) => RegistryHelper.OpenBaseKey(hive, view),
#endif
GetAssemblyPathInGac,
AssemblyInformation.IsWinMDFile,
ReferenceTable.ReadMachineTypeFromPEHeader);
}
... but they nearly all pass through a NativeMethodsShared.GetLastWriteFileUtcTime
in order to check for cache invalidation.
This reduces down to only two sets of accesses which need be checked for up-to-date:
- File modifications
- Check if existence state or the timestamp has changed
- Directory enumerations
- Check if existence state has changed
- Check if the set of files which were accessed has changed
- We do not need to check if their contents have changed since enumerations are only used to check for the existence of specific file extensions in neighboring files
And once again, SystemState
already tracks this for us:
// Tasks/AssemblyDependency/SystemState.cs
internal sealed class SystemState : StateFileBase, ITranslatable
// Validate file timestamps
internal Dictionary<string, FileState> instanceLocalFileStateCache = new Dictionary<string, FileState>(StringComparer.OrdinalIgnoreCase);
// Validate enumeration results
internal Dictionary<string, bool> instanceLocalDirectoryExists = new Dictionary<string, bool>(StringComparer.OrdinalIgnoreCase);
internal Dictionary<string, string[]> instanceLocalDirectories = new Dictionary<string, string[]>(StringComparer.OrdinalIgnoreCase);
Invalidation
...which makes our up-to-date check as simple as just walking the caches:
// Up-to-date check
foreach (KeyValuePair<string, SystemState.FileState> kvp in cachedExecution.SystemState.instanceLocalFileStateCache)
{
if (NativeMethodsShared.GetLastWriteFileUtcTime(kvp.Key) != kvp.Value.LastModified)
{
return false;
}
}
// Repeat for instanceLocalDirectoryExists + instanceLocalDirectories
*Note: A potential perf improvement here is to populate RAR's SystemState on cache miss with any IO that we've already done.
5. Environment snapshots (needs feedback)
The out-of-proc RAR task inherits the environment variables present at the time the RAR node launches. However, environment variables may change between builds.
Isolating this per-task is not so straightforward - RAR itself only has a few environment variables which could be refactored to take internal overrides, but there are many other in shared dependencies that are not easy to modify or comprehensively track.
RAR-specific env vars from a grep of AssemblyDependency/**
(Traits.Instance.EscapeHatches
+ Environment.GetEnvironmentVariable(_)
)
// Additional logging
"MSBUILDDUMPFRAMEWORKSUBSETLIST"
"MSBUILDLOGVERBOSERARSEARCHRESULTS"
// Flags for legacy behavior
"MSBUILDDISABLEGACRARCACHE"
"MSBUILDTARGETPATHFORRELATEDFILES"
// Opts out of a task-local metadata cache
"MSBUILDDONOTCACHERARASSEMBLYINFORMATION"
// Opts out of an additional process-wide cache
"MSBUILDDISABLEASSEMBLYFOLDERSEXCACHE"
Other nodes use a global snapshot system to temporarily swap state on the execution of a command. However, this works on the assumption of a single build command running on a node at any given time, while the RAR node can run multiple concurrent tasks.
// A snapshot of the node startup state.
_savedEnvironment = CommunicationsUtilities.GetEnvironmentVariables();
// Potential franken-vars...
foreach (KeyValuePair<string, string> kvp in _buildParameters.BuildProcessEnvironment)
{
Environment.SetEnvironmentVariable(environmentPair.Key, environmentPair.Value);
}
// ... and another potential race in cached traits.
Traits.UpdateFromEnvironment();
So as a default, we can assume that environment variables will mostly only change in between builds, and use a fallback mechanism to run the request on the client if necessary:
- If no tasks are currently running:
- On the first request, set the new environment state of the process.
- If any task is running:
- On incoming requests, check if the current set of environment variables match
- If true, execute RAR on the node.
- Else, reject the request and allow the client to fall back to in-proc.
Notice the word default - higher-level build engines (e.g. QuickBuild) often set environment variables per-target, such as a log directory or file unique for the command. In that case, this system would leave the node running a single RAR task at a time. The higher-level build engine may also launch the RAR node directly to force global flags.
To solve this, I either propose an additional environment variable to opt-out of env var snapshotting, or scoping the snapshot to a subset of known environment variables relevant for RAR.
Working directory snapshot
From what I've found so far, this seems to be the only place in RAR (outside of handled task inputs) that might resolve a relative path based on the current directory:
// Tasks/AssemblyDependency/ReferenceTable.cs
private AssemblyNameExtension NameAssemblyFileReference(/* reference + file name */)
// ...
if (!Path.IsPathRooted(assemblyFileName))
{
reference.FullPath = Path.GetFullPath(assemblyFileName);
}
}
Though, even this can be traced back to the _assemblyFiles
input, so it may be as simple as ensuring the ItemSpec
is a full path:
// AssemblyDependency/ResolveAseemblyReference.cs
public ITaskItem[] AssemblyFiles
{
get { return _assemblyFiles; }
set { _assemblyFiles = value; }
}
6: Miscellaneous Notes
Dump for additional ideas that don't make sense to put into an item yet:
First is run slower on .NET Framework
Only on .NET Framework - the first run of a RAR task on a newly launched RAR node always appears to have an additional few 100ms of overhead due to JIT-ing (verified via profiling). This does not reproduce at all on .NET Core, so I expect it's just inherent to the runtime.
I expect this may just auto-resolve with NGEN-ed MSBuild, and it only seems to be an issue on the first build - but it is something to watch out for. And it leads directly to my next point...
.NET Core RAR node is just faster
Before integrating the server handshake, I had mostly tested using .NET Framework MSBuild -> .NET Core RAR Node. After, I noticed that a .NET Core instance of the RAR node noticeably outperforms the .NET Framework node, in both overhead and E2E task execution time. So an area for perf improvement would be to default to launching a .NET Core RAR node (when available).
Main blockers I can think of:
- Locating the dotnet host + MSBuild assembly. The shared
NodeLauncher
only knows how to find thedotnet
if it exist in the parent tools directory, and of course it still needs to locate the target assembly. From what I can tell,BuildEnvironmentHelper
also doesn't support this. Either way, this seems like the main problem that would need to be solved. - Some code paths in RAR are feature-flagged for each runtime, so I'm not sure of the potential for output differences. Specifically:
- FEATURE_ASSEMBLYLOADCONTEXT
- FEATURE_WIN32_REGISTRY
- FEATURE_GAC
- As mentioned earlier, this would also need a change in the handshake definition specific to the RAR node, although that seems comparatively simple.