-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Implement extra resources for test actions #11963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
Adds support for specifying extra resources available for the bazel invocation and requirement for test actions. Using --local_extra_resources=<resource name>=<amount of resource> one can specify the resources available for tests. Specifying a tag "resources:<resource name>:<amount of resources>" for a test action will tell bazel how much of the said resource this test will require. Bazel will then limit concurrently running test actions using the same resource if there isn't enough of said resource available.
8f384dc
to
da56473
Compare
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
I think this is pretty cool. :) |
Does this relate #6477 at all? |
It seems to me that these are quite similar in the things they achieve. Or at least some of the proposed things. |
This PR has been sitting around for a while and deserves proper attention, apologies. I don't have enough expertise to feel I know how to review this well. But I should be able to route it to the right place. I'm going to chat with people to get this moving. Stay tuned. |
effectTags = {OptionEffectTag.UNKNOWN}, | ||
allowMultiple = true, | ||
help = | ||
"Explicitly set the number of extra recourses available to Bazel. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo in resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
+ "types of extra resources. Bazel will limit concurrently running test actions " | ||
+ "based on the available extra resources and the extra resources required " | ||
+ "by the test actions. " | ||
+ "Note: This is a no-op if --local_resources is set.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether Bazel should fail when both are specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't matter, --local_resources
has been removed since the revision this is based on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
/** How many extra resources an action requires for execution. */ | ||
public static final ParseableRequirement RESOURCES = | ||
ParseableRequirement.create( | ||
"resources:<str>:<int>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why an int rather than a float? CPU resources are already floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to match the cpu:<int>
code above. Looks like there is some inconsistency around what can be a float and what can be an int:
- Execution requirements:
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
parsesint
. - Test tags:
src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java
parsesFloat
. - Command line flags:
src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java
parsesint
or aFloat
in the expression multiplier:
/** Description of the accepted inputs to the converter. */
public static final String FLAG_SYNTAX =
"an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by "
+ "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";
In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, and integer on the command line. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources.
|
||
// De-and-reserialize & compare to only allow canonical integer formats. | ||
if (!Integer.toString(value).equals(resourceCount)) { | ||
return "must be in canonical format (e.g. '4' instead of '+04')"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why must it be in canonical format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were just trying to match behavior of the cpu:N
tag validation above. Perhaps there's a reason why that validation was originally added for the cpu:N
tag...?
return "can't be zero or negative"; | ||
} | ||
|
||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to return value
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a validator lambda that should return null
if validation passes.
@@ -322,6 +343,13 @@ private synchronized CountDownLatch acquire(ResourceSet resources) { | |||
private synchronized boolean release(ResourceSet resources) { | |||
usedCpu -= resources.getCpuUsage(); | |||
usedRam -= resources.getMemoryMb(); | |||
|
|||
for (Map.Entry<String, Float> resource : resources.getExtraResourceUsage().entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would look nicer as a forEach().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -333,6 +361,19 @@ private synchronized boolean release(ResourceSet resources) { | |||
if (usedRam < epsilon) { | |||
usedRam = 0; | |||
} | |||
|
|||
Set<String> toRemove = new HashSet<>(); | |||
for (Map.Entry<String, Float> resource : usedExtraResources.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also use forEach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -150,23 +151,21 @@ public ResourceSet getLocalResourceUsage(Label label, boolean usingLocalTestJobs | |||
ResourceSet testResourcesFromSize = TestTargetProperties.getResourceSetFromSize(size); | |||
|
|||
// Tests can override their CPU reservation with a "cpus:<n>" tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are a sign that the relevant portions would prefer to live in separate functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
@Override | ||
public Map.Entry<String, Float> convert(String input) throws OptionsParsingException { | ||
int pos = input.indexOf("="); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use a static Splitter instead of hand-coding. Alternatively, make this a subclass of AssignmentConverter. Or even better, make AssignmentConverter a generic method with a converter lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
+ "types of extra resources. Bazel will limit concurrently running test actions " | ||
+ "based on the available extra resources and the extra resources required " | ||
+ "by the test actions. " | ||
+ "Note: This is a no-op if --local_resources is set.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't matter, --local_resources
has been removed since the revision this is based on.
effectTags = {OptionEffectTag.UNKNOWN}, | ||
allowMultiple = true, | ||
help = | ||
"Explicitly set the number of extra recourses available to Bazel. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to mention that CPU, RAM, and tests cannot be specified here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -419,7 +419,7 @@ private Spawn newCustomSpawn(String mnemonic, ImmutableMap<String, String> execu | |||
executionInfo, | |||
EmptyRunfilesSupplier.INSTANCE, | |||
action, | |||
ResourceSet.create(1, 0, 0)); | |||
ResourceSet.create(1, 0, null, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of these tests could use the two-parameter overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like many actually can't, because they also specify a nonzero localTestCount. In fact, looks like this was the only place where I could use ResourceSet.createWithRamCpu
instead.
Is this still under review? What's the status? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exoson is no longer working with us, I'm picking this up.
A new PR is here: #13996 - I think this one could be closed.
I've tried to address all comments from @larsrc-google in the new PR.
return "can't be zero or negative"; | ||
} | ||
|
||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a validator lambda that should return null
if validation passes.
|
||
// De-and-reserialize & compare to only allow canonical integer formats. | ||
if (!Integer.toString(value).equals(resourceCount)) { | ||
return "must be in canonical format (e.g. '4' instead of '+04')"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were just trying to match behavior of the cpu:N
tag validation above. Perhaps there's a reason why that validation was originally added for the cpu:N
tag...?
/** How many extra resources an action requires for execution. */ | ||
public static final ParseableRequirement RESOURCES = | ||
ParseableRequirement.create( | ||
"resources:<str>:<int>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to match the cpu:<int>
code above. Looks like there is some inconsistency around what can be a float and what can be an int:
- Execution requirements:
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
parsesint
. - Test tags:
src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java
parsesFloat
. - Command line flags:
src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java
parsesint
or aFloat
in the expression multiplier:
/** Description of the accepted inputs to the converter. */
public static final String FLAG_SYNTAX =
"an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by "
+ "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";
In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, and integer on the command line. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources.
@@ -135,7 +159,7 @@ public ResourceSet convert(String input) throws OptionsParsingException { | |||
if (memoryMb <= 0.0 || cpuUsage <= 0.0) { | |||
throw new OptionsParsingException("All resource values must be positive"); | |||
} | |||
return create(memoryMb, cpuUsage, Integer.MAX_VALUE); | |||
return create(memoryMb, cpuUsage, null, Integer.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean supporting --local_extra_resources=foo=2,bar=4
? If so, can't we already get the same behavior just with --local_extra_resources=foo=2 --local_extra_resources=bar=4
?
effectTags = {OptionEffectTag.UNKNOWN}, | ||
allowMultiple = true, | ||
help = | ||
"Explicitly set the number of extra recourses available to Bazel. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -333,6 +361,19 @@ private synchronized boolean release(ResourceSet resources) { | |||
if (usedRam < epsilon) { | |||
usedRam = 0; | |||
} | |||
|
|||
Set<String> toRemove = new HashSet<>(); | |||
for (Map.Entry<String, Float> resource : usedExtraResources.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
this.memoryMb = memoryMb; | ||
this.cpuUsage = cpuUsage; | ||
if (extraResourceUsage == null) { | ||
this.extraResourceUsage = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest version I converted the extraResourceUsage from Map to an ImmutableMap, and using ImmutableMap.of()
here.
/** Returns the local test count used. */ | ||
public int getLocalTestCount() { | ||
return localTestCount; | ||
} | ||
|
||
@Override | ||
public String toString() { | ||
StringBuilder sb = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -150,23 +151,21 @@ public ResourceSet getLocalResourceUsage(Label label, boolean usingLocalTestJobs | |||
ResourceSet testResourcesFromSize = TestTargetProperties.getResourceSetFromSize(size); | |||
|
|||
// Tests can override their CPU reservation with a "cpus:<n>" tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -419,7 +419,7 @@ private Spawn newCustomSpawn(String mnemonic, ImmutableMap<String, String> execu | |||
executionInfo, | |||
EmptyRunfilesSupplier.INSTANCE, | |||
action, | |||
ResourceSet.create(1, 0, 0)); | |||
ResourceSet.create(1, 0, null, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like many actually can't, because they also specify a nonzero localTestCount. In fact, looks like this was the only place where I could use ResourceSet.createWithRamCpu
instead.
We are closing this PR with respect to the comment mentioned over here #11963 (review) and also the new reference PR (#13996) . Please feel free to reach to reopen or for any further update. Thanks! |
Adds support for specifying extra resources available for the bazel
invocation and requirement for test actions. Using
--local_extra_resources=<resource name>=<amount of resource> one can
specify the resources available for tests. Specifying a tag
"resources:<resource name>:<amount of resources>" for a test action will
tell bazel how much of the said resource this test will require. Bazel
will then limit concurrently running test actions using the same
resource if there isn't enough of said resource available.
This can be used to arbitrate access to additional resources such as GPUs or embedded developer boards.