-
Notifications
You must be signed in to change notification settings - Fork 54
View alloc fixes for P3 #2980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
View alloc fixes for P3 #2980
Conversation
c2c072d
to
83bcc0c
Compare
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Using Repos:
Pull Request Author: tcclevenger |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
SCREAM_PullRequest_Autotester_Mappy # 5801 FAILED (click to see last 100 lines of console output)
SCREAM_PullRequest_Autotester_Weaver # 6027 FAILED (click to see last 100 lines of console output)
|
1a540cf
to
fa5351b
Compare
b29f360
to
acb9de5
Compare
acb9de5
to
3a85062
Compare
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Using Repos:
Pull Request Author: tcclevenger |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
SCREAM_PullRequest_Autotester_Mappy # 5808 FAILED (click to see last 100 lines of console output)
SCREAM_PullRequest_Autotester_Weaver # 6034 FAILED (click to see last 100 lines of console output)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I defer to Luca though for a final say.
Do you want me to run a quick profile for each SK/not before/after this PR to be sure?
Also copying @ndkeen for awareness
@@ -361,7 +361,11 @@ class P3Microphysics : public AtmosphereProcess | |||
// 1d view scalar, size (ncol) | |||
static constexpr int num_1d_scalar = 2; //no 2d vars now, but keeping 1d struct for future expansion | |||
// 2d view packed, size (ncol, nlev_packs) | |||
#ifdef SCREAM_P3_SMALL_KERNELS | |||
static constexpr int num_2d_vector = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose at some point we should try to see if all these views need to be different, or if we can have a few that alias each other. I guess there are a few that can alias each other, but knowing which ones would require an analysis of p3. Also, allowing aliasing put some constraints on the small kernel execution order, since rearranging them may cause data to be not what was expected.
Still, food for thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, almost certainly there could be some reduction. I'll make an issue.
I was following this PR, but was waiting for it to be ready for me to test. I can try using the profiler to verify we no longer see the cuda mallocs if ready? |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Using Repos:
Pull Request Author: tcclevenger |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
SCREAM_PullRequest_Autotester_Mappy # 5821 FAILED (click to see last 100 lines of console output)
SCREAM_PullRequest_Autotester_Weaver # 6045 PASSED (click to see last 100 lines of console output)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Using Repos:
Pull Request Author: tcclevenger |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
|
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
Using Repos:
Pull Request Author: tcclevenger |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SCREAM_PullRequest_Autotester_Mappy
Jenkins Parameters
Build InformationTest Name: SCREAM_PullRequest_Autotester_Weaver
Jenkins Parameters
|
I haven't yet tried the profiler, but just looking at performance of some cases I've been running recently:
Would need more benchmarking to say more details. Not yet seeing an improvement overall, but could be timing noise. |
Thanks @ndkeen! Currently testing to see if this broke the pm-gpu tests. |
Ah, yep, both ne30, ne120, and ne256 cases that I tried above are not BFB with previous cases. |
fyi, i just experimented with using SMALL_KERNELS - turning on small kernels for both P3 and SHOC will yield BFB results with other cases. That is, in the checkout before this PR, both ne30 and ne120 case are BFB with/without small kernels (as we would hope). And then with checkout using this PR, same story. Not sure how much that helps the issue here. |
@ndkeen So SK is BFB with monolithic for this merge commit (and so then both are non-BFB with previous merge commit). That is good to know. |
In both small kernel and monolithic P3 there are view allocs during run phase, this PR removes these allocs.
Main changes:
Temporaries
struct for p3 temps when using SK. Then these views are added to the Buffer.get_latent_heat()
function and remove global views for latent heat values. Instead use physics constants where appropriate.(ncol, 2)
), use kokkos' scratch space to allocate a 2 bools per team.Viewing changes with "Hide whitespace" will help.